0% found this document useful (0 votes)
13 views96 pages

STCN Major 2

standard topics in complex networks 2

Uploaded by

Vedant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
13 views96 pages

STCN Major 2

standard topics in complex networks 2

Uploaded by

Vedant
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 96
yA enero Traditional Feature- ed Methods: Graph > @ = Key Count the number of different graphlets in a graph. Note: Definition of graphlets here is slightly different from node-level features. Thestwo ai . Stanford e two differences are: University = Nodes in graphlets here do not need to be connected (allows for isolated nodes) Stanford CS224W: Machine Learning with Graphs * The graphlets here are not rooted. = Examples in the next slide illustrate this. bestia Erol, Jute | sek Lecture 2: ‘Traditional feature-based Inet seo evel eeiere Stanford CS224W: Graph-Level Features €8 Stanford €3 University and Graph Kernels LY ZZ Ni Jure Leskovec, http://cs224w.stanford.edu © We want features that characterize the structure of an entire graph. = For example: ‘ f § Stanford University Stanford CS224W: Machine Learning with Graphs Instructor: Prof. Jure Leskovec Lecture 2.3: Traditional feature-based ‘methods: Graph-level features Background: Kernel Methods are widely-used for traditional ML for graph-level prediction. Idea: Design kernels instead of feature vectors. Kernel K(G,G') € R measures similarity b/w data Kernel matrix K = (KG G’)) must always be Stanford = , Gc! y 3 University positive semidefinite (ie, has positive eigenvals) Therevexists\a feature representation (-) such that Stanford CS224W: Machine Learning with Graphs K(G,G') = 6(G)"(G') Instructor: Prof. Jure Leskovec Once the kerfiel is defined, off-the-shelf ML model, ‘SPT TALA pesed such as kerne! SVM, can be used to make predictions. ‘methods: Graph-level features . Measure similarity between two graphs: Other kernels are also proposed in the literature (beyond the scope of this lecture) 3 Stanford University = Random-walk kernel * Shortest-path graph kernel Machine Leaning with Graphs And many more... ference teres legen “Etclortgraphet amols forage graph comparson” Artic nlignce and Statist. 200 Traditional feature-based “Waiatarehmangropn weet" lovns of Machine Cearnng Research 12.9 (201) ‘methods: Graph-level features = Key idea: Bag-of-Words (BoW) for a graph Recall: BoW simply uses the word counts as features for documents (no ordering considered). / Naive extension to a graph: Regard nodes as words. Since both graphs have 4 red nodes, we get the same feature vector for two different graphs... Stanford University Stanford CS224W: Machine Learning with Graphs ¢ (N )= $ (N ) nett Fret rs Leetees Lecture 2.3: Traditional feature-based ‘methods: Graph-level features Degi:e Deg2:e Deg3: © ae #(N) = count( NJ )= 01, 3, 0] Obtains different features 5S for different graphs! (JN) = count(XI ) = (0, 2, 2] Both Graphlet Kernel and Weisfeiler-Lehman (WL) Kernel use Bag-of-* representation of graph, where * is more sophisticated than node degrees! Stanford 3 University Stanford CS224W: Machine Learning with Graphs Instructor: Prof. Jure Leskovec Lecture 2.3: Traditional feature-based ‘methods: Graph-level features For k = 3, there are 4 graphlets. n G2 93 a A A /. Stanford For ke 4, there are 11 graphlets. 3 University Stanford CS224W: Machine Learning with Graphs Instructor: Prof. Jure Leskovec 7 eo ° Lecture 2.3: ° o oo 00 00 Traditional feature-based ‘methods: Graph-level features = Given graph ©, and a graphlet list ( define the graphlet count vector /;, € IR" as = Example for k = 3 Stanford 3 University n a Stanford CS224W: Machine Learning with Graphs Instructor: Prof. Jure Leskovec = =(y 3 0)T Lecture 2.3: , 6, Traditional feature-based ‘methods: Graph-level features W: ML with Graph: ecture 2.3 - Traditional Feature-based Methods: Graph > @ = Given two graphs, G and G’, eraphiet kernel is computed as K(G,G) = fa" Fo . if G and G’ have different sizes, that will eal skew the value. on: normalize each feature vector Stanford University Stanford CS224W: Machine Learning with Graphs fe T helo KCG.6) = hcth oe 6 = SamGa) KG) = he Por — Seo Faun estes eae etre eT eT ont Cte) Limitations: Counting graphlets is expensive! Counting size-k graphlets for a graph with size n by enumeration takes n*. This is unavoidable in the worst-case since subgraph isomorphism test (judging whether a graph is a subgraph of another graph) is NP-hard. If a graph’s node degree is bounded by d, an O(nd*-') algorithm exists to count all the graphlets of size k. Can we design a more efficient graph kernel? DR ee a eC Pa Cee es ace T aa | ee a | Y Stanford 3 University Stanford CS224W: Machine Learning with Graphs Instructor: Prof. Jure Leskovec Lecture 2.3: ‘Traditional featute-based Ipethods: Graph-evel featgay eater aCe ae @ Assign initial colors LS Las oo 9 © Stanford Aggregate neighboring colors University pt Poe faci Crs icra Gan Cait > an Machine Learning with Graphs in > 3 @ @ & D wn => Instructor: Prof. Jure Leskovec DO ® GO, wn 8 Lecture 2.3 Traditional feature-based ‘methods: Graph-level features xample o yr refine Aggregated colors LS ® © ee Stanford Hash aggregated colors University et a eee a Z a5 ap Instructor: Prof, Jure Leskovec a Z @ Trastchel Bhd posed ‘methods: Graph-level features mp! f rrefin nt given two ¢ Aggregated colors Zs & Do ®D QD @S Stanford Hash aggregated colors Hash table 3 University Be ZS 2, a: Stanford CS224W: 2, et 4 Machine Learning with Graphs Qe &— | FI eee eee ers Lecture 2.3: Traditional feature-based 13 ‘methods: Graph-level features W: ML with Graph Traditional Feature-based Methods: Graph Colors 1 5,6,7,8,9,10,11,12,13 = [6,2,1,2,1,0,2,1,0, 0, 0, 2, 1] Counts to a o> 1,2,3,4,5,6,7,8,9,10,11,12,13 = [6,2,1,2,1,1,1,0,1, 1, 1, 0, 1] Ce} Stanford University Stanford CS224W: Machine Learning with Graphs Instructor: Prof. Jure Leskovec Lecture 2.3: ‘Traditional feature-based eases Ceo eile Lecture 2.3 - Traditional Feature-based Methods: Graph @_ > @ Cone a CTS Stanford University Stanford CS224W: Machine Learning with Graphs I KR oO Instructor: Prof. Jure Leskovec Lecture 2.3: ‘Traditional feature-based ates aa aie = WL kernel is computationally efficient The time complexity for color refinement at each step is linear in #(edges), since it involves aggregating neighboring colors. = When computing a kernel value, only colors appeared in the two graphs need to be tracked. 3 Thus, #(colors) is at most the total number of nodes. Stanford University Stanford CS224W: Machine Learning with Graphs = Counting colors takes linear-time w.rt. #(nodes). instructor: Prof. Jure Leskovec Lecture 2.3: = In total, time complexity is linear in #(edges). ee eee ‘methods: Graph-level features = Graphlet Kernel Graph is represented as Bag-of-graphlets Computationally expensive = Weisfeiler-Lehman Kernel Apply K-step color refinement algorithm to enrich node colors Stanford * Different colors capture different K-hop neighborhood 3 University structures . Stanford CS224W: Graph is represented as Bag-of-colors Machine Learning with Graphs Computationally efficient Instructor, Prof, dure Leskowrec Closely related to Graph Neural Networks (as we Neggturs 3.8 ie Traditional feature-based will see!) methods: Graph-level features Suc eau eee \ ee ae) Hand-crafted feature + ML model Node-level: * Node degree, centrality, clustering coefficient, graphlets Link-level: * Distance-based feature Stanford University Stanford CS224W: = local/global neighborhood overlap erie eerie * Graph-level: » Instructor: Prof. Jure Leskovec * Graphlet kernel, WL kernel Leqture 2.8: ‘Traditional feature-based estas setae ear Stanford CS224W: Node Embeddings €; Stanford 9 "University CS224W: on ol Camo eee 1s http://cs224w.stanford. a Given an input graph, extract node, link and graph-level features, learn a model (SVM, neural network, etc.) that maps features to labels. Stanford University Input Structured Learning Prediction Graph Features Algorithm Stanford CS224W: Machine Learning with Graphs Instructor: Prof. Jure Leskovec Feature engineering Downstream eae, (node-level, edge-level,graph- prediction task Node Embeddings Tove features) e Graph Representation Learning Graph Representation Learning alleviates the need to do feature engineering every single time. ® Stanford Input Structured Learning 63 University Graph Features Algorithm Precicnon Stanford 6S224W: Machine Learning with Graphs Representation Learning -- Dow seer Automatically learn the features Lecture, 3.1 Node Embeddings Instructor: Prof. Jure Leskovec Graph Representation Learning Goal: Efficient task-independent feature learning for machine learning with graphs! node vector Stanford u fu Rt 3 University ‘ Ré ae ee nnn Feature representation, instructor Prot. Jure Leskovec embedding Lecture 3.1 Node Embeddings: Why Embedding? * Similarity of embeddings between nodes indicates their similarity in the network. For example: = Both nodes are close to each other (connected by an edge) A Stanford = Encode network information * Potentially used for many downstream predictions 3 Universicy Stanford CS224W: Vec Tasks Machine Learring with Graphs Node classification Link prediction Instructor: Prof. Jure Leskovec Graph classification Anomalous node detection Clustering Lecture 3.1 Node Embeddings: embeddings IR4 = 2D embedding of nodes of the Zachary’s Karate Club network: ° * o, ° Po IR - * Stanford MI, University ST 4 Stanford CS224W: i Feo ene vi mg Instructor: Prof. Jure Leskovec Input Output Lecture 3.1 Node Embeddings: Stanford CS224W: Node Embeddings: €§ Stanford 9 University Encoder and Decoder Poean eee OE oN TRC) http://cs224w.stanford.edu = Assume we have a graph G: » Vis the vertex set. » Ais the adjacency matrix (assume binary). = For simplicity: no node features or extra information is used Stanford University 0104 re eae A 10 oO a t Prof. Jure Leskc V: {1, 2, 3, 4} 0001 ent ae eons Node Embeddings: = Goal is to encode nodes so that similarity in the embedding space (e.g., dot product) approximates similarity in the graph a Stanford ENC(u) University <\ encode nodes f Stanford CS224W: A \ ms Machine Learning with Graphs Lecture 3.4 embedding space ect .4W: Machine Learning with Graphs | 2021 | Lecture 3.1 - Node Embeddings Goal: similarity(u,v) ~ zz, in the original network Similarity of the embedding ENC(u) Stanford University é\ —* encode nodes A ~Y ‘Stanford CS224W: | Machine Leaming with Graphs ENC(u) Instructor: Prof. ure Leskovec original network embedding space Lecture 3.1 Node Embeddings: W: Machine Learning with Graphs | 2021 | Lecture 3.1 - Node Embeddings Goal: similarity(u,v) ~ zz, in the original network Similarity of the embedding ® ENC(u) Stanford University L\ fence nodes mS ‘Stanford CS224w: | { \y Machine Looting #8 Grats Instructor: Prof. Jure Leskovec Lecture 3.1 Node Embeddings: original network Tr ee Node Embeddings W: Machine Learning with Graphs | 2021 | Lecture 3.1 - Node Embeddings Encoder maps from nodes to embeddings Define a node similarity function (i.e., a measure of similarity in the original network) Decoder maps from embeddings to the similarity score Optimize the parameters of the encoder so 3 santor that: F . p(T ay Stanford CS224W: ee Machine Learning with Graphs similarity(u,v) ~ zz, in the original network Similarity of the embedding Instructor: Prof. Jure Leskovec Lecture 3.1 Node Embeddings: IW: Machine Learning with Graphs | Two Key Components Peet eRe cue nter) vector ENC(v) =z, noc he input ¢ specifies how the Stanford relationships in vector space map to the University relationships in the original network Stanford os224w: ET t Nahe oer Gro similarity(u,v) ~ Zj)Zy Decoder instructor: Prot. ture Leskovec Similarity of u and v in dot product between node | Neksteedabiags the original network x embeddings 8 Simplest encoding approach: Encoder is just an embedding-lookup ENC(v) =z, =Z-v ax|v, Matrix, each column is a node 3 Stanford ZER embedding [what we learn / Went) optimize] Stanford CS224W: Machine Learning with Graphs ql indicator vector, all zeroes Instrictor: Prof. ure Lsskoen vVE except a one in column ae indicating node v Node Embeddings Simplest encoding approach: encoder is just an embedding-lookup embedding vector for a embedding specific node matrix \ Stanford Dimension/size 3 University Z= of embeddings Stanford CS224W: Machine Learning with Graphs Instructor: Prof. Jure Leskovec ‘one column per node Nei eros “Shallow” Encoding Simplest encoding approach: Encoder is just an embedding-lookup Each node is assigned a unique embedding vector (i.e., we directly optimize the embedding of each node) Stanford 63 University Stanford CS224W: Machine Learning with Graphs Instructor: Prof. Jure Leskovec Many methods: DeepWalk, node2vec euaitord * Shallow encoder: embedding lookup * Parameters to optimize: Z which contains node embeddings z,, for all nodes u € V We will cover deep encoders (GNNs) in Lecture 6 4 Stanford University * Decoder: b&sed on node similarity. aan ae =) . Machine Learning with Graphs * Objective: maximize z;z,, for node pairs (u, v) that are similar Instructor: Prof. Jure Leskovec Lecture 3.1 Node Embeddings: = Key choice of methods is how they define node similarity. = Should two nodes have a similar embedding if they... * are linked? Stanford = share neighbors? University = have similar “structural roles”? stanford €$224W » We will now learn node similarity definition that uses ce Sl eee random walks, and how to optimize embeddings for Instructor: Prof. Jure Leskovec: such a similarity measure. Lecture 3.1 Node Embeddings: Ecc) W: Machine Learning with Graphs | 2021 | Lecture 3.1 - Node Embeddings Note on Node Embeddings This is unsupervised/self-supervised way of learning node embeddings * We are not utilizing node labels * We are not utilizing node features The goal is to directly estimate a set of coordinates (i.e., the embedding) of a node so that some aspect 3 pone of the network structure (captured by DEC) is elite preserved Mestine iene ‘with Graphs These embeddings are task independent Instructor: Prof. Jure Leskovee = They are not trained for a specific task but can be Lectura 3.5 Node Embeddings used for any task. Soya a a Pn eee Le ca Can An a ee €% Stanford 9 University Stanford CS224W: Machine Learning with Graphs Coen ake __ Inst: Prot. Jr Leskovec Jure Leskovec, Stanford University { koctura 32: http://cs224w.stanford.edu fornog Emacs i) Pe ts W: ML with Graphs | 2021 | Lecture Notation Vector Z,,: *» The embedding of node u (what we aim to find). Probability P(v |z,,) : <== Our model prediction based on z,, The (predicted) probability of visiting node v on random walks starting from node u. Non-linear functions used to produce predicted probabilities Softmax function one Turns vectr of K real values (model predictions) into te e 4 K probabilities that sum to 1: o(z); = he Fee trea econ Sigmoid function: Instructor: Prof. Jure Leskovec » S-shaped function that turns real values into the range of (0, 1). — Written as S(x) = Tras Random Walk Approaches for Node Embeddings @ on a graph and a starting point, we select a neighbor of it at random, and move to this neighbor; then we select a neighbor of this point at random, and move to it, etc. The (random) sequence of points visited this way is a random walk on the graph. 3 Stanford University Stanford CS224W: Machine Learning with Graphs Instructor: Prof. Jure Leskovec Lecture 3.2: Random Waik’Approaches for Node Embeddings SURI Sena ne «| Y a W: ML with Graphs ¥ 7 probability that w ) ZyZy © and yco-occur on § a random walk over the graph Hast Corrina Wit rach Instructor: Prof. Jure Leskovec Stanford University Lecture 3.2: Random Walk Approaches for Node Embeddings €j) Random-Walk Embeddings Estimate probability of visiting node v ona random walk starting from node u using some random walk strategy R ‘A x PTS Stanford ‘ (vu) 8 University Optimize embeddings to encode these Beer random walk statistics: ZL Machine Learning with Graphs Instructor: Prof. Jure Leskovee Similarity in embedding space (Here: dot 8 « Pr(v|t)| pectueaz product=cos(@)) encodes random walk “similarity” ara een jae padeae W: ML with Graphs Why Random Walks? Pane Pac UM Lev rong casa ee MY Expressivity: Flexible stochastic definition of node similarity. that Idea: if random walk earn from node u visits v with high probability, w and v are similar (high-order multi-hop information) Stanford CS224W: Efficiency: Do not need to consider all pode Mschine Leeming wih Graphs Instructor: Prof. Jure Leskovec pairs when training; o} pairs that co-occur on rar Lecture 3.2: Random Walk Approaches for Node Embeddings Unsupervised Feature Learning Ma UI Intuition: Find embedding of nodes in d-dimensional space that preserves similarity Idea: Learn node embedding such that nearby nodes are close together in the network Stanford 9 University Given a node u, how do we define nearby pee caer nodes? Machine Learning with Graphs * Nr(u) ... neighbourhood of u obtained by some _ Pie ee aha random walk strategy R Sahsah ME Gbrosches for Node Embeddings Node Embed. @ > @ W: ML with Graph Tea UMC a = Given G = (V,E), = Our goal is to learn a mapping f:u > R?: fF) =2% = Log-likelihood objective: Stanford max » log P(NR()I Zu) University uev = Np(u) is the neighborhood of node u by strategy R ea eae Machine Learning with Graphs Instructor: Prof. Jure Leskovec = Given node u, we want to learn feature Lecture 3.2; representations that are predictive of the nodes Race WakAbbroaches forNode Embeddings Tr enton Walk Optimization W: ML with Graphs | 2021 | Lecture 3.2-Random Walk Approaches for Node Embed. @ 4 @ Run short fixed-length random walks starting from each node uw in the graph using some random walk strategy & For each node u collect Np(u), the multiset* of nodes visited on random walks starting Stanford Hom 23 University Optimize embeddings according to: Given . eat 5 Stanford CS224W: node u, predict its neighbors Np (u) Machine Learning with Graphs Instructor: Prof. Jure Leskovec max » log P(Ng(uU)| Zy) => Maximum likelihood objective Lecture 32: f Random Walk Approaches » for Node Embeddings — @) ited tiple times on random walks Random Walk Optimization Equivalently, £=) YS -log(P(vlzw)) uceV VENR(U) Intuition: Optimize embeddings z,, to maximize ee Stanford the likelihood of random walk co-occurrences 3 University P(v|z,) Machine Leerring wit Grophe Why softmax? exp(z2z,) We want node v to be Instructor: Prof. ure Leskovec P(viz,) = most similar tonode «| Geotye 3p ‘U. T, (out of all nod Ynev CXP(ZezZn) oka Random Waik Approaches es for Node Embeddings Random Walk Optimization Equivalently, £=) YS =log(P(vtzn)) ueV VENR(U) ituition: Optimize embeddings z,, to maximize Srantond the likelihood of random walk co-occurrences 3 University P(v|z,,) Machine Lerring wit roche T. Why softmax? exp(giz,) Wowent rode» to be Instructor: Prof. ure Leskoveo P24) == oat Lecture a2 Ynev CXP(ZeZn) Random Walk Approaches for Node Embeddings max exp(x,) Random Walk Optimization bz Stanford 63 University Putting it all together: exp(ZuZv) a Znev exp(ZyZn) £=) YY) -t0g ) uUeV vVENR(u) Stanford CS224W: Machine Learning with Graphs Optimizing random walk embeddings = 7 — Rahaor Wak Apbroaches Finding embeddings z,, that minimize L for Node Embeddi Pet Pa URC eee Roa) ola ae | a EUR ear) exp (ZZ) ) Lnev €XP(ZuZn) Stanford midis 3 University Stanford C8224W Le > » —log( ueV vENR(u) Nested sum over nodes gives Machine Learning with Graphs O(|VI2) complexity! inetietee erot luca estate Lecture 3 2: Random Waik Approaches for Node Embeddings € exp(ZuZv) T Znev €XP(ZuZn x Stanford 63 University La > > —log( ueV vENR(u) The normalization term from the softmax is the culpri Stanford CS224W: .. can we approximate it? Machine Learning with Graphs Instructor: Prof. Jure Leskovec Lecture 3.2 Random Waik’Approaches for Node Embeddings Negative Sampling Solution: Negative sampling exp(zuZv) Lnev exP(Zh2n) ~ log (a(232,)) — EK. 4 log (a(222,,)), mi~Py bor og( ) Stanford CS224W: random distribution Machine Learning with Graphs sigmoid function 2 over nodes Instructor: Prof. Jure Leskovec Instead of normalizing w.r.t. all nodes, just loci Random Waik’Approaches normalize against k random “negative samples” 7; fr Node embed random distribution T, exp( ZZ, P(ti2») over nodes Znev &xP(Zi2n) . \ = log (o(ziz,)) - Dan log (o(z%z»,)) n~Py = Sample k negative nodes each with prob. proportional to its degree Stanford 0S224W: . x 4 Machine Learning with Graphs = Two considerations for k (# negative samples): 1. Higher k gives more robust estimates log¢ Stanford University Instructor: Prof. Jure Leskoveo Lecture 3.2 2. Higher k corresponds to higher bias on negative events Random Walk Approaches: : for Node Embeddings In practice k 5 = After we obtained the objective function, how do we optimize (minimize) it? £= LY tog Pete) fat chet = Gradient Descent: a simple way to minimize L: ® Stanford University * Initialize z; at some randomized value for all i. Stanford 6S224W: * Iterate until convergence. Machine Learning with Graphs Instructor: Prof. Jure Leskovec ac * For all i, compute the derivative =. me leaming rate az Lecture 32 az RandomWaiklApproaches * Forall i, make a step towards the direction of derivative:z; < zj — 32, for Node Embeddings Stochastic Gradient Descent = Stochastic Gradient Descent: Instead of evaluating gradients over all examples, evaluate it for each individual training example. = Initialize z; at some randomized value for all i. , Stanford = Iterate until convergence: £“ = y —log(P(vl2)) University vege . ; x atin GLO * Sample a node i, for all j calculate the derivative 2, stanford cs224w a2; Machine Leaning with Graphs aco Instructor: Prof. Jure Leskovec * For all j, update:z) < z) —n —. 7 eure 32: Random WaikApbroaches for Node Embeddings, Random Walks: Summary Run short fixed-length random walks starting from each node on the graph For each node u collect Np (u), the multiset of nodes visited on random walks starting from wu Optimize embeddings using Stochastic Stanford Gradient Descent: 3 University Stanford CS224W: Machine Learning with Graphs £=) YD -togP(wten)) UEV VENR(U) Instructor: Prof. Jure Leskovec Lecture 3.2: OX is usir Random Waik’Approaches for Node Embeddings Hao) atarolel (eM U-M Tae le) al A] Leg = So far we have described how to optimize embeddings given a random walk strategy R = What strategies should we use to run these random walks? = Simplest idea: Just run fixed-length, unbiased random walks starting from each node (i.e., DeepWalk from Perozzi et al., 2013) Stanford University Stanford CS224W: = The issue is that such notion of similarity is too constrained Machine Learning with Graphs . . Instructor: Prof. Jure Leskovee = How can we generalize this? ‘ Lecture 3.2 Random Waik/Approaches Reference: Perozzi etal. 2014. Deer ic sions. KDD, for Node Embeddings Soca) W: ML with Graph Peete Pau eco ucR a esta a a i) Overview of node2vec Goal: Embed nodes with similar network neighborhoods close in the feature space. We frame this goal as a maximum likelihood optimization problem, independent to the downstream prediction task. Key observation: Flexible notion of network neighborhood Np(u) of node u leads to rich node i Stanford CS224W: embeddings Mactine Learning with Graphs Develop biased 2" order random walk R to Instron: Prot ture Lesioven generate network neighborhood Np(u) of nodeu — Leste 9 Random Walk Approaches Reference: Grover et al. 2016. node2vec: Scalable Feature Learning for Networks. KDD. for Node Embeddings Idea: use flexible, biased random walks that can trade off between | and global views of the network (Grover and Leskovec, 2016). f® Stanford 63 University Stanford CS224W: Machine Learning with Graphs Instructor: Prof. Jure Leskovec Lecture 3.2 Random Waik’Approaches for Node Embeddings nodea2vec: Biased Walks Two classic strategies to define a neighborhood Np(w) of agiven node u: Stanford 3 University Stanford CS224W Walk of length 3 (Np (w) of size 3): Machine Learning with Graphs Instructor: Prof, Jure Leskovec Npps(u) = {51,S2,53} Local microscopic view te. Random Waik'Approaches Nopgs(u) = { $4, 55,56} Global macroscopic view ‘orNode Embeddings ® 952 Qe OO O-O-O O60 OO O° Stanford 09 University BFS : D FS: ‘Stanford CS224w: Micro-view of Macro-view of Machine Learning with Graphs neighbourhood neighbourhood rat Dotot: Protire laekovec! Lecture 3.2 Random Waik’Approaches for Node Embeddings Biased fixed-length random walk R that given a node u generates neighborhood Npr(u) = Two parameters: Return parameter p: * Return back to the previous node Stanford In-out parameter q: 9 University = Moving outwards (DFS) vs. inwards (BFS) Stanford CS224W: * Intuitively, q is the “ratio” of BFS vs. DFS Machine Learning with Graphs Instructor: Prof. Jure Leskovec Lecture 3.2; Random Walk Approaches for Node Embeddings @) Biased 2"4-order random walks explore network neighborhoods: * Rnd. walk just traversed edge (s,, w) and is now at w Insight: Neighbors of w can only be: ‘Same distance to s1 a % Stanford 5 University Stanford CS224W: Machine Learning with Graphs A O—s: net ictor Prot eshotes a “Back tos; Lecture 3.2 Idea: Remember where the walk came from Random Waik’Approaches for Node Embeddings Random Walk Approaches for Node Embed... @ > @ W: ML with Graphs | 2021 | Lecture Biased 2"4-order random walks explore network neighborhoods: » Rnd. walk just traversed edge (s,, w) and is now at w Insight: Neighbors of w can only be: ‘Same distance to sy Stanford University Stanford CS224W: Machine Learning with Graphs Instructor: Prof. Jure Leskovec Lecture 32; Random Walk Approaches for Node Embeddings €) + — or = Walker came over edge (sy, w) and is at w. Where to go next? 1/p,1/q,1 are unnormalized probabilities Stanford University = p,q model transition probabilities Races area et rare " p... return parameter Instructor: Prof. Jure Leskovec " q.. "walk away” parameter Lecture 3.2: Random Waik’Approaches for Node Embeddings Biased Random Walks Walker came over edge (sj, w) and is at w. Where to go next? Target’ Prob. Dist. (s3,¢) 81] |1/p | 0 = W- |So 1 1 aS tanford s3||1/a| 2 3 University s,|/a| 2 7 t Machine Learning with Graphs * BFS-like walk: Low value of p t fectuonl jason Inetictor Prof urs Cshetee * DFS-like walk: Low value of q —s Lecture 3.2 Np(u) are the nodes visited by the biased walk OnERean = 1) Compute random walk probabilities = 2) Simulate r random walks of length l starting from each node u = 3) Optimize the node2vec objective using Stochastic Gradient Descent f* Stanford 63 University = Linear-time complexity ae = All 3 steps are individually parallelizable Instructor: Prof. Jure Leskovec Lecture 3.2 Random Waik’Approaches for Node Embeddings = Different kinds of biased random walks: * Based on node attributes (Dong et al., 2017). Based on learned weights (Abu-E|-Haija et al., 2017) = Alternative optimization schemes: Directly optimize based on 1-hop and 2-hop random walk probabilities (as in LINE from Tang et al. 2015). Stanford 7 ; 3 Universi = Network preprocessing techniques: : LIVETSICY, * Run random walks on modified versions of the original SEA a network (e.g., Ribeiro et al. 2017's struct2vec, Chen et al. 2016's HARP). Instructor: Prof. Jure Leskovec Lecture 3.2: Random Waik/Approaches for Node Embeddings = Core idea: Embed nodes so that distances in embedding space reflect node similarities in the original network. = Different notions of node similarity: * Naive: similar if 2 nodes are connected * Neighborhood overlap (covered in Lecture 2) Random walk approaches (covered today) Beeetoren Machine Learning with Graphs § Stanford University Instructor: Prof. Jure Leskovec Lecture 3.2 Random Waik’Approaches for Node Embeddings So what method should | use..? = No one method wins in all cases.... E.g., node2vec performs better on node classification while alternative methods perform better on link prediction (Goyal and Ferrara, 2017 survey) = Random walk approaches are generally more Stanford efficient University = In general: Must choose definition of node Machine Leaning with Graph similarity that matches your application! Instructor: Prof, Jure Leskoveo Lecture 32: Random Waik'Approaches = Stanford CS224W: Machine Learning with Graphs | 2021 | Lecture 3.3 - Embedding Entire (UN uaue Reena Cera eee een raphs insoctor: Pots Jure Leskovec Jure Leskovec, Stanford University Lecture 3.8: http://cs224w.stanford.edu Te eae W: Machine Learning with Graphs | 2021 | Lecture 3.3 - Embedding Entire = Goal: Want to embed a subgraph or an entire graph G. Graph embedding: zg. Stanford University original network embedding space Seco Machine Learning with Graphs = Tasks: Instructor: Prof. Jure Leskovec ® Classifying toxic vs. non-toxic molecules oe = Identifying anomalous graphs Embedding Entire Graphs Simple idea 1: = Runa standard graph embedding technique on the (sub)graph G = Then just sum (or average) the node embeddings in the (sub)graph G ® Stanford Zq= ) Ze 69 University Stanford CS224W: vEG Machine Learning with Graphs = Used by Duvenaud et al., 2016 to classify Instructor: Prof. Jure Leskovec molecules based on their graph structure eg Fiavisd craphe = Idea 2: Introduce a “virtual node” to represent the (sub)graph and run a standard graph embedding technique x <\ Nas } Stanford > 9 University Stanford CS224W: original network embedding space Machine Learning with Graphs = Proposed by Li et al., 2016 as a general struction Btoy- dure Ueskwsc, Lecture 3.8: technique for subgraph embedding Embedding Entire Graphs Approach 3: Anonymous Walk Embeddings States in anonymous walks correspond to the index of the first time we visited the node ina random walk Graph wit. “s a < Stanford Vacs Sa, Sil O20 Y y ¥ Instructor: Prof. Jure Leskovec Mi Walk 1 Anonymous Walk 2 0+0>0>8-0 0-8>0>0-0 Lecture 35: Embedding Entire Graphs Embeddings, CML 2018 hts Approach 3: Anonymous Walk Embeddings States in anonymous walks correspond to the index of the first time we visited the node ina random walk Graph a - joa ~ Stanford @ p @ O Stanford 6S224W: L7Q.9)| o>? © eho Machine Learning with Graphs V y + Instructor: Prof. Jure Leskovec Anonymous Walk I “Anonymous Walk 2 yO _0>8>0>0-0 Lecture 3.5: Embedding Entire Graphs beddings, ICML 2018 ht W: Machine Learning with Graphs | 2021 | Lecture 3.3 - Embedding Entire Graphs > @ = Agnostic to the identity of the nodes visited (hence anonymous) Example RW1: ° va Step1:nodeA —— node node 2 (different from node 1) Stanford = Step 2:nodeB =—~ = Step 3:nodeC —~ node 3 (different from node 1, 2) University = Step 4:nodeB —~ node 2 (same as the node in step 2) aEpTIGaSaT = Step 5:nodeC —— node 3 (same as the node in step 3) Machine Learning with Graphs Instructor: Prof. Jure Leskovec Note: RW2 gives the same anonymous walk Lecture 33: Embedding Ente Graphs Growth of Anonymous Walks with Length .3 Stanford University Length of Anonymous Walks Stanford cS224W: Machine Learning with Graphs Number of anonymous walks grows exponentially: "Th re 5 anon f length Instructor: Prof. Jure Leskovec : 3 Lecture 3.5 w,=111, w2=112, w3= 121, w4= 122, W5= D3 Embedding Entire Graphs Vat) ( Re Rome W LLC Simulate anonymous walks w; of | steps and record their counts Represent the graph as a probability distribution over these walks For example: = Setl = 3 * Then we can represent the graph as a 5-dim vector Stanford CS224W: * Since there are 5 anonymous walks w; of length 3: 111, 112, Maine Seana wih Graphs 121, 122, 123 Instructor: Prof. Jure Leskovee = Z¢[i] = probability of anonymous walk w; in G Lecture 3.8 Embedding Entre Graphs Stanford 63 University

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy