19MAM81-GRLmidsem 1 Answer Key
19MAM81-GRLmidsem 1 Answer Key
An adjacency matrix is a square matrix used to represent a finite graph. The elements of the matrix
indicate whether pairs of vertices are adjacent (connected) or not in the graph.
Definition
For a graph with nnn vertices, the adjacency matrix AAA is an n×nn \times nn×n matrix where:
● A[i][j]=1A[i][j] = 1A[i][j]=1 if there is an edge between vertex iii and vertex jjj.
● A[i][j]=0A[i][j] = 0A[i][j]=0 if there is no edge between vertex iii and vertex jjj.
For weighted graphs, the matrix contains the weight of the edge instead of just 1s and 0s.
The empirical reconstruction loss L is defined over a set of training node pairs D. Each pair (u,v)D consists
8 Bring out the differences between encoder and decoder.
Encoder - Converts a graph (nodes & edges) into a low-dimensional latent representation
(embeddings).
Decoder - Reconstructs the graph or performs tasks like link prediction using the latent
representations.
9 Differentiate between biased random walk and personalized random walk.
Biased Random Walks (e.g., GraRep) – Skips nodes to capture long-range dependencies.
Personalized Random Walks – Modify walk probabilities based on external attributes like edge
weights or node types.
10 Define uniform and filtered sampling.
Uniform Sampling
Definition:In uniform sampling, each node or edge in the graph has an equal probability of being
selected, regardless of its importance or characteristics.
Filtered Sampling
Definition:Filtered sampling selects nodes or edges based on specific criteria or constraints (such
as degree, centrality, importance, or label distribution).
Eigenvector Centrality
•Definition:
•Accounts for the importance of a node's neighbors.
•A node’s centrality is proportional to the average centrality of its
neighbors.
where λ is a constant.
Full Graph:
•Represents the entire network with all edges (both training and test edges).
•The solid blue edges are the training edges.
•The dashed red edges are the test edges that need to be predicted.
Training Graph:
•A subsampled version of the full graph, where the test edges (red dashed lines) have
been removed.
•This graph is used for training models or computing overlap statistics.
2. Salton Index
Normalizes using the product of the degrees.
Useful in balancing the influence of degrees in large, heterogeneous graphs. ((e.g.,
some nodes may have hundreds of neighbors, while others have only a few)
3. Jaccard Index
Considers both the shared and total neighbors of nodes 𝑢 and 𝑣.
A widely used metric for neighborhood similarity.
or
14 a) Explain Katz index and Leicht-Holme-Newman (LHN) Similarity.
. Katz Index
The Katz index computes the similarity between two nodes based on the count of paths of all
lengths between them, with shorter paths receiving higher weight
The LHN similarity normalizes the Katz index by accounting for the expected number of
paths under a random graph model. This reduces the bias toward high-degree nodes.
b) Find the unnormalized laplacian for the graph with edges {(2,3),(3,4),(4,5)}.
c)
d)
15 a) Explain shallow embedding approach with example.
Encoder:
The encoder's job is to transform each node in the graph into a low-dimensional vector
(embedding). This mapping is learned such that similar nodes in the graph should be
represented by similar vectors in the embedding space.
Pairwise Decoders
The most common type of decoder used in node embedding models is the pairwise decoder.
This decoder takes in a pair of node embeddings and predicts their relationship or similarity
based on the graph structure.
This means that the decoder receives two d-dimensional embeddings as input and outputs a
positive scalar value that represents the relationship between the two nodes.
or
The loss function ensures that similar nodes stay close in the embedding space:
Step 5: Interpretation
Each row of Z represents a 2D embedding for a node.
Similar nodes (e.g., A and C, B and D) have closer embeddings.
The embeddings preserve local graph structure by minimizing differences between connected
nodes.
This assumes that node similarity (e.g., neighborhood overlap) is proportional to the dot
product of their embeddings.
Some methods using this approach:
Graph Factorization (GF): Uses the adjacency matrix (𝑆=𝐴S=A) as similarity.
GraRep: Uses powers of the adjacency matrix to capture long-range connections.
HOPE: Uses more general node similarity measures. The loss function minimizes the
difference between predicted and actual similarities
The loss function minimizes the difference between predicted and actual similarities
These methods can be solved using Singular Value Decomposition (SVD), reducing the
problem to matrix factorization:
or
18 a) How does reconstruction of multi-relational graph happen with loss function?
In a multi-relational graph, nodes are connected by different types of edges (relationships).
Imporatnt task is to embed these nodes into a low-dimensional space and reconstruct their
relationships.
Unlike simple graphs, where we only consider node pairs, multi-relational graphs require us
to consider edge types as well.
Decoder Function for Multi-Relational Graphs
The decoder function DEC(u, τ, v) computes the likelihood that an edge (u, v) of type τ
exists.
One early approach is RESCAL, which represents relationships using learnable matrices