0% found this document useful (0 votes)
41 views48 pages

Chapter 4 - Machine Learning With Graphs II: Prepared By: Shier Nee, SAW

Uploaded by

hiphoplistener
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views48 pages

Chapter 4 - Machine Learning With Graphs II: Prepared By: Shier Nee, SAW

Uploaded by

hiphoplistener
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 48

Chapter 4 – Machine Learning with

Graphs II

Prepared by: Shier Nee, SAW


Announcement
• 31 Dec 2021 - Week 10
• Proposal submission
• Asynchronous
History
• GNN was introduced back in 2005
• First paper published in 2008.
• Started to gain popularity in the past 5 years because it able to model
relationship between nodes and produce numeric representation of
it.
• GNN originates from two ML techniques: recursive neural network
and Markov chains.
• Idea: encode data using graph and exploit relationship between
nodes.
Why do we need Graph Neural Network
• Conventional NN is designed for simple sequences and grids (rigid)
• More flexible
• Allow to investigate interaction between nodes / relationship
between nodes
• Shown to be more generalizable
• Many real world data can be represented as graph.
Representation of Real-World Data as
Graph
Social Network Graph Scene Graph Transportation
Network Graph

Molecule
Network Graph

Geometric
Network Graph
Graph Neural Network
• Graph is made up of nodes (vertices) and edges
• Graph is defined as:
• Edge can be undirected or directed edge.
• Direction of the edges indicate the dependencies:
unidirected or bidirected or.
• A graph is often represented using adjacent matrix,
• If a graph has nodes, has a dimension of

• Flexible but also mean it is harder to build

Figure form https://neptune.ai/blog/graph-neural-network-and-some-of-gnn-applications


Graph Neural Network
Image as Graph Text as Graph

Molecules as Graph

Figure from https://distill.pub/2021/gnn-intro/


Applications Graph Neural Network
Medical Diagnosis & Electronic Health Traffic forecasting
Records Modeling

Recommendation System

Figure form https://jonathan-hui.medium.com/applications-of-graph-neural-networks-gnn-d487fd5ed17d


Basic idea of Classifier
Features Classifier Who is the murder?

From https://www.youtube.com/watch?v=eybCCtNKwzA&ab_channel=Hung-yiLee
GNN
Features Classifier Who is the murder?

Sister-
Husband brother colleague
-wife

Teacher-
student

Neighbour

From https://www.youtube.com/watch?v=eybCCtNKwzA&ab_channel=Hung-yiLee
Convolutional Neural Network (CNN)
CNN =4
1 0 1 1 0 1
4
1 1 1 1 1 0
Convolution
0 0 0 1 1 1

0 0 1 1 1 0
W=
3x3 kernel
1 1 1 1 0 0

0 1 0 1 0 0

Layer i Layer i+1


CNN
CNN =4
1 0 1 1 0 1
4 4
1 1 1 1 1 0
Convolution
0 0 0 1 1 1

0 0 1 1 1 0
W=
3x3 kernel
1 1 1 1 0 0

0 1 0 1 0 0

Layer i Layer i+1


CNN
CNN =4
1 0 1 1 0 1
4 4 4
1 1 1 1 1 0
Convolution
0 0 0 1 1 1

0 0 1 1 1 0
W=
3x3 kernel
1 1 1 1 0 0

0 1 0 1 0 0

Layer i Layer i+1


CNN
CNN =5
1 0 1 1 0 1
4 4 4 5
1 1 1 1 1 0
Convolution
0 0 0 1 1 1

0 0 1 1 1 0
W=
3x3 kernel
1 1 1 1 0 0

0 1 0 1 0 0

Layer i Layer i+1


CNN
CNN
1 0 1 1 0 1 Repeat the rest
4 4 4 5
1 1 1 1 1 0
Convolution 4 4 5 4
0 0 0 1 1 1
3 3 4 4
0 0 1 1 1 0
W=
3x3 kernel 3 5 3 3
1 1 1 1 0 0

0 1 0 1 0 0
Feature extraction
Layer i Layer i+1
CNN

From https://developersbreach.com/convolution-neural-network-deep-learning/
GNN
How to embed node into a feature space using convolution?

1 0 1 1 0 1

1 1 1 1 1 0

0 0 0 1 1 1

0 0 1 1 1 0

1 1 1 1 0 0

0 1 0 1 0 0 Not trivial
- No fixed node ordering or reference point
- Arbitrary size and complex topological
structure
GNN
• Approach1: Spatial-based convolution (same idea as CNN)
• Approach2: Spectral-based convolution (same idea as convolution
in signal processing)
Spatial-based convolution
Aggregate local network neighbourhoods:
• Sum
• Mean
• Weighted sum
• LSTM
• Max pooling
Message Passing

Figure from https://www.cs.mcgill.ca/~wlh/grl_book/files/GRL_Book-Chapter_5-GNNs.pdf


Spatial-based convolution

Aggregation Aggregation

(3)
(2)
h2 h2
(1)
h2
Every node consists a hidden embedding,

Aggregate: perform some operation with neighbour features and update the next state
Aggregation: Sum
Input Layer Hidden Layer 0 Hidden Layer 1
(0) (1)
h3 h3
𝑥4
(0) (1)
𝑥3 h2 h2
(0) (1)
𝑥1 𝑥2 h
(0) h1 h
(1) h1
0 0
( 0) ( 1)
h 2 =𝑤 0 (𝑥 2h)2 =𝑤 ^ 1 ,0 ( h01 + h03 ) +𝑤 0 ( 𝑥 2

m = message passing
Aggregation: Sum
Input Layer Hidden Layer 0 Hidden Layer 1
(0) (1)
h3 h3
𝑥4
(0) (1)
h2 h2
𝑥3
(0) (1)
𝑥1 𝑥2 h
(0) h1 h
(1) h1
0 0

NN4G (Neural ( 0)
h 1 =𝑤^ 0 (𝑥 1 ) (1)
^ 1 ,0 ( h0 + h2 +h 0 0
h 1 =𝑤
Network for Graph)
https://ieeexplore.ieee.org/docume
nt/4773279
Aggregation: Mean
Input Layer Hidden Layer 0
(0)
𝑥4 h3
(0)
h2
𝑥3
(0)
𝑥2 h
(0) h1
𝑥1 0

d = distance
Aggregation: Mean
Input Layer Hidden Layer 0 Hidden Layer 1
(0) (1)
𝑥4 h3 h3
(0) (1)
h2 h2
𝑥3
(0) (1)
𝑥2 h
(0) h1 h
(1) h1
𝑥1 0 0

(1) 1
h 1 =𝑤3 𝑀𝐸𝐴𝑁 ( 𝑑 ( 3 , : ) =2

By having many hidden layer, we can extract the relationship


between nodes from far.

d = distance
Aggregation: Mean
Input Layer Hidden Layer 0 Hidden Layer 1
(0) (1)
𝑥4 h3 h3
(0) (1)
h2 h2
𝑥3
(0) (1)
𝑥2 h
(0) h1 h
(1) h1
𝑥1 0 0

(1) 1
h 1 =𝑤3 𝑀𝐸𝐴𝑁 ( 𝑑 ( 3 , : ) =2
DCNN (Diffusion-
Convolutional
By having many hidden layer, we can extract the relationship
Neural Network) between nodes from far.
https://arxiv.org/abs/1511.02136
d = distance
Aggregation: Weighted Sum
Take into account the interaction of the neighbour's nodes

Hidden Layer 0 Hidden Layer 1


(0) 𝑑2 , 3 (1)
h3 h3
(0) (1)
h2 h2
𝑑1 , 2
(0) (1)
h
(0) h1 h
(1) h1
0 0

( )
𝑇 (1) (0) (0 )
1 1 h 1 =𝑤 ( 𝑑1 , 2) × h1 +𝑤 ( 𝑑 2 ,3 ) × h3
𝑑 𝑥 , 𝑦= ,
√ 𝑑𝑒𝑔𝑟𝑒𝑒(𝑥) √ 𝑑𝑒𝑔𝑟𝑒𝑒( 𝑦 )
d = distance
degree = degree of node (connect to how many node)
Aggregation: Weighted Sum
Hidden Layer 0 Hidden Layer 1
(0) 𝑑2 , 3 (1)
h3 h3
(0) (1)
h2 h2
𝑑1 , 2
(0) (1)
h
(0) h1 h
(1) h1
0 0

( )
𝑇 (1) (0) (0 )
1 1 h 1 =𝑤 ( 𝑑1 , 2) × h1 +𝑤 ( 𝑑 2 ,3 ) × h3
𝑑 𝑥 , 𝑦= ,
√ 𝑑𝑒𝑔𝑟𝑒𝑒(𝑥) √ 𝑑𝑒𝑔𝑟𝑒𝑒( 𝑦 )
MoNET (Mixture
Model Network)Take into account the distance of the nodes for its effect
https://arxiv.org/pdf/1611.08402.
d = distance
pdf
degree = degree of node (connect to how many node)
Aggregation: Weighted Sum

Useful strategy for increasing the representational capacity of a GNN model,


especially in cases where you have prior knowledge to indicate that some
neighbors might be more informative than others
Aggregation: Weighted Sum

h =𝑤 ( 𝑥 )+∝ (h )+∝ (h )
Hidden Layer 0 Hidden Layer 1
(0) ∝2 , 3 (1) (1) (0 ) (0 )
h 3 h 3
2 0 2 1,2 1 2,3 3
(0) (1)
h2 h2
∝1 , 2
(0) (1)
h
(0) h1 h
(1) h1
0 0

= attention on neighbour
Spatial-based convolution
Aggregate local network neighbourhoods:
• Sum
• Mean
• Weighted sum
• LSTM
• Max pooling
Types of prediction tasks on graphs
• Graph-level  predict a single property for a whole graph
• Node-level  predict some property for each node in a graph
• Edge-level  predict the property or presence of edges in a graph
Graph Level Prediction
Predict whether it binds to certain Image Classification – MNIST
receptor – implicate diseases?

Text Analysis
Based on the text using graph, we predict type of
moods  happy / sad / depression / excited

Figure form https://www.researchgate.net/publication/308716386_Graph_Based_Convolutional_Neural_Network/figures?lo=1&utm_source=google&utm_medium=organic


Node Level Prediction
• Predict category / property of the node.
• Example: Zach’s Karate club [1] – predict loyalty of member to either John
A. or Mr. Hi.
• Node represents member in Karate club and edge represent interaction
between members.

[1] http://www1.ind.ku.dk/complexLearning/zachary1977.pdf
Node Level Prediction
• With image, image segmentation is an example where we
are trying to label the role of each pixel in an image.
• With text, a similar task would be predicting the parts-of-
speech of each word in a sentence (e.g. noun, verb, adverb,
etc).

[1] http://www1.ind.ku.dk/complexLearning/zachary1977.pdf
Edge Level Prediction
• Image scene understanding - predict the relationship between nodes.

[1] https://distill.pub/2021/gnn-intro/
Classification

Adjacent matrix,
GNN
• We have learn about the graphical concept about GNN
• How are we going to represent the graphical concept (node/edge) into
computer that is understandable by computer
• If you notice, all of these are matrix computation
• One thing to note, we have to represent the graph in an efficient way for
matrix operation.
Node embedding
Task: Map nodes to embedding space
Encode network information
Similarity of embeddings between nodes indicates their
similarity in the network (i.e. Both nodes are close to
each other (connected by an edge))
Potentially used for many downstream predictions

node
vector
𝑢𝑓 :𝑢 ℝ 𝑑

Feature representation , embedding


Node embedding

From http://web.stanford.edu/class/cs224w/slides/03-nodeemb.pdf
Graph
• A graph, G(V,E) is often represented with four components
• Node/Vertices, V: {a,b,c,d}
• Edge, E: {(a,b), (b,c), (b,d), (c, d)}
• Global context
• Connectivity  adjacent matrix,

Adjacent matrix,

Figure form https://www.kdnuggets.com/2020/11/friendly-introduction-graph-neural-networks.html


Node embedding

Enc(c)
z(c)

Encode node
z(b)
Enc(b)
Original network Embedding space

Goal: To encode nodes so that similarity in the embedding space


(e.g., dot product) approximates similarity in the graph
𝑇
𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 (𝑐 , 𝑏)≈ 𝑧 𝑐 𝑧 𝑏
Learning node embedding
1. Encoder maps from nodes to 2. Define a node similarity function:
embeddings specifies how the relationships in vector space
map to the relationships in the original network

Enc(c) 3. Decoder DEC maps from


z(c)
embedding space to similarity score,
dot product of
z(b)
Enc(b)

Original network Embedding space 4. Optimize the parameter of an


encoder such that:
“Shallow” Encoding
Simplest encoding approach: embedding-lookup
“Shallow” Encoding
Simplest encoding approach: embedding-lookup

Each node is assigned a unique


embedding vector (i.e., we directly
optimize the embedding of each node)
Framework

Encoder + Decoder Framework


• Shallow encoder: embedding lookup
• Parameters to optimize: 𝐙 which contains node
embeddings 𝐳𝑢 for all nodes 𝑢 ∈ 𝑉
• Decoder: based on node similarity.
• Objective: maximize for node pairs (c, b) that are similar
Framework

Encoder + Decoder Framework


• Shallow encoder: embedding lookup
• Parameters to optimize: 𝐙 which contains node
embeddings 𝐳𝑢 for all nodes 𝑢 ∈ 𝑉
• Decoder: based on node similarity.
• Objective: maximize for node pairs (c, b) that are similar
Your Task
• Draw a flow chart and write the description of the Neural Network
training process in your own words
• Submit Week6 – Clustering Lab
• Deadline by 24 Dec 2021
References
• https://distill.pub/2021/gnn-intro/
• https://neptune.ai/blog/graph-neural-network-and-some-of-gnn-applicati
ons

• https://towardsdatascience.com/an-introduction-to-graph-neural-networ
k-gnn-for-analysing-structured-data-afce79f4cfdc

• https://www.cs.ubc.ca/~lsigal/532S_2018W2/Lecture18a.pdf
• http://web.stanford.edu/class/cs224w/slides/06-GNN1.pdf

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy