0% found this document useful (0 votes)
30 views51 pages

Lecture 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views51 pages

Lecture 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Graph Analytics for Big Data

Slides is adapted from CS224W:


Machine Learning with Graphs
Prerequisites
▪ Course title: Graph Analytics for Big Data
▪ Good background in:
▪ Machine Learning (ML)
▪ Algorithms and graph theory
▪ Probability and statistics
▪ Programming:
▪ You should be able to write non trivial programs (in
Python)
▪ Familiarity with Python libraries in ML
Graph machine learning tools
▪ PyG (PyTorch Geometric)
▪ The ultimate library for Graph Neural Networks
▪ Python igraph
▪ NetworkX
▪ GsPan
▪ Top python libraries for data science: numpy,
pandas, scikit-learn,
Instructors
▪ Dr. Van Hao Can

▪ Dr. Van Hoan Do


Textbooks & Communication
▪ Text book
▪ Nagiza F. Samatova, William Hendrix, John Jenkins, Kanchana
Padmanabhan, Arpan Chakraborty, Practical Graph Mining with R,
Chapman and Hall/CRC, 2013.
▪ Graph Representation Learning Book by Will Hamilton
▪ Homework and slides will be sent through Moodle.
▪ E-mail
▪ Van Hoan Do: vanhoan310@gmail.com
▪ Van Hao Can: cvhao89@gmail.com
Grading
▪ Attendance/Attitude: 10%
▪ Midterm Exam: 40%
▪ Written or presentation.
▪ Homework assignment (+1, +2 points)
▪ Final Exam: 50%
▪ Written exam.
▪ 50% of exercises and 50% of code in Python.
ML with graphs
Representation learning
Representation learning
Graph Theory
Representing Graphs:
Adjacency Lists
Definition: An adjacency list can be used to
represent a graph with no multiple edges by
specifying the vertices that are adjacent to each
vertex of the graph.
Example:

Example:
Special Types of Simple Graphs: Complete
Graphs
A complete graph on n vertices, denoted by Kn,
is the simple graph that contains exactly one
edge between each pair of distinct vertices.
Special Types of Simple Graphs: Cycles
and Wheels
A cycle Cn for n ≥ 3 consists of n vertices v1, v2
,⋯ , vn, and edges {v1, v2}, {v2, v3} ,⋯ , {vn-1, vn},
{vn, v1}.

A wheel Wn is obtained by adding an additional


vertex to a cycle Cn for n ≥ 3 and connecting this
new vertex to each of the n vertices in Cn by new
edges.
New Graphs from Old
Definition: A subgraph of a graph G = (V,E) is a graph (W,F), where W ⊂
V and F ⊂ E. A subgraph H of G is a proper subgraph of G if H ≠ G.

Example: Here we show K5 and


one of its subgraphs.

Definition: Let G = (V, E) be a simple graph. The subgraph induced by a


subset W of the vertex set V is the graph (W,F), where the edge set F
contains an edge in E if and only if both endpoints are in W.

Example: Here we show K5 and the subgraph


induced by W = {a,b,c,e}.
New Graphs from Old 2
Definition: The union of two simple graphs
G1 = (V1, E1) and G2 = (V2, E2) is the simple
graph with vertex set V1 ⋃ V2 and edge set E1 ⋃
E2. The union of G1 and G2 is denoted by G1 ⋃
G2.

Example:
Trees
Definition: A tree is a connected undirected graph with no simple circuits.

Example: Which of these


graphs are trees?

Solution: G1 and G2 are trees - both are connected and have no simple
circuits. Because e, b, a, d, e is a simple circuit, G3 is not a tree. G4 is not a
tree because it is not connected.

Definition: A forest is a graph that has no simple circuit,


but is not connected. Each of the connected
components in a forest is a tree.
Trees (continued)
Theorem: An undirected graph is a tree if and only if there is a
unique simple path between any two of its vertices.

Proof: Assume that T is a tree. Then T is connected with no simple


circuits. Hence, if x and y are distinct vertices of T, there is a simple
path between them (by Theorem 1 of Section 10.4). This path must
be unique - for if there were a second path, there would be a simple
circuit in T (by Exercise 59 of Section 10.4). Hence, there is a unique
simple path between any two vertices of a tree.

Now assume that there is a unique simple path between any two
vertices of a graph T. Then T is connected because there is a path
between any two of its vertices. Furthermore, T can have no simple
circuits since if there were a simple circuit, there would be two paths
between some two vertices.

Hence, a graph with a unique simple path between any two vertices
is a tree.
Properties of Trees
Theorem 2: A tree with n vertices has n − 1 edges.

Proof (by mathematical induction):


BASIS STEP: When n = 1, a tree with one vertex has no
edges. Hence, the theorem holds when n = 1.
INDUCTIVE STEP: Assume that every tree with k
vertices has k − 1 edges.
Suppose that a tree T has k + 1 vertices and that v is a
leaf of T. Let w be the parent of v. Removing the vertex v
and the edge connecting w to v produces a tree T′ with k
vertices. By the inductive hypothesis, T′ has k − 1
edges. Because T has one more edge than T′, we see
that T has k edges. This completes the inductive step.
Spanning Trees
Definition: Let G be a simple graph. A spanning tree of G is a
subgraph of G that is a tree containing every vertex of G.
Example: Find the spanning tree of this
simple graph:

Solution: The graph is connected, but is not a tree because it


contains simple circuits. Remove the edge {a, e}. Now one simple
circuit is gone, but the remaining subgraph still has a simple circuit.
Remove the edge {e, f} and then the edge {c, g} to produce a simple
graph with no simple circuits. It is a spanning tree, because it
contains every vertex of the original graph.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy