0% found this document useful (0 votes)

11 views11 pages

15 Link 2

The document outlines a lecture on Link Analysis in the context of Big Data Analytics, covering topics such as PageRank, TrustRank, and community detection in graphs. It discusses the challenges posed by spammers and link farms, as well as methods for clustering and partitioning graph data. Additionally, it introduces concepts like Maximum Likelihood Estimation and the Affiliation Graph Model for generating social graphs.

Uploaded by

asansyzbai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views11 pages

15 Link 2

Uploaded by

asansyzbai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Announcements

• Homeworks:
• HW2 (due: 11/08)
• HW3 (will be posted on 11/06)
Link Analysis 2
EE412: Foundation of Big Data Analytics
Fall 2024

Jaemin Yoo 1 Jaemin Yoo 2

Recap Outline
1. Web Search as a Graph 1. Topic-Specific PageRank
2. PageRank 2. Clustering and Partitioning
3. PageRank: Implementation 3. Finding Overlapping Communities

A B
C
3.3 38.4 Google Matrix:
1
34.3

𝐴 = 𝛽𝑀 + 1 − 𝛽
D E F 𝑁 !×!
3.9 8.1 3.9
1.6
1.6 1.6 1.6 1.6
Jaemin Yoo 3 Jaemin Yoo 4
Google vs. Spammers: Round 2 Link Farms
• Spammers began to work out ways to fool Google’s PageRank. • Three kinds of web pages from a spammer’s point of view.
• Link spam: • Owned pages: Completely controlled by spammer.
• They create a link structure that • May span multiple domain names.
boosts PageRank of certain pages.
• Accessible pages: Spammer can post links to his pages.
• Link farm: • E.g., comments on blogs or newspapers, Wikipedia, etc.
• Collection of pages for link spam.
• Inaccessible pages: Majority of the web.
• Spammers cannot do anything about them.

Jaemin Yoo 5 Jaemin Yoo 6

Accessible
Owned

Link Farms Link Farms: Analysis 1

• Goal: Maximize the PageRank score of target page 𝑡

t
• Symbols:
• 𝑁 ≫ 0: The number of pages in the web.
M
𝑥: PageRank contributed by accessible pages.
Accessible Owned
•
• 𝑦: PageRank of the target page 𝑡.
1
𝑧: PageRank contributed by owned pages.
Inaccessible
•
• Then, 𝑦 = 𝑥 + 𝑧 +
2 #$%
t
. Small constant; can be ignored.
≈ 𝑥 + 𝑧.
!

Can insert links to M

Accessible pages
Millions of
farm pages
Jaemin Yoo 7 Jaemin Yoo 8
Accessible Accessible
Owned Owned

Link Farms: Analysis 1 Link Farms: Analysis 1

t 2 t 2

• How can we interpret 𝑦 = +

• Symbols: ( %'
?
𝑁 ≫ 0: The number of pages in the web. #$% ! #)% !

• If 𝛽 = 0.85, it becomes 𝑦 = 3.6𝑥 + 𝑀.

•
M M
𝑀 > 0: The number of pages owned by the spammer.
*.,-
•

• We can make 𝑦 as large as we want by making 𝑀 large.

𝑥: PageRank contributed by accessible pages.
!
•
• 𝑦: PageRank of the target page 𝑡.
+
• Bots create millions of farm pages.
%& #$%
• PageRank of each “owned” page is given as .
' !

• Then, 𝑧 = 𝛽𝑀 + , and 𝑦 = 𝑥 + 𝛽𝑀 +
%& #$% %& #$%
' ! ' !
.
• If we solve it for 𝑦, we get 𝑦 = #$% ! +
( %'
#)% !
.

Jaemin Yoo 9 Jaemin Yoo 10

TrustRank Matrix Formulation

• Assumption: Trustworthy page is unlikely to link to a spam page. • Update the teleportation part of the PageRank formulation:

𝛽𝑀./ + 1 − 𝛽 / 𝑆 if 𝑖 ∈ 𝑆
• TrustRank: Let’s bias the random walk to trustworthy pages.
• When the walker teleports, pick a page from the teleport set 𝑆.
𝐴./ = 4
• Two approaches for developing a trustworthy teleport set: 𝛽𝑀./ otherwise
• Let humans examine a set of pages with highest PageRanks.
• Pick a domain (e.g., .edu, .gov, .ac.kr, etc.) where membership is controlled.
• Note that 𝐴 is still a stochastic matrix.
0% spam 0.1% spam 1% spam 10% spam • We can also assign different weights to the pages in 𝑆.
1 link hop 1 link hop 1 link hop

kaist.ac.kr icml.cc my.blogger.com abc123.biz

Jaemin Yoo 11 Jaemin Yoo 12

Example: Topic-Specific PageRank Application: Proximity on Graphs
• TrustRank is an example of a topic-specific PageRank. • Proximity: How close are nodes 𝐴 and 𝐵 in this graph?
• Teleport set has high PageRank scores even with few in-links.
F A G
A A
0.2 Suppose S = {1}, b = 0.8
! A # A "
0.5 1 Node Iteration
0.5
0.4 0.4 0 1 2 … stable
1 1 0.25 0.4 0.28 0.294 A A
2 3 B
0.8
2 0.25 0.1 0.16 0.118
1 3 0.25 0.3 0.32 0.327
1 A A A
4 0.25 0.2 0.24 0.261
0.8 0.8 & E
4 D

Jaemin Yoo 13 Jaemin Yoo 14

Good Proximity Measure? Random Walk with Restarts

• Shortest path between nodes is not good enough: • Solution: Random walk with restarts. F A G
• (Left) Degree-1 nodes 𝐸, 𝐹, and 𝐺 have no effects. • Also known as Personalized PageRank A A
• (Right) Multi-faceted relationships are not considered. • Teleport always to the query node.
• E.g., 𝑆 = 𝐴 in this case. ! A # A "

• The score 𝑟/ of each node 𝑗 is the

proximity with node 𝑖. A
B
A

…
A A A
& E
D

Jaemin Yoo 15 Jaemin Yoo 16

Application: Recommendation Outline
• RWR can be naturally used for recommendation. 1. Topic-Specific PageRank
1. Create a bipartite graph from a utility matrix. 2. Clustering and Partitioning
2. Run RWR starting from the query user.
3. Proximity scores represent the probabilities to buy.
3. Finding Overlapping Communities
• Finding similar items can also be done with RWR.
• Limitation: Need to rerun RWR for each query.

Source: PGL

Jaemin Yoo 17 Jaemin Yoo 18

Clustering of Nodes Betweenness

• What else can we do in graph-structured data? • Idea: Let’s find inter-cluster edges and remove them.
• Clustering: Way to find communities of nodes. • Assumptions:
• Useful for various graph data, not just the web. • There will be many edges inside each cluster.
• E.g., finding user groups in a social media. • There won’t be many edges between clusters.
• We need an approach specialized for graph data.
• We find such edges by computing their betweenness.
• Graph is a special non-Euclidean space.
• Existing clustering algorithms won’t be enough. = Score for being an intra-cluster edge.
Source: Wikipedia

Jaemin Yoo 19 Jaemin Yoo 20

Betweenness Example: Betweenness
• Betweenness of edge 𝑎, 𝑏 is defined as • Betweenness of 𝐵, 𝐷 is 12.
• The number of pairs of nodes 𝑥 and 𝑦 such that 𝑎, 𝑏 lies on the shortest • It is on every shortest path between any of 𝐴, 𝐵, 𝐶 to any of 𝐷, 𝐸, 𝐹, 𝐺 .
path between 𝑥 and 𝑦.
• Betweenness of 𝐸, 𝐹 is 1.5.
• If there are several shortest paths, credit with a fraction of them. • 1 × 𝐸, 𝐹 + 0.5 × 𝐸, 𝐺 Why? 𝐸, 𝐺 has another shortest path 𝐸, 𝐷, 𝐺 .
5 12 4.5 5 12 4.5
A B D E A B D E

1 5 4 1 5 4
4.5 1.5 4.5 1.5

C C
G F G F
1.5 1.5

Jaemin Yoo 21 Jaemin Yoo 22

Betweenness for Clustering Graph Partitioning

• Betweenness can be directly used for clustering: • Partitioning: Similar to clustering, but more focus on cut.
• High betweenness suggests 𝐵, 𝐷 connects different communities. • Given a graph, divide nodes into two sets so that
• Repeatedly remove edges with highest betweenness to get clusters. • The size of the cut, the set of edges between different sets, is minimized.
• See the Girvan-Newman algorithm (Chapter 10.2.4).
5 4.5 A B D E
A B D E

1 5 4 C
4.5 1.5 H G F

C Smallest Best
G F cut cut
1.5

Jaemin Yoo 23 Jaemin Yoo 24

Normalized Cuts Example: Normalized Cuts
• It is a better partitioning if the two node sets are similar in size. • The smallest cut has a normalized cut 1/1 + 1/11 = 1.09.
• Normalized cut for 𝑆 and 𝑇 is: • The best cut has a normalized cut 2/6 + 2/7 = 0.62.

cut 𝑆, 𝑇 cut 𝑆, 𝑇
• Note that computing (or finding) the optimal cut is NP-hard.
+
vol 𝑆 vol 𝑇
A B D E
• cut 𝑆, 𝑇 = The number of edges that connect 𝑆 and 𝑇.
• vol 𝑆 = The number of edges with at least one end in 𝑆. C
H G F

Smallest Best
cut cut
Jaemin Yoo 25 Jaemin Yoo 26

Pop Quiz Outline

• Find the betweenness values for the edges below. 1. Topic-Specific PageRank
2. Clustering and Partitioning
A B E 3. Finding Overlapping Communities

D C

Jaemin Yoo 27 Jaemin Yoo 28

Overlapping Communities Maximum Likelihood Estimation
• Clustering and partitioning: • Idea: Let’s learn the cluster assignment of nodes from data.
• Two different ways to detect non-overlapping communities. • Maximum likelihood estimation (MLE):
• We often want to find overlapping communities. • Model the generative process of a graph as a function 𝑓.
• Can give us better communities by relaxing the constraint. • 𝑓 has a set of parameters 𝜃 that determine the likelihood ℒ.
• Limitation: Much harder to find an optimal solution. • The values of 𝜃 are “optimal” with the highest likelihood.
• We apply gradient descent to find optimal 𝜃: 𝜃 ← 𝜃 + 𝜕ℒ/𝜕𝜃.
A D

B E

Jaemin Yoo 29 Jaemin Yoo 30

Example: MLE Example: MLE

• Let’s assume the generative process (or model) as follows: • Given a graph 𝐺, the “correct” value of 𝑝 makes highest Pr 𝐺 .
• All edges are independent of each other. • The probability of generating an observed graph.
• Each edge is present with probability 𝑝 = 0.1. • Suppose we observe a graph 𝐺 having 15 nodes and 23 edges.
• Then, a random graph 𝑋 of 3 nodes follows one of four cases. • The number of pairs of nodes is 15 × 14 / 2 = 105.
• The probability 𝑃 𝑋 = 𝐺 for each 𝐺 is determined by 𝑝. • Each to check that Pr 𝐺 is maximized when 𝑝 = 23/105.

Pr(instance) = 0.13 Pr(instance) = 3 x 0.12 x 0.9 Pr(instance) = 3 x 0.1 x 0.92 Pr(instance) = 0.93

31 Jaemin Yoo 32
Affiliation Graph Model Affiliation Graph Model
• Affiliation graph model is a mechanism to generate social graphs. • Fixed (and given): The numbers of nodes and communities.
• Users (= nodes) belong to different communities. • Parameter 1: Community assignment of nodes.
• Users (= nodes) are connected only in each community. • Each community can have any set of individuals as members.
• Parameter 2: Probability 𝑝3 for each community 𝐶 such that
• Graph has edges if users are connected in any community.

Model Social graph • Two members of 𝐶 create a connection with probability 𝑝! .

2 Communities pA A pB B 2 Communities pA A pB B

Memberships Memberships

Individuals Individuals

Jaemin Yoo 33 Jaemin Yoo 34

Example: Affiliation Graph Model Likelihood of AGM

• Q: Likelihood of a graph (top) given a model (bottom)? • Probability of an edge 𝑢, 𝑣 if they are in communities 𝑀:
• Model: Cluster assignments of nodes, and probabilities 𝑝" , 𝑝# , 𝑝!
• Pr 𝐺 = 𝑝45 𝑝56 1 − 𝑝46 𝑝45 = 1 − Z 1 − 𝑝3
• 𝑝$% = 𝑝" u v w 3∈'
• 𝑝%& = 1 − 1 − 𝑝# 1 − 𝑝! • 𝑝$% = 𝜖 if 𝑢 and 𝑣 are not in any communities together.
• 𝑝$& = 𝜖 (a small number) • Since they can still be friends although unlikely.
• Likelihood of an observed graph 𝐺 with edges 𝐸 is given as:
pA A pB B pC C •

𝑝 𝐺 = Q 𝑝$% Q 1 − 𝑝$%
$,% ∈) $,% ∉)
u v w

Jaemin Yoo 35 Jaemin Yoo 36

Optimization of Community Assignments Continuous Community Assignment
• Community assignment of nodes is a discrete parameter. • Assume a “strength of membership” for each node and community.
• MLE solution is the assignment that has the highest likelihood. • Common trick to allow gradient descent.
• However, we cannot use gradient descent. • For each community 𝐶,
• Once we fix on an assignment, we can find the probabilities 𝑝! . • There is a strength of membership 𝐹+! ≥ 0 for each node 𝑥.
• Probability for edge 𝑢, 𝑣 is 𝑝! 𝑢, 𝑣 = 1 − exp −𝐹$! 𝐹%! .
A B
A B
?
FuA FvA FwB

u v w

Jaemin Yoo 37 Jaemin Yoo 38

Continuous Community Assignment Log Likelihood

• Recall that the likelihood of the graph 𝐺 with edges 𝐸 is: • Finally, we use the negative log likelihood as an objective function.
• Update all parameters to minimize 𝑙 𝜃 = − log Pr 𝐺 .
𝑝 𝐺 = Z 𝑝45 Z 1 − 𝑝45
•

• In machine learning, we usually compute the log likelihood.

• Now, the probability of an edge between nodes 𝑢 and 𝑣 is:

4,5 ∈9 4,5 ∉9 • Products become sums, which often simplifies expressions.
• Summing many numbers is less prone to numerical rounding errors .

𝑝45 = 1 − Z 1 − 𝑝3 𝑢, 𝑣 = 1 − exp − ] 𝐹43 𝐹53

• Compared to taking the product of many tiny numbers.

3∈' 3 log 10$#* × 10$#* = log 10$#* + log 10$#* = −20

• Now we can use gradient descent to maximize the likelihood.

Jaemin Yoo 39 Jaemin Yoo 40

Pop Quiz Summary
• What is the likelihood of the observed graph using the new model? 1. Topic-Specific PageRank
• TrustRank
• Random walk with restarts
A B C 2. Clustering and Partitioning
u v w
• Betweenness
• Normalized cuts
u v w 3. Finding Overlapping Communities
• Maximum likelihood estimation
• Log likelihood

Jaemin Yoo 41 Jaemin Yoo 42

Social Network Analysis
No ratings yet
Social Network Analysis
28 pages
Social Network Analysis Unit-2
No ratings yet
Social Network Analysis Unit-2
24 pages
3.5 WebMining ImportantPages
No ratings yet
3.5 WebMining ImportantPages
11 pages
Gionis
No ratings yet
Gionis
191 pages
F2xDV - Lecture 7 - Networks & Hierarchies
No ratings yet
F2xDV - Lecture 7 - Networks & Hierarchies
43 pages
Social Network Graph Mining
No ratings yet
Social Network Graph Mining
34 pages
SNA - T4-5 - Pagerank and Communities
No ratings yet
SNA - T4-5 - Pagerank and Communities
56 pages
Girvan-Newman Algorithm
No ratings yet
Girvan-Newman Algorithm
77 pages
Page Rank, Structure of Web and Analyzing A Web Graph
No ratings yet
Page Rank, Structure of Web and Analyzing A Web Graph
17 pages
23MCB0003 Sna 04
No ratings yet
23MCB0003 Sna 04
15 pages
14 Link 1
No ratings yet
14 Link 1
10 pages
Dsap l04 PDF
No ratings yet
Dsap l04 PDF
63 pages
Data Science 5th Assignment
No ratings yet
Data Science 5th Assignment
13 pages
Menendez Llorente
No ratings yet
Menendez Llorente
22 pages
Section 5
No ratings yet
Section 5
21 pages
Unit - 4
No ratings yet
Unit - 4
22 pages
Lecture 12 - Link Analysis
No ratings yet
Lecture 12 - Link Analysis
57 pages
Lecture 4 - Analyzing Massive Graphs Part I
No ratings yet
Lecture 4 - Analyzing Massive Graphs Part I
27 pages
05 Linkpred
No ratings yet
05 Linkpred
79 pages
3-KMEANS - An Efficient Community Detection Method Based On Rank Centrality-2012
No ratings yet
3-KMEANS - An Efficient Community Detection Method Based On Rank Centrality-2012
13 pages
Link Analysis
No ratings yet
Link Analysis
47 pages
Lect 14-Web Ranking
No ratings yet
Lect 14-Web Ranking
30 pages
Centrality Measures
No ratings yet
Centrality Measures
69 pages
Datamining-Lect7 - Link Analysis Ranking PageRank - Random Walks HITS Absorbing Random Walks and Label Propagation
No ratings yet
Datamining-Lect7 - Link Analysis Ranking PageRank - Random Walks HITS Absorbing Random Walks and Label Propagation
99 pages
Web Mining Exam Cheatsheet
No ratings yet
Web Mining Exam Cheatsheet
6 pages
Course 5-6
No ratings yet
Course 5-6
54 pages
Lecture 9
No ratings yet
Lecture 9
64 pages
0 Chapter 5 LinkAnalysis
No ratings yet
0 Chapter 5 LinkAnalysis
60 pages
CSF-469-L11-13 (Link Analysis Page Rank)
No ratings yet
CSF-469-L11-13 (Link Analysis Page Rank)
47 pages
Graph Based Data Science
No ratings yet
Graph Based Data Science
37 pages
Web Mining 1-10
No ratings yet
Web Mining 1-10
31 pages
Social Network Analysis Unit-4
No ratings yet
Social Network Analysis Unit-4
21 pages
Liuty
No ratings yet
Liuty
50 pages
Page Rank With 13 Cases
No ratings yet
Page Rank With 13 Cases
72 pages
Feb 28
No ratings yet
Feb 28
12 pages
Link Spam Detection Based On Mass
No ratings yet
Link Spam Detection Based On Mass
21 pages
Topic 3
No ratings yet
Topic 3
62 pages
Community Detection and Evaluation
No ratings yet
Community Detection and Evaluation
46 pages
Page Rank Link Farm Detection
No ratings yet
Page Rank Link Farm Detection
5 pages
Social Network Analysis
No ratings yet
Social Network Analysis
20 pages
Module 6-: Real Time Big Data Models
No ratings yet
Module 6-: Real Time Big Data Models
58 pages
Combinatorics and Algos - ETH (2018)
100% (1)
Combinatorics and Algos - ETH (2018)
261 pages
Rec Sys Network
No ratings yet
Rec Sys Network
45 pages
Jeffrey D. Ullman Stanford University
No ratings yet
Jeffrey D. Ullman Stanford University
44 pages
Complex Network Models
No ratings yet
Complex Network Models
110 pages
Graph Mining Handout
No ratings yet
Graph Mining Handout
7 pages
PMBD-07-Link Analysis
No ratings yet
PMBD-07-Link Analysis
42 pages
I Am Sharing 'DSE ASSIGNMENT ADITI CHAUDHARY' With You
No ratings yet
I Am Sharing 'DSE ASSIGNMENT ADITI CHAUDHARY' With You
7 pages
L21 Mining Social Network Graphs
No ratings yet
L21 Mining Social Network Graphs
30 pages
BDA Presentation1
No ratings yet
BDA Presentation1
12 pages
Topic 1 - Graphs
No ratings yet
Topic 1 - Graphs
14 pages
P3 - Graph Theory - 19-10-2022
No ratings yet
P3 - Graph Theory - 19-10-2022
23 pages
Abstract. The Original Purpose of Google'S Pagerank Algorithm Is To Assess The
No ratings yet
Abstract. The Original Purpose of Google'S Pagerank Algorithm Is To Assess The
6 pages
Cse535 Link Analysis
No ratings yet
Cse535 Link Analysis
19 pages
Distributed Computing Seminar: Lecture 5: Graph Algorithms & Pagerank
No ratings yet
Distributed Computing Seminar: Lecture 5: Graph Algorithms & Pagerank
33 pages
Social Network Analysis Unit-3
No ratings yet
Social Network Analysis Unit-3
28 pages
Page Rank Algorithm
No ratings yet
Page Rank Algorithm
9 pages
Link Analysis: (Follow The Links To Learn More!)
No ratings yet
Link Analysis: (Follow The Links To Learn More!)
28 pages
1.1 Pagerank Description
No ratings yet
1.1 Pagerank Description
19 pages
Applications of Stochastic Models in Web Page Ranking
No ratings yet
Applications of Stochastic Models in Web Page Ranking
8 pages
Flo Oved 3435
No ratings yet
Flo Oved 3435
316 pages
Graph and Graph Traaversals
No ratings yet
Graph and Graph Traaversals
19 pages
Dokumen - Pub Engineering Mathematics II m201 Wbut2015 3nbsped 9789339219086 9339219082
No ratings yet
Dokumen - Pub Engineering Mathematics II m201 Wbut2015 3nbsped 9789339219086 9339219082
519 pages
EE3501 Power System Analysis Reg 2021 (Notes Unit I) PDF
No ratings yet
EE3501 Power System Analysis Reg 2021 (Notes Unit I) PDF
38 pages
Accuracy of Astrology
No ratings yet
Accuracy of Astrology
299 pages
A Study On Graceful and - Graceful Labeling of Some Graphs
No ratings yet
A Study On Graceful and - Graceful Labeling of Some Graphs
4 pages
All Questions
No ratings yet
All Questions
140 pages
Lesson1 FUnctions
No ratings yet
Lesson1 FUnctions
19 pages
Applied Combinatorics, 4 Ed. Alan Tucker: Section 1.4 Planar Graphs
No ratings yet
Applied Combinatorics, 4 Ed. Alan Tucker: Section 1.4 Planar Graphs
25 pages
Ford Fulkerson
No ratings yet
Ford Fulkerson
31 pages
On Harmonious Coloring of M (Y N) and C (Y N)
No ratings yet
On Harmonious Coloring of M (Y N) and C (Y N)
3 pages
Matching
No ratings yet
Matching
71 pages
IAI Unit2
No ratings yet
IAI Unit2
81 pages
Data Visualization - Chapter3
No ratings yet
Data Visualization - Chapter3
25 pages
Data Structure and Algorithms QUIZZES
No ratings yet
Data Structure and Algorithms QUIZZES
133 pages
Addis Ababa University Addis Ababa Institute of Technology School of Electrical and Computer Engineering
No ratings yet
Addis Ababa University Addis Ababa Institute of Technology School of Electrical and Computer Engineering
23 pages
DS Notes Graph
No ratings yet
DS Notes Graph
13 pages
COMPSCI120-2021-S1-exam Solutions
No ratings yet
COMPSCI120-2021-S1-exam Solutions
7 pages
Hurara Discrete Assignment
No ratings yet
Hurara Discrete Assignment
4 pages
Design and Analysis of Algorithms Lab Manual: Computer Science & Engineering Department NMAMIT, Nitte
No ratings yet
Design and Analysis of Algorithms Lab Manual: Computer Science & Engineering Department NMAMIT, Nitte
33 pages
CS 106X, Lecture 22 Graphs BFS DFS: Programming Abstractions in C++, Chapter 18
No ratings yet
CS 106X, Lecture 22 Graphs BFS DFS: Programming Abstractions in C++, Chapter 18
80 pages
Graph Theory: Types of Graphs
No ratings yet
Graph Theory: Types of Graphs
13 pages
The Grammar According To West
No ratings yet
The Grammar According To West
20 pages
MCQ For Oba
No ratings yet
MCQ For Oba
17 pages
Graph Convolutional Neural Networks For Human Activity Purpose Imputation From Gps Based Trajectory Data PDF
No ratings yet
Graph Convolutional Neural Networks For Human Activity Purpose Imputation From Gps Based Trajectory Data PDF
6 pages
Online Quiz 8 - Graf (Bagian 1) - Attempt Review - CeLOE LMS
No ratings yet
Online Quiz 8 - Graf (Bagian 1) - Attempt Review - CeLOE LMS
14 pages
4.1.3 Addition of A Link Between An Existing Node and The Reference Node (Case 3)
No ratings yet
4.1.3 Addition of A Link Between An Existing Node and The Reference Node (Case 3)
8 pages
The NWB Tool Basic Tutorial: Getting Started: Goals
No ratings yet
The NWB Tool Basic Tutorial: Getting Started: Goals
5 pages
Exercise 2C
No ratings yet
Exercise 2C
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

15 Link 2

Uploaded by

15 Link 2

Uploaded by

Announcements

Jaemin Yoo 1 Jaemin Yoo 2

Jaemin Yoo 5 Jaemin Yoo 6

Link Farms Link Farms: Analysis 1

• Goal: Maximize the PageRank score of target page 𝑡

Can insert links to M

Link Farms: Analysis 1 Link Farms: Analysis 1

• How can we interpret 𝑦 = +

• If 𝛽 = 0.85, it becomes 𝑦 = 3.6𝑥 + 𝑀.

• We can make 𝑦 as large as we want by making 𝑀 large.

Jaemin Yoo 9 Jaemin Yoo 10

TrustRank Matrix Formulation

kaist.ac.kr icml.cc my.blogger.com abc123.biz

Jaemin Yoo 11 Jaemin Yoo 12

Jaemin Yoo 13 Jaemin Yoo 14

Good Proximity Measure? Random Walk with Restarts

• The score 𝑟/ of each node 𝑗 is the

Jaemin Yoo 15 Jaemin Yoo 16

Jaemin Yoo 17 Jaemin Yoo 18

Clustering of Nodes Betweenness

Jaemin Yoo 19 Jaemin Yoo 20

Jaemin Yoo 21 Jaemin Yoo 22

Betweenness for Clustering Graph Partitioning

Jaemin Yoo 23 Jaemin Yoo 24

Pop Quiz Outline

Jaemin Yoo 27 Jaemin Yoo 28

Jaemin Yoo 29 Jaemin Yoo 30

Example: MLE Example: MLE

Model Social graph • Two members of 𝐶 create a connection with probability 𝑝! .

Jaemin Yoo 33 Jaemin Yoo 34

Example: Affiliation Graph Model Likelihood of AGM

Jaemin Yoo 35 Jaemin Yoo 36

Jaemin Yoo 37 Jaemin Yoo 38

Continuous Community Assignment Log Likelihood

• In machine learning, we usually compute the log likelihood.

• Now, the probability of an edge between nodes 𝑢 and 𝑣 is:

𝑝45 = 1 − Z 1 − 𝑝3 𝑢, 𝑣 = 1 − exp − ] 𝐹43 𝐹53

3∈' 3 log 10$#* × 10$#* = log 10$#* + log 10$#* = −20

Jaemin Yoo 39 Jaemin Yoo 40

Jaemin Yoo 41 Jaemin Yoo 42

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.