0% found this document useful (0 votes)

3 views31 pages

Unit 2 Complete Notes Unit 2 Complete Notes

The document provides comprehensive notes on social media analytics, focusing on the adjacency matrix as a key tool for representing network data, including its binary representation and application in directed and undirected networks. It discusses essential concepts such as nodes, ties, paths, connectivity, and centrality measures, which help analyze social interactions and identify influencers within networks. Additionally, it covers graph traversal algorithms like BFS and DFS, emphasizing their relevance in analyzing social media dynamics.

Uploaded by

Aditya Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views31 pages

Unit 2 Complete Notes Unit 2 Complete Notes

Uploaded by

Aditya Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

lOMoARcPSD|27198752

Unit 2 Complete Notes

B.tech (Dr. A.P.J. Abdul Kalam Technical University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Downloaded by Aditya Sharma (aditya10462004@gmail.com)
lOMoARcPSD|27198752

Social Media Analytics and Data Analysis (BCAM061)

Unit 2

Adjacency Matrix
In social media analytics, the adjacency matrix is a fundamental tool for representing and analyzing
network data.
What is an Adjacency Matrix?
• Essentially, it is a way to represent a network (a graph) in the form of a matrix (a table of
numbers).
• Rows and columns represent the nodes (users) in the network.
• The values within the matrix indicate whether or not there's a connection (edge) between
those nodes.
How it Works in Social Media:
• Binary Representation:
o Often, the matrix uses binary values:
▪ "1" indicates that there is a connection between two users.
▪ "0" indicates that there is no connection.
o For example, if user A follows user B, the cell corresponding to row A and column
B would have a "1".

• Directed vs. Undirected Networks:

o If the social network is "directed" (e.g., Twitter follows), the matrix is asymmetric.
This means the value in cell (A, B) may not be the same as in cell (B, A).

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

o If the network is "undirected" (e.g., Facebook friendships), the matrix is symmetric.

If A is friends with B, then B is friends with A.
• Weighted Adjacency Matrices:
o In some cases, you might use "weighted" adjacency matrices.
o Instead of just "0" or "1", the cells contain values representing the strength of the
connection (e.g., frequency of interactions).

Why it is Useful in Social Media Analytics:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

• Data Representation:
o Provides a structured way to store and manipulate network data.
• Mathematical Analysis:
o Allows for the application of matrix algebra to analyze network properties.
o This is crucial for calculating centrality measures, detecting communities, and
performing other network analyses.
• Computational Efficiency:
o Adjacency matrices are well-suited for computer processing, enabling efficient
analysis of large social networks.
• Pattern Recognition:
o By viewing the matrix, patterns of connections can sometimes be visually
recognized.
• Input for algorithms:
o Many network analysis algorithms require the network information to be in the
form of an adjacency matrix.
In simpler terms:
• Imagine a grid where each row and column is a person on a social media site.
• If two people are friends, you put a mark in the box where their row and column meet.
• This grid is the adjacency matrix.
By using adjacency matrices, social media analysts can gain valuable insights into the structure
and dynamics of online social networks.

Key Components
Social network analysis explores how these key elements interact to create complex social
structures. By understanding these concepts, analysts can gain valuable insights into the dynamics
of online and offline social interactions.
When analyzing social networks, several key concepts help us understand the structure and
dynamics of these interconnected systems.
1. Nodes:
• Nodes are the fundamental units of a network. They represent the individual actors within
the network.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

• In social media, nodes can be:

o Individual users.
o Groups or organizations.
o Even pieces of content.
2. Ties (Edges/Links):
• Ties represent the relationships or connections between nodes.
• These can vary greatly:
o Friendships.
o Follower relationships.
o Communication patterns (e.g., messages, mentions).
o Collaborations.
• Ties can be:
o Directed: Indicating a one-way relationship (e.g., following).
o Undirected: Indicating a mutual relationship (e.g., a two way friendship).
o Weighted: indicating the strength of the relationship.

3. Paths:
• A path is a sequence of ties that connects two nodes.
• Analyzing paths helps understand how information or influence flows through a network.
• Shortest paths are often of particular interest, as they represent the most efficient routes of
communication.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

4. Connectivity:
• Connectivity refers to how well-connected the nodes in a network are.
• It can be measured in various ways:
o Overall network density: The proportion of existing ties to possible ties.
Density = Total no. of edges / Total possible no. of edges
Total possible no. of edges = N (N-1)/2
where,
N = Total no. of nodes in a graph
Note: Density will be 100% if all possible connections are there in a graph.
o The presence of connected components: Groups of nodes that are connected to
each other, but not to other parts of the network.
• High connectivity generally facilitates the spread of information and influence.

5. Influencers:
• Influencers are nodes that have a disproportionate impact on the network.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

• They can be identified using various measures:

o Degree centrality: The number of connections a node has.
o Betweenness centrality: How often a node lies on the shortest path between other
nodes.
o Eigenvector centrality/PageRank: Measures of a node's influence based on the
influence of its neighbors.
• Influencers play a crucial role in:
o Shaping opinions.
o Spreading trends.
o Driving behavior.

DFS and BFS

Distance-first search (DFS) and Breadth-first search (BFS) are fundamental graph traversal
algorithms that can be adapted for social media analytics, although BFS is generally more useful
in this context.
1. Breadth-First Search (BFS):
• How it works:
o BFS explores a network layer by layer, starting from a given node.
o It visits all the node's immediate neighbors, then their neighbors, and so on.
o It effectively finds the shortest paths from the starting node to all other reachable
nodes.
o BFS is generally more applicable for tasks like finding social distances, analyzing
information spread, and community detection.
• Applications in Social Media Analytics:
o Finding Shortest Paths:
▪ BFS can determine the shortest "social distance" between two users (e.g.,
how many "friends of friends" separate them).
▪ This is useful for understanding the efficiency of information flow.
o Community Detection:
▪ BFS can be used as a building block for community detection algorithms,
by exploring the local neighborhood of nodes.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

o Information Diffusion:
▪ Simulating how information spreads outward from a source user, layer by
layer, can approximate how viral content spreads.
o Network Visualization:
▪ BFS can be used to generate layers of a network, that can then be visualized.
• Why it is preferred:
o In social networks, we are often interested in the shortest paths and immediate
neighborhoods, which BFS excels at.

2. Depth-First Search (DFS):

• How it works:
o DFS explores a network by going as deep as possible along each branch before
backtracking.
o It follows a single path until it reaches a dead end, then backtracks and explores
another path.
o DFS is less commonly used, but can be useful for specific tasks like finding
connected components or cycles.
• Potential Applications (Less Common) in Social Media Analytics:
o Finding Connected Components:
▪ DFS can determine if a network is fully connected or if it contains separate
components.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

o Identifying Cycles:
▪ DFS can detect cycles (loops) in a network, which might indicate certain
patterns of interaction.
o Exploring Deep Connections:
▪ In some very specific cases, if there is a need to explore very deep
connections, DFS might be useful.
• Why it is less common:
o DFS can get "lost" in long, less relevant paths.
o It does not guarantee finding shortest paths, which are often the primary interest in
social network analysis.
o Social networks are generally very broad, and not very deep, so DFS is not usually
the most efficient way to analyze them.
Key Differences:
• Exploration Strategy:
o BFS: Layer by layer (breadth).
o DFS: Depth first.
• Path Finding:
o BFS: Finds shortest paths.
o DFS: Does not guarantee shortest paths.
• Memory Usage:
o BFS: Can require more memory for large networks.
o DFS: Generally, uses less memory.

Centrality
Centrality measures are fundamental tools that help us understand the importance of individual
nodes (people, entities) within a network. They essentially quantify how "central" a node is, but
"central" can have different meanings.
Why Centrality Matters:
• Identifying Influence:
o Central nodes often have significant influence over the network.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

• Understanding Information Flow:

o Central nodes can play key roles in how information spreads.
• Detecting Key Players:
o Centrality measures help identify important individuals or entities.
Common Centrality Measures:
• Degree Centrality:
o This is the simplest measure.
o It counts the number of direct connections a node has.
o A node with many connections has high degree centrality.
o In social media, this might represent someone with a large number of friends or
followers.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

• Betweenness Centrality:
o This measures how often a node lies on the shortest path between other nodes.
o Nodes with high betweenness centrality act as "bridges" between different parts of
the network.
o They control the flow of information or resources.

• Closeness Centrality:
o This measures how close a node is to all other nodes in the network.
o Nodes with high closeness centrality can quickly reach other nodes.
o They are efficient at spreading information.

• Eigenvector Centrality:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

o This measures a node's influence based on the influence of its neighbors.

o A node is important if it is connected to other important nodes.
o It considers the quality of connections, not just the quantity.

• PageRank:
o A variant of eigenvector centrality, famously used by Google, that also takes into
account the direction of links.
Key Considerations:
• Context Matters:
o The most appropriate centrality measure depends on the specific network and
research question.
• Network Type:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

o Some measures are more suitable for directed networks (where connections have a
direction) than undirected networks.
• Weighted Networks:
o Centrality measures can be adapted to weighted networks, where connections have
different strengths.

Making connections
It refers to the fundamental process of identifying and representing the relationships between
individuals, groups, or other entities within a network. It is the building block upon which all other
analyses are built.
In other words, it is the process of translating real-world social interactions into a format that can
be analyzed using network science methods. It is the critical first step that allows us to understand
the complex dynamics of social networks.
What Making Connections Entails:
• Data Collection:
o This is the first step, gathering the raw data that represents social interactions.
o Sources can include:
▪ Social media platforms (friendships, follows, mentions, shares).
▪ Communication logs (emails, messages).
▪ Collaboration records (co-authorships, project teams).
▪ Surveys or questionnaires.
• Defining Relationships:
o Clearly defining what constitutes a "connection" is crucial.
o This can vary depending on the context:
▪ "Friendship" on Facebook.
▪ "Following" on Twitter.
▪ "Co-worker" in a corporate network.
▪ "Interaction" based on frequency of comments.
• Representing Connections:
o Once connections are defined, they need to be represented in a way that can be
analyzed.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

o This is typically done using:

▪ Graphs: Mathematical structures consisting of nodes (entities) and edges
(connections).
▪ Adjacency matrices: Tables that represent which nodes are connected.
▪ Edge lists: Simple lists of pairs of connected nodes.
• Quantifying Connections:
o Sometimes, connections are not simply present or absent, but have varying degrees
of strength.
o This leads to the use of:
▪ Weighted networks: Where edges are assigned numerical values
representing the strength of the connection.
▪ These weights can represent frequency of interaction, sentiment, or other
relevant factors.
Why Making Connections Is Important:
• Foundation for Analysis:
o Without accurately representing connections, all subsequent analyses would be
flawed.
• Revealing Network Structure:
o Connections define the overall structure of the network, including:
▪ Clusters and communities.
▪ Central nodes and influential individuals.
▪ Paths and flows of information.
• Understanding Social Dynamics:
o Connections provide insights into how individuals and groups interact, collaborate,
and influence each other.
• Enabling Predictions:
o By analyzing connection patterns, we can make predictions about future behavior,
such as:
▪ The spread of information.
▪ The formation of new relationships.
▪ The evolution of trends.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

Link analysis
Link analysis is a core technique within social network analysis, focused on understanding the
relationships between entities by examining their connections.
Core Concept:
• At its heart, link analysis explores the connections (links) between entities (nodes) within
a network.
• This involves analyzing the patterns and properties of these links to gain insights into the
network's structure and dynamics.
Key Applications in Social Network Analysis:
• Identifying Influence:
o Link analysis helps determine which individuals or entities are most influential
within a network.
o This can involve analyzing the number of connections a person has, or their position
within the network's structure.
• Community Detection:
o By examining link patterns, analysts can identify clusters of tightly connected
individuals, revealing communities or sub-groups within the larger network.
• Information Flow:
o Link analysis helps trace how information spreads through a network.
o This is crucial for understanding how trends, ideas, or even misinformation
propagate.
• Anomaly Detection:
o Unusual link patterns can indicate suspicious activity, such as fake accounts or
coordinated manipulation campaigns.
• Relationship Mapping:
o It creates a visual representation of relationships, making it easier to understand
complex social structures.
Techniques and Measures:
• Centrality Measures:
o These metrics quantify the importance of nodes within a network.
▪ Degree centrality: The number of connections a node has.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

▪ Betweenness centrality: How often a node lies on the shortest path

between other nodes.
▪ Closeness centrality: How close a node is to all other nodes in the
network.
• Network Visualization:
o Visualizing networks helps identify patterns and relationships that might be
difficult to see in raw data.
• Path Analysis:
o This involves tracing the routes between nodes to understand how information or
influence flows.
Why It is Important:
• Social media platforms generate vast amounts of relational data, making link analysis
essential for understanding online interactions.
• It provides valuable insights for:
o Marketing and advertising.
o Political analysis.
o Cybersecurity.
o Public health.
Link analysis helps us see the hidden structure of social networks, revealing how individuals and
groups are connected and how they interact.

PageRank Algorithm
The PageRank algorithm, originally developed by Google, is a way to measure the importance of
nodes within a network. While famously used for ranking web pages, its principles are highly
applicable to social network analysis.
Core Idea:
• PageRank assigns a numerical value to each node in a network, representing its
importance.
• The core concept is that a node is considered important if it is linked to by other important
nodes.
• In simpler terms, it is like a measure of "influence" or "prestige" within the network.
How It Works:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

• Random Surfer Model:

o The algorithm simulates a "random surfer" who clicks on links within the network.
o The PageRank of a node is the probability that the random surfer will land on that
node.
o Nodes with more incoming links, especially from other high-ranking nodes, are
more likely to be visited.
• Iterative Calculation:
o PageRank is calculated iteratively, meaning the algorithm repeatedly updates the
rank of each node until it converges to a stable value.
o This process takes into account the structure of the entire network.
• Damping Factor:
o To account for the possibility that the random surfer might get bored and jump to a
random node, a "damping factor" is introduced.
o This factor prevents the algorithm from getting stuck in loops or dead ends.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

Application in Social Network Analysis:

• Identifying Influential Users:
o PageRank can identify users who are highly influential within a social network.
o Users with high PageRank scores are those who are well-connected and receive
attention from other influential users.
• Analyzing Information Diffusion:
o By understanding the PageRank of different users, analysts can gain insights into
how information flows through the network.
o Information is more likely to spread quickly through high-ranking users.
• Community Detection:
o While not its primary purpose, PageRank can contribute to community detection
by highlighting clusters of interconnected high-ranking users.
• Recommendation Systems:
o A variation of page rank, called personalized pagerank, is very useful in
recommendation systems. By biasing the random walk, the algorithm can find
nodes that are more relevant to a specific starting node.
Key Considerations:
• PageRank is sensitive to the structure of the network.
• It works best on directed networks, where connections have a direction (e.g., follows,
links).

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

• It provides a global measure of importance, meaning it considers the entire network

structure.

Random Graphs
Random graphs are incredibly valuable tools, providing a foundation for understanding the
structure and dynamics of online social interactions.

Core Concepts and Applications:

• Baseline Models:
o Random graphs, particularly the Erdős-Rényi model, serve as baseline models.
They allow researchers to compare real-world social networks to what would be
expected by chance. This helps identify non-random patterns, such as:
▪ Preferential attachment: Where popular users attract even more
connections.
▪ Community structures: Where groups of tightly connected users emerge.
• Understanding Network Properties:
o Random graph theory helps analyze key network properties, including:
▪ Degree distribution: The distribution of the number of connections each
user has.
▪ Clustering coefficient: The tendency of users' friends to also be friends.
▪ Path lengths: The average distance between any two users in the network.
• Modeling Information Diffusion:
o Random graphs can simulate how information spreads through social networks.
This is crucial for:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

▪ Predicting the reach of viral content.

▪ Analyzing the spread of misinformation.
▪ Understanding how opinions and trends propagate.
• Identifying Influential Users:
o By applying centrality measures to random graph models, researchers can:
▪ Identify users who play a central role in connecting different parts of the
network.
▪ Determine whether these users' influence is statistically significant or
simply a result of random chance.
• Creating Null Models:
o Random graphs are often used to create "null models". These are graphs that have
some, but not all of the properties of the real world networks. By comparing real
world networks to these null models, researchers can determine which properties
of the networks are statistically significant.
Key Considerations:
• Real-world social networks often deviate significantly from simple random graph models.
This is because social interactions are influenced by factors like:
o Homophily (the tendency to connect with similar people).
o Social influence.
o External events.
• Therefore, researchers often use more sophisticated random graph models that incorporate
these factors.
Example:
Imagine this:
• Social Networks as Maps:
o Think of a social media platform like a big map. Each person using the platform is
a dot (a "node") on that map.
o When two people are friends or follow each other, we draw a line (an "edge")
connecting their dots.
o This map of dots and lines is what we call a "social network graph."
• What are Random Graphs?

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

o Now, imagine creating a map like that, but instead of real friendships, you just draw
lines between dots completely randomly.
o That is a "random graph." It is a map where connections are made by chance.
o We use mathematical rules to define "how random" it is. For example, we might
say, "Each pair of dots has a 10% chance of being connected."
Why do we use Random Graphs in Social Network Analysis?
• To Find What's "Normal":
o We use random graphs as a kind of "normal" or "expected" pattern.
o Then, we compare real social networks to these random ones.
o If a real network looks very different from a random one, it tells us that something
interesting is happening.
o For example, if in a real network, some people have way more connections than
others, and this is far more than what would happen in a random graph, it tells us
that those people are probably very influential.
• To See Patterns:
o Random graphs help us understand patterns in social networks.
o For instance:
▪ How connected are people? Do most people have a few friends, or are
there a few people with tons of friends?
▪ Do friends of friends know each other? Random graphs help us see if
people's social circles are tightly knit.
▪ How fast does information spread? We can simulate how information
travels through a random graph to see how quickly it could spread through
a real social network.
• To Spot Influence:
o We can use random graphs to see if someone's popularity is just luck, or if they are
genuinely influential.
o If someone has way more connections than you would expect in a random graph,
they are likely a key player.
In simpler terms:
• Random graphs are like a "control group" for social networks.
• They help us see what social networks look like when things happen by chance.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

• By comparing real social networks to random ones, we can discover the hidden forces that
shape our online interactions.

Network evolution
Network evolution is about understanding how online social connections change over time. It is
not just a snapshot; it is a movie of how relationships form, grow, and sometimes fade.
Why Network Evolution Matters in Social Media:
• Social media is dynamic:
o People join and leave platforms.
o Friendships and follower relationships change.
o Interests and trends shift.
• Understanding these changes is crucial for:
o Tracking trends.
o Identifying influential users.
o Analyzing the spread of information.
o Predicting user behavior.
Key Aspects of Network Evolution:
• Growth and Preferential Attachment:
o This is the "rich get richer" phenomenon.
o Users with many followers tend to attract even more.
o This leads to the formation of "hub" users with massive influence.
o Analyzing this helps to predict how a social network is going to grow.
• Network Churn:
o People unfollow others or deactivate accounts.
o Relationships weaken over time.
o Analyzing churn helps understand user engagement and platform health.
• Community Evolution:
o Online communities form and dissolve.
o New communities emerge around trending topics.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

o Tracking community evolution reveals changing interests and social dynamics.

• Information Diffusion Over Time:
o How does a piece of information spread across a network as it evolves?
o Does it reach new users, or does it stay within existing clusters?
o Understanding this helps analyze the impact of social media campaigns.
• Temporal Analysis of User Behavior:
o How do users' connections change in response to events or trends?
o Do they form new connections with people who share their interests?
o Analyzing this provides insights into user behavior and social influence.
In simpler terms:
• Imagine a social media network as a living organism.
• Network evolution is like studying how that organism grows, changes, and adapts over
time.
• It is about tracking the "who knows who" and "who talks to whom" over days, weeks, and
months.
Why this is important:
• For marketers: To understand how campaigns spread.
• For researchers: To study social behavior.
• For platform developers: To improve user experience.
• For anyone trying to understand how information flows in our modern world.
By analyzing network evolution, we gain a deeper understanding of the complex and ever-
changing world of social media.

Weighted Networks
Weighted networks add a layer of intensity or strength to the connections we analyze.
The Basic Idea:
• In a typical social network graph, a connection (edge) simply indicates that two users are
linked (e.g., friends, followers).
• A weighted network goes further: it assigns a numerical value (weight) to each connection.
This weight represents the strength or frequency of the interaction.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

How Weights Are Determined in Social Media:

• Frequency of Interaction:
o How often do users interact (e.g., messages, comments, likes)?
o A high frequency indicates a strong connection, resulting in a higher weight.
• Intensity of Interaction:
o Are interactions positive or negative (sentiment analysis)?
o Do users share content frequently?
o Stronger emotional connections or frequent content sharing would yield higher
weights.
• Type of Interaction:
o Different interactions can have different weights (e.g., a direct message might have
a higher weight than a simple like).
o This can be customized based on what the analyst is attempting to measure.
• Reciprocity:
o Are the interactions one sided or mutual? Mutual interaction generally indicates a
stronger bond.
Why Weighted Networks Are Valuable in Social Media Analytics:
• More Realistic Representation:
o Not all social connections are equal. Weighted networks capture the nuances of
these relationships.
• Improved Community Detection:
o Weighted edges help identify tightly knit communities by emphasizing stronger
connections.
o This leads to more accurate and meaningful community analysis.
• Enhanced Influence Analysis:
o Weights allow for a more refined analysis of influence.
o Users who generate strong interactions (high weights) are likely more influential
than those with weak connections.
• Sentiment Analysis:
o Weights can represent sentiment, revealing the emotional tone of interactions.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

o This helps understand public opinion and identify potential issues.

• More Accurate Trend Detection:
o By analyzing the strength of interactions around certain topics, more accurate trend
analysis can be performed.
• Refined Recommendations:
o Weighted networks can enhance recommendation systems by suggesting
connections or content based on the strength of existing relationships.
Example:
• Imagine two users, Alice and Bob.
o In a simple network, they are just connected.
o In a weighted network:
▪ If Alice and Bob frequently exchange direct messages, their connection has
a high weight.
▪ If they only occasionally like each other's posts, their connection has a low
weight.
Weighted networks add depth to social media analysis by quantifying the strength of connections.
This leads to more accurate and insightful analyses of social interactions.

Hypergraphs
Hypergraphs are a powerful, but less commonly used, tool in social media analytics. They excel
at representing relationships that go beyond simple pairwise connections, which are common in
social media.
Understanding Hypergraphs:
• Beyond Pairs:
o Traditional graphs connect two nodes (users) at a time.
o Hypergraphs use "hyperedges" that can connect any number of nodes.
o Think of it like this: a regular edge is a line between two dots; a hyperedge is like
a group circle that can enclose many dots.
• Representing Group Interactions:
o Social media is full of group interactions:
▪ Online communities.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

▪ Shared posts with multiple users tagged.

▪ Collaborative projects.
▪ Group chats.
o Hypergraphs can model these interactions directly.
Applications in Social Media Analytics:
• Modeling Online Communities:
o A hyperedge can represent all members of a specific online community.
o This allows for analysis of community structure and overlap.
• Analyzing Co-occurrence Patterns:
o Hypergraphs can identify groups of users who frequently participate in the same
activities.
o For example:
▪ Users who consistently comment on the same posts.
▪ Users who share the same hashtags.
▪ Users that participate in the same group chats.
• Understanding Collaborative Activities:
o Hyperedges can represent collaborative projects or shared content creation.
o This helps analyze how users work together and contribute to shared goals.
• Analyzing Tagging Behavior:
o When a post has multiple users tagged, this is a perfect situation to use a
hypergraph. The hyperedge would connect all the tagged users.
• Detecting Complex Influence:
o Traditional influence analysis often focuses on individual users.
o Hypergraphs can reveal how groups of users collectively influence each other and
spread information.
• Event Analysis:
o When analyzing events that happen on social media, like a viral post, or a trending
hashtag, hypergraphs can be used to connect all the users that participated in that
event. This allows for a deeper understanding of who was involved, and how they
are connected.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

Why Hypergraphs Are Useful:

• Capturing Complex Relationships:
o They go beyond simple pairwise connections, providing a more accurate
representation of social media interactions.
• Revealing Hidden Patterns:
o They can uncover patterns that are not visible in traditional graphs.
• Providing Richer Insights:
o They allow for more nuanced and comprehensive analysis of social media data.
In simpler terms:
• Imagine a regular graph as showing "who is friends with whom."
• A hypergraph shows "who is in the same group, chat, or involved in the same activity."
• Essentially, hypergraphs are tools that excel when many people are interacting together at
the same time.

Network Datasets
When conducting social network analysis, having access to relevant and high-quality network
datasets is crucial. These datasets provide the raw material for exploring social structures and
dynamics.
Types of Network Datasets:
1. Government and Public Datasets:
• Data.gov:
o This U.S. government platform provides access to a wide range of public datasets,
some of which may contain network information (e.g., transportation networks,
government collaborations).
• European Data Portal:
o Similar to Data.gov, this portal offers access to public data from European
countries.
• National statistical offices:
o Many countries publish data on social and economic networks.
2. Biological and Scientific Datasets:
• Protein-Protein Interaction Networks:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

o These datasets represent interactions between proteins in biological systems.

o Databases like STRING and BioGRID provide access to these networks.
• Gene Regulatory Networks:
o These datasets show how genes regulate each other's expression.
• Brain Connectivity Networks:
o These datasets represent the connections between different regions of the brain.
• ArXiv datasets:
o These datasets contain co-authorship networks, and citation networks, of papers
published on the ArXiv preprint server.
3. Online Communities and Forums:
• Stack Exchange Data Dump:
o This provides access to data from the Stack Exchange network of Q&A sites,
including user interactions and question-answer relationships.
• Web Forums and Message Boards:
o Datasets from specific online forums or message boards can be collected (with
appropriate ethical considerations).
4. Transportation and Infrastructure Networks:
• OpenStreetMap (OSM):
o This collaborative project provides data on roads, railways, and other infrastructure
networks.
• Airline Networks:
o Datasets representing airline routes and connections between airports.
5. Financial Networks:
• Financial Transaction Networks:
o Datasets representing financial transactions between entities (often proprietary or
requiring special access).
• Interbank Lending Networks:
o Datasets representing lending relationships between banks.
6. Datasets related to specific research fields:
• Social science datasets:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

o Many social science researchers publish data sets related to their studies.
• Information science datasets:
o Researchers in information science, publish datasets related to information flow,
and retrieval.
7. Social Media Datasets:
o These datasets capture interactions from platforms like Twitter, Facebook,
Instagram, and Reddit.
o They often include information about user connections (friendships, followers),
interactions (posts, comments, shares), and user attributes (profiles).
o Examples:
▪ Twitter datasets: Containing tweets, user networks, and hashtag
interactions.
▪ Reddit datasets: Including subreddit interactions and user comments.
8. Collaboration Networks:
o These datasets represent connections based on collaborative activities, such as co-
authorship networks (scientists who have published papers together) or project
collaboration networks.
9. Communication Networks:
o These datasets track communication patterns, such as email networks, phone call
networks, or instant messaging networks.
10. Citation Networks:
o These datasets represent the relationships between academic papers, where
citations indicate connections.
Important Considerations:
• Data Access and Licensing: Be aware of the terms of use and licensing agreements for
any dataset you use.
• Ethical Considerations: When working with social network data, it is crucial to respect
user privacy and adhere to ethical guidelines.
• Data Preprocessing: Network datasets often require significant preprocessing before they
can be analyzed.
By exploring these diverse sources, you can find network datasets that suit a wide range of research
and analysis needs.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

Key Characteristics of Network Datasets:

• Nodes: Represent the entities in the network (e.g., users, authors, web pages).
• Edges (Ties): Represent the relationships or connections between nodes.
• Attributes: Nodes and edges may have associated attributes (e.g., age, location, weight of
connection).
• Directed vs. Undirected: Networks can be directed (e.g., following on Twitter) or
undirected (e.g., friendships on Facebook).
• Weighted vs. Unweighted: Edges can be weighted (representing the strength of the
connection) or unweighted.
Where to Find Network Datasets:
• Stanford Large Network Dataset Collection (SNAP):
o A widely used repository of large network datasets.
• Kaggle:
o A platform that hosts various datasets, including social network data.
• IEEE DataPort:
o A platform that houses various datasets, including many that are useful for social
network analysis.
• University Repositories:
o Many universities maintain repositories of research datasets.
• API's of social media platforms:
o Social media platforms themselves, provide API's that allow researchers to collect
data.
Challenges in Network Datasets:
• Data Privacy: Social network data often contains sensitive information, raising privacy
concerns.
• Data Size: Social networks can be very large, requiring significant computational
resources for analysis.
• Data Quality: Social media data can be noisy and incomplete.
• Dynamic Nature: Social networks are constantly changing, so datasets may become
outdated quickly.
Importance of Network Datasets:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

lOMoARcPSD|27198752

• They enable researchers to study real-world social phenomena.

• They provide a basis for developing and testing network analysis algorithms.
• They facilitate the discovery of insights into social behavior and network dynamics.
By understanding the characteristics and sources of network datasets, researchers can effectively
utilize these resources to advance our understanding of social networks.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

Audit Checklist - JH Steps 1-3
100% (1)
Audit Checklist - JH Steps 1-3
13 pages
Guesstimates
No ratings yet
Guesstimates
35 pages
Module1: Introduction: Prof. Punitha K, VIT Chennai
No ratings yet
Module1: Introduction: Prof. Punitha K, VIT Chennai
50 pages
PinAAcle 900 AA Product Description List - 2011-06-06
100% (1)
PinAAcle 900 AA Product Description List - 2011-06-06
22 pages
Quick Reference Guide For RTN510 AP Deployment (Web)
No ratings yet
Quick Reference Guide For RTN510 AP Deployment (Web)
7 pages
Web Data Analysis
No ratings yet
Web Data Analysis
93 pages
Latent Class Análysis
No ratings yet
Latent Class Análysis
33 pages
Assignment Chapter 3 PDF
No ratings yet
Assignment Chapter 3 PDF
2 pages
05 Networks
No ratings yet
05 Networks
48 pages
Lecture 12
No ratings yet
Lecture 12
31 pages
Lecture2
No ratings yet
Lecture2
25 pages
Social Network Analysis (2017 Reg) - Unit2
No ratings yet
Social Network Analysis (2017 Reg) - Unit2
47 pages
15-Social Network Analysis
No ratings yet
15-Social Network Analysis
18 pages
Session 5
No ratings yet
Session 5
41 pages
Applications of Graph Analysis
No ratings yet
Applications of Graph Analysis
43 pages
Social Network Analysis
No ratings yet
Social Network Analysis
40 pages
NodeXL Chapter 3
No ratings yet
NodeXL Chapter 3
19 pages
Corporate Information: Scan To Verify
No ratings yet
Corporate Information: Scan To Verify
7 pages
JEE 2022 Checklist - Yash Garg
No ratings yet
JEE 2022 Checklist - Yash Garg
7 pages
Analysis of Centrality - Finding Most Influential Person in A Social Network
No ratings yet
Analysis of Centrality - Finding Most Influential Person in A Social Network
9 pages
Date Esaral Complete Revision Schedule Physics Chemistry Maths
No ratings yet
Date Esaral Complete Revision Schedule Physics Chemistry Maths
2 pages
IT CSA Service Level
No ratings yet
IT CSA Service Level
5 pages
Great Britons and Inventions 20
No ratings yet
Great Britons and Inventions 20
13 pages
DSC 651 - Chapter 3 - Hierarchical
No ratings yet
DSC 651 - Chapter 3 - Hierarchical
26 pages
Topic 3
No ratings yet
Topic 3
62 pages
LMI B1 Series Parts List Metering Pump PDF
No ratings yet
LMI B1 Series Parts List Metering Pump PDF
4 pages
Dmproject
No ratings yet
Dmproject
11 pages
Survey Questionnaire Group 1
100% (1)
Survey Questionnaire Group 1
7 pages
Bio Trace Software Manual
No ratings yet
Bio Trace Software Manual
153 pages
Lec 18-Graph Analytics
No ratings yet
Lec 18-Graph Analytics
100 pages
SocialNetworkAnalysis FullNote
No ratings yet
SocialNetworkAnalysis FullNote
10 pages
Setting Up A Unisphere Management Station For The VNX Series 300-013-508 Rev 01
No ratings yet
Setting Up A Unisphere Management Station For The VNX Series 300-013-508 Rev 01
23 pages
SNA - T2-3 - Graphs and Degree
No ratings yet
SNA - T2-3 - Graphs and Degree
62 pages
SNA-UNIT-1 Full
No ratings yet
SNA-UNIT-1 Full
84 pages
Sprinkler Test 1
No ratings yet
Sprinkler Test 1
7 pages
Lesson 1
No ratings yet
Lesson 1
50 pages
Spyderco's Italian Made Limited-Edition Urban Sprint Run of Damasteel Björkmans Twist Steel
No ratings yet
Spyderco's Italian Made Limited-Edition Urban Sprint Run of Damasteel Björkmans Twist Steel
4 pages
YLSK-3D-3100 Wire Bending Machine
No ratings yet
YLSK-3D-3100 Wire Bending Machine
4 pages
Tech 1 Module 1
No ratings yet
Tech 1 Module 1
6 pages
Problem-Solving Process: Assignment
No ratings yet
Problem-Solving Process: Assignment
3 pages
VHDL FSM UNIT 5 ET&T 7th Sem
No ratings yet
VHDL FSM UNIT 5 ET&T 7th Sem
22 pages
PHPZC DE7 V
No ratings yet
PHPZC DE7 V
1 page
Installation and Operation Manual Work MVR - EN
No ratings yet
Installation and Operation Manual Work MVR - EN
2 pages
Chapter 3
No ratings yet
Chapter 3
54 pages
AEIF 2024 Proposal Forms
No ratings yet
AEIF 2024 Proposal Forms
10 pages
Module-1 Lecture-2
No ratings yet
Module-1 Lecture-2
60 pages
TRUSS-ANALYSIS-DESIGN Manual
No ratings yet
TRUSS-ANALYSIS-DESIGN Manual
32 pages
Chapter 1
No ratings yet
Chapter 1
20 pages
Chapter 11KEY
No ratings yet
Chapter 11KEY
9 pages
C2 - Social Network Measurement
No ratings yet
C2 - Social Network Measurement
42 pages
Sma Exp 6 - 100
No ratings yet
Sma Exp 6 - 100
14 pages
Social Network Analytics Session1
No ratings yet
Social Network Analytics Session1
35 pages
CYL Model XD
No ratings yet
CYL Model XD
9 pages
Sna - Short Notes
No ratings yet
Sna - Short Notes
43 pages
Sna It Unit5
No ratings yet
Sna It Unit5
20 pages
Benthe Bounce Jump Profile
No ratings yet
Benthe Bounce Jump Profile
10 pages
Tafl Unit 5 Tafl Unit 5
No ratings yet
Tafl Unit 5 Tafl Unit 5
81 pages
Mod1 2
No ratings yet
Mod1 2
21 pages
Introduction To SNA
No ratings yet
Introduction To SNA
39 pages
SMA Module2A
No ratings yet
SMA Module2A
121 pages
Social Network Analytics Notes
No ratings yet
Social Network Analytics Notes
14 pages
Research Paper 3 - Retail Stores (RelianceTrends) - Altering Like Chameleon For Settling in New Market Post Pandemic
No ratings yet
Research Paper 3 - Retail Stores (RelianceTrends) - Altering Like Chameleon For Settling in New Market Post Pandemic
9 pages
01 MK1033C 1
No ratings yet
01 MK1033C 1
104 pages
Introduction To Social Network Analysis (2021)
No ratings yet
Introduction To Social Network Analysis (2021)
57 pages
Admit Card
No ratings yet
Admit Card
1 page
Unit I Illustrate How Dijkstra's Algorithm Can Be Used To Find The Shortest Path in A Social Network
No ratings yet
Unit I Illustrate How Dijkstra's Algorithm Can Be Used To Find The Shortest Path in A Social Network
29 pages
Social Networks
No ratings yet
Social Networks
85 pages
2025 Sma M2
No ratings yet
2025 Sma M2
247 pages
Unit 5 - SNA AK
No ratings yet
Unit 5 - SNA AK
13 pages
SMA Exp 6
No ratings yet
SMA Exp 6
2 pages
Atv 930
No ratings yet
Atv 930
1 page
Unit 2 Graph and Matrices
No ratings yet
Unit 2 Graph and Matrices
9 pages
Session 15
No ratings yet
Session 15
44 pages
DS Unit 5
No ratings yet
DS Unit 5
89 pages
Sma 5 Marks PYQ
No ratings yet
Sma 5 Marks PYQ
27 pages
Socialmediaunit 2
No ratings yet
Socialmediaunit 2
11 pages
Big Data
No ratings yet
Big Data
20 pages
SMA Unit-2-Complete-Notes
No ratings yet
SMA Unit-2-Complete-Notes
31 pages
Unit 3 Notes Part 1
No ratings yet
Unit 3 Notes Part 1
8 pages
St-02 Notes Bcam061
No ratings yet
St-02 Notes Bcam061
41 pages
Unit 2
No ratings yet
Unit 2
28 pages
Sma QB Solution Tt2
No ratings yet
Sma QB Solution Tt2
40 pages
Unit 3 Notes - Unit3
No ratings yet
Unit 3 Notes - Unit3
25 pages
Unit 2 (Social Media)
No ratings yet
Unit 2 (Social Media)
18 pages
Social Media Analytics and Data Analysis (UNIT 3)
No ratings yet
Social Media Analytics and Data Analysis (UNIT 3)
22 pages
Unit 2 BCAM-061
No ratings yet
Unit 2 BCAM-061
26 pages
Unit 6 Mining Social Network Graph
No ratings yet
Unit 6 Mining Social Network Graph
9 pages
Unit 1
No ratings yet
Unit 1
18 pages
1.ICAO Wake Turbulence Groups
No ratings yet
1.ICAO Wake Turbulence Groups
25 pages
End SEm SNA
No ratings yet
End SEm SNA
34 pages
Section 5
No ratings yet
Section 5
21 pages
1.6 L4 - Social Network Structure, Measures Visualization - Part I
No ratings yet
1.6 L4 - Social Network Structure, Measures Visualization - Part I
36 pages
Unit Ii
No ratings yet
Unit Ii
28 pages
Sma 1 2
No ratings yet
Sma 1 2
24 pages
Cisco Certified Network Associate CCNA Interview Question and Answer
From Everand
Cisco Certified Network Associate CCNA Interview Question and Answer
Manish Soni
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unit 2 Complete Notes Unit 2 Complete Notes

Uploaded by

Unit 2 Complete Notes Unit 2 Complete Notes

Uploaded by

lOMoARcPSD|27198752

Unit 2 Complete Notes

B.tech (Dr. A.P.J. Abdul Kalam Technical University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Social Media Analytics and Data Analysis (BCAM061)

• Directed vs. Undirected Networks:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

o If the network is "undirected" (e.g., Facebook friendships), the matrix is symmetric.

Why it is Useful in Social Media Analytics:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

• In social media, nodes can be:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

• They can be identified using various measures:

DFS and BFS

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

2. Depth-First Search (DFS):

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

• Understanding Information Flow:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

o This measures a node's influence based on the influence of its neighbors.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

o This is typically done using:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

▪ Betweenness centrality: How often a node lies on the shortest path

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

• Random Surfer Model:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

Application in Social Network Analysis:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

• It provides a global measure of importance, meaning it considers the entire network

Core Concepts and Applications:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

▪ Predicting the reach of viral content.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

o Tracking community evolution reveals changing interests and social dynamics.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

How Weights Are Determined in Social Media:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

o This helps understand public opinion and identify potential issues.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

▪ Shared posts with multiple users tagged.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

Why Hypergraphs Are Useful:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

o These datasets represent interactions between proteins in biological systems.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

Key Characteristics of Network Datasets:

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

• They enable researchers to study real-world social phenomena.

Downloaded by Aditya Sharma (aditya10462004@gmail.com)

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.