0% found this document useful (0 votes)

60 views34 pages

Modeling Information Diffusion

This document discusses using partial differential equations (PDEs) to model information diffusion in online social networks. Prior work has largely used ordinary differential equations (ODEs) or statistical models, but PDE models can account for both temporal and spatial patterns of diffusion. The authors propose PDE models built on intuitive network distances that incorporate both internal influences from social network structure and external influences from outside sources. They validate the PDE models using real data sets and find the models can accurately predict influenced user densities over time and network distance. The PDE approach provides a new framework for understanding how network structure and information content interact to shape diffusion processes in online social networks.

Uploaded by

Kumar Vivas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views34 pages

Modeling Information Diffusion

Uploaded by

Kumar Vivas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Modeling Information Diffusion in Online Social Networks

with Partial Differential Equations ∗

Haiyan Wang, Feng Wang, Kuai Xu†
School of Mathematical and Natural Sciences
Arizona State University, Phoenix, AZ 85069-7100, USA
arXiv:1310.0505v1 [cs.SI] 1 Oct 2013

Abstract
Online social networks such as Twitter and Facebook have gained tremendous popularity
for information exchange. The availability of unprecedented amounts of digital data has accel-
erated research on information diffusion in online social networks. However, the mechanism of
information spreading in online social networks remains elusive due to the complexity of social
interactions and rapid change of online social networks. Much of prior work on information
diffusion over online social networks has based on empirical and statistical approaches. The
majority of dynamical models arising from information diffusion over online social networks
involve ordinary differential equations which only depend on time. In a number of recent pa-
pers, the authors propose to use partial differential equations(PDEs) to characterize temporal
and spatial patterns of information diffusion over online social networks. Built on intuitive
cyber-distances such as friendship hops in online social networks, the reaction-diffusion equa-
tions take into account influences from various external out-of-network sources, such as the
mainstream media, and provide a new analytic framework to study the interplay of structural
and topical influences on information diffusion over online social networks. In this survey, we
discuss a number of PDE-based models that are validated with real datasets collected from
popular online social networks such as Digg and Twitter. Some new developments including
the conservation law of information flow in online social networks and information propaga-
tion speeds based on traveling wave solutions are presented to solidify the foundation of the
PDE models and highlight the new opportunities and challenges for mathematicians as well
as computer scientists and researchers in online social networks.

Keywords: Primary: 35K57, 35K45; Secondary: 92D25

1 Introduction
Online social networking has undoubtedly changed the way people communicate and become in-
creasingly popular for information exchange. In recent years, social media (interchangeable with
∗
Research supported by NSF Grant CNS-1218212
†
Email: Haiyan.Wang@asu.edu, fwang25@asu.edu,kuai.xu@asu.edu

1
online social networks in this paper) such as Twitter and Facebook has experienced explosive
growth. The increasing availability of unprecedented amounts of digital data has accelerated
research on information diffusion in online social networks. But the mechanism of information
spreading in online social networks remains elusive due to the complexity of social interactions
and rapid change of social media. A better understanding of information diffusion process over
social media can effectively predict and coordinate online social activities. The insight of infor-
mation spreading process in social media can help increase the efficiency of distributing positive
information while reducing unwanted information over social media.
A significant body of research on social media [24, 22, 13, 24, 25, 19, 33, 34, 35, 36] has
focused on the measurement and analysis of network structures, user interactions, and traffic
characteristics of social media with empirical approaches which utilize data mining and statistical
modeling schemes. There is a considerable effort to use mathematical models to understand and
predict information diffusion over a time period in online social networks [21, 27, 30, 37, 20,
23, 12]. Newman [29] discussed dynamical processes on complex networks, dynamical models of
network growth and dynamical processes taking place on the networks and reports developments
on the structure and function of complex networks. Mathematical models based on epidemiological
processes have influenced the research on information diffusion [21, 29].
However, the deterministic models proposed for online social networks in the literature are
largely based on ordinary differential equations(ODEs) which deal with collective social processes
over time. Starting from a recent paper [1], the authors of this paper proposed to use partial
differential equations (PDEs) built on intuitive cyber-distance among online users to study both
temporal and spatial patterns of information diffusion process in social media. One of the basic
questions that the models address is that for a given information m initiated from a particular
user called source s, the density of influenced users at network distance x from the source at any
time t and distance x away from the source s. We validate our models with real datasets collected
from popular social media sites, Twitter and Digg. The data-set from Digg consists of millions of
votes on top news stories on Digg site during June 2009, and the friendship links among thousands
of users who voted during these stories. The experiment results show that the models can achieve
over 90 % accuracy and effectively predict the density of influenced users for a given distance and
a given time for a network distance metric with friendship links.
To the best of our knowledge, [1] is the first attempt to propose a PDE-based model for
characterizing and predicting the temporal and spatial patterns of information diffusion over social
media. According to a recent survey on information diffusion over online social networks by Guille
et al. [10], the PDE model in [1] is one of the three non-graph based modeling predicative models:
epidemiological, Linear Influence Model(LIM) and PDE approaches. The epidemiological models
in [10] refer to ODE-based or probabilistic models [29]. The LIM approach developed in [20] focuses
on predicting the temporal dynamics of information diffusion through solving non-negative least
squares problems. Our PDE-based models including epidemiological models are spatial dynamical
systems that take into account the influence of the underly network structure as well as information
contents to predict information diffusion over both temporal and spatial dimensions.
The PDE-based models we developed directly address a number of concerns in studying infor-
mation diffusion in online social networks with epidemiological models. Tufekci et al. [9] observed
that there are significant differences between information traveling in social media and the spread-

2
ing of germs in that online users are exposed to information from a wide range of sources and
not only from the networks they are connected to. The same issue also was raised by Myers et
al. [12] (also see [10] ) where two different diffusion processes, internal and external influence,
were discussed. The internal influence results from the structure of the underlying network; the
external influence comes from various out-of-network sources, such as the mainstream media. It is
estimated in [12] that almost 27% of information volume in Twitter can be attributed to the ex-
ternal influence. [12] noticed that nearly all epidemiological models for online social networks only
focus on the internal influence, while neglecting the external influence. However, the probabilistic
model in [12] primarily focuses on separating the external influence from the internal influence,
and quantifying the impact of the external influences on information adoption over time. The
PDE-based models we developed integrate the effect of both the structured-based process (inter-
nal influence) and content-based process (external influence) through dynamical systems in both
temporal and spatial dimensions. It is plausible to see that the network of social relationships and
the set structure of topical affiliations form the backbone of online social media(Romero et al. [8])
and the popularity of the content of information is the key driving force behind the external influ-
ence. As such, our PDE models provide a new analytic framework towards a better understanding
of information diffusion mechanisms by studying the interplay of structural and topical influences.
Our work extends the applications of PDEs into the research of information diffusion in online
social network. In the last few decades, there have occurred numerous new developments in math-
ematical analysis of reaction-diffusion systems. In this paper, in addition to a review of a number
of recent PDE models for information diffusion in our recent papers, some new developments in-
cluding the conservation law for information flow in social media are presented to provide a more
rigorous justification for the PDE models. We discuss stability, bifurcation, free boundary value
problem, information propagation speeds based on analysis of traveling wave solutions for inter-
action models. The theoretical advances in partial differential equations can provide an analytic
tool to reveal mechanisms of information diffusion. For example, analysis of the free boundary
value problem arising from social media in Section 6 leads to a simple formula for how fast infor-
mation is traveling. Surprisingly, the formula is almost the same as the celebrated result of Fisher,
Kolmogorov, Petrovsky and Piscounov in 1937 [18, 17] on the spreading of advantageous genes.
These results provide reasonable predications for how parameters influence information diffusion
over social media. However, because of the complexity of human interactions and rapid change of
social media, PDE models from social media can be quite complex and difficult to study analyt-
ically. The short survey presents a number of simple PDE models arising from social media and
highlights the new opportunity and challenge for mathematicians as well as computer scientists
and researchers in social media.
This paper is organized as follows. Section 2 discusses the spatial-temporal phenomena in social
media. Section 3 introduces the conservation law of information flow in online social networks.
Section 4 presents a number of PDE-based models to describe information flow and validations
of the models with real datasets. Section 5 examines several complex spatial models for complex
interactions in social media. Section 6 discuss a number of related mathematical problems and
gives some theoretical results for the problems. Section 7 concludes the paper with a wide range
of challenges in modeling online social networks.

3
2 Information Diffusion over Social Media
2.1 Digg and Twitter Data
In order to develop and validate PDE-based models, we use real datasets collected from Twit-
ter.com, the largest micro-blogging site, and Digg.com, the most popular news aggregation sit. In
Digg, registered users can post links of news stories and blogs to Digg.com. Other registered users
can vote and comment on the submitted news links. Digg users can connect one to another by
establishing friendship relationship called “follow”. The initiator or source of a news link is the
voter who first posts the news to the Digg site. In addition to followers, who can view and choose
to vote the news submitted by the friend he/she follows, Digg users, who do not friend with the
initiator directly or indirectly, will also be able to view and vote the news once news is promoted
to the front page after certain time. A user can also search for particular news at the web site
and vote for it. The news propagation that does not result from the structure of the online social
networks behaves somewhat randomly, which resembles random walk in the development of par-
tial differential equations. Thus the Digg data provides a very good opportunity for us to study
the impact of the friendship relationships on the process of information spreading with partial
differential equations.
We will validate our PDE models with the data-sets from Digg consisting of the 3553 news
stories that are voted (also called digged) and promoted to the front page of www.digg.com due to
the popularity during June 2009. In total, there are more than 3 millions votes cast on these news
stories from over 139, 409 Digg users. In addition, the data-sets also include the directed friendship
links among the Digg users who have voted these news stories. Based on these friendship links, we
construct a directed social network graph among these Digg users. For each of the news stories,
the dataset includes the user id of all the voters during the collection period, and the timestamps
when votes are cast.
We also collected data from Twitter. Twitter has much in common with Digg. Within the
twitter social network, users follow other people with twitter accounts. These users can follow
their friends, celebrities, or even famous politicians. By being a follower, one can view the tweets,
and also, retweet a person’s message. When a person retweets a status or a picture, he or she is
reposting the tweet so that his or her followers can now view the tweet. By retweeting, followers are
practicing information diffusion through online social networking. The time stamp and the social
network graph give us the opportunity to study the temporal and spatial patterns of information
propagation.

2.2 Cyber-distance Based on Friendship Hops

In online social networks, cyber-distances or friendship hops play a significant role in information
diffusion. Cyber-distance or social distance is used to measure the closeness of users in online social
networks. An intuitive approach for defining the distance between two users is to use the number
of friendship links in the shortest path from one user to another in the social network graph. We
use friendship hops to refer to the number of friendships links or hops. Thus the distance between
the initiator and any other user is defined as the length (the number of friendship hop) of the
shorted path from the initiator to this user in the social network graph [1]. Clearly, the direct

4
followers of the initiator have a distance of 1, while their own direct followers have a distance of 2
from the initiator, and so on. Figure 1 shows the distance distributions of the direct and indirect
followers from Digg users who have initiated one or more top news stories in the Digg dataset. As
we can see from the figure, the majority of online social network users have a distance of 2 to 5
from the initiators. In this figure, for all four stories, the distance 3 users accounts for more than
40% of all the users from the initiator directly or through other users. As the distance increases
from 6 to 8, the number of social networks users reachable from the initiator drops dramatically.
To be more precisely, let U denote the user population in an online social network, and s is
the source of information such as a news story that starts to spread in social media. Based on the
distance from social network users from this source, the user population U can be divided into a
set of groups, i.e., U = {U1 , U2 , ...Ui , ..., Um }, where m is the maximum distance from the users to
the source s. The group Ux consists of users that share the same distance of x to the source.

0.5
story 1
story 2
0.4 story 3
story 4
Fraction of users

0.3

0.2

0.1

0
1 2 3 4 5 6 7 8 9 10
Distance

Figure 1: Distribution of neighbors of four stories

While the social distance in this paper is based on friendship hops, its definition of cyber-
distance can be flexible and can be defined as other measurements. For example, Section 4.4
discusses an alternative way to define distance metrics based on shared interests.

2.3 Temporal and Spatial Patterns of Information Diffusion

In order to model the information diffusion process in temporal and spatial dimensions, we examine
the real datasets to analyze information spreading patterns over social media. With the definition
of distance, all users can be divided into distance groups based on their distance from the news
submitter. As a news story propagates through the Digg or Twitter network, users express their
interests in the news by voting or retweeting for it. We call such users as influenced users of the
information.
[2] studies the impact of friendship hop distance on the information diffusion process by measur-
ing the density of influenced users within the same distance, which is the percentage of influenced
users over the total number of users within a given distance. In the context of Digg social networks,
we consider the users who have voted the news story as influenced users. Figure 2[a-d] illustrate
the density of influenced users (with the distances of 1 − 5) over the initial 50 hours since the news

5
stories was posted on Digg for four example news stories, respectively. Each curve in Figure 2[a-d]
represents the density at a different distance.
We can observe from Figure 2[a-d] that the densities of influenced users at different distances
show consistent evolving patterns rather than increasing or decreasing with random fluctuations.
The temporal and spatial patterns resemble dynamics of evolution equations involving both time
and space variables.
20 12 16 2.5
d=1
d=2
d=1 d=1 d=1
d=3 d=2 d=2 d=2
d=4
18 d=5 d=3 d=3 d=3
14
d=4 d=4 d=4
10 d=5 d=5 d=5
16 2
12
14
8
10
12 1.5
Density

Density

Density
10 6 8

8 1
6
4
6
4
4 0.5
2
2
2

0 0 0 0
5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50
Time (Hours) Time (Hours) Time (Hours) Time (Hours)

(a) Density of influenced (b) Density of influenced (c) Density of influenced (d) Density of influenced
users of s1 users of s2 users of s3 users of s4

Figure 2: Densities of influenced users over 50 hours for s1 to s4

In addition, [2] validates the observations for all news stories in the Digg data set. It is concluded
that 94.9% of all news stories have the similar consistent evolving patterns. For most of the news
stories, densities of influenced users decrease as the distances of the users increase, reconfirming
that friendship is an important channel of information spreading. Therefore, mathematical models,
in particular, evolution equations involving both time and space variables, can be used to describe
the evolution dynamics of information diffusion over social media.
It is worth noting that there are some differences between information diffusion in online social
networks and spatial biological process in mathematical biology. In spatial ecology, the diffusion
process often refers to the fact that animals move randomly from one physical location to another.
In the context of online social network, online users simply pass on information from one to another
and do not necessarily change their network distances within the lifetime of the information.

3 Conservation Law of Information Flow

3.1 Embedment of Diffusion Process into Euclidean space
Based on cyber-distance from a source, one could breakdown the user population U into a set of
groups, i.e., U = {U1 , U2 , ...Ui , ..., Um }, where m is the maximum distance from the users to the
source s. The group Ux consists of users that share the same distance of x to the source. Following
[1], we use the x-axis as the social distance and embed the density Ux at the location x. Let I(x, t)
denote the density of influenced users at distance x and time t. For Digg news, I(x, t) is the ratio
of the number of influenced users with a distance of x at time t over the total number users in Ux .
For Twitter news, I(x, t) is the number of influenced users with a distance of x in Ux at time t as
Twitter has huge numbers of registered users (500 million registered users in 2012 [42]) and the
ratio for Twitter would be too small to study.

6
Content-based Process

Source

x=1 x=2 x=3 x=4 x=5

Structure-based Process

Figure 3: Information diffusion process in online social networks

As information propagates over social media, users promote the information through retweeting,
commenting, searching, voting, forwarding and other activities. In general, two decisive compo-
nents for information diffusion in online social networks are the graph structure of social networks(
follower graphs) and the content of the information, which form the backbone of online social
networks [8]. Online users are subject to information from a wide range of sources, not just those
networks they are connected to [9]. In our setting, because group Ux consists of users from the
same social distance from a source, the growth of the influenced users within the group may be
viewed as a result of the network structure. Other activities to promote information diffusion such
as search do not result from the network structure and may happen randomly for various reasons,
in most cases, mainly because of the content of the information.
As such, we divide the information diffusion process in online social networks into two separate
processes, structure-based process and content-based process. The content-based and structure-
based processes in Fig. 3 resemble the external and internal influences, respectively, in online
social networks in [12]. The interplay of the two processes essentially accounts for the change
of the density of influenced users I(x, t). The structure-based process represents the information
spreading among users in Ux with the same distance because of their direct links to those who
already are already influenced. The content-based process measures information spread among
users at different distance due to various other activities that result from the popularity of content
of the information. The content-based process is usually bidirectional or reciprocal in a manner
of random walk. Figure 3 conceptually illustrates the interplay of the two processes in an online
social network. The content-based and structure-based processes are named slightly different in
our previous papers [1, 4], but refer to the same processes in the context of online social networks.
As social media rapidly gains worldwide popularity in recent years, many social media sites
experience an explosive growth of registered online users. For example, Twitter has 500 million
registered users in 2012. This gives rise to extremely complex and large network graphs in online
social networks. If we introduce a slightly more complex distance metric from its underlying
network topology, the number of the subsets in U can increase dramatically. Therefore the user
population U will be embedded to more dense points in some interval on the x-axis. In particular,
when we discuss traveling wave solutions, it is assumed that these discrete points are enough dense
on a large section of the x-axis that can be mapped to (−∞, ∞).

7
3.2 Formulation of Conservation Law of Information Flow
Conservation laws or basic balance laws play a crucial role in the development of partial differential
equation models in Physics, Mathematical Biology and other fields. A conservation law is a
mathematical formulation of the basic fact that the rate at which a given quantity changes in a
given domain must equal the rate at which it flows across its boundary plus the rate at which it
is created, or destroyed, within the domain. Once we embed the information propagation process
into Euclidean spaces, the formulation of the conservation law for information flow is similar to
that for spatial biology [28]. We emphasize differences and their interpretations in social media.
In social media, the quantity is the amount of information spreading such as the density of
influenced uses, denoted by I = I(x, t) and measured in amount per unit length along the x-axis
since we are embedding the users of entire network into a one-dimensional space. We assume that
any change in the amount of information be restricted to one spatial dimensional tube where each
cross-section is labeled as the spatial variable x. While only discrete set of points (Ux ) in the x-axis
which is meaningful for social media, we can extend the discrete points into a continuous interval.
With this understanding, we can derive the conservation law of information flow. Our spatial
models are direct applications of the conservation law of information flow, and will be validated
by real data sets.
For simplicity, we assume that a constant A is the cross-sectional area of the tube. Thus the
amount of information in a small section of width dx is I(x, t)Adx. Further, we let J = J(x, t)
denote the flux of the quantity at x, at time t. The flux measures the amount of the quantity
crossing the section at x at time t, and its units are given in amount per unit area, per unit time.
In social media, J reflects the content-based diffusion process in Fig. (3) and does not result from
the structure of the underlying network. By convention, flux is positive if the flow is to the right,
and negative if the flow is to the left.
In social media, influenced users in Ux may increase because they directly link or follow those
who are already influenced. Let f = f (I, x, t) denote the given rate at which the information is
created within the section at x at time t. f represents the structure-based process in Fig. (3) and
is a result of local growth due to the underlying network structure. The structure-based diffusion
process has much in common with the internal influence in [12]. f can be negative in social media
if some kind of deletion occurs. f is measured in amount per unit volume per unit time. In this
way, f (I, x, t)Adx represents the amount of information that is created in a small width dx per
unit
We now can formulate the law by considering a fixed, but arbitrary, section a ≤ x ≤ b of the
domain. The rate of change of the total amount of the information in the section must equal to
the rate at which it flows in at x = a, minus the rate at which it flows out at x = b, plus the rate
at which it is created within a < x < b. In mathematical formulation, for any section a ≤ x ≤ b,

d b
Z Z b
I(x, t)Adx = AJ(a, t) − AJ(b, t) + f (I, x, t)Adx
dt a a
Rb
From the fundamental theorem of calculus, J(a, t) − J(b, t) = − a ∂J ∂x
dx. Because A is constant,

8
it may be canceled from the formula. We arrive at, for any section a ≤ x ≤ b,
Z b
∂I ∂J
+ − f (I, x, t) dx = 0
a ∂t ∂x
It follows that the fundamental conservation law of information flow is
∂I ∂J
+ = f (I, x, t)
∂t ∂x
J does not necessarily result from direct social links and behaves like random walk. For
example, in Digg network, besides the fact that a follower votes for news posted by its followee,
a user can also vote for any news that he/she is interested in while the news is promoted to the
front page, or through search engines provided by the network. In Twitter, the symbol # followed
by a few characters, called a hashtag, is used to mark keywords or topics in a tweet. With the
hashtag symbol anyone can search for the set of tweets that contain a hashtag. It is estimated that
Twitter handles 1.6 billion search queries per day [42]. The use of hashtags increases propagation
of tweets. Also Twitter users can send @-messages publicly to a specific user by including the
character before the receiving person’s username in their tweet. This unstructured phenomenon
“jumps” across the network and appears at a seemingly random node [12]. The action results
from the relevance of the content of information rather than the structure of the follower graph of
a network. In general, information flows from high density to low density and therefore a simple
expression of flux J can be
∂I
J = −d (3.1)
∂x
which results from a principle analogous to Fick’s law([28]) in Biology or Physics. The minus sign
describes the flow is down the gradient. d represents the popularity of information which promotes
the spread of the information through non-structure based activities such as search. For now d
can be viewed as an average and therefore is a constant. In general, it may be dependent on
u, x, t, which we will investigate in the paper. Now we obtain the following PDE model to describe
information flow.
∂I ∂2I
= d 2 + f (I, x, t) (3.2)
∂t ∂x

Symbol Description
Diffusion process
∂2I
d ∂x 2 (random walk)
Local growth
f (I, x, t) ( birth and death)

Table 1: Equation (3.2) in mathematical biology with physical distance

Tables 1 and 2 compare the difference of the interpretations of PDE models in both mathemat-
ical biology and online social networks in the setting of Fig. 3. The structure-based process can

9
Similar concept Key reason to
Symbol Description in the literature view the information
External influence [12] Search or others
Content-based (various webpages (due to the popularity
2
∂ I
d ∂x 2 (random action) such as cnn.com) of content)
Internal influence [12]
Structure-based (diffusion over the edges
f (I, x, t) (structured action) of the network) Follower graph

Table 2: Equation (3.2) in online social media with friendship hops as distance in Fig. 3

be viewed as the growth of population due to local growth in mathematical biology. The content-
based process is similar to the diffusion process in mathematical biology and behaves in a manner
of random walk. The content-based and structure-based processes in Fig. 3 resemble the external
and internal influences, respectively, in online social networks in [12]. The key difference of the
structure-based and content-based processes is that the former results from information received
from the follower graph; the latter results from information received from search or other actions
because of the popularity of content rather than the follower graph.

4 Partial Differential Equation Models

In this section, we will discuss three Reaction-Diffusion equation models which will be validated
by real datasets. Two of them have been published in [1] and [2]. Spatial models have been used
in mathematical biology, sociology, economics, and physics to model spatial-temporal patterns.
Spatial models are able to provide a quantitative way to integrate local data about interactions
between individual users into global conclusions about news spread over social media. By intro-
ducing PDEs into the context of social media, we capture the similarity and difference between
spreading of epidemics in biology and the information spreading in online social networks.

4.1 Diffusive Logistic Model

In this subsection we review the diffusive logistic model in the authors [1] for characterizing the
temporal and spatial patterns of information cascading over social media. The model shall be
validated with the data set from Digg. Logistic model is believed to be the simplest nonlinear
model to capture the population dynamics where the rate of reproduction is proportional to both
the existing population and the amount of available resources [28]. It has been widely used to
describe various population dynamics and predict growth of bacteria and tumors over time [28].
The structure-based process in Fig. (3) is modeled with a simple nonlinear equation. Logistic
equation is defined as follows. Denoting with N the population at time t, r the intrinsic growth
rate and K the carrying capacity which gives the upper bound of N, the population dynamics are

10
governed by:
dN N
= rN(1 − ) (4.1)
dt K
where dNdt
is the first derivative of N with respect to t. In the context of online social networks,
N
the term rN(1 − K ) describes the impact of the network structure on the growth of I(x, t), the
density of influenced users at the distance x during time t.
r reflects the decay of news influence with respect to time t. While some information can take
a longer period of time to spread in social media [7], news diffusion in social media is time-sensitive
and the influence of news stories decays drastically as time elapses. Figure 4 illustrates the spread
of the most popular story in the digg dataset in the temporal perspective. It shows that interests
in news decay exponentially over time. The x-axis is the distance, y-axis is the density of the
influenced users, each line represents the density at time t where t is 1 hour, 2 hour and up to 50
hours after the submission of the initial news. The gap of density decreases at time pass by. From
our experiments exponential functions of decay seems plausible for modeling the rapid decay of
news with respect to time.
20

12
Density

0
1 1.5 2 2.5 3 3.5 4 4.5 5
Distance

Figure 4: Density of influenced users over 50 hours with friendship hops as distance

It is conceivable that the rate of influence of a news decays experientially, which can also be
observed in Fig. 4. The decay process can be modeled by the following ordinary differential
equation

dr(t)
= −αr(t) + β
dt (4.2)
r(1) = γ

where dr(t)
dt
is the rate of change of r with respect to time t, α is the decay rate, γ is the initial rate
of influence. β represents the residual rate as time increase, which can be very small. Solving for
r in (4.2), we obtain
β β
r(t) = − e−α(t−1) ( − γ) (4.3)
α α
Base on the conservation law of information flow, combining the structure-based process and

11
content-based process together gives the following diffusive logistic equation:

∂I ∂2I I
= d 2 + rI(1 − )
∂t ∂x K
I(x, 1) = φ(x), l ≤ x ≤ L (4.4)
∂I ∂I
(l, t) = (L, t) = 0, t ≥ 1
∂x ∂x
where

• I represents the density of influenced users with a distance of x at time t;

• d represents the popularity of information which promotes the spread of the information
through non-structure based activities such as search;

• r represents the intrinsic growth rate of influenced users with the same distance, and measures
how fast the information spreads within the users with the same distance;

• K represents the carrying capacity, which is the maximum possible density of influenced
users at a given distance;

• L and l represent the lower and upper bounds of the distances between the source s and
other social network users;

• φ(x) ≥ 0 is the initial density function, which can constructed from history data of informa-
tion spreading. Each information has its own unique initial function;
∂I
• ∂t
represents the first derivative of I with respect to time t;
∂2I
• ∂x2
represents the second derivative of I with respect to distance x;
∂I ∂I
∂x
(l, t)= ∂x (L, t) = 0 is the Neumann boundary condition [28], which means no flux of
information across the boundaries at x = l, L. This assumption is plausible for social media since
the users cluster in a number of groups Ux . We also assume φ(x) ≥ 0 is not identical to zero and the
maximum principle implies that (4.4) has a unique positive solution I(x, t) and 0 ≤ I(x, t) ≤ K.

4.1.1 Initial Density Function Construction

In general, we assume that the initial density function is given and can be constructed using
the data collected from the initial stage of information diffusion. Specifically, φ is a function of
distance x which captures the density of influenced user at distance x at the initial time when a
news story is submitted. In online social networks, it is only possible to observe discrete values for
the initial density function because the distance x is discrete. The initial density is the influenced
user distribution when time t = 1. As in [1], we apply an effective mechanism available in Matlab
cubic spline package, called cubic splines interpolation [26], to interpolate the initial discrete data
in constructing φ(x). Using this process, a series of unique cubic polynomials are fitted between
each of the data points, with the stipulation that the obtained curve is continuous and smooth.

12
Hence φ(x) constructed by the cubic splines interpolation is a piecewise-defined function and twice
continuous differentiable. After cubic splines interpolation, we simply set the two ends to be flat
to satisfy the second requirement since in this way the slopes of the density function φ(x) at the
left and right ends are zero.

4.1.2 Accuracy of Diffusive Logistic Model

In this subsection, we evaluate the performance of the proposed linear diffusive model by comparing
the density calculated by the model with the actual observations in the Digg data set. We first
present the model accuracy for the most popular news story. Accuracy of the predicated value of
a model against an actual value is defined as follows.

|predicted value − actual value|

model accuracy = 1 − (4.5)
actual value
14

10
Density

0
1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
Distance

Figure 5: Prediction of (4.4) vs. real data of story 1 with 24099 votes
We numerically solved the model with Matlab. Figure 5 illustrates the predicting results for an
example news story (story 1) with the proposed model, where the x-axis is the distance measured
by friendship hops, while the y-axis represents the density of influenced users within each distance.
The solid lines denote the actual observations for the density of influenced users for a variety of
time periods (i.e., 1-hour, 2-hours, 3-hours, 4-hours and 5-hours), while the dashed lines illustrate
the predicted density of influenced users by the model. As we can see, the proposed model is able
to accurately predict the density of influenced users with different distance over time. The values
of the two parameters K and d in this case are 25, and 0.01. r(t) = 1.4e−1.5(t−1) + 0.25. Table 3
gives the numerical value of Figure 5. It is clear that the model has high precision in terms of
prediction.

4.2 Linear Diffusive Model

The logistic growth model in [1] is the first spatial model to account for the phenomenon that the
initial stage of the increase of influenced users is approximately exponential; then, as saturation

13
Distance Average t=2 t=3 t=4 t=5 t=6
1 98.27% 97.47% 97.74% 97.48% 99.55% 99.09%
2 86.99% 93.59% 96.63% 87.16% 80.80% 76.78%
3 90.28% 83.23 % 87.98% 90.99% 93.35% 95.94%
4 92.98% 86.75% 91.39% 99.00% 95.68% 92.06%
5 93.77% 89.05% 91.61% 97.79% 97.92% 92.49%
6 94.56% 90.03% 89.48% 96.04% 97.57% 99.67%

Table 3: Prediction accuracy of Accuracy of (4.4) with friendship hop as distances for story s1

begins, the growth slows, and eventually, growth stops. It can achieve a high accuracy as we
discuss in the last subsection. In this subsection, we present a more simple linear function to
model the growth of influenced users in online social networks [2] by the authors, Wu and Xia.
The linear model takes into account the effects of heterogeneity in cyber-distance and news decay
with respect to time. As indicated in Fig. 1, the distribution of the density of influenced users
in distance is not homogeneous. The majority of users are in the groups with distances 3 and 4.
This heterogeneity in distance leads to the assumption that the growth function is dependent on
location x. The concavity of the shape of Fig. 1 further suggests that we can use the following
concave down quadratic function h(x) to describe this heterogeneity in distance.

h(x) = −(x − ρ)(x − σ) (4.6)

The coefficient of x2 in h(x) is scaled to be −1. h(x) reflects the rate of the change of influenced
users with respect to distance x. The simplest way to model the growth of influenced users as
linear function of I. Let
f = r(t)h(x)I
We can think of r(t) as the average of all distances, and likewise, h(x) as the average of all times.
Thus, combining the structure-based process (3) and the growth process together, the fundamental
law of information flow gives the following the linear diffusive equation

∂I ∂2I
= d 2 + r(t)h(x)I
∂t ∂x
I(x, 1) = φ(x), l < x < L (4.7)
∂I ∂I
(l, t) = (L, t) = 0, t > 1
∂x ∂x

4.2.1 Accuracy of Linear Model

We evaluate the performance of the proposed linear diffusive model by comparing the density
calculated by the model with the actual observations in the Digg data set. We numerically solved
the model with Matlab. Figure 6[a] illustrates the performance of the linear diffusive model for
the most popular story, s1. Figure 6[b] gives the shape of r(t) and Figure 6[c] gives the shape
of h(x). h(x) is a concave down function with peak between 3 and 4, which is related to the
neighbor distribution illustrated in Figure 1. The shape of h(x) suggests that there may exist
highly influential users or opinion leaders at distance 4 from the submitter of news s1. The
corresponding parameters are listed in Figure 6[d]. The parameters are adjusted manually to best

14
14

25
Density
0.08
8
24
0.07

6 0.06
22

0.05 21

h(x)
r(t)
4 0.04 20

19
0.03

18
2
0.02
17

0.01
16
0
1 1.5 2 2.5 3 3.5 4 4.5 5 0 15

Distance 1 1.5 2 2.5 3 3.5

Time
4 4.5 5 5.5 6 1 1.5 2 2.5 3
distance
3.5 4 4.5 5

(a) Predicted (blue, solid) vs. Actual data (b) r(t) (c) h(x)
(red, dotted)
Parameter value Distance Average
d 0.0020 1 97.88%
α 1.5526 2 97.27%
β 0.0059 3 97.44%
γ 0.0780 4 96.20%
ρ -0.9478 5 98.25%
σ 8.9149 Overall 97.41%
(d) Parameter values (e) Model Accuracy

Figure 6: Accuracy of (4.7) for the most popular news story in the Digg data set

fit the actual data. The diffusion constant d is relatively small because d is the average diffusion
rate for all distances. It also suggests that the structure-based process has a dominating impact
on the information diffusion process. α, β, γ determine the shape of r(t); and ρ and σ determine
the peak of h(x). The average accuracy at different distances are calculated for time t = 2, ..., 6,
and are provided in Figure 6[e]. The model can achieve high accuracy across distances.
[2] also studies the accuracy of the model for describing all news stories in the Digg data set
and examines whether the model can capture the heterogeneity features in information diffusion
over the Digg network, we explore the overall accuracy of the linear diffusive model for all 133 news
stories with over 3000 votes in the Digg data set. Our results in [2] illustrate that about 13% of
news stories can be described with accuracy higher than 90%. In total, about 60% of news stories
can be described with accuracy higher than 80%. The simulation is performed with a MATLAB
auto fitting program. If we manually adjust parameters for each individual news story, higher
accuracy can be achieved. For example, for the most popular news story, with manually adjusted
parameters, the average accuracy can reach 97.41%, while with the automated parameter selection,
the average accuracy is still greater than 90%. The high accuracy across all news stories with over
3000 votes show strong evidence that the linear diffusive model captures the heterogeneity diffusion
patterns of news and can be used as an effective approach to describe the news spreading in Digg.

15
4.3 Logistic Model with Variable Content-based Diffusion
In previous two subsections, we assume that the diffusion coefficient d in the flux formula
∂I
J = −d
∂x
is a constant. In fact, because of spatial heterogeneity of online users in social media, d may be
dependent on the distance x from the source. In general, d may be a decreasing function of x since
interactions between different groups Ux decrease dramatically as x increases. Therefore, we use
an exponential function
d = de−bx
to model the effect of spatial heterogeneity of online users in the content-based process in Fig. 3.
Thus the following model combines the previous two models with variable contend-based diffusion.

∂I ∂(de−bx Ix ) I
= + r(t)I(h(x) − )
∂t ∂x K
I(x, 1) = φ(x), l < x < L
(4.8)
∂I ∂I
(l, t) = (L, t) = 0, t > 1
∂x ∂x
r(t) = A + Be−Ct

where

• d represent the popularity of information; b represents the decay of the popularity of infor-
mation with respect to the friendship structure in social networks;

• K represents the carrying capacity, which is the maximum possible density of influenced
users at a given distance;

• h(x) represents the heterogeneity of growth rate in distance x.

4.3.1 Accuracy of Logistic Model with Variable Content-based Diffusion

We evaluate the performance of the logistic model with variable diffusion by comparing the density
calculated by the model with the actual observations in the Twitter dataset. We choose the photo
in Figure 7 tweeted by President Barack Obama on December 13, 2012 as an example to verify
accuracy of the model. Based on the Twitter user list and social network graphs collected in
October 2009, we find 383 Twitter users have retweeted this photo. Among these 383 users, 96
users has a distance of 1 to the source of the tweet message, 226, 58, 2, and 1 users have a distance
of 2, 3, 4, and 5 respectively. We compare the number of users retweeting the photo at the different
distances and a time frame from t = 1 to t = 15.
We numerically solved the model with Matlab. Figure 8 (a) illustrates the predicting results
for the photo tweeted by President Barack Obama with the proposed model for the time frame
from t = 1 to t = 15 hours, where the x-axis is the distance measured by friendship hops, while

16
Figure 7: President Barack Obama tweeted the photo in Twitter at 10:20AM, Dec. 13, 2012. We
choose the tweet as an example to validate (4.8).

the y-axis represents the number of retweeted users within each distance. The red-lines represent
the actual number of people retweeting the photo at various time increments. The blue curves
represent the model used to predict the information diffusion based on the PDE model. The red-
dotted lines denote the actual observations for the number of retweeted users. The mathematical
model represented the diffusion of President Obama’s tweet reaches an overall accuracy of 97.64 %,
shown in the Figure 9. These results were obtained with d = 1, b = 3, K = 300, r(t) = 0.3 + e−2t ,
h(x) function shown in Figure 8 (b). Figure 8 (b) illustrates a peak at x = 2, which can indicate
that President Obamas tweeted the photo is most popular within the Twitter users at distance
two.
Therefore, the spatial models (4.4), (4.7) and (4.8) can achieve high accuracy. While (4.7) is a
linear model and captures the behavior of news spread within a few hours, nonlinear models (4.4)
and (4.8) can predicate news spread for a longer time frame. h(x) in (4.7) and (4.8) reflects the
spatial heterogeneity of online users with respect to distance x. In (4.4) it is assumed that h(x) is
constant. (4.8) takes into consideration of the fact that the diffusion coefficient d is a decreasing
function of x, which is a constant in both (4.4) and (4.7). From the experiments above, all three
models can achieve extremely high accuracy.

4.4 PDE model with Distance Based on Shared Interests

While we have used the number of friendship hops as a natural distance, an alternative approach
could be interest distance, for measuring the distance between two users through their shared
interests on information or content in social networks. Given two social networks users a and b, we
use Ca and Cb to denote the set of contents the users a and b have interacted with, respectively.
In the context of Digg social networks, Ca represents all the news stories the user a have voted or
digged. The shared interest da,b between users a and b is defined as:
Ca ∩ Cb
da,b = 1 − (4.9)
Ca ∪ Cb
where Ca ∪ Cb is the number of the total contents that either user a or user b has interacted
with and Ca ∩ Cb is the number of the shared contents that both users a and b have interacted

17
200 0.7

180
0.6

160

0.5
140

120
Density

0.4

h(x)
100

0.3
80

60
0.2

0.1
20

0 0
1 1.5 2 2.5 3 3.5 4 1 1.5 2 2.5 3 3.5 4
Distance distance

(a) Predicted (blue, solid) vs. Actual data (red, (b) h(x)
dotted)

Figure 8: Accuracy of (4.8) for photo tweeted by President Obama at Twitter

Distance Average
1 98.39%
2 98.75%
3 94.11%
4 99.31%
Overall 97.64%

Figure 9: Accuracy of (4.8) at x = 1, 2, 3, 4 for photo tweeted by President Obama

with. Essentially the interest distance quantifies the degree of the shared interests among two
users. An information originating from the source s is likely to influence users who have small
interest distances to the source due to the shared interests. In a recent work [6] we introduced an
effective algorithm to identify the shared interests in online social networks. With the distance,
all previous models can be modified to reflect how information flows from these who share more
common interests to those with less common interests. We shall discuss more about the problem
in a future work.

5 Modeling Complex Interactions

Information diffusion process over online social networks could be influenced by more complex
interactions between users and information, e.g., information originating from multiple sources
and competing information from different political campaigns. Myers et al. [11] (also see Guille
[10]) studies cooperation and competition in information diffusion in the Twitter network using a
statistical model. In this section, we start with a number of simple models to illustrate how to

18
incorporate complex interactions in the spatial-temporal setting. We are in the process to refine
and validate these models with real datasets.

5.1 Multiple Information Sources

In an online social network, same information could originate from multiple sources. Often, break-
ing news stories, emergency events, and controversial topics are initiated by a number of different
news sources; Multiple users tweeting the final result for a sport game in Twitter. News from mul-
tiple sources often increase its spreading speed and coverage. It is more practical to understand
the diffusion patterns of multi-source information in social media.
In a recent paper [3] by Peng and the authors, we use the linear diffusion model (4.7) to
predict the information diffusion process of multi-source news spread in Digg. [3] studies the basic
characteristics of the diffusion process of multi-source information. The distance metric in [3] is
intuitively defined as the minimum shortest path between a user and multiple sources. The distance
definition reflects the fact that while a user of social media could be influenced by a set of news
sources via different multiple paths, but the nearest source to which the user has the minimum
friendship hop has the highest probability of influencing the behavior of the given user due to
the smallest number of friendship hops. The predication accuracy of the linear diffusion model
(4.7) for multiple news sources is validated with the same data-set from Digg. Our experiments in
[3] show that the model can describe the most representative news stories initiated from multiple
sources with an accuracy higher than 90%, and can achieve an average accuracy around 75%
across all multi-source news stories in the data-set. These results confirm that our approach with
reaction-diffusion equation is able to describe and predict the spreading patterns of multi-source
information with high accuracy.
The effect of multiple sources can be also viewed as mutualism in biological systems where two or
more organisms of different species biologically interact in a relationship in which each individual
derives a fitness benefit (i.e., increased or improved reproductive output). Such interactions in
biological systems are known as cooperation systems [28]. The number of influenced users at any
location and time can dependent on multiple distances and time. Therefore dynamical equations
can predicate how the information spreads in social media with interactions between multiple
directions and channels. If we consider two news sources, then the two pieces of news spreads with
logistic growth independently along with additional effect from another. Let ui be the density of
the information from different sources. The positive effects can be modeled by terms α1 u1 u2 and
α2 u1 u2 . As a result, the following simple model can be a starting point to address the impact
between multiple sources:

∂u1 ∂ 2 u1 u1
= d1 2 + r1 (t)u1 (1 − ) + α1 u 1 u 2
∂t ∂x K1
(5.1)
∂u2 ∂ 2 u2 u2
= d2 2 + r2 (t)u2 (1 − ) + α2 u 1 u 2 ,
∂t ∂x K2
where
• d1 , d2 represent the popularity of the two pieces of information.

19
• ri (t), i = 1, 2 represents the intrinsic growth rate of influenced users with the same distance,
and measures how fast the information spreads within the user groups with the same distance;

• Ki , i = 1, 2 represents the carrying capacity, which is the maximum possible density of

influenced users;

• α1 measures the positive effect of news u2 on u1 and α2 measures the positive effect of news
u1 on u2 .

In addition to study solutions of (5.1) along with appropriate boundary and initial conditions,
we are also interested in how fast the information spreads in online social networks with multiple
sources. We will discuss it in Section 6.4.1 where we assume that the underlying domain is from
−∞ to ∞.

5.2 Competing Information

It is common to observe completing information from different political or product campaigns,
where information needs to compete with each other to maximize its own influences on people
through online social networks. To characterize the density of users that are influenced by a
certain competition, we may study competing news by the Lotka-Volterra competition models [28].
Competition models have been extensively studied in mathematical biology [28] where interactions
between organisms or species can lower the presence of another due to limited resources (such as
food, water, and territory). A simple mathematical model to describe news competition may be
the same as equation (5.1) except that the effect of the competition is negatively proportional to
the number of influenced users.
∂u1 ∂ 2 u1 u1
= d1 2 + r1 (t)u1 (1 − ) − α1 u 1 u 2
∂t ∂x K1
(5.2)
∂u2 ∂ 2 u2 u2
= d2 2 + r2 (t)u2 (1 − ) − α2 u 1 u 2 ,
∂t ∂x K2
Here α1 measures the competition effect of news u2 on u1 and α2 measures the competition effect
of news u1 on u2 . Thus the model may be able to capture the logistic growth of each informa-
tion, and also quantify the effect of the competition on the diffusion process. Some relevant and
practical questions, such as which news can win the competition, can be answered by studying the
competition models in Section 6.4.2.

5.3 Spatial Epidemiological Models

It is commonly accepted that dynamical models to describe disease spreading can be used to study
information. There are a considerable body of research work modeling information diffusion based
on techniques for modeling infectious diseases [29]. In general, epidemiological models are neither
cooperative nor competitive and more challenging to study. In principle, the nodes of a network
are classified into several classes (i.e. states) and focus on the evolution of the proportions of nodes
in each class. SI and SIR are the two basic models, where S stands for susceptible, I for infected

20
(i.e. adopted the information) and R for recovered (i.e. refractory). In both cases, nodes in the
S class switch to the I class due to influence of their neighbor nodes. Then, in the case of SI,
nodes in the I class switch to the S class, whereas in the case of SIR they permanently switch to
the R class. The percentage of nodes in each class is expressed by simple differential equations.
Both models assume that every node has the same probability to be connected to another and
thus connections inside the population are made at random.
However, most of the work on social media has largely concentrated on collective analysis
and involves only ordinary differential equations. With the new metric concept between users we
introduced our recent papers, spatial effects can be incorporated and partial differential equations
come into play. In particular, spatial models take into consideration of the external influence.
Many similar concepts and models in epidemiology can be further modified and expanded to study
information diffusion in online social networks. For social media, S represents the density of
susceptible users at time t and distance x in Ux and I represents the density of influenced users at
time t and distance x in Ux . The following SI model is a simple example how spatial infectious
disease model can be used to study online social networks.

∂S ∂2S SI
= d1 2 − r(t)
∂t ∂x S+I
2
(5.3)
∂I ∂ I SI
= d2 2 + r(t)
∂t ∂x S+I
where r(t) is the rate of influence. S + I appears in the denominator because it is not necessarily
constant in spatial models. One important concepts to describe the interactions between user
groups is the rate of influence r(t) which is similar to the force of infection in epidemiology. The
choice of the rate of influence is largely dependent on news and user classifications. Data mining
techniques and graphical model can significantly improve the selection of the parameters. We shall
determine the parameters for news from Twitter in a future work.

5.4 Multiple Communication Channels

Most of the research on information diffusion is limited to study information diffusion from isolated
single media sites or factors, such as friendship hop etc. However, many online social networks
involve multiple connections between any pair of nodes (such as friendship, family relationship,
geographic location and other demographic characteristics). In addition, news propagation in one
social media site often affects that of another social media site, for example, Facebook vs. Twitter.
A theoretical framework for understanding of information flow in online social networks over mul-
tiple communication channels is still missing. With the framework to model information diffusion
in social media with PDE models in [1, 2], it is possible to use partial differential equations defined
on multidimensional domains to analyze, characterize and predict spatial-temporal dynamics of
information diffusion with multiple communication channels. The governing partial differential
differentials are systems of reaction-diffusion equations defined on multidimensional domains. For
example, with almost the same setting as in (4.4), a simple diffusive logistic equation defined on
multidimensional domain Ω may be able to describe information diffusion over social media with
multiple communication channels

21
∂I I
= d∆I + rI(1 − )
∂t K
I(x, 1) = φ(x), x ∈ Ω (5.4)
∂I
= 0, on Ω × (1, ∞)
∂n
here ∆ is the Laplacian operator defined on multidimensional domain Ω ⊂ Rn . The shape of
Ω can be determined by the correlation of the communication channels. For example, if we are
interested in studying a specific region with two unrelated related channels, a rectangle region may
be plausible to describe the combined effect of the two channels on the spreading of news in a social
media site. Systems of equations such as (5.1) or (5.3) can also be defined on multidimensional
domains to model information diffusion with multiple communication channels. We shall collect
data from multiple social media sites to validate the model. The results can be used to help
uncover impacts of multiple communication channels and multiple social media sites on information
diffusion in online social networks.

6 Model Analysis and Discussion

We have presented a number of partial differential equation models to characterize spatial-temporal
patterns in social media. The spatial-temporal models are reaction-diffusion equation models built
on intuitive cyber-distance in online social networks. The basic mathematical properties such as
the existence, uniqueness and positivity of the solution of the models can be established from
the standard theorems for parabolic PDEs [38, 41]. Many other mathematical properties remain
to be investigated. Not only mathematical analysis of the models can validate the models, but
also reveal insights of information flow over social media through mathematical properties of the
solutions of PDE models. In this section, instead of giving technical details, we briefly highlight a
number of interesting mathematical problems and their social implications.

6.1 Information Classification and Parameter Selection

The PDE models we discussed here are general principles governing information diffusion over social
media. The type of information plays a significant role in the diffusion process of information over
social media. Online users more likely spread news they are interested in. Hence, it is necessary to
differentiate and classify types of information. As the first step, we select the best parameters to
maximize the accuracy of the PDE models in comparison with real datasets. The chosen parameters
can be used to classify the information. Once we are able to classify or categorize information, the
PDE models can be used to predicate how information spread through solving the PDE models
along with appropriate parameters according to the category of information. Graphical models and
data mining techniques will be utilized to categorize the information that spreads over online social
networks. Mathematical studies such as stability and bifurcation analysis can determine parameter
ranges or cut-off points for information spreading patterns to preserve predicable trajectories or
undergo sudden changes. As we can see from the next section, one of the important mathematical

22
concept, positive principal eigenvalue of PDE models, plays a pivotal role in the determination of
stability and bifurcation of the model below. As such, mathematical analysis of the PDE models
will solidify the foundation for information classification and parameter selection.

6.2 Stability and Bifurcation

Some information can spread in social media for a long period time even years [7]. The simulations
with real data collected from social media suggest that our models can achieve high accuracy over
an extended time frame, such as more than 40 hours. We are interested in how the size and
structure of the underlying network influence asymptotic behavior of news spreading, coexistence
of multiple competing news, or decay rate of news over an extended time frame. Stability analysis of
the models can predict trajectories of information flow over social media under small perturbations
of initial conditions. Bifurcation analysis of the models can reveal how changes of parameter values
of the models may cause qualitative or topological changes in the behavior of the models.
Long term behavior of the solutions of the model may depend on their corresponding elliptic
equations and parameters of reaction-diffusion equations, reflecting the popularity or spreadiblity
of news in social media. In the context of spatial ecology and physics, extensive research on
eigenvalue analysis, stability, effects of boundary conditions, evolution of dispersal has been done.
[38, 39] and references therein include some recent developments in spatial ecology. In a recent
paper [5], Dai, Ma and the first author study stability and bifurcation of the following model
arising from online social networks

 ut − (a(x)ux )x = λr(t)u h(x) − Ku ,

t > 1, l < x < L,
u(1, x) = u0 (x), l ≤ x ≤ L, (6.1)
ux (t, l) = 0, ux (t, L) + αu(t, L) = 0, t > 1,


where a(x) = de−bx , and α, d, b are positive constants, h(x) is positive and r(t) is a decay function
approaching r∞ as t → ∞. The parameter λ can be interpreted as a scale or factor of r(t). The
Robin boundary condition at x = L reflects the fact that there is an exchange of information at
the boundary. For α > 0, it indicates the flux −ux (t, L) is positive and therefore information
flows to the right. The simulations with real data set from Digg suggests that the Robin boundary
condition can even achieve high accuracy.
The steady state solution equation of (6.1) satisfies
−(a(x)u′ )′ = λr∞ u h(x) − Ku , l < x < L,

(6.2)
u′(l) = 0, u′ (L) + αu(L) = 0,
[5] studies the global bifurcation and stability of (6.1) and its implication to social media. Consider
the following associated eigenvalue problem with the parameter µ
−(a(x)u′ )′ = µh(x)u, l < x < L,

(6.3)
u′ (l) = 0, u′ (L) + αu(L) = 0,
It is known that (6.3 ) has the positive eigenfunction and principal eigenvalue µ+1 , which is deter-
mined by " RL #
2
1 h(x)u dx
= max RL l 2 .
µ+1 u∈W 1,2 ([l,L]),u6=0
|au′ | dx + αu2 (L)
l

23
µ+
It is shown in [5] that (6.2) has an unbounded branch of positive solutions bifurcating from ( r∞1 , 0).
The bifurcation result for (6.1) can be derived from well known general bifurcation results (see e.g.
µ+
[38]). It is also shown in [5] that if λ > r∞1 , then (6.1) has a positive steady solution which attracts
µ+
all its solutions as t goes to infinity; while λ < r∞1 implies all its nonnegative solutions go to zero.
Information diffusion, in particular, news diffusion over social media is often time sensitive.
Reaction-diffusion equations arising from online social networks involves a decaying function r(t).
Most of the existing research focus on the cases that r(t) is constant [38] or periodic [40]. Some
researchers also study eigenvalue, stability and persistence of nonautonomous parabolic PDEs [31,
32]. Mathematical analysis of associated eigenvalue and bifurcation problems can help identifying
thresholds for the change of social dynamics. Mathematically, eigenvalues, stability, bifurcation
and persistence of the reaction-diffusion equations are obtained for h(x) is positive and there are
some interesting results and more challenging problems when h(x) may take negative value [38, 39].
In the context of social media, h(x) may be negative for some x in particular when negative news
or spam are involved as many online users will delete them. With the availability of real data from
social media, we are in a position to study the challenging problems from both theoretical and
practical aspects, identify conditions for stability and persistence, and equally importantly, verify
the conditions through the real data sets collected from social media.
The study of spatial heterogeneity on information diffusion in social media has significant theo-
retical and practical implications. For example, since h(x) represents the adoption rate of informa-
tion for the group users whose distance away from the origin is x, the shape of h(x) may contribute
to locate the so-called the most influenced users or opinion leaders in social media. Other related
interesting problems include maximizing the total influenced users for certain classes of h(x). The
issue are of interest as it has commercial potentials and social implications. Numerous research
on this issue has emerged in recent years to design efficient algorithms for detecting opinions from
corpus of data [10]. Our PDE models provide a new framework to design detection algorithms by
studying mathematical properties of h(x). As a result, recent theoretical developments on non-
linear partial differential equations can facilitate the research and development of the important
social problem.

6.3 Free Boundary Value Problems

In our previous models (4.4),(4.7), (4.8) no flux boundary condition is assumed with the under-
standing of no formation across the boundaries at x = l, L. This might be reasonable when the
number of the groups, or the maximum of x are small. If the number of the groups are large,
the boundary of influenced online users may change as news spreads. To describe the change of
boundary with respect to time t, recently, Lei, Lin and the first author [4] proposed and studied
the following free boundary model to describe the spreading of news in online social networks

ut − duxx = r(t)u(1 − Ku ),


 t > 0, 0 < x < h(t),
ux (t, 0) = 0, u(t, h(t)) = 0, t > 0,

′ (6.4)

 h (t) = −µux (t, h(t)), t > 0,
h(0) = h0 , u(0, x) = u0 (x), 0 ≤ x ≤ h0 ,


24
where the initial function u0 (x) satisfies

u0 ∈ Σ(h0 ) = {φ ∈ C 2 ([0, h0 ]) : φ′ (0) = φ(h0 ) = 0, and φ > 0 in [0, h0 )}. (6.5)

x = h(t) is the moving boundary to be determined and represents the spreading front of news (such
as movie recommendation) among users. h′ (t) = −µux (t, h(t)) is the Stefan condition, where µ
represents the diffusion ability of the information in the new area. Let r∞ = limt→∞ r(t) > 0. It
is well known that the Stefan conditions have been used in many areas when phase transitions in
matters such as ice passing to water and other biological problems.
It was shown in [4] that the free boundary x = h(t) is increasing. Further, it was shown that
the information traveling either lasts forever or suspends in finite time. In addition, the impact of
the initial condition of news on its spread over online social networks is discussed. Let u0 = λϕ
for some ϕ belongs to Σ(h0 ), it was shown in [4] that if λ is sufficiently small, the information
vanishing must occur. Then it was shown that there exists a threshold λ∗ which is dependent on
ϕ ∈ Σ(h0 ) such that when λ > λ∗ , the information with the initial data u0 = λϕ travels in the
whole distance. Otherwise, the information vanishing happens.
Finally, if the information spreading happens, the expanding news front x = h(t) moves at a
constant speed k0 for large time. It is shown in [4] that the following relation holds

k0
lim √ = 2. (6.6)
µK
d
→∞ r∞ d
√
(6.6) indicates that the asymptotic traveling speed k0 is close to 2 r∞ d, which is also called the
minimum speed of (6.4) for the Fisher’s equation as we shall discuss in depth in Section 6.4.
The asymptotic traveling speeds of news fronts from free boundary problems and the minimum
speeds from traveling wave solutions in Section 6.4 can provide a theoretical guide for how to
maximize or control information propagation in online social networks. Several free boundary
value problems related to (4.8) remain to be mathematically studied. For example, information
diffusion with multiple channels can give rise to partial differential equation (5.4) defined in more
complex domains. In such a setting, it is interesting to investigate how news front h(t) changes at
different directions.

6.4 Traveling Wave Solutions and Spreading Speeds

In general, news diffusion in social media is time-sensitive and the influence of news decays dras-
tically as time elapses. However, some information can take a longer period of time to spread
in social media. Cha et al. [7] examined the aggregate growth patterns of two sets of 5,346 and
897 photos in the Flickr social network (http://www.flickr.com/) that are older than 1 year and
2 years respectively. It is found in [7] that the long-term trend in the number of cumulative fans
exhibits a pattern of steady linear growth. Many photos show quick rise in popularity during the
first few days after being loaded. However, most of the pictures exhibit a period of steady linear
growth after the first few (10 − 20) days. More importantly the linear-growth is sustained over
extended periods of time. The growth rate continues to increase even after 1 or 2 years. Thus, for
long-term propagation of information, we may choose distance metrics in a way that online users

25
can be embedded in the whole x-axis and the source of information can be viewed from either from
−∞ or ∞. Further, the parameters may be chosen to be independent of time t. As such, it is
meaningful to discuss long-time behavior and traveling solutions of the reaction-diffusion systems
for information diffusion in online social networks. A traveling wave solution often represents a
transition process connecting two steady states of interactive populations. Traveling wave fronts
of partial differential equations are solutions of the form u(x + ct) that has a fixed shape and
translate at a constant speed c as time evolves. The wave speed c are interpreted as the rate of
spread of the introduced population in biology. The theoretical results on traveling wave solution
of reaction-diffusion equations has successfully predicted spread rates of some introduced species.
For the long-time behavior and spatial spread of an advantageous gene in a population, Fisher
[18] and Kolmogorov, Petrowski, and Piscounov [17] studied the nonlinear parabolic equation

ut = duxx + f (u) (6.7)

here, u(x, t) represents the population density at location x and time t and f (0) = f (1) = 0 and
f (u) > 0 with no Allen effort. Traveling wave fronts of (6.7) are of interest since they enable us
to better understand how a population propagates. It was shown that (6.7) has a traveling wave
solution of the form u(x + ct) if and only if |c| ≥ c∗ and the minimum speed of propagation for
(6.7) is c∗ where p
c∗ = 2 f ′ (0)d
p
This basic formula c∗ = 2 f ′ (0)d establishes the speeding spreads for nonlinear parabolic equa-
tions and indicates the rate of spread is a linear function of time and that it can be predicted
quantitatively as a function of measurable life history parameters.
In spatial biology and epidemiology, it is of great interest to estimate how fast a species or
infectious disease spread within a population. Building on the mathematical foundation for the
theory of spreading speeds for cooperative systems by Weinberger et al. [14], the first author [16]
discussed spreading speeds for a large class of systems of reaction-diffusion equations which are
not necessarily cooperative through analysis of traveling waves via the convergence of initial data
to wave solutions. In particular, [16] provides a practical approach to calculate the propagation
speed based on the eigenvalues of the parameterized Jacobian matrix of its linearized system at
the initial state. Here we follow the direct derivation in [16] from the perspective of traveling wave
solutions. Let us consider a system of reaction-diffusion equations with zero and another positive
equilibria.
ut = Duxx + f(u) for x ∈ R, t ≥ 0 (6.8)
where u = (ui ), D = diag(d1 , d2 , ..., dN ), di > 0 for i = 1, ..., N

f(u) = (f1 (u), f2 (u), ..., fN (u)),

We are looking for a traveling wave solution u of (6.8) of the form u = u(x + ct), u ∈ C(R, RN )
with a speed of c . Substituting u(x, t) = u(x + ct) into (6.8) and letting ξ = x + ct, we obtain the
wave equation
Du′′ (ξ) − cu′ (ξ) + f(u(ξ)) = 0 for ξ ∈ R. (6.9)

26

Now if we look for a solution of the form (ui ) = eλξ ηλi , λ > 0, ηλ = (ηλi ) >> 0 for the linearization
of (6.9) at an initial equilibrium at the origin, we arrive at the following system

diag(di λ2 − cλ)ηλ + f ′ (0)ηλ = 0

which can be rewritten as the following eigenvalue problem

1
Aλ ηλ = cηλ , (6.10)
λ
where
Aλ = (ai,j 2 ′
λ ) = diag(di λ ) + f (0)

Let Ψ(Aλ ) be the spectral radius of Aλ for λ ∈ [0, ∞),

1
Φ(λ) = Ψ(Aλ ) > 0.
λ
In [16], under assumption that f ′ (0) has nonnegative off diagonal elements and some other condi-
tions, it was shown that Φ(λ) is a convex-like function and Φ(λ) goes to ∞ at both of 0 and ∞.
Therefore Φ(λ) assumes the minimum over the domain (0, ∞), that is

c∗ = inf Φ(λ) > 0 (6.11)

λ>0

The value of c∗ reflects information propagation speeds within a population. c∗ is often called the
minimum speed for systems that are linearly determinate. However, it is a challenging mathemat-
ical problem to prove a system is linearly determinate, in particular for non-cooperative system.
It is known that cooperative systems and a few of other type of systems are linearly determinate
[14, 16]. In the next three subsections we will discuss spreading speeds of systems for multiple
sources, competing information and epidemiological process. We are in the process to valid the
theoretical results with real data. Nevertheless, these results can serve a starting point to quantify
information diffusion spreading in online social networks.

6.4.1 Propagation Speeds for Multiple Sources

As we discuss before, certain information such as photos in the Flickr social network can take a
long period of time to spread and exhibits a pattern of steady linear growth. In this case, we can
assume the coefficient in (5.1) are all positive constant. (5.1) is a cooperative system since its
Jacobian
r1 − 2 r1ku1 1 + α1 u2

a1 u1
α2 u 2 r2 − 2 r2ku2 2 + α2 u1
has nonnegative off-diagonal elements for u1 , u2 ≥ 0. Now we can use (6.11) to calculate the
minimum speed of information propagation in social media for a longer period of time where ri , αi
are all positive constant. We are interested in a transition process connecting two equilibria (0, 0)
and (e1 , e2 ) of (5.1) where
k1 r2 (α1 k2 + r1 ) k2 r1 (α2 k1 + r2 )
e1 = , e2 =
r 1 r 2 − α1 α2 k 1 k 2 r 1 r 2 − α1 α2 k 1 k 2

27
We assume that
r 1 r 2 − α1 α2 k 1 k 2 > 0 (6.12)
and therefore e1 , e1 > 0. As a result, we can apply the theocratical results for cooperative systems
in [14, 16] to calculate the minimum speed of (5.1) for information propagation from (0, 0) and
(e1 , e2 ). It was shown in [14, 16] that there is a traveling wave solution connecting (0, 0) and (e1 , e2 )
and the minimum speed of the information propagation can be calculated by the formula (6.11).
For simplicity, assume that and d1 ≥ d2 and r1 ≥ r2 . Now it is easy to calculate that the Jacobian
of (5.1) at (0, 0) is
r1 0
0 r2
For λ ≥ 0, the largest eigenvalue Ψ(Aλ ) of the matrix

d 1 λ 2 + r1

0
0 d 2 λ 2 + r2

is d1 λ2 + r1 . Therefore
1 d 1 λ 2 + r1
Φ(λ) = Ψ(Aλ ) = inf
λ λ>0 λ
In view of (6.11), a standard calculation shows the propagation speed for (5.1) is
p
c∗ = 2 d 1 r1

On the other hand, if d2 ≥ d1 and r2 ≥ r1 , the propagation speed for (5.1) is

p
c∗ = 2 d 2 r2

This indicates that if (6.12) holds, or the effect of the interaction of the two sources are not too
large, the propagation speed for multiple information sources is largely determined by the more
popular source.

6.4.2 Propagation Speeds for Competing Information

If there are two pieces of information to compete each other to maximize its own influences on
online social networks, (5.2) may be used to model the interaction. We assume that all coefficient
are positive constants as we focus on its long term behavior. (5.2) is not a cooperative system.
However, we are interested in a transition process connecting two equilibria (k1 , 0) and (0, k2) of
(5.2). (5.2) can be brought into a cooperative system by the transforation v1 = u1 and v2 = k2 − u2

∂v1 ∂ 2 v1 v1
= d1 2 + r1 v1 (1 − ) − α1 v1 (k2 − v2 )
∂t ∂x k1
2
(6.13)
∂v2 ∂ v2 v2
= d2 2 − r2 (k2 − v2 ) + α2 v1 (k2 − v2 )
∂t ∂x k2

28
The Jacobian of (6.13)

r1 − 2 rk1 v11 − α1 (k2 − v2 )

a1 v1
α2 (k2 − u2 ) −r2 + 2 rk2 v22 − α2 v1

has nonnegative off-diagonal elements for v1 , v2 ≥ 0 and v2 ≤ k2 . We assume that

r 1 > α1 k 2 (6.14)

to ensure that the growth of u1 sustains even with the competition from u2 . We also assume that
d1 ≥ d2 , information u1 is not less popular than information u2 .
We are interested in a transition process connecting two equilibria (0, 0) and (k1 , k2) of (6.13)
that correspond to the two equilibria (0, k2) and (k1 , 0) of (5.2). Again we can apply the theocratical
results for cooperative systems in [14, 16] to calculate the minimum speed of (6.13) for information
propagation from (0, 0) and (k1 , k2 ). It was shown in [14, 16] that there is a traveling wave solution
of (6.13) connecting its two equilibria (0, 0) and (k1 , K2 ) and the minimum speed of the information
propagation can be calculated by the formula (6.11). Now it is easy to calculate that the Jacobian
of (6.13) at (0, 0) is
r 1 − α1 k 2 0
α2 k 2 −r2
For λ ≥ 0, the largest eigenvalue Ψ(Aλ ) of the matrix

d 1 λ 2 + r 1 − α1 k 2

0
α2 k 2 d 2 λ 2 − r2

is d1 λ2 + r1 − α1 k2 . Therefore
1 d 1 λ 2 + r 1 − α1 k 2
Φ(λ) = Ψ(Aλ ) = inf
λ λ>0 λ
In view of (6.11), a standard calculation shows the propagation speed for (6.13) is
p
c∗ = 2 d1 (r1 − α1 k2 )

The conclusion indicates that information u1 will win the competition if the growth of u1 sustains
even with the competition from u2 , and information u1 is not less popular than information u2.
The propagation speed of the information is largely determined by the popularity and growth of
the winner minus negative affect from the competition.

6.4.3 Propagation Speeds for Spatial Epidemiological Models

(6.15) is a non cooperative systems. In general it still is an open question to show it is linearly
determinate. Nevertheless, for (6.15), we can show c∗ is the cut-off point of the existence of
traveling wave solutions. Here we would like to examine the traveling solutions and information
adoption rate of a spatial SIR model for information diffusion in online social networks in a closed
population consisting of susceptible individuals (S(t)), infected individuals (I(t)) (i.e. adopted the

29
information) and removed individuals (R(t)) (i.e. refractory). The diffusive SIR model with the
standard incidence takes the following form

∂t S = d1 ∂xx S − βSI/(S + I)
∂t I = d2 ∂xx I + βSI/(S + I) − γI (6.15)
∂t R = d3 ∂xx R + γI

here γ is the remove rate of the infected group, β is the adoption (or influence) rate between
the susceptible and infectious groups. d1 , d2, d3 > 0 represents the popularity of information with
each of the groups. In general, the information is more popular for the I group than the S group
and, there for, it is assumed that d2 ≥ d1 . For the long-term propagation of information in
online social networks, it is understood that the adoption rate β can be constant. (6.15) is an
extension of the SI model (5.3) with the refractory group R. A traveling wave solution of (6.15)
with the form (S(x + ct), I(x + ct, t), R(x + ct, t) represent the transition process of information
diffusion from the initial adoption-free equilibrium (S−∞ , 0, R−∞ ) to another adoption-free state
(S∞ , 0, R∞ ) with S∞ being determined by the influence rate β and the remove rate γ, as well as
possibly the popularity of information. As such, it is important to determine whether traveling
waves exist and what the propagation speed c is. Thus we shall look for traveling wave solutions
of the form (S(x + ct), I(x + ct), R(x + ct)). Because R does not appear in the system of equations
for the susceptible individuals S and infected individuals I, we omit the R equation and study the
following system with S and I only.

∂t S = d1 ∂xx S − βSI/(S + I)
(6.16)
∂t I = d2 ∂xx I + βSI/(S + I) − γI

and satisfy the following boundary conditions at infinity:

S(−∞) = S−∞ , S(∞) < S−∞ , I(±∞) = 0. (6.17)

In the context of infectious disease, Wang, the first author and Wu [15] studied the traveling waves
and propagation speed of (6.16). The result of [15] is applicable for information diffusion in online
social networks. For (6.16), the nonlinearity f in (6.8) is no longer cooperative and some of the
off-diagonal elements of f ′ may be negative. It is still an open question what additional conditions
would guarantee that Φ(λ) maintains the convex-like property. However, for (6.16), Φ(λ) is a
convex function and we note that the minimum wave speed can be obtained by its linearization at
the initial state (S−∞ , 0). In fact, it is easy to calculate that the Jacobian of (6.16) at (S−∞ , 0) is

0 −β
0 β−γ

Its largest eigenvalue is β − γ. For µ ≥ 0 and d2 ≥ d1 , the largest eigenvalue Ψ(Aλ ) of the matrix

d 1 λ2

−β
0 d 2 λ2 + β − γ

30
is d2 µ2 + β − γ. Therefore

1 d 2 λ2 + β − γ
Φ(λ) = Ψ(Aλ ) = inf
λ λ>0 λ
In view of (6.11), a standard calculation shows the wave speed for (6.16) is
p
c∗ = 2 d2 (β − γ)
p
In addition, [15] shows that c∗ = 2 d2 (β − γ) is the cut-off value of c for which there is a
traveling wave for (6.16) of the form (S(x + ct), I(x + ct)). Specifically, it is shown in [15] that if
R0 := β/γ > 1 (R0 is thepbasic reproduction number for the corresponding ordinary differential
system) and c > c∗ := 2 d2 (β − γ) , then there exists a non-trivial and non-negative traveling
wave solutions (S, I) of (6.16) such that the boundary conditions (6.17) are satisfied. Furthermore,
S is monotonically decreasing, 0 ≤ I(x) ≤ S(−∞) − S(∞) for all x ∈ R, and
Z ∞ Z ∞
βS(x)I(x)
γI(x)dx = dx = c[S(−∞) − S(∞)]. (6.18)
−∞ −∞ S(x) + I(x)
p
On the other hand, if R0 = β/γ ≤ 1 or c < c∗ := 2 d2 (β − γ), then there exist no non-trivial and
non-negative
p traveling wave solution (S, I) of (6.16) satisfying the boundary conditions (6.17).
∗
c = 2 d2 (β − γ) is particularly of interest as it is the cut-off point for the existence of traveling
waves of (6.16). In other words,p the cut-off speed for traveling waves of (6.16) is determined by
∗
its linearized systems. c = 2 d2 (β − γ) can be viewed as the speed of (6.8) for information to
spread in a social network. The result also indicates that the diffusion speed of information is
proportional to the square root of the product of the popularity of the information and difference
of the adoption rate and remove rate of the adopted group.

7 Concluding Remarks
In this paper, we review the recent development in modeling information diffusion in online social
networks with partial differential equations. Building on intuitive cyber-distance, we propose
a number of reaction-diffusion equations to characterize information spreading in temporal and
spatial dimensions. We start with a number of simple spatial models with extensive validations
from real datasets collected from popular online social networks such as Digg.com and Twitter.com.
Our experiment results show that the model achieves high accuracy for the majority of news with
more than 3000 votes in Digg and Twitter. In general, our models can achieve over 90% accuracy.
We discover strong evidence of the feasibility to model the information diffusion process in online
social networks such as Digg and Twitter. We also present a number of spatial models for complex
interactions. The PDEs models take into account influences from various out-of-network sources
such as the mainstream media, and provide a new analytic framework to study the interplay of
structural and topical influences on information diffusion over social media.
To the best of our knowledge, our work is the first attempt to propose PDE-based models for
characterizing and predicting the temporal and spatial patterns of information diffusion over online

31
social networks. The temporal and spatial characteristics of information diffusion process sheds
light on how information spreads and to what extend external influences affect information diffusion
over online social networks. We are in the process to validate the theoretical results such as news
propagation speeds we present in this paper. Our goal is to predict the information diffusion process
for a given news story based on the initial phase of information spreading. Our future works include
examining how parameter estimations of the models are related to information contents as there
are significant differences in the mechanics of information diffusion across topics. These parameters
will provide key measurements to quantify online user interactions in online social networks and
therefore can be used to classify news stories in online social networks. Mathematical analysis such
as bifurcation analysis of the models plays a significant role in parameter estimations. In addition,
mathematical analysis of the PDE model with heterogeneity in distance can shed new light on
the identification of influential spreader or opinion leaders in online social networks. Not only the
mathematical study of the models further confirm the validity of the models, but also reveal and
predict new mechanisms governing information flow in social media. As we can see from the paper,
there is a daunting task to analytically and numerically study mathematical problems arising from
social media. The complexity of human interactions and rapid change of social media make PDE
models from social media even more complex. We choose simple, yet accurate PDE models in this
paper to highlight the new opportunities and challenges for modeling information diffusion over
online social networks for mathematicians as well as computer scientists and researchers in social
media.

References
[1] F. Wang, H. Wang, K. Xu, Diffusive logistic model towards predicting information diffusion in online social net-
works, 32nd International Conference on Distributed Computing Systems Workshops (ICDCSW), 2012, pp. 133–139,
http://dx.doi.org/10.1109/ICDCSW.2012.16

[2] F. Wang, H. Wang, K. Xu, J. Wu and J. Xia, Characterizing Information Diffusion in Online Social Networks with
Linear Diffusive Model, 33nd International Conference on Distributed Computing Systems (ICDCS), 2013, pp 307-316.
http://www.temple.edu/cis/icdcs2013/data/5000a307.pdf

[3] C. Peng, K. Xu, F. Wang, H. Wang, Predicting Information Diffusion Initiated from Multiple Sources in Online Social Networks,
2013 Six International Symposium on Computational Intelligence and Design (ISCID), accepted to appear.

[4] C. Lei, Z, Lin and H. Wang, The free boundary problem describing information diffusion in online social networks, Journal of
Differential Equations, 254(2013)1326-1341

[5] G. Dai, R.Ma and H. Wang, Bifurcation and stablity of partial differential models in social media, preprint.

[6] F. Wang, K. Xu and H. Wang, Discovering Shared Interests in 2012 32nd International Conference on Distributed Computing
Systems Workshops (ICDCSW), 2012, pp.163-168

[7] M. Cha, A. Mislove and K. Gummadi, A measurement-driven analysis of information propagation in the flickr social network,
Proceedings of the 18th international conference on World wide web, 2009, pp 721-730

[8] D. Romero, C. Tan, and J. Ugander. ”On the Interplay between Social and Topical Structure.” Proc. 7th International AAAI
Conference on Weblogs and Social Media (ICWSM), 2013

[9] Z, Tufekci, Big Data: Pitfalls, Methods and Concepts for an Emergent Field (March 7, 2013). Available at SSRN:
http://ssrn.com/abstract=2229952 or http://dx.doi.org/10.2139/ssrn.2229952

[10] A. Guille, H. Hacid, C. Favre and D. Zighed, Information Diffusion in Online Social Networks: a Survey, SIGMOD Record, 42(2013)
pp.17-28

[11] S. Myers and J. Leskovec. Clash of the contagions: Cooperation and competition in information diffusion. In ICDM’12, pages
539–548, 2012.

32
[12] S. Myers, C. Zhu and J. Leskovec, Information Diffusion and External Influence in Networks, KDD ’12 Proceedings of the 18th
ACM, SIGKDD international conference on Knowledge discovery and data mining pp. 33-41

[13] M. Cha, A. Mislove, B. Adams, K. Gummadi. Characterizing social cascades in flickr. Proceeding WOSN ’08 Proceedings of the
first workshop on Online social networks

[14] H. Weinberger, M. Lewis and B. Li, Analysis of linear determinacy for spread in cooperative models, J. Math. Biol. 45(2002)
183-218.

[15] X-S Wang, H. Wang and J. Wu, Traveling Waves of Diffusive Predator-Prey Systems: Disease Outbreak Propagation , Discrete
and Continuous Dynamical Systems A, 32(2012) 3303–3324.

[16] H. Wang, Spreading speeds and traveling waves for non-cooperative reaction-diffusion systems, J. of nonlinear Sciences, 21
(2011)747–783.

[17] A. Kolmogorov, Petrovsky, N.I. Piscounov, Etude de lequation de la diffusion avec croissance de la quantite de matiere et son
application a un probleme biologique. Bull. Moscow Univ. Math. Mech., 1(6), 1–26 (1937)

[18] R. Fisher, The wave of advance of advantageous genes. Ann. of Eugenics, 7(1937) 355 - 369.

[19] J. Yang, and S. Counts, Comparing Information Diffusion Structure in Weblogs and Microblogs. 4th Int’l AAAI Conference on
Weblogs and Social Media, 2010.

[20] J. Yang and J. Leskovec. Modeling information diffusion in implicit networks. In ICDM 10, pages 599-608, 2010.

[21] F. Jin, E. Dougherty, P. Saraf, Y. Cao, N, Ramakrishnan, Epidemiological Modeling of News and Rumors on Twitter, Proceedings
of the 7th Workshop on Social Network Mining and Analysis, 2013, Article No. 8

[22] G. Ver Steeg, R. Ghosh and K. Lerman. What Stops Social Epidemics? Proceedings of the Fifth International AAAI Conference
on Weblogs and Social Media, 2011.

[23] R. Ghosh and K. Lerman. A framework for quantitative analysis of cascades on networks, ACM International Conference on Web
search and data mining, 2011

[24] K. Lerman, and R. Ghosh, Information Contagion: an Empirical Study of Spread of News on Digg and Twitter Social Networks.
In Proceedings of 4th International Conference on Weblogs and Social Media (ICWSM), 2010

[25] S. Tang, N. Blenn, Christian Doerr and Piet Van Mieghem. Digging in the Digg Social News Website. IEEE Transaction on
Multimediam, 2011

[26] C. Gerald and P. Wheatley, Applied Numerical Analysis, Addison-Wesley, 1994

[27] A. Barrat, M. Barthlemy, A. Vespignani, Dynamical Processes on Complex Networks, Cambridge University Press, 2008.

[28] J.D. Murray, Mathematical Biology I. An Introduction, Springer-Verlag, New York, 1989.

[29] M. Newman, The Structure and Function of Complex Networks, SIAM REVIEW, 45(2003), 167-256.

[30] M. Newman, Networks, An introdution, Oxford University Press, 2010.

[31] J. Mierczynski, Janusz, The principal spectrum for linear nonautonomous parabolic PDEs of second order: basic properties, Journal
of Differential Equations, 168(2000) 453-476.

[32] J. langa, J. Robinson, A. Rodriguez-Bernal, AND A. Suarez, Permanence and asymptotically stable complete trajectories for
nonautonomous lotkavolterra models with diffusion, SIAM J. MATH. ANAL. 40(2009) 2179-2216.

[33] F. Schneider, A. Feldmann, B. Krishnamurthy, and W. Willinger, “Understanding Online Social Network Usage from a Network
Perspective,” in Proceedings of ACM SIGCOMM International Measurement Conference, November 2009.

[34] F. Benevenuto, T. Rodrigues, M. Cha, and V. Almeida, “Characterizing User Behavior in Online Social Networks,” in Proceedings
of ACM SIGCOMM International Measurement Conference, November 2009.

[35] A. Nazir, S. Raza, D. Gupta, C.-N. Chuah, and B. Krishnamurthy, “Network Level Footprints of Facebook Applications,” in
Proceedings of ACM SIGCOMM International Measurement Conference, November 2009.

[36] B. Yu, H. Fei. Modeling Social Cascade in the Flickr Social Network, Fuzzy Systems and Knowledge Discovery, 2009.

[37] R. Kumar, J. Novak and A. Tomkins, Structure and Evolution of Online Social Networks, Proceedings of the 12th ACM SIGKDD
international conference on Knowledge discovery and data mining, 20-23, 2006.

[38] R. S. Cantrell and C. Cosner, Spatial Ecology via Reaction-diffusion Equations, John Wiley & Sons Ltd, 2003.

33
[39] Y. Lou, Some challenging mathematical problems in evolution of disperal and population dynamics Tutorials in Mathematical
Biosciences, 2008, 171-205.

[40] P. Hess, Periodic Parabolic Boundary Value Problems and Positivity, Longman Scientific & Technical, Harlow, UK, 1991.

[41] H. Smith, Monotone dynamical systems: An introduction to the theory of competitive and cooperative systems, Amer. Math. Soc.,
Providence, 1995.

[42] http://en.wikipedia.org/wiki/Twitter.

Certified Energy Manager Study Guide - Print
100% (1)
Certified Energy Manager Study Guide - Print
9 pages
Predicting Temporal Variance of Information Cascades in Online Social Networks
No ratings yet
Predicting Temporal Variance of Information Cascades in Online Social Networks
11 pages
Modeling Information Diffusion in Social Networks With Ordinary Linear Differential Equations
No ratings yet
Modeling Information Diffusion in Social Networks With Ordinary Linear Differential Equations
23 pages
Detecting and modelling real percolation and phase transitions of information on social media
No ratings yet
Detecting and modelling real percolation and phase transitions of information on social media
19 pages
Information Diffusion Online Social Networks
No ratings yet
Information Diffusion Online Social Networks
12 pages
Diffusion in Social Network
No ratings yet
Diffusion in Social Network
12 pages
CA Proposal PDF
No ratings yet
CA Proposal PDF
1 page
Social Network Analysis and Mining
No ratings yet
Social Network Analysis and Mining
10 pages
Researching Internet Governance: Methods, Frameworks, Futures
From Everand
Researching Internet Governance: Methods, Frameworks, Futures
Laura Denardis
No ratings yet
Partial Differential Equation Modeling of Malware
No ratings yet
Partial Differential Equation Modeling of Malware
12 pages
The Spreading of Information in Online Social Netw
No ratings yet
The Spreading of Information in Online Social Netw
9 pages
Social Influence and Spread Dynamics in Social Networks: Xiaolong ZHENG, Yongguang ZHONG, Daniel ZENG, Fei-Yue WANG
No ratings yet
Social Influence and Spread Dynamics in Social Networks: Xiaolong ZHENG, Yongguang ZHONG, Daniel ZENG, Fei-Yue WANG
10 pages
Applications of Social Media and Social Network Analysis - Lecture Notes in Social Networks PDF
100% (1)
Applications of Social Media and Social Network Analysis - Lecture Notes in Social Networks PDF
247 pages
Unit 4 - SNA AK
No ratings yet
Unit 4 - SNA AK
15 pages
A Stochastics Branching Process Model_formatted
No ratings yet
A Stochastics Branching Process Model_formatted
5 pages
Diffusion of Information Through Online Social Media: I I T M
No ratings yet
Diffusion of Information Through Online Social Media: I I T M
12 pages
Disconnected: Exploring the Decline of Social Networks
From Everand
Disconnected: Exploring the Decline of Social Networks
Milan Frankl
No ratings yet
Data Science
From Everand
Data Science
John D. Kelleher
3/5 (8)
Predictability of information spreading on online social networks
No ratings yet
Predictability of information spreading on online social networks
12 pages
Social Networks
100% (1)
Social Networks
5 pages
Role of Social Networks in Information Diffusion
100% (2)
Role of Social Networks in Information Diffusion
10 pages
Leaders Vs Mass
No ratings yet
Leaders Vs Mass
16 pages
Decoding the Social World: Data Science and the Unintended Consequences of Communication
From Everand
Decoding the Social World: Data Science and the Unintended Consequences of Communication
Sandra Gonzalez-Bailon
No ratings yet
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
From Everand
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
Sebastian Thelen
5/5 (1)
Comprehensive Decomposition Optimization Method for Locating Key Sets of Commenters Spreading Conspiracy Theory in Complex Social Networks [ALASSAD, Mustafa][HUSSAIN, Muhammad][AGARWAL, Nitin]
No ratings yet
Comprehensive Decomposition Optimization Method for Locating Key Sets of Commenters Spreading Conspiracy Theory in Complex Social Networks [ALASSAD, Mustafa][HUSSAIN, Muhammad][AGARWAL, Nitin]
28 pages
Large-Scale Multi-Agent-based Modeling and Simulation of Microblogging-Based Online Social Network
No ratings yet
Large-Scale Multi-Agent-based Modeling and Simulation of Microblogging-Based Online Social Network
12 pages
Classification Unlabled Online Media
No ratings yet
Classification Unlabled Online Media
13 pages
u-2osn
No ratings yet
u-2osn
22 pages
Modeling the co-diffusion of competing memes in online social networks
No ratings yet
Modeling the co-diffusion of competing memes in online social networks
11 pages
_FO292D24D642
No ratings yet
_FO292D24D642
7 pages
Kim 2020
No ratings yet
Kim 2020
14 pages
Open Space: The Global Effort for Open Access to Environmental Satellite Data
From Everand
Open Space: The Global Effort for Open Access to Environmental Satellite Data
Mariel Borowitz
No ratings yet
Designing an Internet
From Everand
Designing an Internet
David D. Clark
No ratings yet
Beyond Data: Reclaiming Human Rights at the Dawn of the Metaverse
From Everand
Beyond Data: Reclaiming Human Rights at the Dawn of the Metaverse
Elizabeth M. Renieris
No ratings yet
p369 Guo PDF
No ratings yet
p369 Guo PDF
9 pages
Workflow Management: Models, Methods, and Systems
From Everand
Workflow Management: Models, Methods, and Systems
Kees Van Hee
3.5/5 (11)
Spreading dynamics of information on online social networks
No ratings yet
Spreading dynamics of information on online social networks
11 pages
Modern Technology and Conflicts
From Everand
Modern Technology and Conflicts
Kelly Ngyah
No ratings yet
The Great Disconnect: Preparing For The Day The Internet Dies
From Everand
The Great Disconnect: Preparing For The Day The Internet Dies
Lloyd Leon
No ratings yet
Recent Advances in Information Diffusion and Influence Maximization of Complex Social Networks
No ratings yet
Recent Advances in Information Diffusion and Influence Maximization of Complex Social Networks
36 pages
Get Smart Fast: An analysis of Internet based collaborative knowledge environments for critical digital media autonomy
From Everand
Get Smart Fast: An analysis of Internet based collaborative knowledge environments for critical digital media autonomy
Joe Tojek
No ratings yet
Fienberg BriefHistoryStatistical 2012
No ratings yet
Fienberg BriefHistoryStatistical 2012
16 pages
Social Network Analysis Notes
No ratings yet
Social Network Analysis Notes
8 pages
Book SN (Autosaved1) 2013
No ratings yet
Book SN (Autosaved1) 2013
313 pages
CSE101 Chapter 3 Propagation of Information and Misinformation
No ratings yet
CSE101 Chapter 3 Propagation of Information and Misinformation
31 pages
entropy-25-00916arst
No ratings yet
entropy-25-00916arst
16 pages
Paper List for Sna
No ratings yet
Paper List for Sna
1 page
Garimella, K., Gionis, A., Parotsidis, N., & Tatti, N. (2017). Balancing Information
No ratings yet
Garimella, K., Gionis, A., Parotsidis, N., & Tatti, N. (2017). Balancing Information
19 pages
Research paper
No ratings yet
Research paper
9 pages
How Others Affect Your Twitter #Hashtag Adoption? Examination of Community - Based and Context-Based Information Diffusion in Twitter
No ratings yet
How Others Affect Your Twitter #Hashtag Adoption? Examination of Community - Based and Context-Based Information Diffusion in Twitter
4 pages
Big Data Is Not a Monolith
From Everand
Big Data Is Not a Monolith
Cassidy R. Sugimoto
5/5 (1)
Agent-Based Modeling of Online Social Networks With MapReduce
No ratings yet
Agent-Based Modeling of Online Social Networks With MapReduce
12 pages
Devising Parametric User Models For Processing and Analysing Social Media Data To Influence User Behaviour
No ratings yet
Devising Parametric User Models For Processing and Analysing Social Media Data To Influence User Behaviour
41 pages
E - Learning Modules: Dlr Associates Series
From Everand
E - Learning Modules: Dlr Associates Series
Dan Ryan
No ratings yet
[IJCST-V13I1P3]:Prapti Pandey,Vivek Shukla,Rohit Miri,Praveen Chouksey,Dheeraj Agrawal, Ajay Kumar Raja
No ratings yet
[IJCST-V13I1P3]:Prapti Pandey,Vivek Shukla,Rohit Miri,Praveen Chouksey,Dheeraj Agrawal, Ajay Kumar Raja
7 pages
Adaptive Influence Maximization in Dynamic Social Networks
No ratings yet
Adaptive Influence Maximization in Dynamic Social Networks
14 pages
An Endocrine Immune System Inspired Controllable Information D 2018 Neurocom
No ratings yet
An Endocrine Immune System Inspired Controllable Information D 2018 Neurocom
11 pages
Collective Behavior of Social Networking Sites
No ratings yet
Collective Behavior of Social Networking Sites
5 pages
Communication Nets: Stochastic Message Flow and Delay
From Everand
Communication Nets: Stochastic Message Flow and Delay
Leonard Kleinrock
3/5 (1)
Grid Computing: A Revolutionary Approach to Scientific Research and Data Management
From Everand
Grid Computing: A Revolutionary Approach to Scientific Research and Data Management
Pasquale De Marco
No ratings yet
From Individual Behavior To Influence Networks: A Case Study On Twitter
No ratings yet
From Individual Behavior To Influence Networks: A Case Study On Twitter
8 pages
Boldrin Belluzi, Maykel Tesis
No ratings yet
Boldrin Belluzi, Maykel Tesis
184 pages
Sensitivity Analysis of Ordinary Differential Equation Models
No ratings yet
Sensitivity Analysis of Ordinary Differential Equation Models
33 pages
Sensibilidad Virginia Tech
No ratings yet
Sensibilidad Virginia Tech
16 pages
Crime Social Factors
No ratings yet
Crime Social Factors
7 pages
A Novel Random Forest Approach For Imbalance 2020 Elseiver NO
No ratings yet
A Novel Random Forest Approach For Imbalance 2020 Elseiver NO
13 pages
Insights Into The Long-Term Pollution 2020
No ratings yet
Insights Into The Long-Term Pollution 2020
36 pages
Ferath Kherif PCA
No ratings yet
Ferath Kherif PCA
17 pages
Energy Forms Changes Simulation Worksheet
20% (5)
Energy Forms Changes Simulation Worksheet
4 pages
CMT Lab3
No ratings yet
CMT Lab3
6 pages
Styrene-Butadiene Rubber Adhesives: Midgley
No ratings yet
Styrene-Butadiene Rubber Adhesives: Midgley
2 pages
Practical No. 2 - Ultrasonic Pulse Velocity Test
No ratings yet
Practical No. 2 - Ultrasonic Pulse Velocity Test
8 pages
Lecture 01 Elementary Concepts in Material Science
No ratings yet
Lecture 01 Elementary Concepts in Material Science
83 pages
Core Lab New
No ratings yet
Core Lab New
105 pages
Nataly Damaris Rodríguez Camacho 1914630: Find The Definition and Some Example or Figure of The Following Terms
No ratings yet
Nataly Damaris Rodríguez Camacho 1914630: Find The Definition and Some Example or Figure of The Following Terms
7 pages
Ecoray - Digital HF525 PLUS
100% (1)
Ecoray - Digital HF525 PLUS
7 pages
Optimisation of Power Cable Ampacity in Offshore W
No ratings yet
Optimisation of Power Cable Ampacity in Offshore W
10 pages
Gas Cutting by Gnanasekaran
No ratings yet
Gas Cutting by Gnanasekaran
44 pages
IGCSE Math_0580_Subject Contents (2025-2027)
No ratings yet
IGCSE Math_0580_Subject Contents (2025-2027)
46 pages
Fernando Introduction To Environmental Fluid Dynamics
No ratings yet
Fernando Introduction To Environmental Fluid Dynamics
15 pages
Image Enhancement by Elliptic Discrete Fourier Transforms: Artyom M. Grigoryan Sos S. Agaian
No ratings yet
Image Enhancement by Elliptic Discrete Fourier Transforms: Artyom M. Grigoryan Sos S. Agaian
10 pages
Joint Factor
No ratings yet
Joint Factor
8 pages
Flexible Pavement - Aspects of Basic Design As Per IRC Method
100% (2)
Flexible Pavement - Aspects of Basic Design As Per IRC Method
87 pages
Voltage Stability
No ratings yet
Voltage Stability
27 pages
Programme Guide English BSC PDF
No ratings yet
Programme Guide English BSC PDF
179 pages
Electric Field Potential
100% (1)
Electric Field Potential
172 pages
TALAT Lecture 3703: Stretch Forming
No ratings yet
TALAT Lecture 3703: Stretch Forming
13 pages
YCT Motion in One Dimensions NEET JEE Practice Questions
100% (1)
YCT Motion in One Dimensions NEET JEE Practice Questions
210 pages
Programme For Entrance Examination - 2019 For 1St Phase of Admission
No ratings yet
Programme For Entrance Examination - 2019 For 1St Phase of Admission
1 page
Physics Reference Manual
No ratings yet
Physics Reference Manual
354 pages
M 1
No ratings yet
M 1
38 pages
Copia de Medical Marijuana For Veterans Thesis Statement XL by Slidesgo
No ratings yet
Copia de Medical Marijuana For Veterans Thesis Statement XL by Slidesgo
86 pages
N-1 Operational Criteria PDF
No ratings yet
N-1 Operational Criteria PDF
4 pages
Advanced Refrigeration Systems For Operators PDF
No ratings yet
Advanced Refrigeration Systems For Operators PDF
4 pages
Caterpillar
No ratings yet
Caterpillar
3 pages
Martensitic Stainless Steels
No ratings yet
Martensitic Stainless Steels
8 pages
Evaporation Condensation and Melting 1
No ratings yet
Evaporation Condensation and Melting 1
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Modeling Information Diffusion

Uploaded by

Modeling Information Diffusion

Uploaded by

Modeling Information Diffusion in Online Social Networks

with Partial Differential Equations ∗

Keywords: Primary: 35K57, 35K45; Secondary: 92D25

2.2 Cyber-distance Based on Friendship Hops

Figure 1: Distribution of neighbors of four stories

2.3 Temporal and Spatial Patterns of Information Diffusion

Figure 2: Densities of influenced users over 50 hours for s1 to s4

3 Conservation Law of Information Flow

x=1 x=2 x=3 x=4 x=5

Figure 3: Information diffusion process in online social networks

Table 1: Equation (3.2) in mathematical biology with physical distance

4 Partial Differential Equation Models

4.1 Diffusive Logistic Model

• I represents the density of influenced users with a distance of x at time t;

4.1.1 Initial Density Function Construction

4.1.2 Accuracy of Diffusive Logistic Model

|predicted value − actual value|

4.2 Linear Diffusive Model

h(x) = −(x − ρ)(x − σ) (4.6)

4.2.1 Accuracy of Linear Model

Distance 1 1.5 2 2.5 3 3.5

• h(x) represents the heterogeneity of growth rate in distance x.

4.3.1 Accuracy of Logistic Model with Variable Content-based Diffusion

4.4 PDE model with Distance Based on Shared Interests

Figure 8: Accuracy of (4.8) for photo tweeted by President Obama at Twitter

Figure 9: Accuracy of (4.8) at x = 1, 2, 3, 4 for photo tweeted by President Obama

5 Modeling Complex Interactions

5.1 Multiple Information Sources

• Ki , i = 1, 2 represents the carrying capacity, which is the maximum possible density of

5.2 Competing Information

5.3 Spatial Epidemiological Models

5.4 Multiple Communication Channels

6 Model Analysis and Discussion

6.1 Information Classification and Parameter Selection

6.2 Stability and Bifurcation

6.3 Free Boundary Value Problems

u0 ∈ Σ(h0 ) = {φ ∈ C 2 ([0, h0 ]) : φ′ (0) = φ(h0 ) = 0, and φ > 0 in [0, h0 )}. (6.5)

6.4 Traveling Wave Solutions and Spreading Speeds

ut = duxx + f (u) (6.7)

f(u) = (f1 (u), f2 (u), ..., fN (u)),

diag(di λ2 − cλ)ηλ + f ′ (0)ηλ = 0

which can be rewritten as the following eigenvalue problem

Let Ψ(Aλ ) be the spectral radius of Aλ for λ ∈ [0, ∞),

c∗ = inf Φ(λ) > 0 (6.11)

6.4.1 Propagation Speeds for Multiple Sources

On the other hand, if d2 ≥ d1 and r2 ≥ r1 , the propagation speed for (5.1) is

6.4.2 Propagation Speeds for Competing Information

r1 − 2 rk1 v11 − α1 (k2 − v2 )

has nonnegative off-diagonal elements for v1 , v2 ≥ 0 and v2 ≤ k2 . We assume that

6.4.3 Propagation Speeds for Spatial Epidemiological Models

and satisfy the following boundary conditions at infinity:

S(−∞) = S−∞ , S(∞) < S−∞ , I(±∞) = 0. (6.17)

[26] C. Gerald and P. Wheatley, Applied Numerical Analysis, Addison-Wesley, 1994

[30] M. Newman, Networks, An introdution, Oxford University Press, 2010.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.