0% found this document useful (0 votes)
43 views486 pages

Temporal Network Textbook

The document outlines the 'Computational Social Sciences' series, focusing on the application of quantitative and computational methods to analyze social phenomena, particularly the co-evolution of communication technology and social behavior. The second edition of 'Temporal Network Theory' reflects advancements in the field, especially the influence of machine learning, and aims to provide insights into temporal networks and their dynamics. The book features contributions from various experts, presenting diverse approaches to understanding the structure and behavior of temporal networks.

Uploaded by

guangyahao1999
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views486 pages

Temporal Network Textbook

The document outlines the 'Computational Social Sciences' series, focusing on the application of quantitative and computational methods to analyze social phenomena, particularly the co-evolution of communication technology and social behavior. The second edition of 'Temporal Network Theory' reflects advancements in the field, especially the influence of machine learning, and aims to provide insights into temporal networks and their dynamics. The book features contributions from various experts, presenting diverse approaches to understanding the structure and behavior of temporal networks.

Uploaded by

guangyahao1999
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 486

Computational Social Sciences

Petter Holme
Jari Saramäki Editors

Temporal
Network
Theory
Second Edition
Computational Social Sciences
Computational Social Sciences

A series of authored and edited monographs that utilize quantitative and


computational methods to model, analyze and interpret large-scale social phenom-
ena. Titles within the series contain methods and practices that test and develop
theories of complex social processes through bottom-up modeling of social
interactions. Of particular interest is the study of the co-evolution of modern
communication technology and social behavior and norms, in connection with
emerging issues such as trust, risk, security and privacy in novel socio-technical
environments. Computational Social Sciences is explicitly transdisciplinary: quan-
titative methods from fields such as dynamical systems, artificial intelligence,
network theory, agent-based modeling, and statistical mechanics are invoked and
combined with state-of-the-art mining and analysis of large data sets to help us
understand social agents, their interactions on and offline, and the effect of these
interactions at the macro level. Topics include, but are not limited to social
networks and media, dynamics of opinions, cultures and conflicts, socio-technical
co-evolution and social psychology. Computational Social Sciences will also
publish monographs and selected edited contributions from specialized conferences
and workshops specifically aimed at communicating new findings to a large
transdisciplinary audience. A fundamental goal of the series is to provide a single
forum within which commonalities and differences in the workings of this field may
be discerned, hence leading to deeper insight and understanding.

Series Editors
Elisa Bertino Larry S. Liebovitch
Purdue University, West Lafayette, Queens College, City University of
IN, USA New York, Flushing, NY, USA
Claudio Cioffi-Revilla Sorin A. Matei
George Mason University, Fairfax, Purdue University, West Lafayette,
VA, USA IN, USA
Jacob Foster Anton Nijholt
University of California, Los Angeles, University of Twente, Enschede,
CA, USA The Netherlands
Nigel Gilbert Andrzej Nowak
University of Surrey, Guildford, UK University of Warsaw, Warsaw, Poland
Jennifer Golbeck Robert Savit
University of Maryland, College Park, University of Michigan, Ann Arbor,
MD, USA MI, USA
Bruno Gonçalves Flaminio Squazzoni
New York University, New York, University of Brescia, Brescia, Italy
NY, USA Alessandro Vinciarelli
James A. Kitts University of Glasgow, Glasgow,
University of Massachusetts, Scotland, UK
Amherst, MA, USA
Petter Holme · Jari Saramäki
Editors

Temporal Network Theory


Second Edition
Editors
Petter Holme Jari Saramäki
Department of Computer Science Department of Computer Science
Aalto University Aalto University
Helsinki, Finland Espoo, Finland

ISSN 2509-9574 ISSN 2509-9582 (electronic)


Computational Social Sciences
ISBN 978-3-031-30398-2 ISBN 978-3-031-30399-9 (eBook)
https://doi.org/10.1007/978-3-031-30399-9

1st edition: © Springer Nature Switzerland AG 2019


2nd edition: © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer
Nature Switzerland AG 2023

This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface to the Second Edition

As a token of how fast the field of temporal networks is moving, already after 4 years,
we have a new edition of this book. Similar to other areas of network science, temporal
network theory is becoming increasingly influenced by machine learning which is
something we can note in three out of the four new chapters. We are happy that several
of the authors have updated their chapters from the first edition so that this volume
should still paint an accurate picture of the forefront of temporal networks, dynamic
graphs, stream graphs, link streams, time-varying networks, blinking networks, or
whatever you prefer to call this topic of many names.

Helsinki, Finland Petter Holme


May 2023 Jari Saramäki

v
Preface to the First Edition

Great minds think alike! Many researchers have gotten the idea that there is structure
in the times of when things happen, in addition to network structure, that can be
exploited in modeling and the data analysis. This, we believe, is inevitable in any
active, interdisciplinary field, and not necessarily a bad thing, especially because
great minds don’t think identically. There is a multitude of frameworks, mathematical
representations, data structures, and visualization methods that are, on the one hand,
equivalent (there are one-to-one mappings between them), and on the other hand,
emphasizing different aspects of the data. The main motivation behind this book is
to show these different ways of thinking about temporal networks.
Our second motivation is to showcase the field of temporal networks 6–7 years
after our previous edited volume Temporal Networks (in the Springer Complexity
series). At the time the book was published, temporal networks felt like an immature
subfield that had just figured out that it was sufficiently different from (static) network
science that it could not simply bake the same cake over again, this time sprinkling
temporal information on top. Now, 3/4 of a decade later, temporal networks still feels
like an immature subfield, struggling to break free from the ideas of static network
science. Where this will end is not completely clear. Maybe some great mind, relaxing
in the hammock with this book, will be able to unify the many directions taken. Or
there may be a future where the current diversity of ideas will provide ingredients
for cooking great science. Either way, temporal networks is a field that is younger
than its age.
We hope this book will inspire new methods and discoveries, and perhaps guide
applied researchers to useful approaches. We also hope that this is the last time that
the preface of a Springer temporal networks volume claims that temporal networks
is a young field!

Tokyo, Japan Petter Holme


Helsinki, Finland Jari Saramäki
July 2019

vii
Contents

1 A Map of Approaches to Temporal Networks . . . . . . . . . . . . . . . . . . . . 1


Petter Holme and Jari Saramäki
2 Fundamental Structures in Temporal Communication
Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Sune Lehmann
3 Weighted, Bipartite, or Directed Stream Graphs
for the Modeling of Temporal Networks . . . . . . . . . . . . . . . . . . . . . . . . . 49
Matthieu Latapy, Clémence Magnien, and Tiphaine Viard
4 Modelling Temporal Networks with Markov Chains,
Community Structures and Change Points . . . . . . . . . . . . . . . . . . . . . . 65
Tiago P. Peixoto and Martin Rosvall
5 Visualisation of Structure and Processes on Temporal
Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Claudio D. G. Linhares, Jean R. Ponciano, Jose Gustavo S. Paiva,
Bruno A. N. Travençolo, and Luis E. C. Rocha
6 Weighted Temporal Event Graphs and Temporal-Network
Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Jari Saramäki, Arash Badie-Modiri, Abbas K. Rizi, Mikko Kivelä,
and Márton Karsai
7 Exploring Concurrency and Reachability in the Presence
of High Temporal Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Eun Lee, James Moody, and Peter J. Mucha
8 Metrics for Temporal Text Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Davide Vega and Matteo Magnani
9 Bursty Time Series Analysis for Temporal Networks . . . . . . . . . . . . . . 165
Hang-Hyun Jo and Takayuki Hiraoka

ix
x Contents

10 Challenges in Community Discovery on Temporal Networks . . . . . . 185


Remy Cazabet and Giulio Rossetti
11 Information Diffusion Backbone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Huijuan Wang and Xiu-Xiu Zhan
12 Continuous-Time Random Walks and Temporal Networks . . . . . . . . 225
Renaud Lambiotte
13 Spreading of Infection on Temporal Networks:
An Edge-Centered, Contact-Based Perspective . . . . . . . . . . . . . . . . . . . 241
Andreas Koher, James P. Gleeson, and Philipp Hövel
14 The Effect of Concurrency on Epidemic Threshold
in Time-Varying Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Tomokatsu Onaga, James P. Gleeson, and Naoki Masuda
15 Dynamics and Control of Stochastically Switching Networks:
Beyond Fast Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Russell Jeter, Maurizio Porfiri, and Igor Belykh
16 The Effects of Local and Global Link Creation Mechanisms
on Contagion Processes Unfolding on Time-Varying Networks . . . . . 313
Kaiyuan Sun, Enrico Ubaldi, Jie Zhang, Márton Karsai,
and Nicola Perra
17 Supracentrality Analysis of Temporal Networks with Directed
Interlayer Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
Dane Taylor, Mason A. Porter, and Peter J. Mucha
18 Approximation Methods for Influence Maximization
in Temporal Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Tsuyoshi Murata and Hokuto Koga
19 Temporal Link Prediction Methods Based on Behavioral
Synchrony . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
Yueran Duan, Qing Guan, Petter Holme, Yacheng Yang,
and Wei Guan
20 A Systematic Derivation and Illustration of Temporal
Pair-Based Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
Rory Humphries, Kieran Mulchrone, and Philipp Hövel
21 Modularity-Based Selection of the Number of Slices
in Temporal Network Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
Patrik Seiron, Axel Lindegren, Matteo Magnani,
Christian Rohner, Tsuyoshi Murata, and Petter Holme
Contents xi

22 A Frequency-Structure Approach for Link Stream Analysis . . . . . . . 449


Esteban Bautista and Matthieu Latapy

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
Chapter 1
A Map of Approaches to Temporal
Networks

Petter Holme and Jari Saramäki

Abstract The study of temporal networks is motivated by the simple and important
observation that just as network structure can affect dynamics, so can structure in
time. Just as network topology can teach us about the system in question, so can its
temporal characteristics. In many cases, leaving out either one of these components
would lead to an incomplete understanding of the system or poor predictions. We
argue that including time in network modeling inevitably leads researchers away
from the trodden paths of network science. Temporal network theory requires some-
thing different—new methods, new concepts, new questions—compared to static
networks. In this introductory chapter, we overview the ideas that the field of tempo-
ral networks has brought forward in the last decade. We also place the contributions
to the current volume on this map of temporal-network approaches.

Keywords Temporal networks · Dynamic networks · Time-varying networks ·


Network theory · Network science · Complex networks · Complex systems · Data
science

1.1 Overview

If we want to make sense of large and complicated systems via the data they leave
behind, we need to simplify them systematically. Such simplifications typically need
to be very drastic. A common first step is to represent the system as a network that only
stores information on which units are connected to which other units. To investigate
the World Wide Web with this approach, one would neglect the content, the owner,

P. Holme (B)
Department of Computer Science, Aalto University, Espoo, Finland
e-mail: petter.holme@aalto.fi
Center for Computational Social Science, Kobe University, Kobe, Japan
J. Saramäki
Department of Computer Science, Aalto University, Espoo, Finland
e-mail: jari.saramaki@aalto.fi

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 1


P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_1
2 P. Holme and J. Saramäki

Simplifying and coarse graining

10
5 2
4 20

3 11
6
7 9
8
19
21
12
14
18 17 15 16 13

Identifying important nodes etc. How structure affects dynamics

Fig. 1.1 A schematic map of the chapters of this book, positioned with respect to the three main
research themes within the study of temporal networks

the time of creation, and the number of downloads of a web page. Instead, we would
only consider individual web pages and how they are linked together. The second
step is to apply network science methods to find important nodes or clusters of nodes
with some special role or function or study how the network’s wiring controls some
dynamical system. The fundamental idea of this book is that one can learn more
about a system if one does not, at the first step of simplification, discard information
about when things happen. Consequently, one needs to modify the second step and
develop a science of temporal networks that exploits this additional information.
The fundamental idea of retaining information about time is evidently not a hard
one to get. Temporal networks have been invented and reinvented many times.
Researchers have proposed many mathematical and computational frameworks—
some equivalent, some not. This is probably inevitable for such an extraordinarily
interdisciplinary field of science—temporal networks have been applied to neuro-
science, transportation problems (Kujala et al. 2018), criminology (Schaub et al.
2023), social theory (Brudner and White 1997), control theory (Li et al. 2017), ecol-
ogy (Ushio et al. 2018), and many more areas. The many existing frameworks could
be frustrating for a newcomer to temporal networks. Part of our idea with this book
was to showcase this diversity; see Chaps. 3, 5, and 15 for very different ways of
thinking about networks in time.
Even if you encounter a problem where both the network and the temporal aspects
should play a role, there is no general recipe to follow. This introductory chapter aims
to provide a rough map of the field–what types of questions researchers have been
interested in and what results are out there. We will also try to place the subsequent
1 A Map of Approaches to Temporal Networks 3

chapters in their correct locations on this map (Fig. 1.1). This chapter is not a catalog
of techniques or an introduction to a comprehensive and self-consistent theory. For
readers interested in that, our review papers Holme and Saramäki (2012), Holme
(2015) the book by Masuda and Lambiotte (2016) or by Batagelj et al. (2014) will
be a better read.

1.2 Temporal Network Data

This section discusses the many subtleties of how to represent a system as a temporal
network in a meaningful way.

1.2.1 Events

The fundamental building blocks of temporal networks are events (or contacts, links,
or dynamic links). These represent units of interaction between a pair of nodes at
specified times. Often, they take the form of triples (i, j, t) showing that nodes i and
j are in contact at time t. Sometimes, the time can be an interval rather than just a
moment.
As we will see throughout this book, temporal network modeling is far from a
straightforward generalization of static networks—often, it is fundamentally differ-
ent. As a first example, we note that events are not always a straightforward gener-
alization of the static networks’ links. Take email communication as an example. In
static network modeling, one typically assumes that the links (between people who
have exchanged emails) indicate social relationships (Williams et al. 2022). These
links can be viewed as the underlying infrastructure for e.g. information spreading
since people who know each other exchange information. The links are there not only
for one e-mail to be sent but represent persistent opportunities for spreading events
for the duration of the relationship. In contrast, an event in a temporal e-mail net-
work is simply one e-mail being sent. This usually happens for the explicit purpose
of spreading information. But there are also systems other than e-mail communi-
cation where events are more like links of static networks. Consider, for example,
transportation systems where bus, train, or flight connections are really opportunities
to travel that happen whether or not a certain person needs to use them. As we will
see, different approaches treat these two interpretations of events differently.

1.2.2 Boundaries

In the natural sciences, we can sometimes model time as a dimension, if not exactly
like space, then at least similar to it. For temporal networks, the binary connections
4 P. Holme and J. Saramäki

and the time are more fundamentally different concepts. The simplest way of seeing
this is to consider the network’s boundaries (between what is contained in a data set
and what is not). Regarding time, a temporal network data set almost always covers a
time interval, and the interval is the same for all nodes. The structural boundaries of
the network dimension are usually less controlled. Like cohort studies in the social
sciences, one would like to select nodes that are as tight-knit as possible, typically
defined by common features. For example, Stopczynski et al. (2014) is based on
data from voluntary participants among the freshmen of a university—better than a
random group of people but worse than the complete group of freshmen.
Boundaries become a problem when one wants to control the size of a data set. If a
temporal network is too large to handle or one wants to understand the effects of size,
how should one reduce its size without changing its structure? One could reduce the
number of nodes by random subsampling or truncating the sampling time. However,
both these approaches would introduce biases. While there are ways to correct some
of those Kivelä and Porter (2015), it is difficult to avoid problems. For example, if one
truncates the data, there might not be enough time for a spreading process to saturate
before the sampling interval is over. If one deletes nodes or events, one introduces
other biases. The proper way of resampling a temporal network must simultaneously
vary the number of nodes and the sampling duration, but exactly how is still an open
question.

1.2.3 Connectivity

It is fundamental to network modeling that being indirectly connected through a


path is relevant to dynamic processes. This is true for temporal networks as well,
but the connections have to happen along time-respecting paths of contacts (with
strictly increasing timestamps). Indirect connections through time-respecting paths
are not transitive (see Fig. 1.2)—even if one can get from A to B and from B to C, it
might still be impossible to go from A to C because one would arrive at B too late
for a further connection to be possible. Contrary to this, all static networks, directed
networks included, are transitive.
Another important difference to static networks is that connectivity is temporal:
even if there is a path from A to B now, whether direct or indirect, there might be
none a second later. Therefore, the statement “A is connected to B” is not necessarily
even meaningful unless the time (interval) of this connection is specified. The above
issues mean that one can never reduce a temporal network to a static one without
losing information or changing the meaning of the nodes (cf. Fig. 1.3).
Since many static network tools are based on paths and distances, researchers
have sought to generalize these concepts to static networks. Once again, the addition
of a temporal dimension makes this task much more complicated. The most common
generalization of distance is latency (temporal distance) Lamport (1978)—the time
it would take to reach j from i starting at time t and following only time-respecting
1 A Map of Approaches to Temporal Networks 5

(a)
A

D
0 5 10
time
A
8
1,6,7 D
C 1,3,4
B 9 (b)
Fig. 1.2 An illustration of two ways to visualize small temporal networks that can be convenient
for reasoning about measures and methods. Panel a shows a timeline graph where an epidemic
outbreak that starts at node A is indicated by gray lines. In almost all cases, paths between nodes
(which follow events) in temporal networks must go forward in time (to the right in the plot). b
Shows the same data but plotted projected onto a static graph. The latter visualization highlights
the underlying static network structure at the expense of the temporal information. The former, the
time-line plot can capture many temporal structures but is inconvenient for network structures

Fig. 1.3 A time-node A1 A2 A3 A4 A5 A6 A7 A8 A9 A10


representation of the data in
Fig. 1.2. This is a (directed,
acyclic) static graph
B1 B2 B3 B4 B5 B6 B7 B8 B9 B10
containing the same
information as in Fig. 1.2,
but the meaning of nodes and
edges is different C1 C2 C3 C4 C5 C6 C7 C8 C9 C10

D1 D2 D3 D4 D5 D6 D7 D1 D1 D1

paths. For a longer discussion of paths and connectivity, see Masuda and Lambiotte
(2016), Holme and Saramäki (2012), Holme (2015).

1.3 Simplifying and Coarse-Graining Temporal Networks

Even if representing data as a temporal network means that information has to be


discarded for simplification, this is often not enough to understand the large-scale
6 P. Holme and J. Saramäki

organization of the system. There are many ideas on how to simplify further a tem-
poral network that we will discuss in this section.

1.3.1 Projections to Static Networks

Perhaps the most obvious way of simplifying a temporal network is to turn it into
a static network. In fact, many classic examples of static networks, such as citation
networks or affiliation studies (like the “Southern Women” study of 1941 Davis et al.
1941), have temporal link information. Still, the time is ignored simply by studying
the network of all aggregated contacts or separate “snapshot” graphs representing
different times.
If one, from time-stamped data, constructs a binary static network where nodes
are only linked or not, it is obvious that a lot of information is lost. A better option
is to include information on the number or frequency of connections between pairs
of nodes, leading to weighted networks. In this case, the link weights can provide
important insights into the network structure (see, e.g., Barrat et al. 2004; Onnela
et al. 2007). However, including links between all nodes that have been in contact
can, in some cases, result in a very dense network. In this case, one can threshold the
network, discard the weakest links, or extract the backbone of the network Serrano
et al. (2009).
The above weighted-network approach is not really temporal because if one
manipulates the times of the contacts, the outcome will remain the same. The sim-
plest static networks that truly encode some temporal effects are reachability graphs.
These graphs have a directed edge (i, j) if one can reach j from i via a time-respecting
path.
Another way of creating sparser static networks than thresholding weighted graphs
is to aggregate contacts within a time window (Krings et al. 2012b). While the thresh-
olded graphs contain information about the most common contacts in the whole
sampling interval, time-window graphs emphasize shorter time scales (Krings et al.
2012a), and their sequence captures at least part of the network dynamics. Indeed,
tuning the duration of the time windows can be a way of understanding the organi-
zation of the data (Sekara et al. 2016). Yet a similar idea is to construct networks
where links represent ongoing relationships (Holme 2003)—pairs of nodes that, at
some point in time, have had contacts before and will have them again.
One more elaborate way of reducing temporal networks to static ones is the
extraction of backbones, specifically concerning spreading processes on temporal
networks (Zhan et al. 2019) and Chap. 11. By this approach, links in the resultant
network correspond to node pairs that are likely to infect each other in an epidemic
outbreak.
As mentioned above, these approaches can never retain all temporal features of
the original data. Nevertheless, analyzing temporal networks by making them static
is rather attractive because there are a plethora of methods for static network analysis.
One way of circumventing information loss is to use more elaborate mappings, where
1 A Map of Approaches to Temporal Networks 7

temporal networks are mapped onto static network structures whose nodes and links
represent something else than the original network’s nodes and links. One example
is temporal event graphs, whose nodes correspond to the original network’s events
(see Mellor 2018; Kivelä et al. 2018 and Chap. 6 of this book).
One common approach that can also be interpreted as static-network projection is
to use multilayer networks, as in Chap. 17 of this book: time is sliced into consequent
intervals, and the layers of a multilayer network correspond to networks aggregated
for each interval. Once the layers are coupled (e.g. with a directed link from a node to
its future self), one can then apply (static) multilayer network methods to the system.
Importantly, the layers in such a projection are ordered by time.
Finally, we can also project temporal network data to higher-order network models
that retain some information about the flows over the network. Chapter 4 discusses
such approaches.

1.3.2 Separating the Dynamics of Contacts, Links, and Nodes

Instead of reducing temporal network data to static networks, one can retain some
but not all of the temporal features. One example is the statistics of times between
contacts. It was early recognized that often, the times between events, both for nodes
and links, have heavy-tailed distributions (Holme 2003; Johansen 2004) (they are
bursty Barabási 2005; Karsai et al. 2018). Subsequent studies (e.g. Karsai et al.
2011; Miritello et al. 2011) found that this burstiness of inter-event times slows
down spreading processes: simulated spreading that takes place on bursty networks
is slower than it is on networks where the burstiness has been artificially removed.
However, the result is the same when the heavy-tailed inter-event times are part of
the dynamical process itself: when a spreading process with power-law distributed
waiting times is placed on a static network (Min et al. 2011), it is slow, too. This is
related to how events are interpreted (see Sect. 1.2.1 and Chap. 12): are they separated
from the process and just passive conduits for it, as in spreading on top of bursty
event sequences, or are the events actively generated by the process, as one could
interpret the combination of spreading with broad waiting times and a static network?
Fig. 1.4a, b illustrate homogeneous and heterogeneous (bursty) link dynamics on top
of a static network. See also Chap. 9, which goes deeper into this issue. Note that
under some conditions, burstiness may also speed up spreading (Rocha et al. 2011;
Horváth and Kertész 2014).
Another way of simplifying temporal networks is to ignore contact dynamics and
think of links as present between the first and last observations of a contact in the
data and ignore the precise timing of contacts (Holme and Liljeros 2014). Compared
to simplifying the system as a bursty dynamics on top of static networks, this picture
emphasizes longer time structures, such as the general growth and decline of activity
in the data. Figure 1.4c illustrates a data set that is well-modeled by links appearing
and disappearing, disregarding the interevent time statistics.
8 P. Holme and J. Saramäki

(a)
(A,B)
(A,C)
(B,C)
(C,D)
(b)
(A,B)
(A,C)
(B,C)
(C,D)
(c)
(A,B)
(A,C)
(B,C)
(C,D)
Time

Fig. 1.4 Three scenarios of temporal edge structure. The figures show the timelines of contacts
along edges in a four-node graph. Panel a shows a scenario with narrow distributed inter-event
times; b shows bursty edge dynamics; c shows a scenario with a turnover of edges—where the time
between the beginning of the sampling to the first contact, or from the last contact to the end of the
sampling, is too long to be explained by the interevent time distributions

1.3.3 Mesoscopic Structures

In science, in general, “mesoscopic” refers to the scales between macroscopic and


microscopic. In network science, this would mean structures larger than nodes but
smaller than the entire network, and indeed, the term is often used in the context of
grouping nodes into classes based on how they are connected to each other and the
rest of the network. The primary example of mesoscopic structures is the community
structure—some networks have clear groups that are strongly connected within and
weakly connected between each other (Schaub et al. 2017).
Most community detection methods in static networks divide the network so
that every node belongs to one group only (Chaps. 10 and 20). The straightforward
extension of this idea to temporal networks would be to let nodes belong to different
groups at different times but only to a single group at each point in time Rossetti
and Cazabet (2018). This is also the most common assumption in the literature. See
e.g. Rosvall et al. (2014), Palla et al. (2007), Mucha et al. (2010). This view focuses
on the individual nodes and seeks to group them in some principled way. If one,
on the other hand, focuses on the communities instead of the nodes and prioritizes
definitions that give interpretable communities (one temporal-network community
could, for example, represent one seminar, one concert, etc.), it makes sense not to
require every node to be a member of a group at every point in time (Sekara et al.
2016), as in Chap. 2.
1 A Map of Approaches to Temporal Networks 9

Other mesoscopic structures, such as core-periphery structures (Rombach et al.


2014), have been less studied for temporal networks (even though there are some
works—e.g. Rico-Gray et al. 2012 use core-periphery analysis to understand ant-
plant networks). Finally, temporally connected components (see Chap. 6 and Kivelä
et al. 2018) span the structural scale from mesoscopic to macroscopic, both in terms
of network structure and with respect to time.

1.3.4 Fundamental Structures

Chapter 2 and Sekara et al. (2016), Lehmann et al. discuss the traces that the six fun-
damental interaction types leave on temporal networks. Chapter 2 presents a division
of the interaction types in the configuration of participants (one-to-one, one-to-many
and many-to-many) and synchronicity (synchronous and asynchronous). In the limit
of a short time interval projection of a temporal network data set, these different com-
munication events contribute with different subgraphs—synchronous one-to-many
communication yields a star graph, and synchronous many-to-many communica-
tion yields a clique. By tuning the time window, one can identify the time scales of
influence of these “fundamental structures”.

1.4 Important Nodes, Links, and Events

Perhaps the most common question for static networks is to find important nodes
(where “important” should be interpreted in an inclusive sense). This question is just
as relevant for temporal networks. This is perhaps the topic where the approaches
borrowed from the static-network toolbox are most applicable to temporal networks.
One major difference is that it is meaningful to talk about the importance of con-
tacts (in addition to nodes and links) for temporal networks (Takaguchi et al. 2012).
Another difference is that the most principled, general measures of importance are
time-dependent simply because, in most contexts, a node can become more or less
important in time.

1.4.1 Generalizing Centrality Measures

A huge number of papers have been devoted to the generalization of classical central-
ity measures to temporal networks. See Pan and Saramäki (2011), Taylor et al. (2017)
and Chaps. 8 and 17. In many cases—for distance-based centrality measures—they
have taken the obvious approach of replacing distances with latency. Since temporal
networks are typically less connected (in the sense that the fraction of nodes that are
reachable through time-respecting paths is smaller than the corresponding quantity
10 P. Holme and J. Saramäki

in static networks), centrality measures have to work in fragmented networks. This


means that one needs to combine information about how many nodes can be reached
with information on how easily they can be reached (or whatever rationale the corre-
sponding static centrality measure has). An example would be to generalize closeness
centrality by averaging reciprocal latencies rather than taking the inverse of the aver-
ages (Tang et al. 2013). This is, however, an arbitrary combination of two aspects of
centrality and quite typical for straightforward generalizations of static concepts to
temporal networks—they become less principled than their static counterparts.

1.4.2 Controllability

The rationales of centrality measures come from reasoning about dynamic systems—
you can reach other nodes quickly from central nodes; central nodes are in the middle
of heavy traffic, etc. The purpose of measuring centrality is typically to rank the nodes
and perhaps to list the most central ones. Finding control nodes involves slightly
different thinking. Instead of ranking the nodes, the control nodes are minimal sets
of nodes needed to be manipulated for the entire network to reach a certain state.
Li et al. (2017) and subsequent works show that temporal networks can facilitate
controllability—i.e. the system can be controlled with less energy and by fewer
nodes, if it has temporal heterogeneities.

1.4.3 Vaccination, Sentinel Surveillance, and Influence


Maximization

The problems of vaccination, sentinel surveillance, and influence maximization are


related to questions about spreading phenomena. Similarly to controllability, one
assumes some objective and some intervention to the underlying temporal-network
structure. In this case, however, the objective is typically to minimize or maximize
the number of nodes reached by some spreading dynamics (like an infectious disease,
word-of-mouth marketing, etc.).
The vaccination problem is to select nodes that would minimize or slow down
disease spreading as much as possible. Typically, one assumes that the vaccinated
nodes are deleted from the system so that they can no longer become infected and
spread the disease. Unlike centrality measures, but similar to controllability, it usually
makes no sense to talk about the importance of individual nodes for the vaccination
procedure—vaccinating one or a few nodes in a large network would have no mea-
surable effect on the epidemics. Instead, the importance of the nodes comes from
the membership of a group that is vaccinated (Gu et al. 2017). Another important
point is that one can typically not assume knowledge about the entire network of
1 A Map of Approaches to Temporal Networks 11

contacts—only the interactions that individuals could reliably report can serve as
input for vaccination protocols. For example, Génois et al. (2015), Starnini et al.
(2013) propose vaccination protocols that exploit temporal structures.
The influence maximization problem deals with finding seed nodes for spreading
dynamics that maximize the number of reached nodes (Kempe et al. 2003). The
prime application is viral marketing, but to protect against outbreaks that have not
yet entered a population influence maximization is also interesting for network epi-
demiology. The nodes that are important for vaccination and influence maximization
do not necessarily have to be the same—optimal node sets for vaccination typically
fragment the network efficiently. In contrast, influence maximization emphasizes
efficiently splitting the network into subnetworks of influence. The first problem is
akin to network attack or network dismantling (Braunstein et al. 2016), and the sec-
ond is to find a vertex cover (Dinur and Safra 2005). To exploit temporal structures,
one can identify nodes in a heightened state of activity or nodes that reliably influence
others (Chap. 18).
Sentinel surveillance assumes that one can put sensors (sentinels) on the nodes.
The task is to choose the sensors’ locations so that disease outbreaks are discovered
as reliably or quickly as possible. This is probably the least studied of these three
problems on temporal networks—we are only aware of Bai et al. (2017). On the
other hand, it is practically a more important problem, since it is currently used in
health care systems (Arita et al. 2004). Bai et al. (2017) tests how efficient temporal
network protocols originally developed for vaccination are for the problem of sentinel
surveillance.

1.4.4 Robustness to Failure and Attack

A problem that is very much overlapping with influence maximization, etc., is net-
work robustness. The scenario is that some adversary is trying to destroy a network.
This adversary can have different amounts of information or resources to carry out
the attack, which yields different versions of the problem. With no information about
the network, the problem reduces to node percolation (or robustness to failure). With
perfect information but limited computational resources, the problem is equivalent
to network dismantling. It is both interesting to study optimal heuristics for this
problem and what network structures contribute to the robustness of a network. For
temporal networks, Trajanovski et al. (2012), Scellato et al. (2013) studied this prob-
lem. Still, there should be several ways of extending their work, and in general,
temporal-network robustness seems to be an understudied area. This may have to do
with the fact that the temporal dimension makes the whole percolation framework
more complicated (see Chap. 6).
12 P. Holme and J. Saramäki

1.5 How Structure Affects Dynamics

For disease-spreading models, heterogeneous, heavy-tailed degree distributions are


known to speed up the dynamics Barthélemy et al. (2004). One line of research in
temporal network studies is to identify similar relations between the structure of the
data and dynamics taking place over the contacts.
The types of dynamics people have been studying on underlying temporal net-
works include disease spreading of different kinds (Chaps. 7, 11, 13, 14 and 16)
Fefferman and Ng (2007), threshold models of complex contagion (Takaguchi et al.
2013; Backlund et al. 2014), random walks (Chap. 12) (Starnini et al. 2012; Delvenne
et al. 2015; Masuda and Lambiotte 2016; Saramäki and Holme 2015), navigation pro-
cesses (Lee and Holme 2019), synchronization (Chap. 15) and even game-theoretic
models (Cho and Gao 2016; Zhang et al. 2019).

1.5.1 Simulating Disease Spreading

Disease spreading typically follows standard compartmental models developed by


applied mathematicians (Hethcote 2000; Britton 2010). Such models divide a popu-
lation into classes with respect to the disease and then state transition rules between
the classes. The key transition rule in all compartmental models is the contagion event
where a susceptible individual becomes infected when in contact with an infectious
individual. In the two canonical and most well-studied models—the SIS (susceptible–
infectious–susceptible) and SIR (susceptible–infectious–recovered) models—the
contagion event is paired with the recovery of individuals (in SIS, recovered individ-
uals become susceptible again, whereas in SIR they become immune to the disease
or die). The probability that a contact between a susceptible individual and an infec-
tious individual results in contagion is usually a model parameter. It is assumed to be
the same for all contacts (which is an assumption for convenience and not realism).
Many assumptions are needed to simulate a compartmental model on a temporal
network of contacts (Masuda and Holme 2015; Enright and Kao 2018). Unless mod-
eling bioterrorism or the spread of something other than a disease makes no sense
to select more than one seed. Without prior knowledge about the disease’s entry
into the population, one should choose this seed uniformly at random. By the same
principle, one should choose the time of infection uniformly randomly as well. This
could, of course, lead to the outbreak not being able to reach all nodes so that the
measured outbreak sizes are an average of outbreak sizes of different times. For this
reason, some authors choose to start the outbreak early in the interval covered by
their data, although this introduces a bias if e.g. the activity in the data grows Rocha
and Blondel (2013). Another commonly used approach is to use periodic boundary
conditions and repeat the data from the beginning (e.g., in Karsai et al. 2011).
Another important consideration is the duration of the infectious period. In the
mathematical epidemiology literature, it is usually assumed to be exponentially dis-
1 A Map of Approaches to Temporal Networks 13

tributed to achieve the Markov property (that the probability of recovering is inde-
pendent of the time since the infection). Markovian SIR and SIS are not only easier
to analyze analytically, but also allow for some tricks to speed up simulation code
(see www.github.com/pholme for fast, event-driven code for the Markovian SIR on
temporal networks). Some studies use a constant duration of infection for all nodes.
To the best of our knowledge, no studies have tried duration distributions inferred
from data.
Another decision that anyone simulating disease spreading (or random walks) on
temporal networks needs to make is what to do with contacts happening in the same
time step. There are, as we see it, two principled solutions. Either one assumes that this
is allowed, in which case one then needs to pick contacts with the same timestamp
in random order and average over different randomizations, or one assumes that
the disease cannot spread via an intermediate node in a single time step. This is
effectively to assume an SEIS or SEIR model (E stands for exposed, which means
that the individual will become infectious in the future but is not yet infectious),
with the duration of the E state being less than the time resolution of the temporal
network.
Another slight difference in approaches, especially in studies where a model
generates the underlying temporal network, is that of link-centric and node-centric
compartmental models. In node-centric models (Masuda and Rocha 2018; Jo et al.
2014), the time to the next contact that could spread the disease is determined at a
contagion event. In link-centric models (Vazquez et al. 2007; Horváth and Kertész
2014) the contacts are generated independently of the propagation of the disease. The
node-centric model simplifies analytical calculations, whereas the link-centric model
is conceptually simpler and perhaps more realistic (even though the assumption that
the contact dynamics are independent of what spreads on the network is probably
often invalid).
Typically, papers about disease spreading have focused on understanding how
network structure affects the final outbreak size (Min et al. 2011; Holme and Liljeros
2014; Masuda and Holme 2015). Some, however, have studied early outbreak char-
acteristics such as the basic reproductive number R0 (the expected number of others
an infectious individual would infect in a completely susceptible population) Liu
et al. (2014), Rocha and Blondel (2013). From a medical perspective, there is no
obvious choice between these two—even though the societal concern is to minimize
the outbreak size. The outbreak size is also a consequence of interventions not mod-
eled by canonical disease-spreading models such as SIS and SIR. Thus the early
phase of the disease, which is better summarized by R0 , could be more informative.
Random walk studies have focused on the mean-first passage time—the expected
time since the beginning of a walk that the walker reaches a node—and reachability—
the probability that a node is reached by a walker starting at a random node (Masuda
and Lambiotte 2016; Saramäki and Holme 2015). Another topic of interest has been
how the topological and temporal structures affect the speed of diffusion (Delvenne
et al. 2015). As opposed to the spreading of disease, there is no directly obvious
14 P. Holme and J. Saramäki

real-world phenomenon that would be well-modeled by random walks on temporal


networks; however, random walks equal diffusion, and diffusion can be considered a
fundamental process in any system. Often, the random walk process is simply used
as a probe of the temporal network structure.

1.5.2 Tuning Temporal Network Structure by Randomization

The most straightforward way of understanding the impact of temporal network struc-
ture on dynamic processes is, of course, to tune it and monitor the response on some
quantity describing the dynamics. There were important contributions (also involving
temporal structures) in this direction even before the turn-of-the-millennium network
boom. For example, Morris and Kretzschmar studied the effect of concurrency, or
overlapping relations, on outbreak sizes (Morris and Kretzschmar 1995).
The most common way to investigate the effect of structures on a temporal network
is to use randomization schemes. This approach starts with empirical networks and
then destroys some specific correlation by randomizing it away. For example, one can
randomly swap the timestamps of contacts or replace the timestamps with a random
timestamp chosen uniformly between the first and last of the original data (Holme
2005). The former randomization is more restrictive because it preserves the overall
activity pattern and per-node and per-link inter-event time statistics (see Fig. 1.5).
Randomization schemes turn out to be much more versatile for temporal networks
than for static ones. Gauvin et al. (2018) gives a comprehensive theory of almost
40 different randomization schemes. By applying increasingly restrictive methods
to real data sets, one can see how much structure is needed to recreate the original
temporal network’s behavior.
In general, the terminology of temporal networks is ambiguous. The topic itself
sometimes goes by the names “dynamic networks”, “temporal graphs”, or “time-
varying networks”. The randomization schemes above are no exception—Holme
(2005) calls the scheme of Fig. 1.5b “permuted times”, Karsai et al. (2011) calls it
“shuffled times” and Gauvin et al. (2018) calls it “shuffled timetamps”.

1.5.3 Models of Temporal Networks

Another way of tuning temporal network structure, other than randomization, is


by generative models. Generative models of temporal networks serve a different
role than static networks. Static network science traditionally used network evo-
lution models as proof-of-concept models for theories about emergent properties,
like power-law degree distributions (Barabási and Albert 1999) or community struc-
ture (Grönlund and Holme 2004). There are common structures for temporal net-
works combining temporal and network structures in a nontrivial way that is non-
1 A Map of Approaches to Temporal Networks 15

(b) 2 0 1 2 0 1 1 1 1
A
3
B 1
1
C
(a) 2 0 1 2 0 1 1 1 1 3
A D
3 0 5 10
B 1 time
1
C (c) 1 1 1 0 0 1 2 1 1
3
D A
3
0 5 10 B 1
time 1
C
3
D
0 5 10
time

Fig. 1.5 Illustrating two types of randomization procedures. Panel a shows a temporal network that
is randomized by randomly swapping time stamps (b) and replacing timestamps with random ones
(c). The randomization in b preserves both the number of contacts per time (the numbers above) and
the number of contacts per pair of nodes (the numbers to the right). The randomization procedure
of panel c preserves the number of contacts per pair of nodes but not the number of contacts per
time

trivial to explain. Nevertheless, temporal network models are needed, if for nothing
else than to generate underlying data sets for controlled experimentation (Presigny
et al. 2021). In this section, we will mention some of the central developments in this
area. For a complete overview, see Holme (2015).
The most straightforward approach to generating a temporal network is to generate
a static network and assign a sequence of contacts to every link. For example, Holme
(2013) uses the following procedure:
1. Construct a simple graph by first generating an instance of the configuration
model (Newman 2010) and merging multiple links and self-links from it.
2. For every link, randomly generate an interval when contacts can happen.
3. Generate a sequence of contacts following an interevent time distribution.
4. Match the time sequence of contacts to the active intervals of the links.
This model is illustrated in Fig. 1.6.
Perra et al. (2014) proposed a model of temporal networks—the activity driven
model—that is even simpler than the above with the advantage that it is analytically
tractable. Let G t denote a simple graph at time t. Their generation algorithm proceeds
as follows:
16 P. Holme and J. Saramäki

(a) (b) 4
3
5 3
1
5
2
4 2 6
6
1

(c) (1,2)
(2,3)
(2,4)
(2,5)
(3,4)
(3,5)
(4,5)
(5,6)

0 t

(d)
t

(e) (1,2)
(2,3)
(2,4)
(2,5)
(3,4)
(3,5)
(4,5)
(5,6)

0 t

Fig. 1.6 Illustrating a simple generative model for temporal networks, used in Holme (2013) and
Rocha and Blondel (2013). First, one generates a static network from the configuration model
by creating desired degrees from a probability distribution (a) and matching them up in random
pairs (b). Then, one generates intervals for the links showing when they are active (c). Finally, one
generates a time series of interevent times (d) and matches it to the active intervals. This figure is
adapted from Holme (2015)
1 A Map of Approaches to Temporal Networks 17

1. Increment the time to t and let G t be empty.


2. Activate a node i with probability ai Δt. Connect i to m other randomly chosen
distinct nodes.
This model has been fundamental to analytical studies of processes on temporal
networks; see e.g. Perra et al. (2012), Karsai et al. (2014), Liu et al. ( 2013, 2014),
Starnini and Pastor-Satorras (2014), Sun et al. (2015), Han et al. (2015).
Starnini et al. (2013) use a two-dimensional random walk model where the chance
of approaching node i is proportional to an increasing attractiveness parameter ai .
This means that the more attracted a walker is to its neighbors, the slower its walk
becomes (simulating acquaintances stopping to talk when they meet on the street).
Furthermore, they also allow people not to socialize by having occasional inactive
periods. Zhang et al. (2015) propose a similar model without an explicit representa-
tion of space.
Another model of temporal networks of social contacts was proposed in Vester-
gaard et al. (2014). The authors introduced a model in which temporal effects activate
nodes and links. In their model, a link can be active or inactive and further character-
ized by the time τ(i, j) since the last time it changed state. Similarly, node i uses the
time τi since it was last involved in a contact as a basis for its decisions. The network
is initialized to N nodes, and all links are inactive. A node can activate a link with
probability depending on τ . The link is chosen from the nodes i that are currently
not in contact with i with a probability depending on the τ s of the neighbors. An
active link is inactivated with a rate that is also dependent on τ .
Masuda et al. (2013) and Cho et al. (2014) use a Hawkes process to model a
similar situation to the one considered by Starnini et al. (2013) above. Masuda et al.
(2013) argues that there is a positive correlation between consecutive interevent times
in empirical data that one cannot model by interevent times alone. Their model works
by defining an event rate by 
v+ ϕ(t − ti ) (1.1)
i:ti ≤t

where φ is an exponentially decreasing memory kernel, and v is a base event rate.


Even with an exponentially decaying kernel, the interevent time distribution becomes
broader than exponential. Similar to Masuda et al. (2013) and Cho et al. (2014),
Colman and Vukadinović Greetham (2015) introduced a model of temporal networks
based on stochastic point processes. In their model, the nodes form and break links
following a Bernoulli process with memory. Like the Hawkes process mentioned
above, the probability of an event between i and j increases with the number of
recent events between i and j. Specifically, Colman and Vukadinović Greetham
(2015) takes the probability that a link is activated or deactivated at time t to be
proportional to the number of such events in a time window.
18 P. Holme and J. Saramäki

1.6 Other Topics

There are, of course, some themes in the temporal network literature that do not
fit into the above three categories. Two examples are generalizations of link pre-
diction (Liben-Nowell and Kleinberg 2007) and network reconstruction (Newman
2018; Peixoto 2019) to temporal networks. The motivation for both these topics is
that real data is often erroneous and incomplete. In static networks, link prediction
refers to the problem of finding the linkless pair of nodes that is most likely to be a
false negative (falsely not having a link). In the context of temporal networks, this
could be reformulated as either the question of what will be the next contact (given
the information up to a point) or of which contact was missing in the past. We are
not aware of any paper addressing these particular problems. Instead of solving these
purely temporal network questions, there is a large body of literature on link pre-
diction in static networks with a turnover of nodes and links—see e.g. Ahmed and
Chen (2016) and references therein—i.e., assuming a slower changing network than
elsewhere in this chapter.
Network reconstruction, in general, is the problem of inferring a network from
secondary, incomplete, or noisy data (Newman 2018; Peixoto 2019). So far, we are
unaware of such temporal network studies similar to the static network case. There are
papers about the technical difficulties of inferring temporal network contacts from
electronic sensors (Stopczynski et al. 2014; Barrat and Cattuto 2013) and papers
about how to reconstruct static networks from temporal network data (Krings et al.
2012b; Holme 2013), but we are aware of no papers that would predict false positive
and negative data in a contact sequence.

1.7 Future Perspectives

Temporal network studies have been a vivid subdiscipline of network science for
around a decade. Some issues of the early days have been settled, while others remain.
This period has seen a shift from research that simply extends static network ideas to
temporal networks to methods that are unique to temporal networks. Still, the overall
research directions are more or less the same as for static networks (cf. Fig. 1.1)—
questions about identifying important nodes, how to simplify temporal networks
further, and how their structure affects dynamics. Are there such larger research
directions that make sense for temporal networks but not static ones? An obvious
idea would be to focus on questions that involve time more directly. Researchers
rarely asked what the optimal time to do something is, or the optimal duration to
expose the system to some treatment, etc. Change-point detection (finding the time
when a system changes between qualitatively different states) is one exception (Peel
and Clauset 2015). There are also papers about time series analysis of temporal
networks (Sikdar et al. 2016; Huang et al. 2017), but these typically do not ask
questions about time like the ones above.
1 A Map of Approaches to Temporal Networks 19

Perhaps the crudest assumption of temporal network modeling to date (as men-
tioned in Sect. 1.2.1) is that the existence of a contact is independent of the dynamic
system of interest. As an example, there are many modeling studies of informa-
tion spreading on top of empirical temporal networks (e.g. mobile-phone or e-mail
data Karsai et al. 2011; Karimi and Holme 2013; Backlund et al. 2014). Of course,
information spreading via email or calls does actually happen. Still, one cannot usu-
ally view it as a random process on top of some temporal contact structure indepen-
dent of the information. While one can imagine less important information spreading
this way—“By the way, put on that Finnish heavy metal when uncle Fredrik comes
to visit, he will appreciate it”—usually, calls are made, and e-mails are sent with the
explicit purpose of spreading information. Therefore, information spreading influ-
ences or even drives the contact structure. How should one then model information
spreading on temporal networks? One possibility would be to give up using empirical
data as the basis for the analysis; such an approach would be similar to adaptive net-
works (Gross and Sayama 2009). One could also go for data that contain the content
of the messages or conversations instead of only their metadata; in this case, it might
be possible to understand the relationship between contacts’ temporal network and
the spreading dynamics. Evidently, such data is hard to come by for privacy reasons,
but interestingly, early studies of electronic communications did analyze both the
content and the structure of spreading (Danowski and Edison-Swift 1985). There are
also communication channels where everything is public, such as Twitter.
One research direction with plenty of room for improvement is temporal-network
visualization. Figure 1.2 illustrates some of the challenges where Fig. 1.2a gives a
reasonable feeling for the temporal structures but none for the network structure,
and for Fig. 1.2b, the situation is reversed. One can probably rule out a type of
visualization that manages to show all information and convey all different aspects
of the structure. However, there should be methods that discard some information but
still reveal important structures. Also, animated visualization (which has the obvious
limitation that not all the information is shown at once) probably has room for
improvement. Some such methods are discussed in Chap. 5. The “alluvial diagrams”
of Rosvall and Bergstrom (2010) are another interesting approach. Evidently, there
are some available methods, but we wish for an even wider selection to choose from.
Yet another fundamental challenge for temporal networks is how to rescale or
subsample a data set properly. In particular, many methods inspired by statistical
physics rely on ways to change the size of a network consistently. This is a challenge
even for static networks—simply making subgraphs based on a random set of nodes
will most likely change the structure of a network (other than Erdős-Rényi random
graphs) Lee et al. (2006). The same applies to more elaborate ways to reduce the
network size by merging nodes (Kim 2004; Song et al. 2006)—there is no guarantee
that such manipulation will retain the structure of networks. For temporal networks,
one might think that at least the temporal dimension could be rescaled by sampling
windows of different sizes, but this is not trivial either because it could change
whether or not a dynamic process has the time to reach a certain state or not. For
finite-size scaling, such as used in the study of critical phenomena (Hong et al. 2007),
20 P. Holme and J. Saramäki

one would need a way to link the size of the network and the duration of the temporal
network.
Finally, as mentioned earlier in this chapter, we feel that there is much to do
regarding temporal-network robustness and fragility, with applications ranging from
network security to public health and the efficient planning of robust public-transport
systems. This is an area where it is possible to go beyond static network analo-
gies. For example, while a static network may fragment when chosen nodes are
attacked/immunized, the range of temporal-network responses is much broader. The
network may remain temporally connected in principle, but the average latency of
time-respecting paths may grow high enough to make them useless. Or, the system’s
latency could temporarily grow to make it temporarily disconnected: consider, e.g.,
congestion in a public transport system. Furthermore, the range of possible attack
or immunization strategies can be much broader too: interventions to events, attacks
that aim to increase latency generally, interventions at specific times, sequences of
timed interventions at different nodes or contacts, and so on. Likewise, when the aim
is to improve network robustness, interventions are not limited to network topology
alone. For example, to improve the reliability of public transport systems, one could
only modify the temporal sequences of connections and their time-domain statistics
to minimize the disruption caused by random deviations from the planned schedules,
or one could aim at maximal synchronization of connections to minimize the latency
of time-respecting paths.

Acknowledgements PH was supported by JSPS KAKENHI Grant Number JP 21H04595. JS


acknowledges funding from the Strategic Research Council at the Academy of Finland (NetRe-
silience consortium, grant numbers 345188 and 345183).

References

N.M. Ahmed, L. Chen, An efficient algorithm for link prediction in temporal uncertain social
networks. Inf. Sci. 331, 120–136 (2016)
I. Arita, M. Nakane, K. Kojima, N. Yoshihara, T. Nakano, A. El-Gohary, Role of a sentinel surveil-
lance system in the context of global surveillance of infectious diseases. Lancet Infectious Dis-
eases 4(3), 171–177 (2004)
V.P. Backlund, J. Saramäki, R.K. Pan, Effects of temporal correlations on cascades: Threshold
models on temporal networks. Phys. Rev. E 89, 062,815 (2014)
Y. Bai, B. Yang, L. Lin, J.L. Herrera, Z. Du, P. Holme, Optimizing sentinel surveillance in temporal
network epidemiology. Sci. Rep. 7(1), 4804 (2017)
A.L. Barabási, The origin of bursts and heavy tails in human dynamics. Nature 435, 207–211 (2005)
A.L. Barabási, R. Albert, Emergence of scaling in random networks. Science 286, 509–512 (1999)
A. Barrat, C. Cattuto, Temporal networks of face-to-face human interactions, in Temporal Networks.
ed. by P. Holme, J. Saramäki (Springer, Berlin, 2013), pp.191–216
A. Barrat, M. Barthélemy, R. Pastor-Satorras, A. Vespignani, The architecture of complex weighted
networks. Proc. Natl. Acad. Sci. U.S.A. 101, 3747–3752 (2004)
M. Barthélemy, A. Barrat, R. Pastor-Satorras, A. Vespignani, Velocity and hierarchical spread of
epidemic outbreaks in scale-free networks. Phys. Rev. Lett. 92, 178,701 (2004)
1 A Map of Approaches to Temporal Networks 21

V. Batagelj, P. Doreian, A. Ferligoj, N. Kejzar, Understanding Large Temporal Networks and Spatial
Networks: Exploration, Pattern Searching Visualization and Network Evolution. (Wiley, 2014)
A. Braunstein, L. Dall’Asta, G. Semerjian, L. Zdeborová, Network dismantling. Proc. Natl. Acad.
Sci. U.S.A. 113(44), 12368–12373 (2016)
T. Britton, Stochastic epidemic models: a survey. Math. Biosci. 225(1), 24–35 (2010)
L.A. Brudner, D.R. White, Class, property, and structural endogamy: visualizing networked histo-
ries. Theory Soc. 26(2), 161–208 (1997)
J.H. Cho, J. Gao, Cyber war game in temporal networks. PLoS ONE 11(2), 1–16 (2016)
Y.S. Cho, A. Galstyan, P.J. Brantingham, G. Tita, Latent self-exciting point process model for
spatial-temporal networks. Discret. Contin. Dyn. Syst.- Ser. B 19(5), 1335–1354 (2014)
E.R. Colman, D. Vukadinović Greetham, Memory and burstiness in dynamic networks. Phys. Rev.
E 92, 012,817 (2015)
J.A. Danowski, P. Edison-Swift, Crisis effects on intraorganizational computer-based communica-
tion. Commun. Res. 12(2), 251–270 (1985)
A. Davis, B.B. Gardner, M.R. Gardner, Deep South (The University of Chicago Press, Chicago,
1941)
J.C. Delvenne, R. Lambiotte, L.E.C. Rocha, Diffusion on networked systems is a question of time
or structure. Nat. Commun. 6, 7366 (2015)
I. Dinur, S. Safra, On the hardness of approximating vertex cover. Ann. Math. 162(1), 439–485
(2005)
J. Enright, R.R. Kao, Epidemics on dynamic networks. Epidemics 24, 88–97 (2018)
N.H. Fefferman, K.L. Ng, How disease models in static networks can fail to approximate disease
in dynamic networks. Phys. Rev. E 76, 031,919 (2007)
L. Gauvin, M. Génois, M. Karsai, M. Kivelä, T. Takaguchi, E. Valdano, C.L. Vestergaard, Random-
ized reference models for temporal networks. SIAM Rev. 64(4), 763–830 (2022)
M. Génois, C.L. Vestergaard, J. Fournet, A. Panisson, I. Bonmarin, A. Barrat, Data on face-to-face
contacts in an office building suggest a low-cost vaccination strategy based on community linkers.
Netw. Sci. 3(3), 326–347 (2015)
A. Grönlund, P. Holme, Networking the seceder model: Group formation in social and economic
systems. Phys. Rev. E 70, 036,108 (2004)
T. Gross, H. Sayama (eds.), Adaptive Networks (Springer, Berlin, 2009)
J. Gu, S. Lee, J. Saramäki, P. Holme, Ranking influential spreaders is an ill-defined problem. EPL
(Europhys. Lett.) 118(6), 68,002 (2017)
D. Han, M. Sun, D. Li, Epidemic process on activity-driven modular networks. Phys. A 432, 354–
362 (2015)
H.W. Hethcote, The mathematics of infectious diseases. SIAM Rev. 42, 599 (2000)
P. Holme, Epidemiologically optimal static networks from temporal network data. PLoS Comput.
Biol. 9, e1003,142 (2013)
P. Holme, Network reachability of real-world contact sequences. Phys. Rev. E 71, 046,119 (2005)
P. Holme, Network dynamics of ongoing social relationships. Europhys. Lett. 64, 427–433 (2003)
P. Holme, Modern temporal network theory: A colloquium. Eur. Phys. J. B 88, 234 (2015)
P. Holme, F. Liljeros, Birth and death of links control disease spreading in empirical contact net-
works. Sci. Rep. 4, 4999 (2014)
P. Holme, J. Saramäki, Temporal networks. Phys. Rep. 519, 97–125 (2012)
H. Hong, M. Ha, H. Park, Finite-size scaling in complex networks. Phys. Rev. Lett. 98(25), 258,701
(2007)
D.X. Horváth, J. Kertész, Spreading dynamics on networks: the role of burstiness, topology and
non-stationarity. New J. Phys. 16(7), 073,037 (2014)
Q. Huang, C. Zhao, X. Zhang, X. Wang, D. Yi, Centrality measures in temporal networks with time
series analysis. EPL (Europhys. Lett.) 118(3), 36,001 (2017)
H.H. Jo, J.I. Perotti, K. Kaski, J. Kertész, Analytically solvable model of spreading dynamics with
non-poissonian processes. Phys. Rev. X 4, 011,041 (2014)
A. Johansen, Probing human response times. Phys. A 330, 286–291 (2004)
22 P. Holme and J. Saramäki

F. Karimi, P. Holme, Threshold model of cascades in empirical temporal networks. Phys. A 392(16),
3476–3483 (2013)
M. Karsai, M. Kivelä, R.K. Pan, K. Kaski, J. Kertész, A.L. Barabási, J. Saramäki, Small but slow
world: how network topology and burstiness slow down spreading. Phys. Rev. E 83, 025,102(R)
(2011)
M. Karsai, N. Perra, A. Vespignani, Time varying networks and the weakness of strong ties. Sci.
Rep. 4, 4001 (2014)
M. Karsai, H.H. Jo, K. Kaski (eds.), Bursty Human Dynamics (Springer, Berlin, 2018)
D. Kempe, J. Kleinberg, É. Tardos, Maximizing the spread of influence through a social network,
in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining (ACM, 2003), pp. 137–146
B.J. Kim, Geographical coarse graining of complex networks. Phys. Rev. Lett. 93, 168,701 (2004)
M. Kivelä, J. Cambe, J. Saramäki, M. Karsai, Mapping temporal-network percolation to weighted,
static event graphs. Sci. Rep. 8, 12,357 (2018)
M. Kivelä, M.A. Porter, Estimating interevent time distributions from finite observation periods in
communication networks. Phys. Rev. E 92, 052,813 (2015)
G. Krings, M. Karsai, S. Bernhardsson, V.D. Blondel, J. Saramäki, Effects of time window size and
placement on the structure of an aggregated communication. EPJ Data Sci. 1, 4 (2012)
G. Krings, M. Karsai, S. Bernhardsson, V.D. Blondel, J. Saramäki, Effects of time window size
and placement on the structure of an aggregated communication network. EPJ Data Sci. 1(1), 4
(2012)
R. Kujala, J. Weckström, R. Darst, M. Mladenovic, J. Saramäki, A collection of public transport
network data sets for 25 cities. Sci. Data 5, 180,089 (2018)
L. Lamport, Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21,
558–565 (1978)
S.H. Lee, P.J. Kim, H. Jeong, Statistical properties of sampled networks. Phys. Rev. E 73, 016,102
(2006)
S.H. Lee, P. Holme, Navigating temporal networks. Phys. A 513, 288–296 (2019)
A. Li, S.P. Cornelius, Y.Y. Liu, L. Wang, A.L. Barabási, The fundamental advantages of temporal
networks. Science 358, 1042–1046 (2017)
D. Liben-Nowell, J. Kleinberg, The link-prediction problem for social networks. J. Am. Soc. Inf.
Sci. Technol. 58(7), 1019–1031 (2007)
S.Y. Liu, A. Baronchelli, N. Perra, Contagion dynamics in time-varying metapopulation networks.
Phys. Rev. E 87, 032,805 (2013)
S. Liu, N. Perra, M. Karsai, A. Vespignani, Controlling contagion processes in activity driven
networks. Phys. Rev. Lett. 112, 118,702 (2014)
N. Masuda, P. Holme, Predicting and controlling infectious disease epidemics using temporal net-
works. F1000Prime Rep. 5, 6 (2015)
N. Masuda, R. Lambiotte, A Guide to Temporal Networks (World Scientific, Singapore, 2016)
N. Masuda, L.E.C. Rocha, A Gillespie algorithm for non-markovian stochastic processes. SIAM
Rev. 60, 95–115 (2018)
N. Masuda, T. Takaguchi, N. Sato, K. Yano, Self-exciting point process modeling of conversation
event sequences, in Temporal Networks. ed. by P. Holme, J. Saramäki (Springer, Berlin, 2013),
pp.245–264
A. Mellor, The temporal event graph. J. Compl. Netw. 6, 639–659 (2018)
B. Min, K.I. Goh, A. Vazquez, Spreading dynamics following bursty human activity patterns. Phys.
Rev. E 83, 036,102 (2011)
Miritello, G., Moro, E., Lara, R.: Dynamical strength of social ties in information spreading. Phys.
Rev. E 83, 045,102 (2011)
Morris, M., Kretzschmar, M.: Concurrent partnerships and transmission dynamics in networks. Soc.
Netw. 17(3), 299 – 318 (1995). Social networks and infectious disease: HIV/AIDS
P.J. Mucha, T. Richardson, K. Macon, M.A. Porter, J.P. Onnela, Community structure in time-
dependent, multiscale, and multiplex networks. Science 328, 876–878 (2010)
1 A Map of Approaches to Temporal Networks 23

Newman, M.E.J.: Estimating network structure from unreliable measurements. Phys. Rev. E 98(6),
062,321 (2018)
M.E.J. Newman, Networks: An Introduction (Oxford University Press, Oxford, 2010)
J.P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, D. Lazer, K. Kaski, J. Kertész, A.L. Barabási,
Structure and tie strengths in mobile communication networks. Proc. Natl. Acad. Sci. U.S.A.
104, 7332–7336 (2007)
G. Palla, A.L. Barabási, T. Vicsek, Quantifying social group evolution. Nature 446, 664–667 (2007)
R.K. Pan, J. Saramäki, Path lengths, correlations, and centrality in temporal networks. Phys. Rev.
E 84, 016,105 (2011)
L. Peel, A. Clauset, Detecting change points in the large-scale structure of evolving networks, in
Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
T.P. Peixoto, Network reconstruction and community detection from dynamics. Phys. Rev. Lett.
123, 128301 (2019)
N. Perra, A. Baronchelli, D. Mocanu, B. Gonçalves, R. Pastor-Satorras, A. Vespignani, Random
walks and search in time-varying networks. Phys. Rev. Lett. 109, 238,701 (2012)
N. Perra, B. Gonçalves, R. Pastor-Satorras, A. Vespignani, Activity driven modeling of time varying
networks. Sci. Rep. 4, 4001 (2014)
C. Presigny, P. Holme, A. Barrat, Building surrogate temporal network data from observed back-
bones. Phys. Rev. E 103, 052,304 (2021)
V. Rico-Gray, C. Díaz-Castelazo, A. Ramírez-Hernández, P.R. Guimarães, J.N. Holland, Abiotic
factors shape temporal variation in the structure of an ant-plant network. Arthropod-Plant Interact.
6(2), 289–295 (2012)
L.E.C. Rocha, V.D. Blondel, Bursts of vertex activation and epidemics in evolving networks. PLoS
Comput. Biol. 9(3), 1–9 (2013)
L.E.C. Rocha, F. Liljeros, P. Holme, Simulated epidemics in an empirical spatiotemporal network
of 50,185 sexual contacts. PLoS Comput. Biol. 7, 1–9 (2011)
M.P. Rombach, M.A. Porter, J.H. Fowler, P.J. Mucha, Core-periphery structure in networks. SIAM
J. Appl. Math. 74(1), 167–190 (2014)
G. Rossetti, R. Cazabet, Community discovery in dynamic networks: a survey. ACM Comput. Surv.
51, 35 (2018)
M. Rosvall, C.T. Bergstrom, Mapping change in large networks. PLoS ONE 5(1), e8694 (2010)
M. Rosvall, A.V. Esquivel, A. Lancichinetti, J.D. West, R. Lambiotte, Memory in network flows
and its effects on spreading dynamics and community detection. Nat. Commun. 5, 4630 (2014)
J. Saramäki, P. Holme, Exploring temporal networks with greedy walks. Eur. Phys. J. B 88(12), 334
(2015)
S. Scellato, I. Leontiadis, C. Mascolo, P. Basu, M. Zafer, Evaluating temporal robustness of mobile
networks. IEEE Trans. Mob. Comput. 12(1), 105–117 (2013)
M.T. Schaub, J.C. Delvenne, M. Rosvall, R. Lambiotte, The many facets of community detection
in complex networks. Appl. Netw. Sci. 2(1), 4 (2017)
M.T. Schaub, J.C. Delvenne, M. Rosvall, R. Lambiotte, Examining the importance of existing
relationships for co-offending: a temporal network analysis in Bogotá, colombia (2005–2018).
Appl. Netw. Sci. 8, 4 (2023)
V. Sekara, A. Stopczynski, S. Lehmann, Fundamental structures of dynamic social networks. Proc.
Natl. Acad. Sci. U.S.A. 113(36), 9977–9982 (2016)
M.Á. Serrano, M. Boguná, A. Vespignani, Extracting the multiscale backbone of complex weighted
networks. Proc. Natl. Acad. Sci. U.S.A. 106(16), 6483–6488 (2009)
S. Sikdar, N. Ganguly, A. Mukherjee, Time series analysis of temporal networks. Eur. Phys. J. B
89(1), 11 (2016)
C. Song, S. Havlin, H.A. Makse, Origins of fractality in the growth of complex networks. Nat. Phys.
2(4), 275 (2006)
M. Starnini, A. Baronchelli, A. Barrat, R. Pastor-Satorras, Random walks on temporal networks.
Phys. Rev. E 85(5), 056,115 (2012)
24 P. Holme and J. Saramäki

M. Starnini, A. Baronchelli, R. Pastor-Satorras, Modeling human dynamics of face-to-face interac-


tion networks. Phys. Rev. Lett. 110, 168,701 (2013)
M. Starnini, R. Pastor-Satorras, Temporal percolation in activity-driven networks. Phys. Rev. E 89,
032,807 (2014)
M. Starnini, A. Machens, C. Cattuto, A. Barrat, R. Pastor-Satorras, Immunization strategies for
epidemic processes in time-varying contact networks. J. Theor. Biol. 337, 89–100 (2013)
A. Stopczynski, V. Sekara, P. Sapiezynski, A. Cuttone, M.M. Madsen, J.E. Larsen, S. Lehmann,
Measuring large-scale social networks with high resolution. PLOS ONE 9, e95,978 (2014)
K. Sun, A. Baronchelli, N. Perra, Contrasting effects of strong ties on sir and sis processes in
temporal networks. Eur. Phys. J. B 88(12), 326 (2015)
T. Takaguchi, N. Masuda, P. Holme, Bursty communication patterns facilitate spreading in a
threshold-based epidemic dynamics. PLOS ONE 8, e68,629 (2013)
T. Takaguchi, N. Sato, K. Yano, N. Masuda, Importance of individual events in temporal networks.
New J. Phys. 14(9), 093,003 (2012)
J. Tang, I. Leontiadis, S. Scellato, V. Nicosia, C. Mascolo, M. Musolesi, V. Latora, Applications
of temporal graph metrics to real-world networks, in Temporal Networks. ed. by P. Holme, J.
Saramäki (Springer, Berlin, 2013), pp.135–159
D. Taylor, S.A. Myers, A. Clauset, M.A. Porter, P.J. Mucha, Eigenvector-based centrality measures
for temporal networks. Multiscale Model. Simul. 15(1), 537–574 (2017)
S. Trajanovski, S. Scellato, I. Leontiadis, Error and attack vulnerability of temporal networks. Phys.
Rev. E 85, 066,105 (2012)
M. Ushio, C.H. Hsieh, R. Masuda, E.R. Deyle, H. Ye, C.W. Chang, G. Sugihara, M. Kondoh,
Fluctuating interaction network and time-varying stability of a natural fish community. Nature
360–363 (2018)
A. Vazquez, B. Rácz, A. Lukács, A.L. Barabási, Impact of non-poissonian activity patterns on
spreading processes. Phys. Rev. Lett. 98, 158,702 (2007)
C.L. Vestergaard, M. Génois, A. Barrat, How memory generates heterogeneous dynamics in tem-
poral networks. Phys. Rev. E 90, 042,805 (2014)
O.E. Williams, L. Lacasa, A.P. Millán, V. Latora, The shape of memory in temporal networks. Nat.
Commun. 13, 499 (2022)
X.X. Zhan, A. Hanjalic, H. Wang, Information diffusion backbones in temporal networks. Sci. Rep.
9, 6798 (2019)
Y.Q. Zhang, X. Li, D. Liang, J. Cui, Characterizing bursts of aggregate pairs with individual pois-
sonian activity and preferential mobility. IEEE Commun. Lett. 19(7), 1225–1228 (2015)
Y. Zhang, G. Wen, G. Chen, J. Wang, M. Xiong, J. Guan, S. Zhou, Gaming temporal networks.
IEEE Trans. Circuits Syst. II Express Briefs 66(4), 672–676 (2019)
Chapter 2
Fundamental Structures in Temporal
Communication Networks

Sune Lehmann

Abstract In this paper I introduce a framework for modeling temporal communica-


tion networks and dynamical processes unfolding on such networks. The framework
originates from the (new) observation that there is a meaningful division of temporal
communication networks into six dynamic classes, where the class of a network is
determined by its generating process. In particular, each class is characterized by a
fundamental structure: a temporal-topological network motif, which corresponds to
the network representation of communication events in that class of network. These
fundamental structures constrain network configurations: only certain configurations
are possible within a dynamic class. In this way the framework presented here high-
lights strong constraints on network structures, which simplify analyses and shape
network flows. Therefore the fundamental structures hold the potential to impact how
we model temporal networks overall. I argue below that networks within the same
class can be meaningfully compared, and modeled using similar techniques, but that
integrating statistics across networks belonging to separate classes is not meaningful
in general. This paper presents a framework for how to analyze networks in general,
rather than a particular result of analyzing a particular dataset. I hope, however, that
readers interested in modeling temporal networks will find the ideas and discussion
useful in spite of the paper’s more conceptual nature.

Keywords Temporal networks · Proximity networks · Communication networks ·


Electronic communication

2.1 Introduction

Temporal networks provide an important methodology for modeling a range of


dynamical systems (Holme and Saramäki 2012, 2013; Holme 2015). A central cat-
egory of temporal networks is communication networks, which—in this context—I
define to be networks that facilitate or represent communication between human

S. Lehmann (B)
Technical University of Denmark, DTU Compute, DK-2800 Kgs, Lyngby, Denmark
e-mail: sljo@dtu.dk

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 25


P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_2
26 S. Lehmann

beings. Frequently analyzed examples of communication networks are networks of


face-to-face contacts between individuals (Eagle et al. 2007; Sekara et al. 2016),
phone calls and text messages (Onnela et al. 2007), online social networks such as
Facebook (Ugander et al. 2011, 2012) or Twitter (Myers et al. 2014), and networks
of email messages (Guimera et al. 2003). But communication networks could also
represent other types of human communication, such as broadcast networks (e.g. tele-
vision or newspapers) or communication via letters or books. While the framework
discussed here is presented in the context of human communication networks, in
many cases the validity of the framework extends beyond networks of human com-
munication to describe networks of machine-machine communication, biological
signaling, etc.

2.2 Network Structure of Communication Events

The main realization underlying the ideas presented here is that each human commu-
nicative act is shaped by the medium in which it takes place. As modern communica-
tion tools have developed, the richness of the ways human beings can communicate
with one another has grown. What is perhaps less recognized in the field of network
theory is that each new medium for communication sets its own particular constraints
for the network structure of communication events within that medium. In the field of
communication studies, a key question is to understand how the technological evolu-
tion impacts human communication. Therefore, within that field, the many possible
types of human communication—old and new—have been boiled down to six fun-
damental prototypical communicative practices1 (Jensen and Helles 2011) shown in
Table 2.1.

Table 2.1 Six prototypical communicative practices and real-world examples of each practice
Synchronous Asynchronous
One-to-one Phone calla , voice chat Text message, letter
One-to-many Broadcast Radio and TV Book, Newspaper, Webpage
Many-to-many Face-to-face, Online chatroom Online social network, Wiki
a It is, of course, possible to set up a conference call, but one-to-one calls are so prevalent within

this medium of communication that I allow myself to use phone calls as an example of one-to-one
communication

1 In their formulation within the field of communication these practices are not connected to the
underlying communication networks (or their dynamics); rather, the prototypical practices are used
as a way to categorize real-world communication and understand their impact on, e.g. communica-
tion practices.
2 Fundamental Structures in Temporal Communication Networks 27

2.2.1 Synchronous Versus Asynchronous

In the vertical split, Table 2.1 makes a distinction between synchronous and asyn-
chronous communication. In the case of synchronous communication, both parties
are active and engaged. E.g. during a phone call. Conversely, in the case of asyn-
chronous communication, a message is initiated at some time by the sender and
then received later at some time by the recipient(s). For example, in the case of
one-to-one communication, the recipient reading a text message or a letter.

2.2.2 One-to-One, One-to-Many, Many-to-Many

Along the horizontal splits, each row in Table 2.1 refers to the configuration of partic-
ipants in a given communication act and the nature of their interaction. This division
of communicative behaviors into one-to-one, one-to-many, and many-to-many
is quite natural and recognized beyond communication theory; these distinctions are
used, for example, in the analysis of computer networks (Carlberg and Crowcroft
1997; Jo and Kim 2002), when negotiating contracts (Rahwan et al. 2002), within
marketing (Gummesson 2004), or as design patterns/data models in database design
(Jewett 2011).

2.2.3 Connecting to Network Theory

Bringing this framework, which was developed to organize different types of com-
munication, into the realm of temporal network theory, I propose that we think of
each prototypical type of communication as defining a dynamic class of network and
that very real-world communication network can be modeled as belonging to one of
these six classes.
The key concept which distinguishes the six classes is their fundamental struc-
tures. We arrive at the fundamental structures by first noticing that each row in
Table 2.1, corresponds to an archetypal network structure: one-to-one interactions
correspond to dyads, one-to-many interactions can be represented as star graphs (or
trees), and the many-to-many interactions match the network structure of cliques.
When also incorporating the temporal aspect (synchronous/asynchronous), we
arrive at the network representations of the six prototypical communicative prac-
tices, the fundamental structures.
The fundamental structures are temporal-topological network patterns, with each
pattern corresponding to a communication event (a phone call, a meeting, a text mes-
sage) in that network. We can then model the entire network as sequences of instances
of fundamental structures. Since each class is characterized by its fundamental struc-
28 S. Lehmann

ture, we name the each class according to their fundamental structure: synchronous,
one-to-one, synchronous, one-to-many, …asynchronous, many-to-many.
Let me provide some examples to give a sense of what I mean. In the syn-
chronous, one-to-one class (e.g. the phone call network), fundamental structures
are individual dyads, with some duration given by its start and end time. In syn-
chronous, one-to-many networks (e.g. a live-stream), the duration of the commu-
nication event is set by the node which is broadcasting, whereas receiving nodes
may participate for only part of the communication event’s duration. Finally, in the
synchronous, many-to-many class, where the fundamental structure is a sequence
of cliques (see Fig. 2.2 for an example), the start of the communication event is set by
the first participant(s) connecting—and the end occurs when the last participants(s)
stop communicating. An example of this class is face-to-face networks, where a
fundamental structure could represent a group of friends meeting for dinner at a
restaurant.
In all of the synchronous classes, infinitesimally thin temporal slices of the
communication event reveal the network pattern characteristic of that class. That
is, a dyad, tree, or clique for the one-to-one, one-to-many, and many-to-many
class respectively. I illustrate this point in Fig. 2.1, where I show brief snapshots (thin
temporal slices) of the network of phone-calls (synchronous, one-to-one) which
consists of disconnected dyads, each dyad corresponding to an ongoing conversation
(see Fig. 2.1a), whereas a slice of face-to-face meetings is well approximated as
disconnected cliques (see Fig. 2.1b). Getting a bit ahead of myself, I note that, already
at this point, it is clear from inspection that from the point of view of a dynamical
process, the possible network flows in the two networks shown in Fig. 2.1 are going
to be very different.
Let us now consider examples of networks from the asynchronous classes. Here
one-to-one communication events still involves dyads, and an event starts, when a
person sends a message. The communication ends when the recipient receives the
message at some later time. In the asynchronous, one-to-many class, a communi-

a b

Fig. 2.1 Cross-sections of fundamental structures are revealed in brief snapshots of networks from
the SensibleDTU project (Stopczynski et al. 2014). a A one minute time-slice from the phone call
network at peak activity; the network is entirely composed of dyads. b Social interactions over
5 min in the face-to-face contact network. Here the network is disconnected and well-approximated
by non-overlapping cliques
2 Fundamental Structures in Temporal Communication Networks 29

cation event starts when some communication is initiated (and that node becomes
active): a book is published, a web-page is launched, etc. Now, recipients can engage
with that active node at any point until the sender-node is no longer active/available—
and thus ending that fundamental structure. Finally the asynchronous, many-to-
many class. Here, again a node becomes active (starting the communication event),
and other nodes can engage with the active node as well as all other nodes in that
conversation. The fundamental structure ends when original post ceases to be avail-
able (although activity may end sooner than that). Examples of networks from the
asynchronous, many-to-many class is a post on a message board, or a post on
Facebook.2 In the case of the asynchronous classes, infinitesimally thin time-slices
of the fundamental structures are empty, only revealing the active nodes, since the
interactions themselves typically are instantaneous and do not have a duration.

Summarizing the discussion above, the key concepts are

• Dynamic Classes. Each dynamic class is the set of networks characterized


by a certain type of fundamental structure. There are six dynamic classes,
cf. Table 2.1.
• Fundamental structures. A fundamental structure is the topological-
temporal network representation the archetypical communication pattern
within a class of network. Each communication event corresponds to an
instance of the fundamental structure characterizing that network.
A useful way to think about real-world communication networks is as a
sequences of instances of fundamental structure from a single class. In this
sense we can think of each of the fundamental structures as generating a class
of networks.

These distinctions prove to be important because while two networks originating


from different classes aggregated over time may have similar topological properties,
a difference in network class may have profound impact on the network dynamics
and for processes unfolding on the network (cf. Fig. 2.1). Stated differently: When we
consider the networks on much shorter time-scales than those typically considered
in the literature, networks from the six classes are radically different.
I cover this point in detail below, arguing that there are a number of advan-
tages associated with thinking about temporal networks as sequences of fundamental
structures—and that while the impact is on many aspects of how we model tempo-
ral networks, the unfolding of dynamics are particularly important. Further, I argue
below that when we compare analyses of various real-world networks, we should
only expect similar behavior when we compare networks within the same dynamic
class, and that we should aggregate statistics within each class of network separately.

2It is not guaranteed that all posts become a full discussion between all readers—and if no-one
comments, such posts could display a one-to-many structure. I discuss this below.
30 S. Lehmann

2.2.4 The Case of Many-to-Many, Synchronous Networks

Before we move on, let me start by showing how the fundamental structures can
lead to clean, simple descriptions of temporal communication networks. A few years
ago—without realizing the connection to a larger framework—my group analyzed
a network from the class of synchronous, many-to-many interactions (Sekara
et al. 2016), arising from person-to-person interactions (estimated using Bluetooth-
signal strength Sekara and Lehmann 2014) in a large, densely connected network
(Stopczynski et al. 2014).
The key realization arose from simply plotting the contact-network at increasingly
higher temporal resolution. The green hairball (Fig. 2.2, left panel) shows connections
between everyone who has spent time together, aggregated across an entire day. The
orange network (Fig. 2.2, middle panel) shows contacts aggregated over an hour, and
the blue network (Fig. 2.2, right panel) shows the interactions during a five-minute
time slice. The discovery originates from the blue network. There, we can directly
observe cross-sections of the fundamental structures: groups (well approximated by
cliques) of people spending time together.
This was a case where analyzing the network became easier by including higher
resolution temporal data (in our case, no community detection was necessary). Usu-
ally it is the opposite. Usually, our descriptions become more complex when we have
to account for more detailed data, especially temporal data (Holme and Saramäki
2013; Holme 2015; Masuda and Lambiotte 2016). I take the fact that more data sim-
plified this particular problem to mean that we were on to something: in this case,
the fundamental structures constitute a quite natural representation of the network.
This way of representing the network also provided a way to understand the tem-
poral evolution of the network. Simply matching up cross-sections of the fundamental
structures across time-slices, we could then construct the full fundamental structures
(the individual communication events) for this class of network. We called the result
gatherings—the temporal representation of a meeting between a group of individ-
uals (see Fig. 2.3 for an examples of gatherings in two real-world networks). Gath-
erings represent the fundamental structure of many-to-many, synchronous net-
works. Studying the properties of gatherings allowed us to estimate the relevant time-
scales and spatial behaviors of the fundamental structures in this systems, e.g. how

Fig. 2.2 Three views of the contact network. Left (green), all interactions aggregated over 24 h.
Middle (orange), interactions during one hour. Right (blue), interactions in a 5-min window
2 Fundamental Structures in Temporal Communication Networks 31

Fig. 2.3 An example of fundamental structures in real-time, many-to-many networks. In both


panels, time runs from left to right, and each horizontal colored band represents a fundamental
structure in that network (a sequence of cliques matched up over time). Therefore, each horizontal
colored band is basically a representation of a group of people meeting, with the width of each band
proportional to the number of participants at that time. Here we show these fundamental structures
in two social settings a a Workplace network (Génois et al. 2015), b in the SensibleDTU data
(Stopczynski et al. 2014)

individual nodes interact with the gatherings. Turning our attention to time-scales
of weeks and months, we could study the patterns of meetings (gatherings) among
the same people beyond single meetings. Thus, we could model the network dynam-
ics as sequences of—and relationships between—such gatherings. This provided a
dramatic simplification allowing us, for example, to make predictions about the tem-
poral trajectories of individual nodes through the social network (Sekara et al. 2016).
We have since developed more sophisticated methods for identifying communities
in this class of networks (Aslak et al. 2017).
I include this example here to showcase the potential of the fundamental structures
to organize our modeling of a certain network, and I hope that it will be possible
to make similar progress for the remaining five dynamic classes. Connecting to
the more general point of within-class versus between-class comparisons, it is also
important to emphasize, that while the descriptions and algorithms developed in
Sekara et al. (2016), Aslak et al. (2017) are excellent when analyzing networks in the
synchronous, many-to-many class, they are not suited for describing networks in
the remaining network classes (because they assume an underlying many-to-many,
synchronous network structure).

2.3 Frequently Asked Questions

In this Frequently Asked Questions (FAQ) section, I go over a few questions that
have come up frequently when I have discussed the ideas in the paper with other
researchers.
32 S. Lehmann

2.3.1 What Do You Mean ‘Framework’!?

It is important to point out that the dynamic classes and associated fundamental
structures are emphatically not a mathematical framework (for example, the classes
are neither disjoint, nor complete). Instead what I aim to do here, is to point out new,
meaningful structures in dynamic networks. These structures are organized around
the idea of communication events, which in turn can be roughly classified into six
prototypical forms of communication. In this sense, aspects of the framework are
qualitative, focusing on providing useful taxonomy of classes of networks in the real
world.
Nevertheless, as I argue in detail below, the fundamental structures impose a set
of important constraints on dynamics for networks belonging to each class (with
different constraints in different classes). These constraints impact many aspects of
how we currently model and analyze temporal networks, and therein lies the value
of the framework. Much more on this in the epilogue.

2.3.2 Is the Framework All Done and Ready to Use?

A very important point to make in this FAQ section is to admit that there is still
a big piece of the framework missing. Specifically, that, while in the case of the
synchronous classes, understanding the temporal evolution of single communica-
tion events is relatively straightforward (as witnessed by our progress in the case of
synchronous, many-to-many networks Sekara et al. 2016), the temporal structure
of fundamental structures of networks from the asynchronous classes is non-trivial
since identification of (and method of analysis for) individual acts of communication
is less clear.
In these cases, for example, while there is a well-defined end-time for a each funda-
mental structure (when the active node ceases to be available), structures themselves
can still cease to show any link-activity much before that, for example an old Face-
book post which it is technically possible to comment on, but which nobody will ever
find again. Or a book, which nobody will ever read again, but which is still available on
many bookshelves. Further, in the many-to-many, asynchronous classes (which
includes many important online social networks, such as Twitter and Facebook),
there seems to be almost a spectrum running from one-to-many to many-to-many,
depending on the amount of discussion associated with a post: posts without activity
resembling trees, while vigorous discussions result in more clique-like structures.
2 Fundamental Structures in Temporal Communication Networks 33

2.3.3 Is It Just for Communication Networks?

While we focus here on modeling communication networks, it is likely that the


distinctions, concepts, and methods developed for each of the classes summarized in
Table 2.1 are valid in domains outside human communication, for example dynamics
of signaling networks in biology such as protein-protein interaction networks (Rual
et al. 2005), gene regulatory networks (Thieffry et al. 1998), and metabolic networks
(Jeong et al. 2000). I also expect that the results developed in this project can be
extended to networks of computer-to-computer communication (Schwartz 1977).

2.3.4 Isn’t All This Obvious?

The distinctions pointed out in Table 2.1 may appear so self-evident that a reader
might ask why they are currently not a part of modeling temporal networks. I believe
that the reason the network classes have remained unnoticed in the context of net-
work science because time aggregation has obscured the fundamental differences in
generating processes between networks with distinct fundamental structures.
As noted above, at the level of aggregation used in the literature, the many dis-
tinct networks (face-to-face, phone calls, text messages, emails, Twitter, Facebook,
Snapchat, Instagram, discussion forums, etc.) that we participate in have common
properties (see Fig. 2.4). These common properties are due to the simple fact that
all these networks reflect the same underlying object: the social network of relation-
ships between human beings. But as the cross sections of fundamental structures
displayed in Fig. 2.2 shows, these networks are fundamentally different from each
other on short time-scales. These differences are due to the characteristics of (and
design-choices behind) each communication platform, which inevitably encodes one
of the prototypical forms of communication in Table 2.1.
There are many traces of the fundamental structures in the recent literature. My
group’s work on communities in face-to-face networks (Sekara et al. 2016; Aslak et al.

Fig. 2.4 Three networks defined on the same set of approximately 500 nodes from the SensibleDTU
project (Stopczynski et al. 2014), with links aggregated over one week; node-positions are the
same in all three panels. From left to right the three panels show networks of physical proximity,
telecommunication, and Facebook interactions. Data from Mones et al. (2018). While from different
dynamic classes, in aggregated form, all three networks have similar topological properties
34 S. Lehmann

2017) discussed above proposes a new way of analyzing the class of synchronous,
many-to-many networks, but does not realize its place in a larger framework. Else-
where, recent work focusing on simplicial complexes explores the same class, both
in terms of network structures (Petri and Barrat 2018) and implications for spreading
processes (Iacopini et al. 2018), again without noting that these networks are not
necessarily representative of temporal networks generally; without explicitly point-
ing out that networks from different classes need different methods of analysis. From
another angle, it has been pointed out by many authors, see for example Krings et al.
(2012), Ribeiro et al. (2013), that time integrating techniques can introduce biases
in understanding spreading processes as we will discuss later.
In the next section, we explore the consequences of the presence of the six classes
on selected topics within temporal network analysis. Because each structure severely
constrains possible network configurations, the fundamental motifs have a profound
impact on the current state-of-the-art in temporal networks research.

2.4 Consequences for Analysis and Modeling

An immediate and important realization that flows from constraints imposed by the
fundamental structures is that many important high-order network structures are
strongly influenced by their network class.
I include an overview of five key topics below to illustrate the implications for
existing temporal network theory. This list is not exhaustive, but simply intended to
give the reader some examples of where I think the dynamic classes could be useful
for developing new descriptions of temporal networks.

2.4.1 Randomization

A common approach to understand the effect of temporal structure in networks is


to use randomization techniques to probe the impact of a structural feature of the
network.
A simple example from static network theory to explain the logic of randomiza-
tion: In their seminal paper Watts and Strogatz (1998), Watts and Strogatz argued
that real world networks are ‘small worlds’, characterized by high clustering and
short path lengths. But what does ‘high’ and ‘short’ mean in the sentence above?
To make their point, Watts and Strogatz created ‘random’ counterparts to their real-
world networks which contained the same number of nodes and links as the empirical
networks, but with links placed randomly among nodes. They found that the empir-
ical networks had both clustering and path-lengths that were orders of magnitude
different (higher and lower, respectively) from their random counterparts. In static
networks, the degree distribution is also often conserved (Maslov and Sneppen 2002).
2 Fundamental Structures in Temporal Communication Networks 35

The purpose of randomization is similar in temporal networks, but the possible


randomization schemes are much richer (Holme and Saramäki 2012; Gauvin et al.
2018). The idea is still: We want to estimate the effect of a specific temporal network
property and remove that property (through randomization) to measure the effect.
One may shuffle time-stamps (to understand the importance of ordering), replace
time-stamps with random times drawn from a uniform distribution (to understand
the importance of circadian patterns), shuffle links (in order to destroy topological
structures), reverse time (to understand importance of causal sequences), etc. The
idea is then to simulate a process of interest on the temporal network and compare
the dynamics of that process with the same process run on ensembles of networks
that are increasingly randomized relative to the original network (Holme 2015).
Because the fundamental structures (as I have argued above) correspond to indi-
vidual communication events, it is not always meaningful to randomize the networks
according to the strategies mentioned above—this generally results in configurations
of links that could not possibly appear in real communication networks.3
Thus, a fruitful area for future research is to develop randomization schemes which
respect the fundamental structures and understand how the fundamental structures
impact the existing work on network randomizations (Gauvin et al. 2018). A frame-
work for randomization that respects the network classes would be analogous to the
way that most randomizations in static networks respect degree distributions (Maslov
and Sneppen 2002) (or higher-order structures Mahadevan et al. 2006; Orsini et al.
2015), the key topological feature in these networks.

2.4.2 Generative Models

Closely related to randomization is the idea of using the fundamental structures to


build new synthetic networks.
The idea of using simple models that reproduce some properties of the system
under study and its dynamics, has been another important method for understand-
ing complex dynamical systems (Miller and Page 2009). Realistic synthetic data is
important because we can use such synthetic temporal networks to study dynamic
processes. The synthetic networks provide access to arbitrary amounts of data where
we (a) understand the network’s temporal changes (because we have created them)
and (b) create ensembles of networks to study variability in outcome given a partic-
ular dynamic (contrary to the case of real-networks, we typically only have a single
instance).
Thus, a plethora of models that generate temporal networks have also been inves-
tigated. The simplest approach is probably the ‘graph sequence approach’, which—
time-slice by time-slice—selects nodes according to a heavy-tailed probability dis-

3 Another, related issue is that the communication events (fundamental structures) themselves, often
are the very thing that spread information/opinions. They are not always (as many modeling papers
assume) an underlying infrastructure on which the spreading occurs.
36 S. Lehmann

tribution and connect them to a fixed number of neighbors (Perra et al. 2012). Due
to its simplicity, this model has been the subject of much analytical work (Perra
et al. 2012; Liu et al. 2013; Karsai et al. 2014; Liu et al. 2014; Starnini and Pastor-
Satorras 2014; Sun et al. 2015) and extended in a number of ways (Cui et al. 2013;
Mantzaris and Higham 2013; Laurent et al. 2015; Moinet et al. 2015; Sunny et al.
2015). Another simple approach is to generate a static network using an algorithm for
generating static network (e.g. the configuration model Newman 2010) and define
activation patterns for links (Holme 2013; Rocha and Blondel 2013). Yet another
approach is the work on simplicial complexes (Iacopini et al. 2018; Petri and Barrat
2018), discussed above. Networks can also be generated based on an ensemble of
two-dimensional random walkers with links forming when walkers are nearby each
other (Starnini et al. 2013; Zhang et al. 2015). Other interesting approaches ‘grow’
network topologies according to local rules (Bagrow and Brockmann 2013; Vester-
gaard et al. 2014) Although they focus on larger (meso-scale) structures, we can even
think of generative models for communities as models for networks (Peixoto 2013;
Gauvin et al. 2014; Peixoto 2015b, a; Valles-Catala et al. 2016; Matias and Miele
2017). Adding temporal correlations, based on the notion that there is a positive cor-
relation between inter-event times in empirical data is the motivation behind using
Hawkes processes (Masuda et al. 2013; Cho et al. 2013), an approach which has also
been used for predicting, for example, retweet dynamics (Kobayashi and Lambiotte
2016).4
In the case of all these existing generative models, analyses based on synthetic
datasets may have little relevance for real-world problems because the models do not
incorporate the constraints on dynamics imposed by the fundamental structures.
The framework of dynamic classes, however, offers a completely new way of
generating synthetic temporal networks. Since the fundamental structures are a man-
ifestation of each network’s real-world generative process, we can create network
models by simply creating time-sequences of realistic fundamental structures for a
given. The usefulness of such models can be tested using statistical methods (Clegg
et al. 2016).

2.4.3 Link Prediction and Link Activity

Another dynamic network property strongly influenced by network class is the pattern
of how links are active/non-active, and activity correlations between sets of links
in a network (Eckmann et al. 2004; Karsai et al. 2012). In face-to-face networks,
these patterns are typically dominated by long-duration meetings between groups of
individuals (Sekara et al. 2016), whereas in text message networks back-and-forth
dynamics are common (Saramaki and Moro 2015).

4These latter models are closely related to the inter-fundamental structure activity in asyn-
chronous, many-to-many networks.
2 Fundamental Structures in Temporal Communication Networks 37

Closely related the link-activities is temporal link prediction (Liben-Nowell and


Kleinberg 2007). Here, the objective is to model patterns of link occurrences and use
machine-learning to predict subsequent occurrences of links in the network based on
local/global features of nodes/links. In static network theory, link prediction (espe-
cially within computer science) is a large topic (Lü and Zhou 2011), which focuses
on predicting the presence of links that have been artificially removed or removed
due to noise of some kind. In temporal networks, the objective is rephrased to—for
example—predict all or some links in the next time-step (Dhote et al. 2013).
Based our understanding of the differences in link-activities in different classes,
it is clear that the fundamental structures offer a way to understand why features for
link-prediction can vary strongly from network to network. There is simply a mas-
sive difference between predicting future links in a synchronous, many-to-many
network, where temporal cross-sections are cliques and structures typically persist
for hours, relative to e.g. text chat networks (asynchronous, one-to-one), where
individuals can be in multiple ongoing conversations and text-snippets are short. In
turn, this means that link prediction algorithms (Liben-Nowell and Kleinberg 2007;
Lü and Zhou 2011) trained on one class of networks will fare poorly on networks
belonging to other classes, since features will change dramatically depending on
network class. These caveats become especially important when link-prediction is
used to infer values for missing data (Clauset et al. 2008; Guimerà and Sales-Pardo
2009; Kim and Leskovec 2011).
Another consequence for link prediction is that current performance estimations
may be misleading. This is because, depending on the dynamic class of network, not
all links are possible to realize.
When performing a link prediction task, we feed the classifier examples of
removed links (‘true’ examples) and examples of links that never existed (‘false’
examples), we then evaluate whether the classifier can tell which links exist and
which do not. What we learn from the dynamic classes, is that there are, in fact, two
types on non-links: actual false examples and ‘impossible’ links—links that cannot
occur because they are not possible given the constraints imposed by fundamental
structures in that network. This problem is important in one-to-many networks,
where message recipients cannot communicate amongst each other, and there are
many such impossible links. Link prediction algorithms should only consider actual
false examples and not the impossible links, see Fig. 2.5 for an illustration of this
problem in a one-to-many network.

2.4.4 Spreading Processes

Spreading processes are profoundly impacted by the fundamental structures. Let us


begin the discussion on spreading by considering epidemic spreading. Perhaps the
most studied type of dynamical systems on temporal networks is epidemic spreading,
realizing compartment models, such as SIS (susceptible-infected-susceptible), SIR
(susceptible-infected-recovered), etc. on the temporal network (Holme and Saramäki
38 S. Lehmann

Fig. 2.5 Impossible links and link prediction performance. In this one-to-many scenario, the only
possible links connects central node to the three neighbors. In the left panel we see the ground
truth network. In the middle panel, we see the links that we are, in fact, relevant to consider when
evaluating the performance of link-prediction. In the rightmost panel, we show the ‘padded’ network,
which most current algorithms base their performance metrics on. The padded task, however,
includes a number of links that could not possibly occur. We are not interested in the classifier’s
performance on these links, and therefore an algorithm’s ability to predict/not predict their presence,
should not be a part of the performance evaluation

2012; Holme 2015). In terms of disease spreading, the key quantity is the fraction of
available Susceptible-Infected links at any given time. This fraction varies strongly
depending on the network class (Mones et al. 2018), which in turn means that we
can expect spreading dynamics to unfold differently within different classes.
For example, a central finding when simulating epidemics on temporal networks
is that adding the temporal dimension has a strong impact on disease spreading in
nearly all networks, relative to simulating the disease on a static network. In some
cases the disease speeds up (relative to null models) and in others it slows down,
depending on a complex interplay between structure and topology (see Holme 2015
for a discussion). This raises the intriguing possibility that perhaps some classes
(e.g. one-to-one networks) might have slower epidemics than their randomized
counterparts, while other classes (e.g. many-to-many networks) might have more
rapidly spreading epidemics than their randomized versions.
If we look beyond epidemic spreading, there is experimental evidence that there
are subtle differences in spreading processes across various domains (Centola 2010;
Backstrom et al. 2006; Romero et al. 2011; Weng et al. 2013) and that opinions,
behaviors, and information spread in different ways than diseases (Centola and Macy
2007; Romero et al. 2011). When multiple sources of exposure to an innovation are
required for a transmission to take place, we call the process complex contagion. In
terms of modeling complex contagion processes on temporal networks, a key frame-
work is threshold models, (Granovetter 1978) where infection probability increases
as a function of the fraction of infected neighbors. These have been generalized to
temporal networks (Karimi and Holme 2013a, b; Takaguchi et al. 2013; Backlund
et al. 2014; Michalski et al. 2014).
The class of network has an even more profound impact on complex contagion
processes than on simple disease spreading. Consider, for example, a threshold model
(Granovetter 1978), where the probability of infection depends on which fraction of
a node’s neighbors are infected. Compared to phone-call networks, for example,
2 Fundamental Structures in Temporal Communication Networks 39

a b

Fig. 2.6 A cartoon illustrating why complex contagion (e.g. the threshold model) behaves differ-
ently in different classes of networks. a Shows the meetings in a many-to-many realtime network
(face-to-face meetings). b Shows phone calls (one-to-one synchronous network) among the same
nodes at some point in time. Imagine that blue nodes are infected. In the many-to-many network,
simultaneous information about a large set of neighbors is available for extended periods of time
allowing for an accurate overview of opinions in the network. In the phone network nodes might
need to wait extensively to access the state of some neighbors, allowing for much more difficulty
in establishing an accurate state of knowledge

threshold models have fundamentally different outcomes in face-to-face networks,


where large groups of individuals routinely gather (Iacopini et al. 2018). In the phone
call network, forming connections to a large fraction of one’s network might take
several months. See Fig. 2.6 for an illustration of this discussion. Thus, if we want
to understand contagion on a specific network, we must first understand the class of
fundamental structures to which the network belongs.

2.4.5 Communities

Communities in static networks are groups of nodes with a high density of internal
connections. Community detection in static networks never settled on a common
definition of the term community (there is a strong analogy to clustering in machine
learning Kleinberg 2003). Thus, generalizations to temporal networks also allow for
substantial variability in approaches. The simplest strategy for identifying temporal
communities is to first separate the list of time-stamped edges into sequence of static
snapshots, independently cluster each layer, and then match the communities across
the layers to find the temporal communities (Sekara et al. 2016; Palla et al. 2007;
Tantipathananandh et al. 2007; Pietilänen and Diot 2012; Kauffman et al. 2014; He
and Chen 2015). A number of approaches can directly cluster the entire stack of
temporal layers; these include three-way matrix factorization (Gauvin et al. 2014),
40 S. Lehmann

time-node graphs (Speidel et al. 2015), and stochastic block models (Gauvin et al.
2014; Peixoto 2015a; Matias and Miele 2017).5
From the perspective of the temporal structures, the central issue with commu-
nity detection is that the appropriate community detection method varies strongly
depending on a network’s dynamic class. In synchronous, many-to-many net-
works, temporal continuity is a key feature of communities. And as we have already
discussed, communities in face-to-face networks (dynamic class: synchronous,
many-to-many) form more or less instantaneously as a group of fully connected
nodes that connect at a certain time (Sekara et al. 2016) and form gatherings that can
be easily tracked over time. In this sense, communities in face-to-face networks are
straight-forward to identify—they are literally the fundamental structures of such
systems.
Identifying communities in other dynamic classes, is a completely different exer-
cise. For example, in the phone call (dynamic class: synchronous, one-to-one) or
Facebook networks (dynamic class: asynchronous, many-to-many), communities
become gradually observable as calls or messages aggregate over time. In the lat-
ter case, communities have to do with other network properties than the temporal
sequence. Here, my interactions are driven by the order in which posts were published
rather than organized by social context (as is the case in the synchronous networks).
To give a concrete example, I might retweet a work-related post about p-values, then
‘like’ a post about the Finnish heavy metal band Nightwish, published by a friend,
and finally comment on a political statement from a family member. Thus, in most
asynchronous systems, activity aggregates around active nodes (posts) rather than
social contexts. This means that interactions within communities are not necessarily
correlated in time. A fact which must be taken into consideration when we construct
methods for detecting communities. At the same time, we know from the literature
that communities do exist in these networks (Palla et al. 2007; Porter et al. 2009;
Ugander et al. 2012).
As in the FAQ, the temporal evolution of the fundamental structures within the
asynchronous classes is under-determined in the framework as it currently stands.
Similarly, exactly how to identify communities in these dynamical classes is not
clear to me. The central point I wish to make related to communities, however, is that
methods related to identifying communities in temporal networks will likely need to
be different depending on the network’s dynamical class.

5 These methods, however, do not incorporate explicit dependencies between layers. To take into
account interdependencies, some methods cluster multilayer networks using interlayer links that
represent specific causal or correlational dependencies between the layers (De Bacco et al. 2017;
Mucha et al. 2010; Chen et al. 2013; De Domenico et al. 2015; Bazzi et al. 2016; Larremore et al.
2013; Stanley et al. 2016). These interlayer dependencies are of key importance in the context of
modeling fundamental structures since the temporal aspect of the fundamental structures impose
important (and class dependent) interlayer correlations among nodes belonging to the same fun-
damental structure over time. Using the Infomap framework (Rosvall and Bergstrom 2008), we
have recently developed a model for interlayer dependencies for synchronous, many-to-many
networks (Aslak et al. 2017).
2 Fundamental Structures in Temporal Communication Networks 41

2.5 Conclusion

The lesson that i hope arises across the five key examples above is that networks within
each of the dynamic classes must be analyzed and modeled separately; that compar-
isons of statistics between networks are only meaningful for networks belonging to
the same class. This is because the class itself (and not just the actual systems that
are represented through the temporal network), strongly impacts almost all known
temporal network metrics.
Zooming out further, three key points emerge from the full discussion of the
dynamic classes and their fundamental structures.
1. Firstly, I argued that it is meaningful to divide all communication networks into
six dynamic classes (Fig. 2.1). This distinction originates from communication
studies (Jensen and Helles 2011) but is not yet recognized within network science.
2. Secondly, I pointed out that a network’s class strongly influences its temporal
evolution and alters dynamic processes on that network. This implies that we
cannot meaningfully compare results for networks belonging to different classes.
3. Thirdly, I tried to motivate the idea that the dynamic classes provide a promising
new framework for modeling temporal communication networks. This is because
every communication network can be seen as sequences of individual commu-
nication events. Thus, we can model every such network as generated by many
instances of a single fundamental structure. In this sense, the six classes provide
us the foundation for a new framework for both measuring and modeling temporal
networks.
These three key take-homes lead me to consider the role that I hope the dynamical
classes will play in the field of temporal networks. An important element that is
currently missing from the field of temporal network theory is a set of topological
properties to measure and devise statistics for. This lack of agreed-upon-structures is
eloquently pointed out by Petter Holme in his excellent review of temporal network
science (Holme 2015), where he writes:
In the history of static network theory, measuring network structure has been driving
the field. For example, after Barabási and coworkers discovered how common scale-free
(i.e. power-law-like) degree distributions are (…), there was a huge effort both to measure
degree distribution and to model their emergence.
For temporal networks, similar ubiquitous structures are yet to be discovered, perhaps
they do not even exist. This has led the research in temporal networks down a slightly
different path, where the focus is more on dynamic systems on the network and how they
are affected by structure, and less on discovering common patterns or classifying networks.
[my emphases]

Now, allow me to speculate wildly for a bit. I do not think that it is impossible
that the fundamental structures could be analogues to the ‘ubiquitous structures’
mentioned in the quote for the case of temporal networks. Perhaps the six dynamic
classes will allow us to think about structure in temporal networks in a new and more
principled way.
42 S. Lehmann

Finding such structures is important because, in static networks, a deeper under-


standing of the structure of the network, has allowed us to reason in principled ways
about their function—and for most applications outside pure science, function is what
we care about. As the quote illustrates, temporal network science has had to follow
a different path, focused more on simulation, for example observing how dynamical
processes unfold. As a consequence, we still do not have a coherent picture of the
key mechanisms in temporal networks. While still unproven at this point, I think that
the fundamental structures carry the promise of being the ubiquitous structures that
Holme posits are ‘yet to be discovered’. Therefore I hope that the new perspective
provided by the dynamic classes will give rise to new statistical models, algorithms
and research questions.

Epilogue: More FAQs

There’s a couple of more questions that have come up frequently in discussions of


the framework, but which slowed down the flow of the paper, so I have moved them
here, to the epilogue, for readers who might share these particular questions.

What About Mathematical Completeness?

A graph-theory inclined reader may to ask: ‘In what sense is this a mathematical
framework?’ With follow-ups such as ‘Are the classes disjoint? Can a dynamic net-
work belong to multiple classes? Can a network’s class change over time?’ They
might proceed ‘Are the classes complete? Can all possible networks be divided into
one of the six classes? Is it possible to construct networks that fall outside the tax-
onomy in Table 2.1?’
Here, the answer is that this is not a framework/theory in a graph theoretical sense.
I think of the six classes as a model in the physics sense of the word.
Let me explain by way of an analogy. In the early days of quantum mechanics,
Geiger and Marsden (directed by Rutherford), decided to shoot some α-particles into
a thin sheet of gold foil (Gegier and Marsden 1909; Geiger 1910). They noticed that
a vast majority of the particles went straight through the gold foil, but that a small
fraction were scattered at a wide range of angles. This was a highly unexpected
and very non-classical behavior. To explain these strange experimental observations,
Rutherford proposed a new model, qualitative at first, that atoms have a tiny and
heavy nucleus, surrounded by a cloud of electrons (departing from the then popular
‘plum pudding model’6 of the atom, proposed in 1904 by J.J. Thompson). Based on
Rutherford’s model for the atom’s structure, other scientists were able to develop

6 Yes, that was real thing.


2 Fundamental Structures in Temporal Communication Networks 43

better descriptions, eventually leading to the quantum mechanical framework that


we teach undergrads today.
I think of the framework presented here as a model in the same sense as Ruther-
ford’s (no comparison otherwise). Just like the model of a dense core with mostly
empty space around it was a way to organize subsequent observations and provide
structure to the theories/models to follow it, the dynamical classes are a way to orga-
nize our study of networks and to provide constraints/structure for the next steps of
theory-building.7

But How Is this Different from Temporal Motifs?

Motifs are a structural characteristic closely connected to fundamental structures,


and have been the focus of much research. This area features multiple general-
izations of the motifs in static networks (Milo et al. 2002)—small subgraphs that
occur more or less frequently than one might expect in an appropriate null model.
Typically, the strategy is to count the temporal subgraphs occurring within some
interval Δt (Zhao et al. 2010; Kovanen et al. 2013). Findings suggest that certain tit-
for-tat motifs and triangles are over-represented in phone networks (dynamic class:
synchronous, one-to-one networks) and may shape processes such as spreading
(Mantzaris and Higham 2013; Rocha and Blondel 2013; Saramäki and Holme 2015;
Delvenne et al. 2015). Recently, highly efficient methods have resulted in accurate
motif-counts for very large networks (Paranjape et al. 2017). Other motif-like struc-
tures have been explored, for example graphlets, which are equivalence classes of
Δt-causal subgraphs (Hulovatyy et al. 2015). Of particular relevance to the frame-
work presented here is work on structure prediction (Lahiri and Berger-Wolf 2007)
and related algorithms for efficiently counting isomorphic temporal subgraphs (Red-
mond and Cunningham 2013).
From the perspective of fundamental structures, there are two issues with temporal
sub-graph counting approaches. The key issue is that current methods do not mea-
sure individual communication events. The sliding window based approach, which
identifies the network structures that arise within some time Δt does not recognize
that the fundamental structures have a natural beginning and end. As a consequence,
these methods do not identify and aggregate statistics for the fundamental structures,
rather ending up with aggregate statistics for smaller structures which are incidental
to the fundamental structures. To be concrete: In the example we discussed above
(Fig. 2.2), a motif-based method would find many triangles in the face-to-face net-
works, but not realize that the network consists of disjoint cliques, which could be
matched up over time. A second problem in some of the large comparative studies is
that the notion of network classes are not considered. This can lead to non-meaningful

7 By the way, as far as I can tell, the classes are not disjoint and not complete. Further, real networks
are not necessarily a perfect fit to their classes. But as I hope to have convinced the reader by way
of the analogy above …that’s not the point.
44 S. Lehmann

comparisons of motif-counts between networks belonging to separate classes (see


Paranjape et al. 2017 for an example).

Acknowledgements I would like to thank Arkadiusz ‘Arek’ Stopczynski, Enys Mones, Hjalmar
Bang Carlsen, Laura Alessandretti, James Bagrow, Petter Holme, Piotr Sapiezynski, Sebastiano
Piccolo, Ulf Aslak Jensen, and Yong-Yeol Ahn for fruitful discussions and generous comments on
the manuscript text (list sorted alphabetically by first name). Special thanks to Piotr for the link
prediction example. This work was supported by the Independent Research Fund Denmark.

References

U. Aslak, M. Rosvall, S. Lehmann, Constrained information flows in temporal networks reveal


intermittent communities (2017). arXiv:1711.07649
V.P. Backlund, J. Saramäki, R.K. Pan, Effects of temporal correlations on cascades: threshold models
on temporal networks. Phys. Rev. E 89(6), 062,815 (2014)
L. Backstrom, D. Huttenlocher, J. Kleinberg, X. Lan, Group formation in large social networks:
membership, growth, and evolution, in Proceedings of the 12th ACM SIGKDD International
Conference (2006), pp. 44–54
J.P. Bagrow, D. Brockmann, Natural emergence of clusters and bursts in network evolution. Phys.
Rev. X 3(2), 021,016 (2013)
M. Bazzi, M.A. Porter, S. Williams, M. McDonald, D.J. Fenn, S.D. Howison, Community detection
in temporal multilayer networks, with an application to correlation networks. Multiscale Model.
Simul. 14(1), 1–41 (2016)
K. Carlberg, J. Crowcroft, Building shared trees using a one-to-many joining mechanism. ACM
SIGCOMM Comput. Commun. Rev. 27(1), 5–11 (1997)
D. Centola, The spread of behavior in an online social network experiment. Science 329(5996),
1194–1197 (2010)
D. Centola, M. Macy, Complex contagions and the weakness of long ties. Am. J. Sociol. 113(3),
702–734 (2007)
Y. Chen, V. Kawadia, R. Urgaonkar, Detecting overlapping temporal community structure in time-
evolving networks (2013). arXiv:1303.7226
Y.S. Cho, A. Galstyan, P.J. Brantingham, G. Tita, Latent self-exciting point process model for
spatial-temporal networks (2013). arXiv:1302.2671
A. Clauset, C. Moore, M.E. Newman, Hierarchical structure and the prediction of missing links in
networks. Nature 453(7191), 98 (2008)
R.G. Clegg, B. Parker, M. Rio, Likelihood-based assessment of dynamic networks. J. Compl. Netw.
4(4), 517–533 (2016)
J. Cui, Y.Q. Zhang, X. Li, On the clustering coefficients of temporal networks and epidemic dynam-
ics, in 2013 IEEE International Symposium on Circuits and Systems (ISCAS) (IEEE, 2013), pp
2299–2302
C. De Bacco, E.A. Power, D.B. Larremore, C. Moore, Community detection, link prediction, and
layer interdependence in multilayer networks. Phys. Rev. E 95(4), 042,317 (2017)
M. De Domenico, A. Lancichinetti, A. Arenas, M. Rosvall, Identifying modular flows on multilayer
networks reveals highly overlapping organization in interconnected systems. Phys. Rev. X 5(1),
011,027 (2015)
J.C. Delvenne, R. Lambiotte, L.E. Rocha, Diffusion on networked systems is a question of time or
structure. Nat. Commun. 6, 7366 (2015)
Y. Dhote, N. Mishra, S. Sharma, Survey and analysis of temporal link prediction in online social
networks, in 2013 International Conference on Advances in Computing, Communications and
Informatics (ICACCI) (IEEE, 2013), pp. 1178–1183
2 Fundamental Structures in Temporal Communication Networks 45

N. Eagle, A. Pentland, D. Lazer, Inferring social network structure using mobile phone data. Proc.
Natl. Acad. Sci. 106, 15274–15278 (2007)
J.P. Eckmann, E. Moses, D. Sergi, Entropy of dialogues creates coherent structures in e-mail traffic.
Proc. Natl. Acad. Sci. USA 101(40), 14333–14337 (2004)
L. Gauvin, M. Génois, M. Karsai, M. Kivelä, T. Takaguchi, E. Valdano, C.L. Vestergaard, Random-
ized reference models for temporal networks (2018). arXiv:1806.04032
L. Gauvin, A. Panisson, C. Cattuto, Detecting the community structure and activity patterns of
temporal networks: a non-negative tensor factorization approach. PloS one 9(1), e86,028 (2014)
H. Gegier, E. Marsden, On a diffuse reflection of the α-particles. Proc. Roy. Soc. Lond. Ser. A,
Contain. Pap. Math. Phys. Character 82(557), 495–500 (1909)
H. Geiger, The scattering of α-particles by matter. Proc. Roy. Soc. Lond. Ser. A, Contain. Pap.
Math. Phys. Character 83(565), 492–504 (1910)
M. Génois, C.L. Vestergaard, J. Fournet, A. Panisson, I. Bonmarin, A. Barrat, Data on face-to-face
contacts in an office building suggest a low-cost vaccination strategy based on community linkers.
Netw. Sci. 3(3), 326–347 (2015)
M. Granovetter, Threshold models of collective behavior. Am. J. Sociol. 83(6), 1420 (1978)
R. Guimera, L. Danon, A. Diaz-Guilera, F. Giralt, A. Arenas, Self-similar community structure in
a network of human interactions. Phys. Rev. E 68(6), 065,103 (2003)
R. Guimerà, M. Sales-Pardo, Missing and spurious interactions and the reconstruction of complex
networks. Proc. Natl. Acad. Sci. 106(52), 22073–22078 (2009)
E. Gummesson, From one-to-one to many-to-many marketing, in Service Excellence in Manage-
ment: Interdisciplinary Contributions, Proceedings from the QUIS 9 Symposium, Karlstad Uni-
versity Karlstad, Sweden Citeseer (2004), pp. 16–25
J. He, D. Chen, A fast algorithm for community detection in temporal network. Phys. A 429, 87–94
(2015)
P. Holme, Epidemiologically optimal static networks from temporal network data. PLOS Comput.
Biol. 9(7), e1003,142 (2013)
P. Holme, Modern temporal network theory: a colloquium. Eur. Phys. J. B 88(9), 234 (2015)
P. Holme, J. Saramäki, Temporal networks. Phys. Rep. 519, 97–125 (2012)
P. Holme, J. Saramäki, Temporal Networks (Springer, 2013)
Y. Hulovatyy, H. Chen, T. Milenković, Exploring the structure and function of temporal networks
with dynamic graphlets. Bioinformatics 31(12), i171–i180 (2015)
I. Iacopini, G. Petri, A. Barrat, V. Latora, Simplicial models of social contagion (2018).
arXiv:1810.07031
K.B. Jensen, R. Helles, The internet as a cultural forum: implications for research. New Media Soc.
13(4), 517–533 (2011)
H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, A.L. Barabási, The large-scale organization of
metabolic networks. Nature 407(6804), 651 (2000)
T. Jewett, Database Design with UML and SQL, 3 edn. Online (2011). http://www.tomjewett.com/
dbdesign
J. Jo, J. Kim, Synchronized one-to-many media streaming with adaptive playout control, in Mul-
timedia Systems and Applications V, vol. 4861. International Society for Optics and Photonics
(2002), pp. 71–83
F. Karimi, P. Holme, A temporal network version of watts’s cascade model, in Temporal Networks
(Springer, 2013a), pp. 315–329
F. Karimi, P. Holme, Threshold model of cascades in empirical temporal networks. Phys. A 392(16),
3476–3483 (2013b)
M. Karsai, K. Kaski, J. Kertész, Correlated dynamics in egocentric communication networks. Plos
one 7(7), e40,612 (2012)
M. Karsai, N. Perra, A. Vespignani, Time varying networks and the weakness of strong ties. Sci.
Rep. 4, 4001 (2014)
J. Kauffman, A. Kittas, L. Bennett, S. Tsoka, Dyconet: a gephi plugin for community detection in
dynamic complex networks. PloS one 9(7), e101,357 (2014)
46 S. Lehmann

M. Kim, J. Leskovec, The network completion problem: Inferring missing nodes and edges in
networks, in Proceedings of the 2011 SIAM International Conference on Data Mining (SIAM,
2011), pp. 47–58
J.M. Kleinberg, An impossibility theorem for clustering, in Advances in Neural Information Pro-
cessing Systems (2003), pp. 463–470
R. Kobayashi, R. Lambiotte, Tideh: Time-dependent hawkes process for predicting retweet dynam-
ics, in ICWSM (2016), pp. 191–200
L. Kovanen, K. Kaski, J. Kertész, J. Saramäki, Temporal motifs reveal homophily, gender-specific
patterns, and group talk in call sequences. Proc. Natl. Acad. Sci. 110(45), 18070–18075 (2013)
G. Krings, M. Karsai, S. Bernhardsson, V.D. Blondel, J. Saramäki, Effects of time window size
and placement on the structure of an aggregated communication network. EPJ Data Sci. 1(1), 4
(2012)
M. Lahiri, T.Y. Berger-Wolf, Structure prediction in temporal networks using frequent subgraphs,
in IEEE Symposium on Computational Intelligence and Data Mining, 2007. CIDM 2007 (IEEE,
2007), pp. 35–42
D.B. Larremore, A. Clauset, C.O. Buckee, A network approach to analyzing highly recombinant
malaria parasite genes. PLoS Comput. Biol. 9(10), e1003,268 (2013)
G. Laurent, J. Saramäki, M. Karsai, From calls to communities: a model for time-varying social
networks. Eur. Phys. J. B 88(11), 301 (2015)
D. Liben-Nowell, J. Kleinberg, The link-prediction problem for social networks. J. Assoc. Inf. Sci.
Technol. 58(7), 1019–1031 (2007)
S. Liu, N. Perra, M. Karsai, A. Vespignani, Controlling contagion processes in activity driven
networks. Phys. Rev. Lett. 112(11), 118,702 (2014)
S.Y. Liu, A. Baronchelli, N. Perra, Contagion dynamics in time-varying metapopulation networks.
Physical Review E 87(3), 032,805 (2013)
L. Lü, T. Zhou, Link prediction in complex networks: a survey. Phys. A 390(6), 1150–1170 (2011)
P. Mahadevan, D. Krioukov, K. Fall, A. Vahdat, Systematic topology analysis and generation using
degree correlations. in ACM SIGCOMM Computer Communication Review, vol. 36 (ACM, 2006),
pp. 135–146
A.V. Mantzaris, D.J. Higham, Infering and calibrating triadic closure in a dynamic network, in
Temporal Networks (Springer, 2013), pp. 265–282
S. Maslov, K. Sneppen, Specificity and stability in topology of protein networks. Science 296(5569),
910–913 (2002)
N. Masuda, R. Lambiotte, A Guidance to Temporal Networks (World Scientific, 2016)
N. Masuda, T. Takaguchi, N. Sato, K. Yano, Self-exciting point process modeling of conversation
event sequences, in Temporal Networks (Springer, 2013), pp. 245–264
C. Matias, V. Miele, Statistical clustering of temporal networks through a dynamic stochastic block
model. J. Roy. Stat. Soc.: Ser. B (Stat. Methodol.) 79(4), 1119–1141 (2017)
R. Michalski, T. Kajdanowicz, P. Bródka, P. Kazienko, Seed selection for spread of influence in
social networks: temporal vs. static approach. New Generation Comput. 32(3-4), 213–235 (2014)
J.H. Miller, S.E. Page, Complex Adaptive Systems: An Introduction to Computational Models of
Social Life (Princeton University Press, 2009)
R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, U. Alon, Network motifs: simple
building blocks of complex networks. Science 298(5594), 824–827 (2002)
A. Moinet, M. Starnini, R. Pastor-Satorras, Burstiness and aging in social temporal networks. Phys.
Rev. Lett. 114(10), 108,701 (2015)
E. Mones, A. Stopczynski, N. Hupert, S. Lehmann et al., Optimizing targeted vaccination across
cyber–physical networks: an empirically based mathematical simulation study. J. Roy. Soc. Inter-
face 15(138), 20170,783 (2018)
P. Mucha, T. Richardson, K. Macon, M. Porter, J.P. Onnela, Community structure in time-dependent,
multiscale, and multiplex networks. Science 328(5980), 876–878 (2010)
2 Fundamental Structures in Temporal Communication Networks 47

S.A. Myers, A. Sharma, P. Gupta, J. Lin, Information network or social network?: the structure of
the twitter follow graph, in Proceedings of the 23rd International Conference on World Wide Web
(ACM, 2014), pp. 493–498
M. Newman, Networks: An Introduction (Oxford University Press, 2010)
J.P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, D. Lazer, K. Kaski, J. Kertész, A.L. Barabási,
Structure and tie strengths in mobile communication networks. Proc. Natl. Acad. Sci. 104(18),
7332–7336 (2007)
C. Orsini, M.M. Dankulov, P. Colomer-de Simón, A. Jamakovic, P. Mahadevan, A. Vahdat, K.E.
Bassler, Z. Toroczkai, M. Boguñá, G. Caldarelli et al., Quantifying randomness in real networks.
Nat. Commun. 6, 8627 (2015)
G. Palla, A. Barabási, T. Vicsek, Quantifying social group evolution. Nature 446, 664–667 (2007)
A. Paranjape, A.R. Benson, J. Leskovec, Motifs in temporal networks, in Proceedings of the Tenth
ACM International Conference on Web Search and Data Mining (ACM, 2017), pp. 601–610
T.P. Peixoto, Parsimonious module inference in large networks. Phys. Rev. Lett. 110(14), 148,701
(2013)
T.P. Peixoto, Inferring the mesoscale structure of layered, edge-valued, and time-varying networks.
Phys. Rev. E 92(4), 042,807 (2015a)
T.P. Peixoto, Model selection and hypothesis testing for large-scale network models with overlap-
ping groups. Phys. Rev. X 5(1), 011,033 (2015b)
N. Perra, A. Baronchelli, D. Mocanu, B. Gonçalves, R. Pastor-Satorras, A. Vespignani, Random
walks and search in time-varying networks. Phys. Rev. Lett. 109(23), 238,701 (2012)
N. Perra, B. Gonçalves, R. Pastor-Satorras, A. Vespignani, Activity driven modeling of time varying
networks. Sci. Rep. 2, 469 (2012)
G. Petri, A. Barrat, Simplicial activity driven model. Phys. Rev. Lett. 121, 228,301 (2018)
A.K. Pietilänen, C. Diot, Dissemination in opportunistic social networks: the role of temporal
communities, in Proceedings of the thirteenth ACM international symposium on Mobile Ad Hoc
Networking and Computing (ACM, 2012), pp. 165–174
M.A. Porter, J.P. Onnela, P.J. Mucha, Communities in networks. Not. AMS 56(9), 1082–1097
(2009)
I. Rahwan, R. Kowalczyk, H.H. Pham, Intelligent agents for automated one-to-many e-commerce
negotiation, in Australian Computer Science Communications, vol. 24. (Australian Computer
Society, Inc., 2002), , pp. 197–204
U. Redmond, P. Cunningham, Temporal subgraph isomorphism, in Proceedings of the 2013
IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.
(ACM, 2013), pp. 1451–1452
B. Ribeiro, N. Perra, A. Baronchelli, Quantifying the effect of temporal resolution on time-varying
networks. Sci. Rep. 3, 3006 (2013)
L.E. Rocha, V.D. Blondel, Bursts of vertex activation and epidemics in evolving networks. PLoS
Comput. Biol. 9(3), e1002,974 (2013)
L.E. Rocha, V.D. Blondel, Flow motifs reveal limitations of the static framework to represent human
interactions. Phys. Rev. E 87(4), 042,814 (2013)
D.M. Romero, B. Meeder, J. Kleinberg, Differences in the mechanics of information diffusion
across topics: idioms, political hashtags, and complex contagion on twitter, in Proceedings of the
20th International Conference on World Wide Web (ACM, 2011), pp. 695–704
M. Rosvall, C. Bergstrom, Maps of random walks on complex networks reveal community structure.
Proc. Natl. Acad. Sci. 105(4), 1118–1123 (2008)
J.F. Rual, K. Venkatesan, T. Hao, T. Hirozane-Kishikawa, A. Dricot, N. Li, G.F. Berriz, F.D. Gibbons,
M. Dreze, N. Ayivi-Guedehoussou et al., Towards a proteome-scale map of the human protein-
protein interaction network. Nature 437(7062), 1173 (2005)
J. Saramäki, P. Holme, Exploring temporal networks with greedy walks. Eur. Phys. J. B 88(12), 334
(2015)
J. Saramaki, E. Moro, From seconds to months: multi-scale dynamics of mobile telephone calls.
Eur. Phys. J. B 88, 1 (2015)
48 S. Lehmann

M. Schwartz, Computer-Communication Network Design and Analysis, vol. 25 (Prentice-hall


Englewood Cliffs, NJ, 1977)
V. Sekara, S. Lehmann, The strength of friendship ties in proximity sensor data. PloS One 9(7),
e100,915 (2014)
V. Sekara, A. Stopczynski, S. Lehmann, Fundamental structures of dynamic social networks. Proc.
Natl. Acad. Sci. 113(36), 9977–9982 (2016)
L. Speidel, T. Takaguchi, N. Masuda, Community detection in directed acyclic graphs. Eur. Phys.
J. B 88(8), 203 (2015)
N. Stanley, S. Shai, D. Taylor, P.J. Mucha, Clustering network layers with the strata multilayer
stochastic block model. IEEE Trans. Netw. Sci. Eng. 3(2), 95–105 (2016)
M. Starnini, A. Baronchelli, R. Pastor-Satorras, Modeling human dynamics of face-to-face interac-
tion networks. Phys. Rev. Lett. 110(16), 168,701 (2013)
M. Starnini, R. Pastor-Satorras, Temporal percolation in activity-driven networks. Phys. Rev. E
89(3), 032,807 (2014)
A. Stopczynski, V. Sekara, P. Sapiezynski, A. Cuttone, J.E. Larsen, S. Lehmann, Measuring large-
scale social networks with high resolution. PLOS One 9(4), e95,978 (2014)
K. Sun, A. Baronchelli, N. Perra, Contrasting effects of strong ties on SIR and SIS processes in
temporal networks. Eur. Phys. J. B 88(12), 326 (2015)
A. Sunny, B. Kotnis, J. Kuri, Dynamics of history-dependent epidemics in temporal networks. Phys.
Rev. E 92(2), 022,811 (2015)
T. Takaguchi, N. Masuda, P. Holme, Bursty communication patterns facilitate spreading in a
threshold-based epidemic dynamics. PloS One 8(7), e68,629 (2013)
C. Tantipathananandh, T. Berger-Wolf, D. Kempe, A framework for community identification in
dynamic social networks, in Proceedings of the 13th ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining (ACM, 2007), pp. 717–726
D. Thieffry, A.M. Huerta, E. Pérez-Rueda, J. Collado-Vides, From specific gene regulation to
genomic networks: a global analysis of transcriptional regulation in escherichia coli. BioEssays
20(5), 433–440 (1998)
J. Ugander, L. Backstrom, C. Marlow, J. Kleinberg, Structural diversity in social contagion. Proc.
Natl. Acad. Sci. 109(16), 5962–5966 (2012)
J. Ugander, B. Karrer, L. Backstrom, C. Marlow, The anatomy of the facebook social graph (2011).
arXiv:1111.4503
T. Valles-Catala, F.A. Massucci, R. Guimera, M. Sales-Pardo, Multilayer stochastic block models
reveal the multilayer structure of complex networks. Phys. Rev. X 6(1), 011,036 (2016)
C.L. Vestergaard, M. Génois, A. Barrat, How memory generates heterogeneous dynamics in tem-
poral networks. Phys. Rev. E 90(4), 042,805 (2014)
D. Watts, S. Strogatz, Collective dynamics of ‘small-world’ networks. Nature 393, 440 (1998)
L. Weng, F. Menczer, Y.Y. Ahn, Virality prediction and community structure in social networks.
Sci. Rep. 3 (2013)
Y.Q. Zhang, X. Li, D. Liang, J. Cui, Characterizing bursts of aggregate pairs with individual pois-
sonian activity and preferential mobility. IEEE Commun. Lett. 19(7), 1225–1228 (2015)
Q. Zhao, Y. Tian, Q. He, N. Oliver, R. Jin, W.C. Lee, Communication motifs: a tool to characterize
social communications, in Proceedings of the 19th ACM International Conference on Information
and Knowledge Management (ACM, 2010), pp. 1645–1648
Chapter 3
Weighted, Bipartite, or Directed Stream
Graphs for the Modeling of Temporal
Networks

Matthieu Latapy, Clémence Magnien, and Tiphaine Viard

Abstract We recently introduced a formalism for the modeling of temporal net-


works, that we call stream graphs. It emphasizes the streaming nature of data and
allows rigorous definitions of many important concepts generalizing classical graphs.
This includes in particular size, density, clique, neighborhood, degree, clustering
coefficient, and transitivity. In this contribution, we show that, like graphs, stream
graphs may be extended to cope with bipartite structures, with node and link weights,
or with link directions. We review the main bipartite, weighted or directed graph con-
cepts proposed in the literature, we generalize them to the cases of bipartite, weighted,
or directed stream graphs, and we show that obtained concepts are consistent with
graph and stream graph ones. This provides a formal ground for an accurate modeling
of the many temporal networks that have one or several of these features.

Keywords Temporal networks · Stream graphs

3.1 Introduction

Graph theory is one of the main formalisms behind network science. It provides
concepts and methods for the study of networks, and it is fueled by questions and
challenges raised by them. Its core principle is to model networks as sets of nodes and
links between them. Then, a graph G is defined by a set of nodes V and a set of links
E ⊆ V ⊗ V where each link is an unordered pair of nodes.1 In many cases, though,
this does not capture key features of the modeled network. In particular, links may be
weighted or directed, nodes may be of different kinds, etc. One key strength of graph
theory is that is easily copes with such situations by defining natural extensions of

1 Given any two sets X and Y , we denote by X × Y the cartesian product of X and Y , i.e. the set
of all ordered pairs (x, y) such that x ∈ X and y ∈ Y . We denote by X ⊗ Y the set of all unordered
pairs composed of x ∈ X and y ∈ Y , with x = y, that we denote by x y = yx.

M. Latapy (B) · C. Magnien · T. Viard


Laboratoire d’Informatique, CNRS, de Paris 6, LIP6, 75005 Paris, France
e-mail: Matthieu.Latapy@lip6.fr

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 49


P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_3
50 M. Latapy et al.

a
b
c
d
0 2 4 6 8 time
Fig. 3.1 An example of stream graph: S = (T, V, W, E) with T = [0, 10] ⊆ R, V =
{a, b, c, d}, W = [0, 10] × {a} ∪ ([0, 4] ∪ [5, 10]) × {b} ∪ [4, 9] × {c} ∪ [1, 3] × {d}, and E =
([1, 3] ∪ [7, 8]) × {ab} ∪ [4.5, 7.5] × {ac} ∪ [6, 9] × {bc} ∪ [2, 3] × {bd}. In other words, Ta =
[0, 10], Tb = [0, 4] ∪ [5, 10], Tc = [4, 9], Td = [1, 3], Tab = [1, 3] ∪ [7, 8], Tac = [4.5, 7.5],
Tbc = [6, 9], Tbd = [2, 3], and Tad = Tcd = ∅

basic graphs, typically weighted, bipartite, or directed graphs. Classical concepts on


graphs are then extended to these more complex cases.
Stream graphs were recently introduced as a formal framework for temporal
networks (Latapy et al. 2018), similar to what graph theory is to networks. A
stream graph S is defined by a time set T , a node set V , a set of temporal nodes
W ⊆ T × V and a set of temporal links E ⊆ T × V ⊗ V . See Fig. 3.1 for an illus-
tration. Each node v ∈ V has a set of presence times Tv = {t, (t, v) ∈ W }. Like-
wise, Tuv = {t, (t, uv) ∈ E} is the set of presence times of link uv. Conversely,
Vt = {v, (t, v) ∈ W } and E t = {uv, (t, uv) ∈ E} are the set of nodes and links present
at time t, leading to the graph at time t: G t = (Vt , E t ). The graph induced by S is
G(S) = ({v, Tv = ∅}, {uv, Tuv = ∅}).
Stream graphs encode the same information as Time Varying Graphs (TVG)
(Casteigts et al. 2012), Relational Event Models (REM) (Butts 2008; Stadtfeld and
Block 2017), Multi-Aspect Graphs (MAG) (Wehmuth et al. 2016, 2015), or other
models of temporal networks. Stream graphs emphasize the streaming nature of data,
but all stream graph concepts may easily be translated to these other points of views.
A wide range of graph concepts have been extended to stream graphs (Latapy et al.
2018). The most basic ones are probably the number of nodes n = v∈V |T v|
|T |
and the

number of links m = uv∈V ⊗V |T|Tuv|| . Then, the neighborhood of node v is N (v) =
|N (v)|
{(t, u), (t, uv) ∈ E} and its degree is d(v) = |T |
.
The average degree of S is the
 |Tv |
average degree of all nodes weighted by their presence time: d(S) = v∈V |W |
d(v).
Going further, the density of S is δ(S) =  m
. It is the probability, when
uv∈V ⊗V |Tu ∩Tv |
one chooses at random a time instant and two nodes present at that time, that these
two nodes are linked together at that time. Then, a clique is a subset C of W such
that for all (t, u) and (t, v) in C, u and v are linked together at time t in S, i.e.
(t, uv) ∈ E. Equivalently, a subset of W is a clique of S if the substream it induces
has density 1.
This leads to the definition of clustering coefficient in stream graphs: like
 graphs, |Tcc(v)
in is the density of the neighborhood of v. Equivalently, cc(v) =
vu ∩Tvw ∩Tuw |
uw∈V ⊗V |Tvu ∩Tvw |
. Likewise, the transitivity of S is the fraction of all 4-uplets
(t, u, v, w) with (t, uv) and (t, vw) in E such that (t, vw) is also in E.
3 Weighted, Bipartite, or Directed Stream Graphs for the Modeling … 51

paper [28]
Graphs Stream Graphs

Weighted Graphs Weighted Stream Graphs


literature

Bipartite Graphs Bipartite Stream Graphs

Directed Graphs Directed Stream Graphs

this contribution

Fig. 3.2 The global positioning of this contribution with respect to state-of-the-art. Left: weighted,
bipartite and directed extensions of graph properties are available in the scientific literature. Top: a
generalization of graphs to stream graphs was proposed in Latapy et al. (2018). Dotted rectangle: in
this contribution, we extend weighted, bipartite and directed graph concepts to weighted, bipartite,
and directed stream graphs, as well as standard stream graph concepts to the weighted, bipartite
and directed cases, in a consistent way

These concepts generalize graph concepts in the following sense. A stream S is


called graph-equivalent if it has no dynamics: G t = G(S) for all t. In this case,
each stream property of S is equal to the corresponding graph property of G(S). For
instance, the density of S is equal to the one of G(S). Graphs may therefore be seen
as special cases of stream graphs (the ones with no dynamics).
Like for graphs, the stream graph formalism was designed to be readily extendable
to weighted, directed, or bipartite cases. However, these extensions remain to be done,
and this is the goal of the present contribution, summarized in Fig. 3.2.
Before entering in the core of this contribution, notice that the set of available
graph concepts is huge, much larger than what may be considered here. We therefore
focus on a the set of key properties succinctly summarized above. In particular, we
do not consider path-related concepts, which would deserve a dedicated work of their
own. Rather than being exhaustive, our aim is to illustrate how weighted, bipartite,
or directed graph concepts may be generalized to stream graphs in a consistent way,
and to provide a ground for further generalizations.

3.2 Weighted Stream Graphs

A weighted graph is a graph G = (V, E) equipped with a weight function ω gener-


ally defined over E, and sometimes on V too. Then, ω(v) is the weight of node v and
ω(uv) the one of link uv. Link weights may represent tie strength (in a friendship
or collaboration network for instance) Newman (2001b), Serrano et al. (2009), link
capacity (in an road or computer network, for instance) Conte et al. (2016), Serrano
52 M. Latapy et al.

et al. (2009), or a level of similarity (in document or image networks, or gene net-
works for instance) Kalna and Higham (2007), Zhang and Horvath (2005). Node
weights may represent reliability, availability, size, etc. As a consequence, weighted
graphs are very important and they are used to model a wide variety of networks.
In most cases, though, nodes are considered unweighted. We will therefore only
consider weighted links in the following, except where specified otherwise.
Even when one considers a weighted graph G = (V, E) equipped with the weight
function ω, the properties of G itself (without weights) are of crucial interest. In addi-
tion, one may consider thresholded versions of G, defined as G τ = (Vτ , E τ ) where
Vτ = {v ∈ V, ω(v) ≥ τ } and E τ = {uv ∈ E, ω(uv) ≥ τ }, for various thresholds τ .
This actually is a widely used way to deal with graph weights, formalized in a sys-
tematic way as early as 1969 Doreian (1969). However, one often needs to truly take
weights into account, without removing any information. In particular, the impor-
tance of weak links is missed by thresholding approaches. In addition, determining
appropriate thresholds is a challenge in itself (Esfahlani and Sayama 2018; Serrano
et al. 2009; Smith et al. 2015).
As a consequence, several extensions of classical graph concepts have been intro-
duced to incorporate weight information and deal directly with it. The most basic
ones are the maximal, minimal, and average weights, denoted respectively by ωmax ,
ωmin and ω . The minimal weight ωmin is often implicitly considered as equal to
0, and weights are sometimes normalized in order to ensure that ωmax = 1 (Onnela
et al. 2005; Zhang and Horvath 2005; Grindrod 2002; Ahnert et al. 2007).
In addition to these trivial metrics, one of the most classical property probably
is the weighted version of node degree, known as node  strength (Barrat et al. 2004;
Newman 2004; Antoniou and Tsompa 2008): s(v) = u∈N (v) ω(uv). Notice that
the average node strength is equal to the product of the average node degree and the
average link weight.
Strength is generally used jointly with classical degree, in particular to investigate
correlations between degree and strength: if weights represent a kind of activity
(like travels or communications) then correlations give information on how activity
is distributed over the structure (Barrat et al. 2004; Panzarasa et al. 2009). One
may also combine node degree and strength in order to obtain a measure of node
 For instance, (Opsahl et al. 2010) uses a tuning parameter α and compute
importance.
α
d(v) · d(v)
s(v)
, but more advanced approaches exist (Conte et al. 2016).
Generalizing density, i.e. the number of present links divided by the total number
of possible links, raises subtle questions. Indeed, it seems natural to replace the
number of present links by the sum of all weights uv∈E ω(uv) like for strength,
but several variants for the total sum of possible weights make sense. For instance,
the literature on rich clubs (Alstott et al. 2014; Opsahl et al. Oct
 2008) considers
that all present links may have the maximal weight, leading to uv∈E  ωmax , or that
all links may be present and have the maximal weight, leading to uv∈V ⊗V ωmax .
In the special case where weights represent a level of certainty between 0 and 1 for
link presence (1 if it is present
 for sure, 0 if it is absent for sure), then the weighted
ω(uv)
density may be defined as uv∈E |V ⊗V |
(Zou 2013).
3 Weighted, Bipartite, or Directed Stream Graphs for the Modeling … 53

a
b
c
d
0 2 4 6 8 time

Fig. 3.3 An example of weighted stream graph. In this example, nodes are unweighted but links
are weighted. Instead of just a straight horizontal line indicating link presence over time, we plot
the weight value (assuming that 0 is indicated by the horizontal line)

Various definitions of weighted clustering coefficients have been proposed, and


(Saramäki et al. 2007; Antoniou and Tsompa 2008; Wang et al. 2017) review many
of them in details.  The most classical one was proposed in Barrat et al. (2004):
ω(vi)+ω(v j)
cc(v) = s(v)(d(v)−1) 1
i, j∈N (v),i j∈E 2
. A general approach was also proposed
in Opsahl and Panzarasa (2009). Given a node v, it assigns a value to each triplet
of distinct nodes (i, v, j) such that iv and jv are in E and to each such triplet such
that i j is also in E. Then, cc(v) is defined as the ratio between the sum of values
of triplets in the second category and the one of triplets in the first category. In
Opsahl and Panzarasa (2009), considered values are the arithmetic mean, geomet-
ric mean, maximal value or minimal value of weights of involved links, depend-
ing on the  application. One may also consider the product of weights, leading to
ω(vi)·ω(v j)·ω(i j)
cc(v) = i, j∈N (v),i j∈E as proposed in Zhang and Horvath (2005), Kalna
i = j∈N (v) ω(vi)·ω(v j)
and Higham (2007), Ahnert et al. (2007) with normalized weights.
Transitivity is generalized in a very similar way (Opsahl and Panzarasa 2009)
by considering all triplets of distinct nodes (i, j, k) such that i j and jk are in E
and each such triplet such that ik is also in E, instead of only the ones centered
on a specific
 node v. If the associated value is the product of weight, this leads to
ω(i j)·ω( jk)·ω(ik)
tr = (i,j,k) ω(i j)·ω( jk) . If all weights are equal to 1 (i.e. the graph is unweighted)
(i, j,k)
this is nothing but the transitivity in G.
Various other concepts have been generalized to weighted graphs, like for instance
assortativity (Barrat et al. 2004), and specific weighted graph concepts, like closeness
and betweenness centralities (Opsahl et al. 2010), connectability (Amano et al. 2018),
eigenvector centrality (Newman 2004), or rich club coefficient (Alstott et al. 2014;
Opsahl et al. Oct 2008; Zlatic et al. 2009). We do not consider them here as our focus
is on the most basic properties.
We define a weighted stream graph as a stream graph S = (T, V, W, E)
equipped with a weight function ω defined over W and E: if (t, v) ∈ W then ω(t, v)
is the weight of node v at time t, and if (t, uv) ∈ E then ω(t, uv) is the weight of
link uv at time t. See Fig. 3.3 for an illustration.
If a stream graph represents money transfers, then node weights may represent
available credit and link weights may represent transfer amounts; if a stream rep-
resents travels, then node weights may represent available fuel, and link weights
may represent speed; if a stream represents contacts between mobile device, node
54 M. Latapy et al.

weights may represent battery charge and link weights may represent signal strength
or link capacity; if a stream represents data transfers between computers then link
weights may represent throughput or error rates; like for weighted graphs, countless
situations may benefit from a weighted stream graph modeling.
As we will see in Sect. 3.3, weighted stream graphs also widely appear within
bipartite graph studies. In addition, as explained in Latapy et al. (2018, Sect. 19),
one often resorts to -analysis for stream graph studies. Given a stream graph
S = (T, V, W, E) with T = [x, y] and a parameter , it most simple form consists
in transforming S into S = (T , V, W , E ) such that T = [x + 2 , y − 2 ], Tv =
T ∩ {t, ∃t ∈ [t − 2 , t + 2 ], t ∈ Tv }, and Tuv = T ∩ {t, ∃t ∈ [t − 2 , t + 2 ], t ∈
Tuv }. Then, one may capture the amount of information in S leading to node and
link presences in S with weights: ω(t, v) = |{t, ∃t ∈ [t − 2 , t + 2 ], t ∈ Tv }| and
ω(t, uv) = |{t, ∃t ∈ [t − 2 , t + 2 ], t ∈ Tuv }|.
Like for weighted graphs, in addition to S itself (without weights), one may
consider the thresholded (unweighted) stream graphs Sτ = (T, V, Wτ , E τ ) where
Wτ = {(t, v) ∈ W, ω(t, v) ≥ τ } and E τ = {(t, uv) ∈ E, ω(t, uv) ≥ τ }, for various
thresholds τ . One may then study how the properties of Sτ evolve with τ .
The graph obtained from S at time t, G t = (Vt , E t ), is naturally weighted by the
function ωt (v) = ω(t, v) and ωt (uv) = ω(t, uv). Likewise, the induced graph G(S)
is weighted by ω(v) = |T1 | t∈Tv ω(t, v) dt and ω(uv) = |T1 | t∈Tuv ω(t, uv) dt. These
definitions correspond to the average weight over time of each node and link. One
may define similarly the minimal and maximal node and link weights, and go further
by studying S through the weighted graph G(S) and the time-evolution of weighted
graph properties of G t .
These approaches aim to take both the weight and the temporal aspect into account:
from a weighted stream graph, the first one provides a family of (unweighted) stream
graphs, one for each considered value of the threshold; the second one provides a
series of (static) weighted graphs, one for each instant considered. In both cases, the
actual combination of weight and time information is poorly captured. We will there-
fore define concepts that jointly deal with both weights and time. Like with weighted
graphs, we simplify the presentation by assuming that only links are weighted (nodes
are not). 
Since the degree of node v in a stream graph S is d(v) = u∈V |T|Tuv|| and since the

strength of node v in a weighted graph G is s(v) = u∈N (v) ω(uv), we define the
 
strength of node v in a weighted stream graph S as s(v) = u∈V t∈Tuv ω(t,uv) |T |
dt. It
is the degree of v where each neighbor is counted with respect to the weight of its
links with v at the times when  it is linked to v. It is related to the strength st (v) of v
in G t as follows: s(v) = |T1 | t∈Tv st (v); it is the average strength of v over time.
With this definition, one may study correlations between degree and strength
in stream
 graphs,
 as with graphs, and even directly use their combinations, like
α
d(v) · d(v)
s(v)
where α is a parameter.
Unsurprisingly, generalizing density to weighted stream graphs raises the same
difficulties as for weighted graphs. Still, proposed definitions for weighted graphs
easily apply to weighted stream graphs. Indeed, the sum of all weights becomes
3 Weighted, Bipartite, or Directed Stream Graphs for the Modeling … 55

 
uv∈V ⊗Vt∈Tuv ω(t, uv)

dt, and the maximal weight of possible links may be
 
defined as uv∈V ⊗V t∈Tuv ωmax dt = ωmax · |E| or uv∈V ⊗V t∈T ωmax dt = ωmax ·
|V ⊗ V |. Like with weighted 
graphs,
 if all weights are in [0, 1] then the weighted
ω(t,uv) dt
density may be defined by uv∈V ⊗V|T ⊗V t∈Tuv
⊗V |
.
Then, one may define the clustering coefficient of v as the weighted density
(according to one of the definitions above or another one) of the neighborhood of v.
One may also consider the time-evolution of one the weighted clustering coefficient
in G t , according to previously proposed definitions surveyed above. Interestingly, one
may also generalize
  the classical definition (Barrat et al. 2004) as follows: cc(v) =
i, j∈Nt (v),(t,i j)∈E ω(t, vi) + ω(t, v j) dt, which is the product of link
1
2s(v)(d(v)−1) t∈Tv
weights of v with its pairs of neighbors when these neighbors are linked together.
The general approach of Opsahl and Panzarasa (2009) also extends: one has to
assign a value to each quadruplet (t, i, j, k) with i, j, and k distinct such that (t, i j),
(t, jk) are in E, and to each such quadruplet such that (t, ik) also is in E. As in
the weighted graph case, the weighted stream graph clustering coefficient of node
v, cc(v) is then the ratio between the sum of values of quadruplets in the second
category such that j = v and the one of quadruplets in the first category such that
j = v too. If the value
 of 
a quadruplet is the product of the weights of involved links,
ω(t,vi)·ω(t,v j)·ω(t,i j) dt
we obtain cc(v) = t∈Tv i,j∈Nt (v),(t,i
 j)∈E .
t∈Tv i = j∈Nt (v) ω(t,vi)·ω(t,v j)
Likewise, we define the weighted stream graph transitivity as the ratio between
the sum of values of all quadruplets in the second category above and the one of
quadruplets in the first category. If the value of quadruplets   is defined as the product
ω(t,i j)·ω(t, jk)·ω(t,ik) dt
of weights of involved links, this leads to tr = t (i, j,k)
. If all
t (i, j,k) ω(t,i j)·ω(t, jk) dt
weights are equal to 1 (i.e. the stream is unweighted), this is nothing but the stream
graph transitivity defined in Latapy et al. (2018).
If S is a graph-equivalent stream weighted by a constant function over time, i.e.
ω(t, v) = ω(t , v) and ω(t, uv) = ω(t , uv) for all t and t , then it is equivalent to
the weighted graph G(S) weighted by ω(v) = ω(t, v) and ω(uv) = ω(t, uv) for any
t; we call it a weighted graph-equivalent weighted stream. The strength of v in S is
equal to its strength in G(S) if S is a weighted graph-equivalent weighted stream.
The same is true for the different notions of density or clustering coefficient: the
density of S is equal to the density of G(S) and the clustering coefficient of a vertex
v or the transitivity are equal to the clustering coefficient or transitivity in G(S).

3.3 Bipartite Stream Graphs

A bipartite graph G = ( ∪ ⊥, E) is defined by a set of top nodes , a set of


bottom nodes ⊥ with  ∩ ⊥ = ∅, and a set of links E ⊆  ⊗ ⊥: there are two dif-
ferent kinds of nodes and links may exist only between nodes of different kinds.
Like weighted graphs, but maybe less known, bipartite graphs are pervasive and
model many real-world data (Guillaume and Latapy 2004; Latapy et al. 2008). Typ-
56 M. Latapy et al.

ical examples include relations between client and products (Bernardes et al. 2014),
between company boards and their members (Robins and Alexander 2004; Battiston
and Catanzaro 2004), and between items and their key features like movie-actor net-
works (Watts and Strogatz 1998; Newman et al. 2001) or publication-author networks
(Newman 2001a, 2001b), to cite only a few.
Bipartite graphs are often studied through their top or bottom projections (Breiger
1974) G  = (, E  ) and G ⊥ = (⊥, E ⊥ ), defined by E  = ∪v∈⊥ N (v) ⊗ N (v) and
E ⊥ = ∪v∈ N (v) ⊗ N (v). In other words, in G  two (top) nodes are linked together
if they have (at least) a (bottom) neighbor in common in G, and G ⊥ is defined sym-
metrically. Notice that, if v ∈  (resp. v ∈ ⊥), then N (v) always is a (not necessarily
maximal) clique in G ⊥ (resp. G  ).
Projections induce important information losses: the existence of a link or a clique
in the projection may come from very different causes in the original bipartite graph.
To improve this situation, one often considers weighted projections: each link uv
in the projection is weighted by the number ω(uv) = N (u) ∩ N (v) of neighbors u
and v have in common in the original bipartite graph. One may then use weighted
graph tools to study the weighted projections (Guillaume et al. 2005; Barrat et al.
2004; Newman 2004), but information losses remain important. In addition, projec-
tion are often much larger than the original bipartite graphs, which raises serious
computational issues (Latapy et al. 2008).
As a consequence, many classical graph concepts have been extended to deal
directly with the bipartite case, see Latapy et al. (2008), Borgatti and Everett
(1997), Faust (1997), Breiger (1974), Bonacich (1972). The most basic properties are
n  = || and n ⊥ = |⊥|, the number of top and bottom nodes. The definition of the
number of links m = |E| is the same as in classical graphs. The definitions of node
neighborhoods and degrees are also unchanged. The average top and bottom degrees
d and d⊥ of G are the average degrees of top and bottom nodes, respectively.
With these notations, the bipartite density of G is naturally defined as δ(G) =
m
n  ·n ⊥
: it is the probability when one takes two nodes that may be linked together that
they indeed are. Then, a bipartite clique in G is a set C ∪ C⊥ with C ⊆  and
C⊥ ⊆ ⊥ such that C × C⊥ ⊆ E. In other words, all possible links between nodes
in a bipartite clique are present in the bipartite graph.
Several bipartite generalizations of the clustering coefficient have been proposed
Latapy et al. (2008), Lind et al. (2005), Zhang et al. (2008), Opsahl (2013), Lioma
et al. (2016). In particular:
• Latapy et al. (2008) and Lind et al. (2005) rely on the Jaccard coefficient defined
over node pairs both either in  or ⊥: cc(uv) = |N (u)∩N (v)|
|N (u)∪N (v)|
or close variants. One
then obtains the clustering coefficient of nodev by averaging its Jaccard coefficient
(v)),u =v cc(uv)
with all neighbors of its neighbors: cc(v) = u∈N (N |N (N (v))|
.
• for each node v, Latapy et al. (2008) and Lioma et al. (2016) consider all triplets
(u, v, w) of distinct nodes with u and w in N (v) and define the redundancy rc(v) as
the fraction of such triplets such that there exists an other node in N (u) ∩ N (w).
3 Weighted, Bipartite, or Directed Stream Graphs for the Modeling … 57

a a
u u
b b
v v
c c
0 2 4 6 8 time 0 2 4 6 8 time

Fig. 3.4 Left: a bipartite stream graph S = (T,  ∪ ⊥, W, E) with T = [0, 10],  = {u, v},
⊥ = {a, b, c}, W = T × ( ∪ ⊥), and E = ([0, 2] ∪ [3, 9]) × {ua} ∪ ([4, 5] ∪ [8, 10]) × {ub} ∪
[1, 5] × {uc} ∪ [2, 7] × {v, b} ∪ [0, 8] × {vc}. Right: its ⊥-projection S⊥ . For instance, a and c are
linked together from time 3 to 5 because they both have u in their neighborhood for this time period
in S

• similarly, Opsahl (2013) proposes to consider all quintuplets (a, b, v, c, d) of dis-


tinct nodes with b, c ∈ N (v), a ∈ N (b) and d ∈ N (c) and defines cc∗ (v) as the
fraction of them such that there exists another node in N (a) ∩ N (d).
Transitivity is usually extended to bipartite graphs (Robins and Alexander 2004;
Latapy et al. 2008; Opsahl 2013) by considering the set N of quadruplets of nodes
(a, b, c, d) such that ab, bc and cd are in E and the set  of such quadruplets with
in addition ad in E. Then, the transitivity is tr(G) = ||
|N|
. Like above, Opsahl (2013)
also propose to consider the fraction of all quintuplets (a, b, c, d, e) of distinct nodes
with ab, bc, cd, de in E such that there exists an other node f with a f and e f in E.
We define a bipartite stream graph S = (T,  ∪ ⊥, W, E) from a time interval
T , a set of top nodes , a set of bottom nodes ⊥ with  ∩ ⊥ = ∅, and two sets
W ⊆ T × ( ∪ ⊥) and E ⊆ T ×  ⊗ ⊥ such that (t, uv) ∈ E implies (t, u) ∈ W
and (t, v) ∈ W . See Fig. 3.4 (left) for an illustration. Each instantaneous graph G t ,
as well as the induced graph G(S), are bipartite graphs with the same top and bottom
nodes.
Bipartite stream graphs naturally model many situations, like for instance presence
of people in rooms or other kinds of locations, purchases of products by clients, access
to on-line services, bus presence at stations (Curzel et al. 2019), traffic between a set
of computers and the rest of the internet (Viard et al. 2018), or contribution of people
to projects, such as software.
The classical definition of projections is easily extended, leading to S =
(T, , W , E  ) and S⊥ = (T, ⊥, W⊥ , E ⊥ ), where W = W ∩ (T × ), W⊥ =
W ∩ (T × ⊥), E  = ∪(t,v)∈W⊥ {(t, uw) s.t. (t, uv) ∈ E and (t, wv) ∈ E} and E ⊥ =
∪(t,v)∈W {(t, uw) s.t. (t, uv) ∈ E and (t, vw) ∈ E}. In other words, in S two (top)
nodes are linked together at a given time instant if they have (at least) a (bottom)
neighbor in common in S at this time, and S⊥ is defined symmetrically. See Fig. 3.4
for an illustration. Notice that, if v ∈  (resp. v ∈ ⊥) then N (v) always is a (not
necessarily maximal) clique in S⊥ (resp. S⊥ ).
One may also generalize weighted projections by considering the number ω(t, uv)
= |Nt (u) ∩ Nt (v)| of neighbors u and v have in common at time t in the original
bipartite stream graph. One then obtains weighted stream graphs, and may use the
58 M. Latapy et al.

definitions proposed in Sect. 3.2 to study them. Still, this induces much information
loss, which calls for the generalization of bipartite properties themselves.
 The|Tmost immediate definitions are the numbers of top and bottom nodes: n  =
v| |W ∩(T ×)| |Tv | |W ∩(T ×⊥)|
v∈ |T | = |T |
and n ⊥ = v∈⊥ |T | = |T |
, respectively. We then have
n = n  + n ⊥ . Like with graphs, the number of links, neighborhood of nodes, and
their degree do not call for specific bipartite definitions. We however define the
average top and bottom degrees d and d⊥ of S as the average degrees of top and
bottom nodes, respectively, weighted
  by their presence time in the stream: d (S) =
|Tv | |Tv |
v∈ |W | d(v) and d⊥ (S) = v∈⊥ |W | d(v).
We define the bipartite density of S as δ(S) =  m
: it is the probability
u∈,v∈⊥ |Tu ∩Tv |
when one takes two nodes when they may be linked together that they indeed are. A
subset C = C ∪ C⊥ of W , with C ⊆ T ×  and C⊥ ⊆ T × ⊥, is a clique in S if
all possible links exist between nodes when they are involved in C: for all t, if (t, u)
is in C and v is in C⊥ then uv is in E.
Like with graphs, defining a bipartite stream graph clustering coefficient is difficult
and leaves us with several reasonable choices:
• Extending the Jaccard coefficient to node pairs in a stream graph leads to an
instantaneous definition: cct (uv) = |N t (u)∩Nt (v)|
|Nt (u)∪Nt (v)|
, which is nothing but cc(uv) in G t .

|T ∩T |
It also leads to a global definition: cc(uv) = |N (u)∩N (v)|
|N (u)∪N (v)|
= w∈∪⊥ |Tuw vw
. We may
w∈∪⊥ uw ∪Tvw |
then define the bipartite clustering coefficient of node v by averaging its Jaccard
coefficient with all|Tneighbors of its neighbors, weighted by their co-presence time:
u ∩Tv |
u∈∪⊥ |T | cc(uv). Notice that cc(uv) = 0 if the neighborhoods of u
1
|N (N (v))|
and v do not intersect. This means that the sum actually is over nodes that are at
some time neighbor of a neighbor of v, which is consistent with the bipartite graph
definition.
• Redundancy is easier to generalize to stream graphs: for each node v, we consider
all triplets (t, u, v, w) composed of a time instant t, a neighbor u of v at time t, v
itself, and another neighbor w of v at time t, and we define rc(v) as the fraction of
such quadruplets such that there exists an other node linked to u and w at time t.
In other words, rc(v) = |{(t,u,v,w),u|{(t,u,v,w),u
=w,u,w∈Nt (v),∃x∈Nt (u)∩Nt (w),x=v}|
=w,u,w∈Nt (v)}|
.
• Similarly, we propose a stream graph generalization of cc∗ (v) as the fraction of
sextuplets (t, a, b, v, c, d) with a, b, v, c, and d all different from each other,
b ∈ Nt (v), c ∈ Nt (v), a ∈ Nt (b) and d ∈ Nt (c) for which in addition there exists
an other node in Nt (a) ∩ Nt (d).
Finally, if we denote by N the set of quintuplets (t, a, b, c, d) such that (t, ab),
(t, bc) and (t, cd) are in E and the set  of such quintuplets with in addition (t, ad)
in E, then the bipartite transitivity for stream graphs is tr(G) = ||
|N|
as before. Like for
bipartite graphs, one may also consider the fraction of all sextuplets (t, a, b, c, d, e)
where a, b, c, d, and e are distinct nodes with (t, ab), (t, bc), (t, cd), (t, de) in E
such that there exists an other node f with (t, a f ) and (t, e f ) in E.
If S is a graph-equivalent bipartite stream, then its projections S and S⊥ are also
graph-equivalent streams, and their corresponding graphs are the projections of the
graph corresponding to S: G(S⊥ ) = G(S)⊥ and G(S ) = G(S) . In addition, the
3 Weighted, Bipartite, or Directed Stream Graphs for the Modeling … 59

bipartite properties of S are equivalent to the bipartite properties of its corresponding


bipartite graph: the density, Jaccard coefficient, redundancy, and cc∗ , as well as
transitivity values, are all equal to their graph counterpart in G(S) if S is a graph-
equivalent bipartite stream.

3.4 Directed Stream Graphs

A directed graph G = (V, E) is defined by its set V of nodes and its set E ⊆ V × V
of links: while links of undirected graphs are unordered pairs of distinct nodes, links
in directed graphs are ordered pairs of nodes, not necessarily distinct: (u, v) = (v, u),
and (v, v) is allowed and called a loop. Then, (u, v) is a link from u to v, and if both
(u, v) and (v, u) are in E then the link is said to be symmetric.
Directed graphs naturally model the many situations where link asymmetry is
important, like for instance dependencies between companies or species, citations
between papers or web pages (Meusel et al. 2015), friendship relations in many
on-line social networks (Mislove et al. 2007), or hierarchical relations of various
kinds.
Directed graph are often studied as undirected graphs by ignoring link directions.
However, this is not satisfactory in many cases: having a link to a node is very different
from having a link from a node and, for instance, having links to many nodes is very
different from having links from many nodes. As a consequence, many directed
extensions of standard graph properties have been proposed to take direction into
account, see for instance Wasserman and Faust (1994, Sect. 4.3) and Fagiolo (2007),
Hakimi (1965).
First, a node v in a directed graph has two neighborhoods: its out-going neighbor-
hood N + (v) = {u, (v, u) ∈ E} and its in-coming neighborhood N − (v)={u, (u, v) ∈
E}. This leads to its out- and in-degrees d + (v) = |N + (v)| and d − (v) = |N − (v)|.
These definitions make it possible to study the role of nodes with high in- and out-
degrees, as well as correlationsbetween these metrics (Mislove et al. 2007; Guil-
laume et al. 2004). Notice that v∈V d + (v) = v∈V d − (v) = m is the total number
of links, and so the average in- and out-degrees are equal to mn .
The directed density of G is nm2 since in a directed graph (with loops) there are
2
n possible links. Then, a directed clique is nothing but a set of nodes all pairwise
linked together with symmetrical links, which, except for loops, is equivalent to an
undirected clique. One may also consider the fraction of loops present in the graph
|{(v,v)∈E}|
n
, as well as the fraction of symmetric links |{(u,v)∈E s.t.
m
(v,u)∈E}|
.
With this definition of density, one may define the in- and out-clustering coeffi-
cient of node v as the density of its in-coming and out-going neighborhood. However,
this misses the diversity of ways v may be linked to its neighbors and these neigh-
bors may be linked together (Fagiolo 2007; Wasserman and Faust 1994). Table 1 in
Fagiolo (2007) summarizes all possibilities and corresponding extensions of clus-
tering coefficient. We focus here on two of these cases, that received more attention
60 M. Latapy et al.

a
b
c
d
0 2 4 6 8 time
Fig. 3.5 An example of directed stream graph S = (T, V, W, E) with T = [0, 10] ⊆ R,
V = {a, b, c, d}, W = [0, 10] × {a} ∪ ([0, 4] ∪ [5, 10]) × {b} ∪ [4, 9] × {c} ∪ [1, 3] × {d}, and
E = [1, 3] × {(a, b)} ∪ [2.5, 3.5] × {(b, a)} ∪ [4.5, 7.5] × {(c, a)} ∪ [6, 9] × {(b, c)} ∪ [2, 3] ×
{(d, b)}. Notice that Ta,b = [1, 3] = Tb,a = [2.5, 3.5], and that links between a and b are
symmetrical from time 2.5 to time 3

because they capture the presence of small cycles and the transitivity of relations.
Given a node v, the cyclic clustering coefficient is the fraction of its pairs of distinct
neighbors u and w with (u, v) and (v, w) in E, such that (w, u) also is in E, i.e. u, v
and w form a cycle. The transitive coefficient of v consists in the fraction of its pairs
of distinct neighbors u and w with (u, v) and (v, w) in E, such that (u, w) also is in
E, i.e. the relation is transitive.
Finally, these extensions of node clustering coefficient directly translate to exten-
sions of graph transitivity ratio. In the two cases explained above, this leads to the
fraction of all triplets of distinct nodes (u, v, w) with (u, v) and (v, w) in E, such
that in addition (w, v) is in E, or (v, w) is in E, respectively.
We define a directed stream graph S = (T, V, W, E) from a time interval T ,
a set of nodes V , a set of temporal nodes W ⊆ T × V , and a set of temporal links
E ⊆ T × V × V : (t, u, v) in E means that there is a link from u to v at time t,
which is different from (t, v, u). We therefore make a distinction between Tu,v the
set {t, (t, u, v) ∈ E} and Tv,u the set {t, (t, v, u) ∈ E}. A loop is a triplet (t, v, v)
in E. If both (t, u, v) and (t, v, u) are in E, then we say that this temporal link is
symmetric. See Fig. 3.5 for an illustration.
Directed stream graphs model the many situations where directed links occur over
time and their asymmetry is important, like for instance money transfers, network
traffic, phone calls, flights, moves from a place to another, and many others. In all
theses cases, both time information and link direction are crucial, as well as their
interplay. For instance, a large number of computers sending packets at a given
computer in a very short period of time is a typical signature of denial of service
attack (Mazel et al. 2015). Instead, a computer sending packets to a large number of
other computers in a short period of time is typical of a streaming server.
The directed stream graph S = (T, V, W, E) may be studied through the stan-
dard stream graph (T, V, W, {(t, uv), (t, u, v) ∈ E or (t, v, u) ∈ E}) obtained by
considering each directed link as undirected. Likewise, S may be studied through
its induced directed graph G(S) = ({v, ∃(t, v) ∈ W }, {(u, v), ∃(t, u, v) ∈ E}) and/or
the sequence of its instantaneous directed graphs G t = ({v, (t, v) ∈ W },
{(u, v), (t, u, v) ∈ E}). However, these approaches induce much information loss,
3 Weighted, Bipartite, or Directed Stream Graphs for the Modeling … 61

and make it impossible to make subtle distinctions like the one described above for
network traffic. This calls for generalizations of available concepts to this richer case.
We define the out-going neighborhood of node v as N + (v) = {(t, u), (t, v, u) ∈
E} and its in-coming neighborhood as N − (v) = {(t, u), (t, u, v) ∈ E}. Its in- and
+  − 
out-degrees are d + (v) = |N|T(v)| |
= u∈V |T|Tv,u| | and d − (v) = |N|T(v)| | 
= u∈V |T|Tu,v| | .
Like with  directed graphs, the total number of links m is equal to v∈V d + (v) as
well as to v∈V d − (v).
m
We extend the standard stream graph density  into the directed stream
uv∈V ⊗V |Tu ∩Tv |
m
graph density  : it is the fraction of possible links that indeed exist.
(u,v)∈V ×V |Tu ∩Tv |
Then, a clique in a directed stream graph is a subset C of W such that for all (t, u)
and (t, v) in C, both (t, u, v) and (t, v, u) are in E. These definitions are immediate
extensions of standard stream graph concepts.
One may then define the in- and out-clustering coefficient of a node v as the
density of its in- and out-neighborhoods in the directed stream graph. However, like
for directed graphs, there are many possible kinds of interactions between neighors
of a node, which lead to various generalizations of clustering coefficient to directed
stream graph. We illustrate this on the two definitions detailed above for a given node
v. First, let us consider the set of quadruplets (t, u, v, w) with u, v and w distinct,
such that (t, u, v) and (t, v, w) are in E. Then one may measure the cyclic clustering
coefficient as the fraction of these quadruplets such that in addition (t, w, u) is in E,
and the transitive clustering coefficient as the fraction of these quadruplets such that
in addition (t, w, u) is in E.
Like with directed graphs, this leads to directed stream graph extensions of tran-
sitivity. In the two cases above, we define it as the fraction of quadruplets (t, u, v, w)
with u, v and w distinct and with (t, u, v) and (t, v, w) in E, such that in addition
(t, w, v) is in E, or (t, v, w) is in E, respectively.
If S is a directed graph-equivalent directed stream graph, i.e. G t = G(S) for all t,
then all the properties of S defined above are equal to their directed graph counterpart
in G(S).

3.5 Conclusion

Previous works extended many graph concepts to deal with weighted, bipartite or
directed graphs. In addition, graphs were generalized recently to stream graphs in
order to model temporal networks in a way consistent with graph theory. In this contri-
bution, we show that weighted, bipartite or directed graphs concepts may themselves
be generalized to weighted, bipartite or directed stream graphs, in a way consistent
with both their graph counterparts and the stream graph formalism. This opens the
way to a much richer modeling of temporal networks, and more precise case studies,
in a unified framework.
Such case studies may benefit from improved modeling with either weights,
different sorts of nodes, or directed links, but may also combine these extensions
62 M. Latapy et al.

together. For instance, money transfers between clients and sellers are best mod-
eled by weighted bipartite directed stream graphs. The concepts we discussed then
have to be extended even further, like what has already been done for graphs for
instance with the directed strength s + (v) = u∈N + (v) ω(u, v) (Opsahl et al. 2010),
the directed weighted clustering coefficient (Clemente and Grassi 2018; Fagiolo
2007), or for the study of weighted bipartite graphs (Guillaume et al. 2004).
One may also consider other kinds of graph extensions, like multigraphs, labelled
graphs, hypergraphs, or multi-layer graphs for instance, which model important
features of real-world data and already received much attention. We focused on
weighted, bipartite and directed cases because they seem to be the most frequent in
applications.
Likewise, we selected only a few key weighted, bipartite or directed properties in
order to extend them to stream graphs. Many others remain to generalize, in particular
path-related concepts like reachability, closeness, or betweenness, to cite only a few
(Opsahl et al. 2010).

Acknowledgements This work is funded in part by the European Commission H2020 FET-
PROACT 2016–2017 program under grant 732942 (ODYCCEUS), by the ANR (French National
Agency of Research) under grants ANR-15-CE38-0001 (AlgoDiv), by the Ile-de-France Region
and its program FUI21 under grant 16010629 (iTRAC).

References

S.E. Ahnert, D. Garlaschelli, T.M.A. Fink, G. Caldarelli, Ensemble approach to the analysis of
weighted networks. Phys. Rev. E 76, 016101 (2007)
J. Alstott, P. Panzarasa, M. Rubinov, E.T. Bullmore, P.E. Vértes, A unifying framework for measuring
weighted rich clubs. Sci. Rep. 4, 7258 (2014)
S.iI. Amano, K.-i. Ogawa, Y. Miyake, Node property of weighted networks considering connectabil-
ity to nodes within two degrees of separation. Sci. Rep. 8464 (2018)
I.E. Antoniou, E.T. Tsompa, Statistical analysis of weighted networks. Discret. Dyn. Nat. Soc.
(2008)
A. Barrat, M. Barthélemy, R. Pastor-Satorras, A. Vespignani, The architecture of complex weighted
networks. Proc. Natl. Acad. Sci. 101(11), 3747–3752 (2004)
S. Battiston, M. Catanzaro, Statistical properties of corporate board and director networks. Eur.
Phys. J. B 38, 345–352 (2004)
D. Bernardes, M. Diaby, R. Fournier, F. Fogelman-Soulié, E. Viennet, A social formalism and
survey for recommender systems. SIGKDD Explorations 16(2), 20–37 (2014)
P. Bonacich, Technique for analyzing overlapping memberships. Sociol. Methodol. 4, 176–185
(1972)
S.P. Borgatti, M.G. Everett, Network analysis of 2-mode data. Soc. Netw. 19(3), 243–269 (1997)
R.L. Breiger, The duality of persons and groups. Soc. Forces 53(2), 181–190 (1974)
C.T. Butts, A relational event framework for social action. Sociol. Methodol. 38(1), 155–200 (2008)
A. Casteigts, P. Flocchini, W. Quattrociocchi, N. Santoro, Time-varying graphs and dynamic net-
works. IJPEDS 27(5), 387–408 (2012)
G.P. Clemente, R. Grassi, Directed clustering in weighted networks: a new perspective. Chaos,
Solitons & Fractals 107, 26–38 (2018)
3 Weighted, Bipartite, or Directed Stream Graphs for the Modeling … 63

A. Conte, L. Candeloro, L. Savini, A new weighted degree centrality measure: the application in
an animal disease epidemic. PLoS ONE 11 (2016)
J.L. Curzel, R. Lüders, K.V.O. Fonseca, M. de Oliveira Rosa, Temporal performance analysis of
bus transportation using link streams. Math. Prob. Eng. (2019)
P. Doreian, A note on the detection of cliques in valued graphs. Sociometry 32, 237–242 (1969)
F.Z. Esfahlani, H. Sayama, A percolation-based thresholding method with applications in functional
connectivity analysis, in Complex Networks IX, ed. by S. Cornelius, K. Coronges, B. Gonçalves,
R. Sinatra, A. Vespignani (2018), pp. 221–231
G. Fagiolo, Clustering in complex directed networks. Phys. Rev. E 76, 026107 (2007)
K. Faust, Centrality in affiliation networks. Soc. Netw. 19, 157–191 (1997)
P. Grindrod, Range-dependent random graphs and their application to modeling large small-world
proteome datasets. Phys. Rev. E 66, 066702 (2002)
J-L. Guillaume, M. Latapy, Bipartite structure of all complex networks. Inf. Process. Lett. (IPL)
90(5), 215–221 (2004)
J-L. Guillaume, S. Le Blond, M. Latapy, Clustering in P2P exchanges and consequences on perfor-
mances, in Lecture Notes in Computer Sciences (LNCS), Proceedings of the 4-th International
Workshop on Peer-to-Peer Systems (IPTPS) (2005)
J-L. Guillaume, S. Le Blond, M. Latapy, Statistical analysis of a P2P query graph based on degrees
and their time-evolution, in Lecture Notes in Computer Sciences (LNCS), Proceedings of the 6-th
International Workshop on Distributed Computing (IWDC) (2004)
S.L. Hakimi, On the degrees of the vertices of a directed graph. J. Frankl. Inst. 279(4), 290–308
(1965)
G. Kalna, D.J. Higham, A clustering coefficient for weighted networks, with application to gene
expression data. AI Commun. 20, 263–271 (2007)
M. Latapy, C. Magnien, N. Del Vecchio, Basic notions for the analysis of large two-mode networks.
Soc. Netw. 31–48 (2008)
M. Latapy, T. Viard, C. Magnien, Stream graphs and link streams for the modeling of interactions
over time. Social Netw. Analys. Mining 8(1), 1–61 (2018)
P.G. Lind, M.C. González, H.J. Herrmann, Cycles and clustering in bipartite networks. Phys. Rev.
E 72, 056127 (2005)
C. Lioma, F. Tarissan, J.G. Simonsen, C. Petersen, B. Larsen, Exploiting the bipartite structure of
entity grids for document coherence and retrieval, in Proceedings of the 2016 ACM International
Conference on the Theory of Information Retrieval, ICTIR ’16 (ACM 2016), pp. 11–20
J. Mazel, P. Casas, R. Fontugne, K. Fukuda, P. Owezarski, Hunting attacks in the dark: clustering
and correlation analysis for unsupervised anomaly detection. Int. J. Netw. Manag. 25(5), 283–305
(2015)
R. Meusel, S. Vigna, O. Lehmberg, C. Bizer, The graph structure in the web - analyzed on different
aggregation levels. J. Web Sci. 1, 33–47 (2015)
A. Mislove, M. Marcon, K.P. Gummadi, P. Druschel, B. Bhattacharjee, Measurement and analysis
of online social networks, in Proceedings of the 7th ACM SIGCOMM Conference on Internet
Measurement (2007), pp. 29–42
M.E.J. Newman, Scientific collaboration networks: I. Network construction and fundamental
results. Phys. Rev. E 64, 016131 (2001a)
M.E.J. Newmanm Scientific collaboration networks: II. Shortest paths, weighted networks, and
centrality. Phys. Rev. E 64, 016132 (2001b)
M.E.J. Newman, Analysis of weighted networks. Phys. Rev. E 70, 056131 (2004)
M.E.J. Newman, S.H. Strogatz, D.J. Watts, Random graphs with arbitrary degree distributions and
their applications, Phys. Rev. E 64, 026118 (2001)
T. Opsahl, V. Colizza, P. Panzarasa, J.J. Ramasco, Prominence and control: the weighted rich-club
effect. Phys. Rev. Lett. 101, 168702 (2008)
T. Opsahl, Triadic closure in two-mode networks: redefining the global and local clustering coeffi-
cients. Soc. Netw. 35(2), 159–167 (2013)
64 M. Latapy et al.

J-P. Onnela, J Saramäki, J. Kertész, K. Kaski, Intensity and coherence of motifs in weighted complex
networks. Phys. Rev. E 71, 065103 (2005)
T. Opsahl, F. Agneessens, J. Skvoretz, Node centrality in weighted networks: generalizing degree
and shortest paths. Soc. Netw. 32(3), 245–251 (2010)
T. Opsahl, P. Panzarasa, Clustering in weighted networks. Soc. Netw. 31(2), 155–163 (2009)
P. Panzarasa, T. Opsahl, K.M. Carley, Patterns and dynamics of users’ behavior and interaction:
network analysis of an online community. J. Am. Soc. Inf. Sci. Technol. 60(5), 911–932 (2009)
M.Á. Serrano, M. Boguñá, A. Vespignani, Extracting the multiscale backbone of complex weighted
networks. Proc. Natl. Acad. Sci. 106(16), 6483–6488 (2009)
G. Robins, M. Alexander, Small worlds among interlocking directors: network structure and distance
in bipartite graphs. Comput. Math. Organ. Theory 10(1), 69–94 (2004)
J. Saramäki, M. Kivelä, J.P. Onnela, K. Kaski, J. Kertesz, Generalizations of the clustering coefficient
to weighted complex networks. Phys. Rev. E 75(2), 027105 (2007)
K. Smith, H. Azami, M.A. Parra, J.M. Starr, J. Escudero, Cluster-span threshold: An unbiased
threshold for binarising weighted complete networks in functional connectivity analysis, in 2015
37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society
(EMBC) (2015), pp. 2840–3
C. Stadtfeld, P. Block, Interactions, actors, and time: dynamic network actor models for relational
events. Soc. Sci. 318–352 (2017)
T. Viard, R. Fournier-S’niehotta, C. Magnien, M. Latapy, Discovering patterns of interest in IP
traffic using cliques in bipartite link streams, in Proceedings of Complex Networks IX (2018), pp.
233–241
Yu. Wang, E. Ghumare, R. Vandenberghe, P. Dupont, Comparison of different generalizations of
clustering coefficient and local efficiency for weighted undirected graphs. Neural Comput. 29,
313–331 (2017)
S. Wasserman, K. Faust, Social Network Analysis: Methods and Applications (Cambridge University
Press, 1994)
D. Watts, S. Strogatz, Collective dynamics of small-world networks. Nature 393, 440–442 (1998)
K. Wehmuth, A. Ziviani, E. Fleury, A unifying model for representing time-varying graphs, in 2015
IEEE International Conference on Data Science and Advanced Analytics, DSAA 2015 (Campus
des Cordeliers, Paris, France, 2015), pp. 1–10. Accessed from 19–21 Oct 2015
K. Wehmuth, E. Fleury, A. Ziviani, On multiaspect graphs. Theor. Comput. Sci. 651, 50–61 (2016)
B. Zhang, S. Horvath, A general framework for weighted gene co-expression network analysis.
Stat. Appl. Genetics Mol. Biol. 4, 1544–6115 (2005)
P. Zhang, J. Wang, X. Li, M. Li, Z. Di, Y. Fan, Clustering coefficient and community structure of
bipartite networks. Phys. A 387(27), 6869–6875 (2008)
V. Zlatic, G. Bianconi, A. Díaz-Guilera, D. Garlaschelli, F. Rao, G. Caldarelli, On the rich-club
effect in dense and weighted networks. Eur. Phys. J. B 67(3), 271–275 (2009)
Z. Zou, Polynomial-time algorithm for finding densest subgraphs in uncertain graphs, in In Pro-
ceedings of MLG Workshop (2013)
Chapter 4
Modelling Temporal Networks
with Markov Chains, Community
Structures and Change Points

Tiago P. Peixoto and Martin Rosvall

Abstract While temporal networks contain crucial information about the evolv-
ing systems they represent, only recently have researchers showed how to incor-
porate higher-order Markov chains, community structure and abrupt transitions to
describe them. However, each approach can only capture one aspect of temporal
networks, which often are multifaceted with dynamics taking place concurrently at
small and large structural scales and also at short and long timescales. Therefore, these
approaches must be combined for more realistic descriptions of empirical systems.
Here we present two data-driven approaches developed to capture multiple aspects
of temporal network dynamics. Both approaches capture short timescales and small
structural scales with Markov chains. Whereas one approach also captures large
structural scales with communities, the other instead captures long timescales with
change points. Using a nonparametric Bayesian inference framework, we illustrate
how the multi-aspect approaches better describe evolving systems by combining dif-
ferent scales, because the most plausible models combine short timescales and small
structural scales with large-scale structural and dynamical modular patterns or many
change points.

Keywords Temporal networks · Higher-order Markov chains · Community


structure · Change points · Bayesian inference

With new introduction and conclusion sections, this chapter combines and reuses
text and figures from Peixoto, T. P. & Rosvall, M. Modelling sequences and temporal
networks with dynamic community structures. Nature Communications 8, 582 (2017)
and Peixoto, T. P. & Gauvin, L. Change points, memory and epidemic spreading
in temporal networks. Scientific Reports 8, 15511 (2018), both licensed under a

T. P. Peixoto
Department of Network and Data Science, Central European University, Vienna, Austria
e-mail: t.peixoto@bath.ac.uk
ISI Foundation, Via Chisola 5, 10126 Torino, Italy
M. Rosvall (B)
Integrated Science Lab, Department of Physics, Umeå University, 901 87 Umeå, Sweden
e-mail: martin.rosvall@umu.se

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 65


P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_4
66 T. P. Peixoto and M. Rosvall

Creative Commons Attribution 4.0 International License, http://creativecommons.


org/licenses/by/4.0/.

4.1 Introduction

Recent advances in temporal network theory include modelling of the time-varying


network structure (Ho et al. 2011; Perra et al. 2012) as well as processes that take
place on the dynamic structure, such as epidemic spreading (Rocha et al. 2011; Val-
dano et al. 2015; Génois et al. 2015; Ren and Wang 2014). However, most approaches
rely on a characteristic time scale at which they describe the dynamics. These can
be roughly divided into approaches that model temporal correlations of interactions
through Markov chains with short-term memory (Scholtes 2014; Peixoto and Rosvall
2017), and those that model the dynamics at longer times, usually through network
snapshots (Xu and Iii 2013; Gauvin et al. 2014; Peixoto 2015; Stanley et al. 2016;
Ghasemian et al. 2016; Zhang et al. 2017) or discrete change points when the dynam-
ics changes abruptly (Peel and Clauset 2015; De Ridder et al. 2016; Corneli et al.
2017). For example, when the network evolution is represented as a static Markov
chain, such that new edges are placed based on the placement of the last few edges
with fixed transition probabilities, the system eventually reaches equilibrium and
cannot maintain any long-term memory. Conversely, when the network evolution is
represented by a sequence of snapshots, there is no attempt to model any short-term
memory. Consequently, focus on one timescale blurs the other whereas in reality most
systems exhibit dynamics in a wide range of timescales. Moreover, most systems also
exhibit dynamics at multiple structural scales, including large-scale dynamic com-
munities (Gauvin et al. 2014; Peixoto and Rosvall 2017).
In this chapter, we review two approaches that attempt to capture multiple aspects
of temporal network dynamics with Markov chains inferred from data. The first
approach identifies large-scale structures by parametrising the transition probabilities
as well as the nodes into communities. The second approach describes non-stationary
dynamics based on abrupt change points. Both approaches can detect short-term
dynamical memory through arbitrary-order Markov chains and use a nonparamet-
ric Bayesian inference framework that prevents overfitting and yields efficient and
effective algorithms.
We employ the multi-aspect approaches on empirical data and show that the most
plausible models tend to combine short-term memory with large-scale structural and
dynamical modular patterns and also many change points. The communities and
change points work synergistically with the Markov chains, typically uncovering
higher-order memory that is washed out by less flexible models. This effect suggests
that a full dynamical description of large-scale modular structures that combines
community structure and change points would further our understanding of temporal
networks.
4 Modelling Temporal Networks with Markov Chains, Community … 67

4.2 Temporal Networks as Markov Chains

We consider dynamical networks that can be represented as sequence of edges that


are observed in time. This includes, for example, proximity events where an edge
denotes when two individuals come in close contact at a specific point in time. More
formally, we describe a sequence of discrete tokens, s = {xt }, with t ∈ {1, . . . , E}
being by the relative time ordering of the tokens, and xt ∈ {1, . . . , D} the set of
unique tokens with cardinality D. This description is applied to dynamical
  networks
by considering each token as an edge, i.e., xt = (i, j)t , and hence D = N2 is the total
number of possible edges in a network with N nodes (see Fig. 4.1 for an illustration).
This sequential representation lends itself naturally to be modelled as a stationary
Markov chain of order n, i.e., the sequence s occurs with probability
  a
P(s| p, n) = pxt ,x t−1 = x,x
px,x , (4.1)
t x,x

where p corresponds to the transition matrix, with elements pxt ,x t−1 being the prob-
ability of observing token xt given the previous n tokens x t−1 = {xt−1 , . . . , xt−n } in
the sequence, and ax,x is the number of observed transitions from memory x to token
x. Despite its simplicity, this model is able to reproduce arbitrary edge frequencies,
determined by the steady-state distribution of the tokens x and temporal correlations
between edges. This means that the model should be able to reproduce properties
of the data that can be attributed to the distribution of number of contacts per edge,
which are believed to be important, e.g. for epidemic spreading (Gauvin et al. 2013;
Vestergaard et al. 2014). However, due to its Markovian nature, the dynamics will
eventually forgets its history, and converge to the limiting distribution (assuming the
chain is ergodic and aperiodic). This latter property means that the model should
be able to capture nontrivial statistics of waiting times only at a short timescale,
comparable to the Markov order.

(a) (b)
o I
f 10
Node

9 Edge
t 8
7
6
w 5
4
b h 3
i 2
s 1
a m
e 0 1 2 3 4 5 6 7 Time

Fig. 4.1 The modelling of dynamical networks as Markov chains describe the transitions between
“tokens” in a sequence, where the tokens are observed edges in the network. For example, a illustrates
the transitions in the sequence “It was the best of times” with letters as tokens and b
illustrates the sequence {(1, 2), (4, 3), (5, 2), (10, 8), (7, 2), (9, 3), (3, 4)}
68 T. P. Peixoto and M. Rosvall

Given the above model, the simplest way to proceed would be to infer transition
probabilities from data usingmaximum likelihood, i.e., maximizing Eq. 4.1 under
the normalization constraint x px,x = 1. This yields

ax,x
p̂x,x = , (4.2)
kx

where k x = x ax,x is the number of transitions originating from x. However, if
we want to determine the most appropriate Markov order n that fits the data, the
maximum likelihood approach cannot be used, as it will overfit, i.e., the likelihood
of Eq. 4.1 will increase monotonically with n, favouring the most complicated model
possible, and thus confounding statistical fluctuations with actual structure. Instead,
the most appropriate way to proceed is to consider the Bayesian posterior distribution

P(s|n)P(n)
P(n|s) = , (4.3)
P(s)

which involves the integrated marginal likelihood (Strelioff et al. 2007)



P(s|n) = P(s| p, n)P( p|n) d p, (4.4)

where the prior probability P( p|n) encodes the amount of knowledge we have on
the transitions p before we observe the data. If we possess no information, we can
be agnostic by choosing a uniform prior
   
P( p|n) = (D − 1)!δ 1 − x px,x , (4.5)
x

where D is again the total number of tokens, and δ(x) is the Dirac delta function,
which assumes that all transition probabilities are equally likely. Inserting Eq. 4.1
and 4.5 in Eq. 4.4, and calculating the integral we obtain
 (D − 1)! 
P(s|n) = ax,x !. (4.6)
x
(k x + D − 1)! x

The remaining prior, P(n), that represents our a priori preference to the Markov
order, can also be chosen in an agnostic fashion in a range [0, N ], i.e.,

1
P(n) = . (4.7)
N +1

Since this prior is a constant, the upper bound N has no effect on the posterior of
Eq. 4.3, provided it is sufficiently large to include most of the distribution.
Differently from the maximum-likelihood approach described previously, the pos-
terior distribution of Eq. 4.3 will select the size of the model to match the statistical
4 Modelling Temporal Networks with Markov Chains, Community … 69

Fig. 4.2 Posterior 0


distribution of the Markov
order P(n|x) (Eq. 4.3) for a −5000
temporal network between
−10000
students in a high
school (Fournet and Barrat

ln P(n|xx)
−15000
2014)
−20000

−25000

−30000

−35000
0 1 2 3
Markov order n

significance available, and will favour a more complicated model only if the data can-
not be suitably explained by a simpler one, i.e., it corresponds to an implementation
of Occam’s razor that prevents overfitting.
Although elegant, this modelling approach turns out to be limited in capturing
the dynamics observed in real systems. For example, when applied to a proximity
network between students in a high school (Fournet and Barrat 2014), it favours a
“zero-order” n = 0 Markov chain, indicating that the edges occur independently at
random in the sequence, as can be seen in Fig. 4.2. Rather than concluding that this
uncovers a lack of temporal structure in the data, it is in fact a lot more plausible
that this description is too simplistic and ill-suited to capture the actual underlying
dynamics. One way of seeing this is by comparing properties of the data with artifi-
cial sequences generated by the fitted model. For example, if we generate temporal
networks using the maximum-likelihood transition probabilities p̂x,x = ax,x /k x , and
simulate an epidemic spreading dynamic on them, the observed behaviour is quite
different from the same dynamics simulated on the empirical time-series, as can be
seen in Fig. 4.3. Importantly, independently of the Markov order n—even those val-
ues that overfit—the simulated dynamics lacks the abrupt changes in the infection
rate observed with the empirical data. This discrepancy exposes the inadequacy of
simple Markov assumption considered above. However, as we will show below, this
model can nevertheless be used as the basis for higher-order models that are in fact
capable of capturing important aspects of the underlying dynamics. In the following,
we present two ways in which this can be done: (1) by parametrising the Markov
chain using community structures (Peixoto and Rosvall 2017), and (2) by introducing
non-stationary Markov chains with latent change points (Peixoto and Gauvin 2018).
70 T. P. Peixoto and M. Rosvall

(a)
n=3
n=2
75 n=1

Infected X
n=0
Original data
50

25

0
0 5000 10000 15000 20000 25000 30000
Time t (a.u.)
(b)
100

75
Infected X

50
n=3
n=2
25 n=1
n=0
Original data
0
0 5000 10000 15000 20000 25000 30000
Time t (a.u.)

Fig. 4.3 Number of infected nodes over time X (t) for a temporal network between students in a
high-school (Fournet and Barrat 2014) (N = 126), considering both the original data and artificial
time-series generated from the fitted Markov model of a given order n, using a SIR and b SIS
epidemic models. In all cases, the values were averaged over 100 independent realizations of the
network model (for the artificial datasets) and dynamics. The shaded areas indicate the standard
deviation of the mean. The values of the infection and recovery rates were chosen so that the
spreading dynamics spans the entire time range of the dataset

4.3 Markov Chains with Communities

One of the main limitations of the Markov chain model considered previously is its
relatively large number of parameters. For a network of N nodes, the transition matrix
p has O(N 2(n+1) ) entries, which means that even for n = 1 we would still need to
infer O(N 4 ) parameters from data, requiring the length of the observed sequence
to be compatible in size. If the size of the data is insufficient given the parameter
space, overfitting is unavoidable, and the Bayesian posterior of Eq. 4.3 appropriately
prefers a smaller model, e.g. with n = 0 for which the number of parameters is much
smaller, O(N 2 ). This situation can be improved by imposing latent structure in the
transition matrix p, which can be used to adaptively match the dimension of the
model, for any given Markov order n, according to what is compatible with the data.
The alternative we propose is to assume that both memories and tokens are distributed
in disjoint groups (see Fig. 4.4). That is, bx ∈ [1, 2, . . . , BN ] and b x ∈ [BN + 1, BN +
2, . . . , BN + BM ] are the group memberships of the tokens and memories uniquely
4 Modelling Temporal Networks with Markov Chains, Community … 71

(a) (b)
o I
f
t f s I e a w o h b m i t

w
t s w o a e m b h
h i b f
i
s
a m
e
(c)

t he It of st as s f es t w be wa me e o b th im ti

h i t w o a s b f e m

Fig. 4.4 Schematic representation of the Markov model with communities. The token sequence
{xt } = “It was the best of times represented with nodes for memories (top row) and
tokens (bottom row), and with directed edges for transitions in different variations of the model. a
A partition of the tokens and memories for an n = 1 model. b A unified formulation of an n = 1
model, where the tokens and memories have the same partition, and hence can be represented as a
single set of nodes. c A partition of the tokens and memories for an n = 2 model

assigned in BN and BM groups, respectively, such that the transition probabilities can
be parametrised as
px,x = θx λbx bx . (4.8)

Here θx is the relative probability at which token x is selected among those that
belong to the same group, and λr s is the overall transition probability from memory
group s = b x to token group r = bx . Note that, in principle, the number of parameters
in this model can be larger than the previous model, however this depends on the
numbers of groups BN and BM , which is something that needs to be chosen to match
the data, as will be done in our method.
In the case of Markov order n = 1 each token appears twice in the model, both as
token and memory. An alternative and often useful approach for n = 1 is to consider
a single unified partition for both tokens and memories, as shown in Fig. 4.4b and
described in more detailed in Peixoto and Rosvall (2017). In any case, the maximum
likelihood parameter estimates are

er s kx
λ̂r s = , θ̂x = , (4.9)
es ebx

where er s is the number of observed transitions from group s to r , es = t ets is the
total outgoing transitions from group s if s is a memory group, or the total incoming
transition if it is a token group. The labels r and s are used indistinguishably to denote
memory and token groups, since it is only their numerical value that determines their
72 T. P. Peixoto and M. Rosvall

kind. Finally, k x is the total number of occurrences of token x. Putting this back in
the likelihood of Eq. 4.1, we have
 er s 
ln P̂(s|b, λ, θ , n) = er s ln + k x ln k x . (4.10)
r <s
er es x

As before, this maximum likelihood approach cannot be used if we do not know


the order of the Markov chain, otherwise it will overfit. In fact, this problem is
now aggravated by the potential larger number of model parameters. Therefore, we
employ a Bayesian formulation and construct a generative process for the model
parameters themselves. We do this by introducing prior probability densities for
the parameters P(θ |α) and P(λ|β) for tokens and memories, respectively, with
hyperparameter sets α and β, and computing the integrated likelihood

P(s|b, n, α, β) = P(s|b, λ, θ )P(θ |α)P(λ|β) dθ dλ. (4.11)

where we used b as a shorthand for {bx } and {b x }. Now, instead of inferring the
hyperparameters, we can make a noninformative choice for α and β that reflects our
a priori lack of preference towards any particular model (Jaynes 2003). Doing so in
this case yields a likelihood (for details, see Peixoto and Rosvall 2017),
 
 (n r − 1)!  (BN − 1)! er s ! 
P(s|b, n) = r
k x !, (4.12)
r
(er + n r − 1)! s
(es + BN − 1)! x


where n r = x δbx ,r is the number of tokens in group r . The Markov order and
partitions can now be inferred from the posterior distribution

P(s|b, n)P(b)P(n)
P(b, n|s) = , (4.13)
P(s)

where P(s) = b,n P(s|b, n)P(b)P(n) is an intractable constant, that fortunately
does not need to be determined when either maximizing or sampling from the pos-
terior distribution.
Before proceeding further with this model, we note that it still treats each pos-
sible edge as an individual token. However, this can be suboptimal if the networks
are sparse, that is, if only a relatively small subset of all possible edges occur, and
thus there are insufficient data to reliably fit the model. Furthermore, although it puts
the tokens and memories into groups, it does not take into account any community
structure present in the network, i.e., groups of nodes with similar connection pat-
terns. In order to alleviate both problems, we adapt the model above by including
an additional generative layer between the Markov chain and the observed edges.
We do so by partitioning the nodes of the network into groups, that is, ci ∈ [1, C]
determines the membership of node i in one of C groups, such that each edge (i, j)
4 Modelling Temporal Networks with Markov Chains, Community … 73

is associated with a label (ci , c j ). Then we define a Markov chain for the sequence
of edge labels and sample the actual edges conditioned only on the labels. Since
this reduces the number of possible tokens from O(N 2 ) to O(C 2 ), it has a more
controllable number of parameters that can better match the sparsity of the data. We
further assume that, given the node partitions, the edges themselves are sampled in
a degree-corrected manner, conditioned on the edge labels,

δci ,r δc j ,s κi κ j if r = s
P((i, j)|(r, s), κ, c) = (4.14)
2δci ,r δc j ,s κi κ j if r = s,

where κi determines the probability of a node being selected inside a group, with

i∈r κi = 1. The total likelihood conditioned on the label sequence becomes


P({(i, j)t }|{(r, s)t }, κ, c) = P((i, j)t |(r, s)t , κ). (4.15)
t

Since we want to avoid overfitting the model, we once more use noninformative
priors, but this time on {κi }, and integrate over them,

P({(i, j)t }|{(r, s)t }, c) = P({(i, j)t }|{(r, s)t }, κ, c)P(κ) dκ, (4.16)

di ! 2m rr  nr −1
= i r
, (4.17)
r mr ! r
mr

where di is the degree


 of node i, and m r s is the total number of edges between groups
r and s, and m r = s m r s .
Combining this result with Eq. 4.12, we have the complete likelihood of the tem-
poral network,

P(s|c, b) = P({(i, j)t }|{(r, s)t }, c)P({(r, s)t }|b), (4.18)

conditioned only on the partitions, and remembering that s = {(i, j)t }. Finally, the
full posterior distribution involves both kinds of partitions considered,

P(s|c, b)P(c)P(c)P(n)
P(c, b, n|s) = . (4.19)
P(s)

For details on the priors P(c) and P(b) and the algorithmic approach to maximise
the posterior distribution, we refer to Peixoto and Rosvall (2017).
We employ this model in a variety of dynamic network datasets from different
domains (for details, see Table 4.1 and Peixoto and Rosvall 2017). In all cases, we
infer models with n > 0 that identify many groups for the tokens and memories,
meaning that the model succeeds in capturing temporal structures. In most cases,
models with n = 1 best describe the data, implying that there is not sufficient evidence
74

Table 4.1 Summary of inference results for empirical temporal networks. Description length Σ = − log2 P({(i, j)t }, c, b) in bits and inferred number of node
groups C, token groups BN , and memory groups BM for different data sets and different Markov order n. Values in grey correspond to the minimum of each
column
High school proximity (N = 327, E = 5, 818) Enron email (N = 87, 273, E = 1, 148, 072)
n C BN BM Σ C BN BM Σ
0 10 – – 89, 577 1, 447 – – 19, 701, 405
1 10 9 9 82, 635 1, 596 2, 219 2, 201 13, 107, 399
cre 2 10 6 6 86, 249 324 366 313 16, 247, 904
3 9 6 6 103, 453 363 333 289 26, 230, 928
APS citations (N = 425, 760, E = 4, 262, 443) prosper.com loans (N = 89, 269, E = 3, 394, 979)
0 3, 774 – – 131, 931, 579 318 – – 96, 200, 002
1 4, 426 6, 853 6, 982 94, 523, 280 267 1039 1041 59, 787, 374
2 4, 268 710 631 144, 887, 083 205 619 367 109, 041, 487
3 4, 268 454 332 228, 379, 667 260 273 165 175, 269, 743
Hospital contacts (N = 75, E = 32, 424) Infectious Sociopatterns (N = 10, 972, E = 415, 912)
0 68 – – 484, 121 4695 – – 8, 253, 351
1 60 58 58 245, 479 5572 2084 2084 4, 525, 629
2 62 29 26 366, 351 5431 3947 3947 7, 503, 859
3 50 11 7 644, 083 1899 829 783 12, 527, 730
Internet AS (N = 53, 387, E = 500, 106) Chess moves (N = 76, E = 3, 130, 166)
0 187 – – 19, 701, 403 72 – – 66, 172, 128
1 185 131 131 10, 589, 136 72 339 339 58, 350, 128
2 132 75 43 14, 199, 548 72 230 266 58, 073, 342
3 180 87 79 22, 821, 016 72 200 205 76, 465, 862
T. P. Peixoto and M. Rosvall
4 Modelling Temporal Networks with Markov Chains, Community … 75

st PC
(a) PS I (b) (0, 0) (7, 7)

(1, 1) (5, 5)
O3
(1, 4) (3, 7)
(0, 9)
(1, 6) (1, 7) (3, 5)
2BI

(1, 3) (4, 9)

PCs
(5, 8)
(0, 8) (1, 9) (2, 4) (2, 6)
(5, 9) (6, 7) (2, 3) (1, 8)
(5, 7)

t
(1, 5) (2, 7) (3, 9) (2, 9) (0, 4)
(6, 9) (3, 6)
(7, 9) (4, 8)
(6, 8)
(5, 6) (8, 9) (4, 6) (0, 1)
2

(0, 7) (0, 5) (4, 7)


2BIO

(7, 8)

MPst
(1, 2) (0, 2)
(4, 5) (0, 6)
(2, 5) (0, 3)

1
(8, 8)
(3, 3)
1

IO M (2, 8)
(3, 8) (3, 4)
(4, 4)
P (2, 2)
2B
MPst2 (9, 9) (6, 6)

Fig. 4.5 Inferred temporal model for a high school proximity network (Mastrandrea et al. 2015). a
The static part of the model divides the high school students into C = 10 groups (square nodes) that
almost match the known classes (text labels). (b The dynamic part of the model divides the directed
multigraph group pairs in a into BN = BM = 9 groups (grey circles). The model corresponds to
an n = 1 unified Markov chain on the edge labels, where the memory and tokens have identical
partitions, as described in detail Peixoto and Rosvall (2017)

for higher-order memory, with exception of the network of chess moves, which is
best described by a model with n = 2. We note that this only occurs when we use
the intermediary layer where the Markov chain generates edge types instead of the
edges. If we fit the original model without this modification, we indeed get much
larger description lengths and we often fail to detect any Markov structure with n > 0
(not shown).
To illustrate how the model characterizes the temporal structure of these systems,
we focus on the proximity network of high school students, which corresponds to the
voluntary tracking of 327 students for a period of 5 days (Mastrandrea et al. 2015).
Whenever the distance between two students fell below a threshold, an edge between
them was recorded at that time. In the best-fitting model for these data, the inferred
groups for the aggregated network correspond exactly to the known division into 9
classes, except for the PC class, which was divided into two groups (Fig. 4.5). The
groups show a clear assortative structure, where most connections occur within each
class. The clustering of the edge labels in the second part of the model reveals the
temporal dynamics. We observe that the edges connecting nodes of the same group
cluster either in single-node or small groups, with a high incidence of self-loops.
This means that if an edge that connects two students of the same class appears
in the sequence, the next edge is most likely also inside the same class, indicating
that the students of the same class are clustered in space and time. The remaining
edges between students of different classes are separated into two large groups. This
division indicates that the different classes meet each other at different times. Indeed,
the classes are located in different parts of the school building and they typically go
to lunch separately (Mastrandrea et al. 2015). Accordingly, our method can uncover
the associated dynamical pattern from the data alone.
76 T. P. Peixoto and M. Rosvall

4.4 Markov Chains with Change Points

The model considered previously succeeds in capturing both temporal and struc-
tural patterns in empirical dynamic networks, but it is still based on a Markov chain
with stationary transition probabilities. As discussed before, this means long-term
correlations are not used to inform the model, which can only aggregate temporal
heterogeneities into a effective model that averages out possible changes in the tran-
sition probabilities over time. In this section we describe how to model changes in
the dynamics by considering non-stationary transition probabilities px,x that change
abruptly at a given change point, but otherwise remain constant between change
points. The occurrence of change points is governed by the probability q that one is
inserted at any given time. The existence of M change points divide the time series
into M + 1 temporal segments indexed by l ∈ {0, . . . , M}. The variable lt indicates
to which temporal segment a given time t belongs among the M segments. Thus, the
conditional probability of observing a token x at time t in segment lt is given by

P(xt , lt |x t−1 , lt−1 ) = plx,x


t
[q(1 − δlt ,lt−1 ) + (1 − q)δlt ,lt−1 ], (4.20)

where plx,xt
is the transition probability inside segment lt and q is the probability to
transit from segment l to l + 1. The probability of a whole sequence s = {xt } and
l = {lt } being generated is then
 ax,x
l
P(s, l| p, q) = q M (1 − q) E−M plx,x (4.21)
l,x,x

l
where ax,x is the number of transitions from memory x to token x in the segment l.
Note that we recover the stationary model of Eq. 4.1 by setting q = 0. The maximum-
likelihood estimates of the parameters are
l
ax,x M
p̂lx,x = , q̂ = (4.22)
k lx E
 l
where k lx = x ax,x is the number of transitions originating from x in a segment l.
But once more, we want to infer the model the segments l in a Bayesian way, via the
posterior distribution
P(s, l|n)
P(l|s, n) = , (4.23)
P(s|n)

where the numerator is the integrated likelihood



P(s, l|n) = P(s, l| p, q, n)P( p|n)P(q) d p dq (4.24)

using uniform priors P(q) = 1, and


4 Modelling Temporal Networks with Markov Chains, Community … 77

P( p|n) = P( pl |dl , n)P(dl ), (4.25)
l

with the uniform prior


  
P( pl |dl , n) = (Dl − 1)!δ x plx,x − 1 . (4.26)
x

and
P(dl ) = 2−D (4.27)

being the prior for the alphabet dl of size Dl inside segment l, sampled uniformly
from all possible subsets of the overall alphabet of size D. Performing the above
integral, we obtain

M!(E − M)!   (Dl − 1)!  l


P(x, l|n) = 2−D(M+1) a !. (4.28)
(E + 1)! l x
(k lx + Dl − 1)! x x,x

Like with the previous stationary model, both the order and the positions of the
change points can be inferred from the joint posterior distribution

P(x, l|n)P(n)
P(l, n|x) = , (4.29)
P(x)

in a manner that intrinsically prevents overfitting. This constitutes a robust and ele-
gant way of extracting this information from data, that contrasts with non-Bayesian
methods of detecting change points using Markov chains that tend to be more cum-
bersome (Polansky 2007), and is more versatile than approaches that have a fixed
Markov order (Arnesen et al. 2016).
The exact computation of the posterior of Eq. 4.23 would require the marginaliza-
tion of the above distribution for all possible segments l, yielding the denominator
P(x|n), which is unfeasible for all but the smallest time series. However, it is not
necessary to compute this value if we sample l from the posterior using Monte Carlo.
We do so by making move proposals l → l  with probability P(l  |l), and accepting
it with probability a according to the Metropolis-Hastings criterion (Metropolis et al.
1953; Hastings 1970)

P(l  |x, n)P(l|l  )


a = min 1, , (4.30)
P(l|x, n)P(l  |l)

which does not require the computation of P(x|n) as it cancels out in the ratio. If the
move proposals are ergodic, i.e., they allow every possible partition l to be visited
eventually, this algorithm will asymptotically sample from the desired posterior. We
refer to Peixoto and Gauvin (2018) for more details about the algorithm, including
the move proposals used.
78 T. P. Peixoto and M. Rosvall

Fig. 4.6 Integrated joint


likelihood P(x, l|n) −120000
(Eq. 4.28) for a temporal
network between students in −140000
a high school (Fournet and

ln P(xx, l |n)
Barrat 2014), for the
stationary (i.e., zero change −160000
points) and nonstationary
models. For all values of n −180000
the likelihoods are higher for
the nonstationary model
(yielding a posterior odds −200000 Stationary
Change points
ratio Λ > 1)
0 1 2 3
Markov order n

Once a fit is obtained, we can compare the above model with the stationary one
by computing the posterior odds ratio

P(l, n|x) P(x, l|n)


Λ= = , (4.31)
P(l 0 , n 0 |x) P(x, l 0 |n 0 )

where l 0 is the partition into a single interval (which is equivalent to the stationary
model). A value Λ > 1 [i.e., P(x, l|n) > P(x, l 0 |n 0 )] indicates a larger evidence
for the nonstationary model. As can be seen in Fig. 4.6, we observe indeed a larger
evidence for the nonstationary model for all Markov orders. In addition to this, using
this general model we identify n = 1 as the most plausible Markov order, in contrast
to the n = 0 obtained with the stationary model. Therefore, identifying change points
allows us not only to uncover patterns at longer timescales, but the separation into
temporal segments enables the identification of statistically significant patterns at
short timescales as well, which would otherwise remain obscured with the stationary
model—even though it is designed to capture only these kinds of correlations.
The improved quality of this model is also evident when we investigate the epi-
demic dynamics, as shown in Fig. 4.7. In order to obtain an estimate of the number
of infected based on the model, we generated different sequences of edges using the
fitted segments and transition probabilities p̂lx,x = ax,x
l
/k lx in each of the segments
estimated with Markov orders going from 0 to 3. We simulated SIR and SIS pro-
cesses on top of the networks generated and averaged the number of infected over
many instances. Looking at Fig. 4.7, we see that the inferred positions of the change
points tend to coincide with the abrupt changes in infection rates, which show very
good agreement between the empirical and generated time-series. For higher Markov
order, the agreement improves, although the improvement seen for n > 1 is probably
due to overfitting, given the results of Fig. 4.6. The fact that n = 0 provides the worst
fit to the data shows that it is not only the existence of change points, but also the
inferred Markov dynamics that contribute to the quality of the model in reproducing
the epidemic spreading.
4 Modelling Temporal Networks with Markov Chains, Community … 79

(a)
60
n=3
n=2
n=1

Infected X
40 n=0
Original data

20

0
0 5000 10000 15000 20000 25000 30000
Time t (a.u.)
(b)

80
Infected X

60

40 n=3
n=2
n=1
20 n=0
Original data
0
0 5000 10000 15000 20000 25000 30000
Time t (a.u.)

Fig. 4.7 Number of infected nodes over time X (t) for a temporal network between students in a
high-school (Fournet and Barrat 2014) (N = 126), considering both the original data and artificial
time-series generated from the fitted nonstationary Markov model of a given order n, using a
SIR (β = 0.41, γ = 0.005) and b SIS (β = 0.61, γ = 0.03) epidemic models. The vertical lines
mark the position of the inferred change points. In all cases, the values were averaged over 100
independent realizations of the network model (for the artificial datasets) and dynamics. The shaded
areas indicate the standard deviation of the mean

In order to examine the link between the structure of the network and the
change points, we fitted a layered hierarchical degree-corrected stochastic block
model (Peixoto 2015, 2017) to the data, considering each segment as a separate
edge layer. Figure 4.8 shows that the density of connections between node groups
vary in a substantial manner, suggesting that change point marks an abrupt transition
in the typical kind of encounters between students—representing breaks between
classes, meal time, etc. This yields an insight as to why these changes in pattern
may slow down or speed up an epidemic spreading: if students are confined to their
classrooms, contagion across classrooms is inhibited, but as soon they are free to
move around the school grounds, so can the epidemic.
80 T. P. Peixoto and M. Rosvall

Fig. 4.8 Network structure inside the first nine segments of a temporal network between students
in a high-school (Fournet and Barrat 2014). The segments were captured by a layered hierarchical
degree-corrected stochastic block model (Peixoto 2015) using the frequency of interactions as edge
covariates (Peixoto 2017) (indicated by colors), where each segment is considered as a different
layer. The values of the infection and recovery rates were chosen so that the spreading dynamics
spans the entire time range of the dataset

4.5 Conclusion

In this chapter, we reviewed two data-driven approaches to model temporal networks


based on arbitrary-order Markov chains, while at the same time incorporating two
kinds of higher-order structures: (1) the division of the transition probabilities and
the nodes in the network into communities, and (2) the abrupt transition of the
Markov transition probabilities at specific change points. In each case, we have
described a Bayesian framework that allows the inference of communities, change
points and Markov order from data in a manner that prevents overfitting, and enables
the selection of competing models.
4 Modelling Temporal Networks with Markov Chains, Community … 81

We have applied our approach to a variety of empirical dynamical networks,


and we have evaluated the inferred models based on their capacity to compress the
data and to reproduce the epidemic spreading observed with the original data. We
have seen that the model with communities uncovers modular structure both in the
network itself and in its dynamical patterns. In turn, the nonstationary model with
change points accurately reproduces the highly-variable nature of the infection rate,
with changes correlating strongly with the inferred change points.
Both modelling approaches can be extended to data with continuous time and with
bursty dynamics by introducing waiting times as additional covariates, as described
in Peixoto and Rosvall (2017).
There is no reason why community structure and change points cannot be con-
sidered together, which would allow, among other things, for a fully dynamical
description of large-scale modular structures. In fact, there are approaches not based
on Markov chains that do just that (Peel and Clauset 2015). However, combining
both models described here seems at first challenging, as the most direct approach
yields a cumbersome model that is difficult to operate algorithmically. An elegant
approach that combines both aspects simultaneously—thus allowing for the syner-
gistic combination of multiple timescales and dynamic community structure—is a
desirable area for future work.
Finally, it would be interesting to investigate how the approaches presented here
can be extended from dynamics represented as a sequence of edges to scenarios
where edges are allowed both to appear and disappear from the network.

Acknowledgements M.R. was supported by the Swedish Research Council grant 2016-00796.

References

P. Arnesen, T. Holsclaw, P. Smyth, Bayesian detection of changepoints in finite-state Markov chains


for multiple sequences. Technometrics 58, 205–213 (2016)
M. Corneli, P. Latouche, F. Rossi, Multiple change points detection and clustering in dynamic
network. Stat. Comput. (2017)
S. De Ridder, B. Vandermarliere, J. Ryckebusch, Detection and localization of change points in
temporal networks with the aid of stochastic block models. J. Stat. Mech: Theory Exp. 2016,
113302 (2016)
J. Fournet, A. Barrat, Contact Patterns among High School Students. PLoS ONE 9, e107878 (2014)
L. Gauvin, A. Panisson, C. Cattuto, A. Barrat, A. Activity clocks: spreading dynamics on temporal
networks of human contact. Sci. Rep. 3 (2013)
L. Gauvin, A. Panisson, C. Cattuto, Detecting the community structure and activity patterns of
temporal networks: a non-negative tensor factorization approach. PLoS ONE 9, e86028 (2014)
M. Génois, C.L. Vestergaard, C. Cattuto, A. Barrat, Compensating for population sampling in
simulations of epidemic spread on temporal contact networks. Nat. Commun. 6 (2015)
A. Ghasemian, P. Zhang, A. Clauset, C. Moore, L. Peel, Detectability thresholds and optimal
algorithms for community structure in dynamic networks. Phys. Rev. X 6, 031005 (2016)
W.K. Hastings, Monte Carlo sampling methods using Markov chains and their applications.
Biometrika 57, 97–109 (1970)
82 T. P. Peixoto and M. Rosvall

Q. Ho, L. Song, E.P. Xing, Evolving cluster mixed-membership blockmodel for time-varying net-
works. J. Mach. Learn. Res.: Workshop Conf. Proc. 15, 342–350 (2011)
E.T. Jaynes, Probability Theory: The Logic of Science (Cambridge University Press, Cambridge,
UK, New York, NY, 2003)
R. Mastrandrea, J. Fournet, A. Barrat, Contact patterns in a high school: a comparison between
data collected using wearable sensors, contact diaries and friendship surveys. PLoS ONE 10,
e0136497 (2015)
N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, E. Teller, Equation of state calcu-
lations by fast computing machines. J. Chem. Phys. 21, 1087 (1953)
L. Peel, A. Clauset, Detecting change points in the large-scale structure of evolving networks, in
Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
T.P. Peixoto, Inferring the mesoscale structure of layered, edge-valued, and time-varying networks.
Phys. Rev. E 92, 042807 (2015)
T.P. Peixoto, Nonparametric Bayesian inference of the microcanonical stochastic block model.
Phys. Rev. E 95, 012317 (2017)
T.P. Peixoto, L. Gauvin, Change points, memory and epidemic spreading in temporal networks.
Sci. Rep. 8, 15511 (2018)
T.P. Peixoto, M. Rosvall, Modelling sequences and temporal networks with dynamic community
structures. Nat. Commun. 8, 582 (2017)
N. Perra, N., B. Gonçalves, R. Pastor-Satorras, A. Vespignani, Activity driven modeling of time
varying networks. Sci. Rep. 2 (2012)
A.M. Polansky, Detecting change-points in Markov chains. Comput. Stat. Data Anal. 51, 6013–6026
(2007)
G. Ren, X. Wang, Epidemic spreading in time-varying community networks. Chaos: Interdiscip. J.
Nonlinear Sci. 24, 023116 (2014)
L.E.C. Rocha, F. Liljeros, P. Holme, Simulated epidemics in an empirical spatiotemporal network
of 50,185 sexual contacts. PLoS Comput. Biol. 7, e1001109 (2011)
I. Scholtes et al., Causality-driven slow-down and speed-up of diffusion in non-Markovian temporal
networks. Nat. Commun. 5 (2014)
N. Stanley, S. Shai, D. Taylor, P.J. Mucha, Clustering network layers with the strata multilayer
stochastic block model. IEEE Trans. Netw. Sci. Eng. 3, 95–105 (2016)
C.C. Strelioff, J.P. Crutchfield, A.W. Hübler, Inferring Markov chains: Bayesian estimation, model
comparison, entropy rate, and out-of-class modeling. Phys. Rev. E 76, 011106 (2007)
E. Valdano, L. Ferreri, C. Poletto, V. Colizza, Analytical computation of the epidemic threshold on
temporal networks. Phys. Rev. X 5, 021005 (2015)
C.L. Vestergaard, M. Génois, A. Barrat, How memory generates heterogeneous dynamics in tem-
poral networks. Phys. Rev. E 90, 042805 (2014)
Xu, K. S. & Iii, A. O. H. Dynamic Stochastic Blockmodels: Statistical Models for Time-Evolving
Networks. In Greenberg, A. M., Kennedy, W. G. & Bos, N. D. (eds.) Social Computing,
Behavioral-Cultural Modeling and Prediction, no. 7812 in Lecture Notes in Computer Science,
201–210 (Springer Berlin Heidelberg, 2013)
X. Zhang, C. Moore, M.E.J. Newman, Random graph models for dynamic networks. Eur. Phys. J.
B 90, 200 (2017)
Chapter 5
Visualisation of Structure and Processes
on Temporal Networks

Claudio D. G. Linhares, Jean R. Ponciano, Jose Gustavo S. Paiva,


Bruno A. N. Travençolo, and Luis E. C. Rocha

Abstract The temporal dimension increases the complexity of network models but
also provides more detailed information about the sequence of connections between
nodes allowing a more detailed mapping of processes taking place on the network.
The visualisation of such evolving structures thus permits faster identification of
non-trivial activity patterns and provides insights about the mechanisms driving the
dynamics on and of networks. In this chapter, we introduce key concepts and discuss
visualisation methods of temporal networks based on 2D layouts where nodes cor-
respond to horizontal lines with circles to represent active nodes and vertical edges
connecting those active nodes at given times. We introduce and discuss algorithms
to re-arrange nodes and edges to reduce visual clutter, layouts to highlight node and
edge activity, and visualise dynamic processes on temporal networks. We illustrate
the methods using real-world temporal network data of face-to-face human contacts
and simulated random walk and infection dynamics.

Keywords Temporal networks · visualization

C. D. G. Linhares · J. R. Ponciano · J. G. S. Paiva · B. A. N. Travençolo


Faculty of Computing, Federal University of Uberlândia, Av. João Naves de Ávila, 2121,
38400-902 Uberlândia, Brazil
e-mail: claudiodgl@gmail.com
J. R. Ponciano
e-mail: jeanrobertop@gmail.com
J. G. S. Paiva
e-mail: gustavo@ufu.br
B. A. N. Travençolo
e-mail: travencolo@gmail.com
L. E. C. Rocha (B)
Department of Economics, Ghent University, Sint-Pietersplein 59000, Ghent, Belgium
e-mail: luis.rocha@ugent.be
Department of Physics and Astronomy, Ghent University, Sint-Pietersplein 59000, Ghent,
Belgium

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 83


P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_5
84 C. D. G. Linhares et al.

5.1 Introduction

Networks have been extensively used to model the connections and inter-
dependencies between the parts of a system by using nodes and edges. Net-
work models and measures are used to reproduce and identify patterns, or extract
information, from the structure of connections in a multitude of disciplines, ranging
from the social sciences to technologies, passing through biology, medicine, eco-
nomics, business and engineering (Newman 2010; da Fontoura Costa et al. 2011;
Holme and Saramäki 2012). Network visualisation complements the network sci-
ence toolbox by providing means to translate mathematical abstraction to visual
patterns. It helps to understand the network structure, the network evolution, and
processes taking place on the network (Card et al. 1999). The challenge however is
to make visualisations that are informative and insightful.1 Since Ramon Llull first
introduced his ideas of connecting concepts to derive consequences by lines back
in the 12th century (Sales 2011), we have been coping well with small and ordered
networks (or graphs) such as those on road maps, companies’ flight route maps, the
London tube (arguably) or simple lattices (Fig. 5.1a). However, visual information
scales with network size and density, and sociograms used to provide an intuitive
graphic representation of social relations (Moreno 1934; Lima 2011) can quickly
turn into ridiculograms that are a bunch of crossing lines resembling a cat’s hairball2
(Fig. 5.1b). This effect is called visual clutter (i.e. excessive information or items in
an image) in information visualisation (Ellis and Dix 2007). Several methods exist
to improve network visualisation to avoid or at least minimise the hairball curse and
other issues (Tamassia 2013). Nevertheless, visualisation per se has its own limita-
tions and is at risk of manipulation or unconscious bias. Simply changing the order of
drawing nodes and edges, or making a 3D projection, may highlight or hide relevant
features such as node clustering, the strength of connections, or the distance between
nodes.
The increasing availability of high quality time-evolving network data revealed
that structure may change faster than previously acknowledged. Such findings
encouraged researchers to study more in-depth the temporal evolution of these com-
plex networks. Such networks are characterised by the timings that nodes and edges
are active. Although some networks are rather stable (e.g. road networks) or vary
slowly in time (e.g. flight networks Rocha 2017), a number of social networks are
highly dynamic, with an inflow and outflow of nodes and edges, periods of inac-
tivity and bursts of interaction at small temporal scales in respect to the observa-
tion time (Barabási 2005; Holme and Saramäki 2012; Rocha et al. 2017; Karsai
et al. 2018). The temporal dimension adds an extra layer of complexity in such
already hard-to-visualise patterns. The most rudimentary yet useful approach is to
simply draw a sequence of snapshots containing the network structure at subsequent

1 Art-conscious researchers may also call for aesthetic visualisations.


2 To our knowledge, the oldest available online record suggests that the term ridiculogram was
coined by Marc Vidal as early as 2007 (www.cs.unm.edu/simaaron/blog/archives/2007/05/ipam_
random_and.htm).
5 Visualisation of Structure and Processes on Temporal Networks 85

(a) (b)

Fig. 5.1 The diagrams correspond to a a meaningful sociogram where some information about the
structure can be retrieved and b a ridiculogram where no meaningful information can be retrieved

times (Gleicher et al. 2011; Beck et al. 2014). This method highlights the structure of
connections at given times and gives insights about the activation of specific edges
(and nodes) over time. It is particularly informative if nodes are (artificially or empir-
ically) fixed in space. However, if layout strategies (Tamassia 2013 )are applied at
each time step, patterns become less intuitive. Furthermore, the same issues of static
network drawings, such as node and edge overlaps, are observed. In addition, the
mental map of the historical evolution of the network is lost due to the form of pre-
senting the network. For example, in a movie generated using temporally ordered
network snapshots, the screen updates at each time step and a viewer has to rely on his
or her memory to recall previous network patterns. The use of transparency and three
dimensional visualisation have been suggested to improve such methods (Sazama
2015). This snapshot method is possibly mostly appropriate to study networks with
weak temporal activity, e.g. when structural changes occur at a slower pace or when
edges and nodes are persistently active, but may also aid identifying parts of the
network that are more active at certain times (e.g. flashing broadcast) irrespective of
the speed of structural changes.
Not only the quality but also the size of network data has increased. Some strate-
gies have been proposed to handle larger evolving networks and include reducing
their complexity by pre-processing the networks before visualisation. For example,
one can observe the movement of nodes between network communities through allu-
vian diagrams where groups of nodes are clustered in communities, represented by
temporally ordered blocks of varying sizes connected by stream fields representing
the nodes moving between communities over time (Rosvall and Bergstrom 2010). A
number of variations of the alluvian diagrams have been used to visualise evolving
communities (Rossetti and Cazabet 2018). Another methodological attempt focuses
on visualising the evolution of the adjacency matrix of the network. The so-called
matrix cube method draws a 2D grid (i.e. the adjacency matrix) where occupied cells
represent an active edge between two nodes and then stacks up subsequent chess-
boards in the third dimension to show the evolving activity as blocks (i.e. cubes,
86 C. D. G. Linhares et al.

instead of 2D occupied cells) (Bach et al. 2014). The shortcoming of this (juxta-
posed) snapshot method is to visualise the historical evolution of the network since
internal blocks are frequently hidden in sparse networks. An intuitive approach to
overcome this limitation is to create a 2D visualisation where all active nodes and
edges are shown simultaneously with different nodes in different rows (as horizontal
lines) and temporal edges (as vertical lines) connecting these nodes in the respective
time steps that they are active. Although information about the connectivity is partly
lost due to overlap of edges, it is then possible to get an overview of the node activity
in time (Jerding and Stasko 1998; van den Elzen et al. 2014).
In this chapter, we first define the temporal networks (Sect. 5.2) and then introduce
key concepts of information visualisation in the context of temporal networks with
particular emphasis on algorithms to reduce edge overlap and methods to highlight
the node activity over time (Sect. 5.3). These methods are then applied to real-world
temporal networks to identify non-trivial activity patterns and are also used to visu-
alise trajectories of random walkers and transmission paths of infection dynamics
taking place on these evolving networks (Sect. 5.4). Key visual and computational
scalability issues are discussed in Sect. 5.5 and conclusions and future research direc-
tions in Sect. 5.6.

5.2 Temporal Networks

This section introduces the concept of temporal networks, including the definition
of essential parameters and measures that will be used throughout the chapter. A
temporal network with N nodes and E edges is defined by a set of nodes i connected
by edges active at times t, i.e. (i, j, t) (Holme and Saramäki 2012; Masuda and
Lambiotte 2016). Since t is discrete, an edge activation at time t means that the
activation occurred in the interval [t, t + δ) where δ is the temporal resolution. Each
of these intervals is called snapshot, or time step τ , and ϒ = T /δ is the total number of
time steps.3 To simplify, here we remove multiple edges and self-edges per snapshot.
Multiple edges could be summed and assigned as weight to a single edge at each
snapshot. The times t = 0 and t = T indicate respectively the start and end of the
observation period. The adjacency matrix is dynamic with ai j (t) = 1 if there is an
edge between i and j at time t, and ai j (t) = 0 otherwise. The set of nodes j connected
to i forms the neighbourhood of i. The degree of node i at time t is given by ki (t),
the accumulated degree is ki and the strength is si :

3 The snapshot τ coincides with time t if δ = 1.


5 Visualisation of Structure and Processes on Temporal Networks 87


N
ki (t) = ai j (t) (5.1)
j=0


N
ki = ai j wher e : ai j = ∪t=0
T
ai j (t) (5.2)
j=0


T 
N
si = ai j (t) (5.3)
t=0 j=0

5.3 Visualisation on and of Temporal Networks

This section introduces the concept of layout and the use of Gestalt theory in the
context of visualisation of temporal networks. It also discusses the main issue of
visual clutter on network visualisation and directions to quantify and reduce this
recurrent problem, including node re-ordering techniques, removal of edges and
colouring of nodes.

5.3.1 Layouts

The layout is a fundamental visualisation concept that defines the way in which the
parts of something are arranged in the visual space. In the case of a network, the
parts would be nodes and edges. An ordered list of contacts is a common way of
representing temporal network data (Fig. 5.2a). Though convenient and optimal to
store information and input statistical analysis, this tabular format is unhelpful if
one wishes to visually identify temporal patterns for qualitative analysis. A number
of more appropriate layouts, such as space-time cubes (Bach Mar 2016), circular
methods (van den Elzen et al. 2014), structural (the classic node-edge diagrams, see

Time 0 1 2
Node 1 Node 2 Time
A
Nodes

A B 0 A A
C D 0 [1] D
[0,2] B B
A C 1
[0]
A B 2 C C
B D 2 B
[2] C
D D
(a) (b) (c)
Fig. 5.2 Methods for representation of temporal networks: a tabular or ordered list; b node-edge
diagram, the standard layout for static networks, with time stamps; c temporal or sequence view
layout
88 C. D. G. Linhares et al.

Fig. 5.2b) and sequence views (Linhares et al. 2017; Battista et al. 1994) (Fig. 5.2c),
have been designed to facilitate visual analysis of temporal network data. There is
no consensus on the best layout because each one has advantages and disadvantages
on particular tasks. For example, matrix-based techniques are more suitable than
node-edge diagrams for low-level tasks such as the estimation of the network den-
sity (Behrisch et al. 2016). The structural and temporal layouts on the other hand are
typically used to analyse the distribution of connections.
The structural layout is the conventional network representation in which the
nodes (called instances in the discipline of information visualisation) are spatially
placed on the layout with edges connecting them (Fig. 5.2b). Edges may contain a list
of numbers representing the times they are active. This layout is recommended to get
a global picture of the network since it facilitates identifying multiple structures at
the same time. If the network has several nodes and edges, visual clutter may affect
the perception of patterns. Visual clutter refers to excessive items or information
close together due to edge and node overlap (Ellis and Dix 2007) (see Sect. 5.3.2). In
network visualisation, it indicates excessive overlap of edges and nodes on the layout.
To reduce clutter, a number of strategies can be employed to change nodes’ positions.
Popular methods include force-based and circular algorithms (Six and Tollis March
2006; Battista et al. 1994; Mi et al. 2016). Such techniques are generally hopeless for
temporal networks even if combined with animated graphs because it is difficult to
maintain the mental map representation using the structural layout with networks that
change in time (Archambault and Purchase 2016). In this layout, it is recommended
to use nodes with fixed positions.
One effective layout for temporal networks places different nodes on non-
overlapping horizontal lines with vertical lines (i.e. edges) linking different nodes.
Nodes only appear as circles (or squares) at times they are active. The temporal
evolution of the network structure follows the x-axis (Fig. 5.2c). In this case, the
positions of nodes are fixed on space and are time-invariant to maintain the mental
map necessary to identify relevant patterns. Originally introduced by Jerding and
Stasko (1998), this method is sometimes called massive sequence view (MSV) (Cor-
nelissen et al. 2007). In this temporal layout, the edge length varies according to the
distance between the two nodes in the layout (compare for example nodes A and B,
and B and D in Fig. 5.2c). The varying edge length may mislead the interpretation
of the visualisation since simple edge overlap frequently suggests regions with high
connectivity where in fact only a few edges may be overlapping.
The Gestalt theory from psychology attempts to explain the human perception
of the world and is a fundamental tool for optimal visual design and consequently
information visualisation (Ware 2013). The most relevant principles of Gestalt theory
that applies to the temporal layout is proximity, closure and similarity (Fig. 5.3). The
proximity principle suggests that things that are close to one another lead us to
perceive them as grouped together (Ware 2013), as illustrated by nodes A and C in
the static layout of Fig. 5.3a and the gray blocks in the temporal layout of Fig. 5.3b.
However, the closure principle states that “a closed contour tends to be seen as
an object” (Ware 2013). The same nodes A and C partially overlap but the brain
visually separates them as distinct objects because of the closure of the contours
5 Visualisation of Structure and Processes on Temporal Networks 89

B
Time

Nodes
C
D A
E
(a) (b)
Fig. 5.3 Gestalt principles applied to structural and temporal complex network layouts: a structural
layout with node overlap and multiple node colours; b Temporal layout with multiple colours,
positions and lengths of edges

(Fig. 5.3a). Similarly, the spatially close gray lines in the temporal layout are seen as
separated blocks distinguishable due to white lines (Fig. 5.3b). Complementary, the
similarity principle states that “similar elements tend to be grouped together” (Ware
2013). Even though nodes B and E are spatially distant, the human brain is able to
distinguish them from the other nodes due to the common colour (Fig. 5.3a). For the
same reason, the vertical lines in the temporal layout are also segregated by colour
and easily distinguishable (Fig. 5.3b).
Redundant coding is typically used to improve the visualisation (Ware 2013).
In the structural layout, for example, it can be implemented by combining the edge
thickness with a gray-scale such that thicker lines indicate more connections between
a pair of nodes and darker gray indicate higher edge overlap. Colours can be also used
to highlight node features such as the network structure (e.g. degree) or node labels
(e.g. political parties Mucha et al. 2010). In the temporal layout, the colour gradient
technique is used to visualise the relationships among nodes within the same group
(if nodes are labelled) or within the same network community, and to indicate the
level of edge overlap at each time step. Furthermore, the state of nodes (e.g. occupied
or not) or trajectories in the network can be highlighted through the use of different
colours in the respective nodes and edges.

5.3.2 Visual Clutter

The main limitation of the temporal (and also of the structural) layout is the often
excessive overlap of edges that creates visual clutter. Visual clutter is the exces-
sive amount of information within a small area of an image. As pointed out in the
previous section, visual clutter due to edge similarity and edge overlap limits any
meaningful visualisation of information and visual analysis. This section describes
algorithms to highlight visual information taking into account Gestalt principles to
reduce clutter. There are 3 major classes of methods in the temporal layout: i. node
90 C. D. G. Linhares et al.

re-ordering (Fig. 5.4b), ii. smart sampling or filtering (partial removal) of edges or
nodes (Fig. 5.4c), and iii. complete removal of edges (Fig. 5.4d).
The baseline reference strategy is to place nodes in rows uniformly chosen at
random. To highlight the order of appearance, nodes are sorted (e.g. from top to
bottom) according to the timings of first connection (Fig. 5.4a). Similarly, nodes can
be sorted according to the last time they are observed. These ordering strategies help
to identify the birth, death and lifetime of nodes and edges (Holme and Liljeros
2014), and to identify if node and edge activities are concentrated in time or spread
all over the observation period (Linhares et al. 2017). Another useful strategy is
to sort nodes according to labels or values (called lexicographic ordering van den
Elzen et al. 2014; Linhares et al. 2017). Labels may come from meta-data, e.g. same
ward patients, classmates, age or income bands, or from the network structure, e.g.
accumulated degree/strength (Sect. 5.2) (van den Elzen et al. 2014), centrality or
community structure (Linhares et al. 2019). The main advantage of such approach is
to cluster nodes with similar features to facilitate cross-comparisons. For example,
two nodes with similar (cumulative) degree may have different activity patterns, e.g.
one node may concentrate activity within short periods (relatively lower persistence)
whereas another may have a more uniform distributed pattern of connections (higher
persistence). These strategies explore activity patterns for ordering without taking
into account visual clutter.
Advanced strategies aim to reduce edge overlap to maximise the visual structural
information. One strategy is to place nodes that are frequently in contact, or whose
connections are recurrent in time, spatially close in the layout. This is named recurrent
neighbours strategy (RN). In the RN strategy, the node i with highest strength si is
initially placed in the centre and then the most connected neighbour j to i is placed
right above i followed by placing the second most connected neighbour k to i right
below i and so on until all neighbours of i are added on the layout. The next node j
with highest strength is now selected and the same routine is repeated. The algorithm
proceeds until all nodes are positioned on the layout. This iterative process keeps the
highest density of edges in the central part of the layout,4 minimising the average
length of edges and consequently the visual clutter (Linhares et al. 2017). In this
solution, least important (i.e. least frequent) edges end up with relatively longer
lengths that may be interpreted as more important. An inverted recurrent neighbour
solution could solve this issue but then would add substantial edge overlapping,
that should be avoided. An alternative hierarchical strategy (HS) for node ordering
aiming to minimise both edge block overlap and average edge length has been also
proposed in the literature van den Elzen et al. (2014). In this case, the algorithm first
uses a simulated annealing optimisation process to find the shortest combination
of edges length based on the standard deviation of the lengths and then minimises
the block overlapping, i.e. the groups of overlapping edges (vertical lines) (van den
Elzen et al. 2014). Finally, edge overlap can be also minimised by filtering, i.e. by
removing or hiding selected edges. Naive strategies include random removal of a

4Somewhat similar to the force-directed graph drawing algorithm for the structural layout (Tamassia
2013), except that in our case nodes’ positions are fixed.
5 Visualisation of Structure and Processes on Temporal Networks 91

0 1 2 0 1 2

A A A A
B B C C
C C B B
D D D D
(a) (b)

0 1 2 0 1 2

A A A A
B B B B
C C C C
D D D D
(c) (d)
Fig. 5.4 Strategies to reduce visual clutter in the temporal layout: a Original temporal layout; b
Node re-ordering; c Sampling edges; d Temporal activity map with all edges removed. The darker
square indicates that node B has a higher degree (i.e. k B (2) = 2) than the other nodes at other times
(i.e. ki (t) = 1, for i = B and t = 2)

fraction of edges to depopulate the layout yet keeping some structure (Fig. 5.4c) or
simply reducing the temporal resolution δ to collapse edges (Rocha et al. 2017). In
fact, small variations of the temporal resolution do not affect substantially network
structure (Ribeiro et al. 2013; Rocha et al. 2017) but can substantially decrease visual
clutter. Another strategy consists in creating a sample of edges maintaining the edge
distribution over time (Zhao et al. 2018).
Since edges bring little information in the temporal layout, they can sometimes be
completely removed to decrease the density of elements in a layout called temporal
activity map (TAM)5 (Linhares et al. 2017). To improve the visualisation, nodes are
drawn using squares instead of circles to provide a sense of continuity (Ware 2013)
and colours are used to show structural of dynamic information, as for example
the degree ki (t) (Fig. 5.4d). The network structure is then used in the re-ordering
algorithms, as those discussed above, or to assign values to nodes (e.g. network
centrality) presented by colour gradients. This layout emphasises node activity since
high activity is readily identified through the frequency of appearances (inactive
nodes are not shown in the respective time steps). Activity can be further highlighted
by assigning colours to nodes according to their level of connectivity (e.g. their

5 This method is similar to heatmap grids (Wilke 2019).


92 C. D. G. Linhares et al.

degree ki (t) or strength si ). This method and variations are particularly suitable to
visualise communities in temporal networks (Rossetti and Cazabet 2018).

5.3.3 Estimating Clutter on Temporal Networks

To quantify the performance of algorithms employed to reduce visual clutter in


temporal networks, one can measure different image features like the number of
overlapping edges, the edge length, and the number of edge intersections for one re-
ordering algorithm and compare to the random case, i.e. when nodes are randomly
placed in rows. The most intuitive measure is to count the number of overlapping
edges in each time step τ (θ (τ )) and take the average θ  over all time steps ϒ.6
If two edges overlap more than one time in the same time step, only one overlap
is counted. The maximum overlap per time is thus equal to the number of edges at
that time. This measure does not take into account that edges have different lengths
and longer edges populate the layout more than shorter ones. To capture this feature,
we define the length of edge (i, j) as li j = n + 1, where n is the number of nodes
in-between the connected nodes i and j and estimate the average edge length l and
standard deviation of edge lengths σl . Nevertheless, the edge length does not indicate
whether a region of the image is visually dense. The average number of intersections
γ  is then used to count the number of times γ (τ ) that two edges cross each other
at each time step τ (in each horizontal line):

ϒ
1 
θ  = θ (τ ) (5.4)
ϒ τ =0
1 
l = li j (5.5)
E (i, j)

1 
σl = (li j − l)2 (5.6)
E (i, j)
ϒ
1 
γ  = γ (τ ) (5.7)
ϒ τ =0

Two edges with several intersections result on more visual clutter and thus larger
and denser networks are expected to have a higher number of intersections and
consequently higher visual clutter (Fig. 5.5).

6We use the notation of snapshots τ rather than time t to emphasise that measures take into account
snapshots, see Sect. 5.2 for definitions.
5 Visualisation of Structure and Processes on Temporal Networks 93

0 1 2 3 4

A A
=1 =1
B B
=1 =3
C C
=3
D D
Fig. 5.5 Measuring the performance of re-ordering algorithms. In this sample temporal network,
the number of snapshots is ϒ = 5. The average overlap is θ = (0 + 2 + 0 + 0 + 4)/5 = 1.2. The
average edge length and standard deviation are respectively l = 18/10 = 1.8 and σl = 0.75. The
average number of intersections is γ  = (0 + 2 + 0 + 0 + 7)/5 = 1.8

5.4 Visual Insights

The visualisation techniques discussed in the previous sections can be applied to


real-world network data to support visual analysis for insights about the evolution of
the network structure and to identify patterns to guide further quantitative statistical
analysis. The visualisation of the temporal network is particularly helpful when
combined with the visualisation of node activity and dynamic processes taking place
on the network. This section shows a few case studies to illustrate the visualisation
techniques based on three layouts containing (i) both nodes and edges (focus on
structure), (ii) nodes attributes (focus on temporal activity), and (iii) nodes states
linked to dynamic processes (focus on processes).

5.4.1 Network Data

Two different network data sets of social contacts will be used to illustrate the applica-
tion of the methods. They correspond to face-to-face spatial proximity (within 1.5 m)
interactions between two people wearing RFID badges and include typical real-world
temporal structures. In both cases, contacts are scanned every 20 seconds (that is the
maximum temporal resolution δ). The first network data set contains E = 188, 508
contacts between N = 327 students collected during 5 d (between the 2nd and 6th of
December 2013) in a high-school in Marseille, France (2015). The second network
data set contains E = 6, 980 contacts between N = 72 people visiting the Science
Gallery in Dublin, Ireland, during 1 day in 2009 (Isella et al. 2011).
94 C. D. G. Linhares et al.

(a)

(b)

high medium low inactivity

Fig. 5.6 Temporal layout for a random and b recurrent neighbour ordering of nodes for the museum
data set. The re-ordering of nodes removes noise and helps to emphasise periods with different levels
of activity over time. The temporal resolution is δ = 1 min

5.4.2 Temporal Structure

The temporal or sequence view layout is a projection of the network on the plane
where nodes are located in fixed horizontal lines and then linked by vertical edges
at certain times. Therefore, edges on each time (i.e. at each temporal snapshot)
necessarily overlap. Due to spatial constrains, a few active nodes per time step are
already sufficient to produce a layout with high clutter due to edge overlap. Edges
between spatially distant nodes also cost too much information in such layout since
they occupy much space. The colour intensity correlates with the level of edge over-
lap such that lighter edges indicate relatively fewer connections at that time step
in comparison to darker edges. For example, in the museum network, for random
node location, the first part of the observation window shows dense information
and suggests relatively higher activity followed by some periods of medium activity
(Fig. 5.6a). An effective strategy to reduce information while keeping all edges on
the layout is to simultaneously reduce overlap and the length of recurring edges.
This is done by putting together nodes that interact often (i.e. recurrent neighbour—
RN—strategy, see Sect. 5.3.2). The RN algorithm highlights the frequent interaction
by positioning the most active nodes in the centre of the layout whereas less active
nodes stay peripheral (Fig. 5.6b). It produces a cleaner layout that facilitates the iden-
tification of clusters of temporal activity. Since the longer the edge, the less frequent
is the contact, one can readily see that activity in earlier times is less intense and
involve less nodes than suggested by the random ordering. It is also visible that only
a few pairs of nodes have intense activity (i.e. repeated connections) over time (e.g.
before the period of high activity, there is a pair of nodes between the bottom and the
mid-part of the layout, and after the period of high activity, there is a pair of nodes
at the mid-part of the layout with recurrent contacts, Fig. 5.6b).
5 Visualisation of Structure and Processes on Temporal Networks 95

Table 5.1 Statistical measures of visual clutter for the museum and high-school data sets using
different temporal resolutions δ. We measure the performance of RN re-ordering algorithm in com-
parison to the random case, Δ = 100(x R N − x Random )/x Random where x represents one measure
for RN and the same measure for the random placement of nodes, averaged over 10 realisations
with ± indicating the standard deviation. The values of Δ are rounded
Museum data set
δ = 20 s δ = 1 min δ = 5 min
Random RN Δ(%) Random RN Δ(%) Random RN Δ(%)
θ  5.04 ± 0.03 4.43 −12 10.83 ± 0.04 10.33 −5 29.99 ± 0.03 29.73 −1
l 24 ± 1 10 −57 23 ± 2 13 −45 25 ± 1 14 −43
σl 17 ± 1 14 −14 17 ± 1 16 −5 18 ± 1 16 −9
ι 147.04 ± 10.09 48.49 −67 629.55 ± 74.21 355.40 −44 5639.75 ± 543.05 3005.26 −47
High-school data set
δ = 20 s δ = 3 min δ = 10 min
Random RN Δ(%) Random RN Δ(%) Random RN Δ(%)
θ  0.998 ± 0.002 0.92 −8 2.86 ± 0 2.82 −1 56.97 ± 0 56.84 −0.2
l 109 ± 5 33 −69 108 ± 3 38 −65 110 ± 2 33 −70
σl 77 ± 3 64 −17 76 ± 2 71 −7 77 ± 2 60 −22
ι 356.4 ± 24.3 47.0 −87 3381.8 ± 168.7 640.8 −81 151320 ± 3395 19784 −87

Reducing visual clutter (and thus layout information) is an essential step in net-
work visualisation. To quantify the improvement of the RN algorithm, we apply mea-
sures of visual clutter (Sect. 5.3.3) in both data sets (museum and high-school) for
various temporal resolutions δ (Table 5.1). In some cases, RN provides an improve-
ment of up to 87% (and frequently above 40%) for both data sets. In the museum data,
performance tend to decrease for lower temporal resolutions whereas in the school
network, performance is generally maintained for most measures for various δ. This
is likely a consequence of the persistent of edge activity in the high school case where
all students arrive and leave at the same time and subsequent contacts are common
(lower resolution simply collapse similar edges), whereas museum visitors, at least in
this particular case, tend to come in groups, moving around (subsequent contacts are
not so common and collapsing subsequent edges generate different visual patterns)
and spending roughly the same time in the museum but starting at different times.

5.4.3 Temporal Activity

Another strategy to reduce information due to edge overlap is to completely remove


the edges and focus on the nodes. In the temporal activity map strategy, the position
of nodes can be defined by their connectivity through node re-ordering algorithms
(e.g. degree or centrality) but edges are removed before showing the network. This
layout reveals several relevant patterns during the network evolution, as for example,
periods of activity verses inactivity. The main difference to the previous layout is
that node activity is highlighted here. Figure 5.7 shows the interactions between stu-
96 C. D. G. Linhares et al.

dents (high-school data set) during 3 subsequent days. Sharp transitions are observed
when students start (or go home after) their studies with no interactions before or
after classes. Interactions after school time could indicate strong friendship ties exist-
ing beyond the school environment.7 Defining the start/end times is not straightfor-
ward in real networks because one cannot easily identify when the first (or last)
contact happened (this is called censoring in statistics Miller 1997). Visualisation
may improve by using the re-ordering strategy based on the first appearance of the
edge (Sect. 5.3.2). Figure 5.7 also shows clusters of high activity, capturing the fact
that students are divided into various classrooms and thus interactions mostly occur
among classmates at certain times. Nevertheless, a relatively higher level of social
interactions is observed a few hours after the start of the activity at each day that
is likely related to mandatory morning break, for playing and socialising. In day 2
and day 3, the visual analysis suggests that a group of students (not the same in
each day) is missing in afternoon sessions, possibly due to self-study time or other
activities in which the badges were temporarily removed. A gradient colour scale,
linking temporal connectivity (ki (τ )) to node colour, helps to identify highly active
nodes. Figure 5.8 shows for example that during a period of high connectivity in the
museum (several active nodes), a few nodes interacted relatively more than others
(see gradients of gray colour), even though less active nodes at those times were
active at several other times (their activity is spread over time).

5.4.4 Dynamic Processes

A useful application of temporal visualisation is the analysis of dynamic processes


on networks (Masuda and Lambiotte 2016). For example, the evolution of a diffusion
process can be monitored by colouring nodes according to their dynamic state or by
colouring edges to highlight specific paths. Two fundamental dynamic processes of
theoretical and practical interest are the random walk and the infection (information)
spread dynamics. In the random walk model, a node i can be in one out of two
states at each time step τ , i.e. occupied φi (τ ) = O or empty φi (τ ) = E. All network
nodes but one randomly chosen node i 0 start empty at time t0 . At each τ , the walker
decides whether to move via an existing active edge to a neighbour with probability
(1 − p) or to remain in the current node with probability p. If there are no active
neighbours at τ , the walker simply remains in the node (Starnini et al. 2012; Rocha
and Masuda 2014). The process unfolds until t f = T . In the infection dynamics, a
node i can be in one out of three states at each time step τ , susceptible φi (τ ) = S,
infected φi (τ ) = I , or recovered φi (τ ) = R. In this case, all nodes start susceptible
and one random node is chosen to be initially infected i 0 at time t0 (patient zero or
seed). In case of active neighbours at time t, an infected node i infects a neighbour
j with probability β. An infected node remains infected for tin f ec time steps and

7 In this particular face-to-face experiment, badges were not allowed outside the school.
5 Visualisation of Structure and Processes on Temporal Networks 97

break

missing students

day night

2nd day 3rd day 4th day


Fig. 5.7 Temporal activity map for high-school students for 3 day using RN re-ordering strategy. A
coloured square indicates that the respective student had a social contact at that time (with the colour
gradient indicating the degree ki (τ )) whereas a white square indicates absence of contact with any
other students at the respective time. For all three days, contacts are typically more concentrated in
the mornings and no contacts occur at nights. The temporal resolution is δ = 3 min

Time
Nodes

Fig. 5.8 Temporal activity map for museum visitors during 1 day using appearance re-ordering
strategy. A coloured square indicates that the respective visitor had a social contact at that time
(with the colour gradient indicating the degree ki (τ )) whereas a white square indicates absence of
contact with any other visitors at the respective time. The red bar in the x-axis indicates the period of
most network activity (most nodes are interacting). The red bar in the y-axis indicates 2 very active
nodes that have relatively less intense activity during the period of high network activity (lighter
gray). The temporal resolution is δ = 2 min
98 C. D. G. Linhares et al.

(a) (b)

(c) (d)
Fig. 5.9 Random walk trajectories for various stay probability a p = 1, b p = 0.75, c p = 0.5
and d p = 0, for the museum contact network with δ = 2 min. The x-axis represents the temporal
dimension. The colour red indicates the nodes occupied by the walker and the edges used to move
between nodes at given times. Light gray indicates unoccupied nodes and white indicates absence
of activity. Nodes are sorted by order of appearance. The seed is the same for all cases

then recovers. Recovered nodes cannot be re-infected or turn back to susceptible


state (Barrat et al. 2008; Rocha and Blondel 2013).
Figure 5.9 shows various trajectories of a random walker starting from the same
node (seed i 0 ) but using various probabilities p of staying in the node. In the trivial
case, p = 1 and the walker simply remains in the initial node i 0 indefinitely. The
occupancy of this node by the walker is thus seen until the last activation of the
hosting node, that happens well before the end of the observation time T (Fig. 5.9a).
However, for p = 1, the walker hops between nodes through active edges and a
richer diffusion dynamics unfolds in time (Fig. 5.9b–d). Note that p = 0 implies that
a walker hops as soon as an active edge becomes available and less hoping (and
simpler trajectories) is observed for larger p, as expected. In these particular random
cases, the walker is able to reach longer times for p = 1 in comparison to p = 1, and
ends up in nodes that entered later in the network (compare last appearance of the
walker for different p in Fig. 5.9). For any p = 1, the walker remains trapped between
two nodes for relatively long periods, an unlikely scenario in static networks where
walkers explore larger regions of the network (Starnini et al. 2012). Since nodes
are ordered by time of first activation, one can also identify potential correlations
between lifespan and frequency of walker visits. Similar visualisation ideas could
be applied to trace higher-order random walks on temporal networks (Scholtes et al.
2014).
In more elaborated dynamic processes as the infection dynamics, a visualisation
may focus on the state of nodes or on transmission paths, or even both but then likely
causing cognitive overload (Huang et al. 2009). The TAM layout emphasises the node
5 Visualisation of Structure and Processes on Temporal Networks 99

(a) (b)

(c) (d)

(e) (f)

(g) (h)
Susceptible Infected Recovered

Fig. 5.10 Infection dynamics for a β = 0 and tr = 20, b β = 1 and tr = ∞, c β = 0.01 and
tr = 10, d β = 1 and tr = 10, e β = 0.01 and tr = 20, f β = 1 and tr = 20, g β = 0.01 and
tr = 100 and h β = 1 and tr = 100, for the museum contact network with δ = 2 min. Nodes are
sorted by order of appearance. The infection seed is the same for all cases

state and is particularly helpful to identify the timings of infection events and how
groups of same-state nodes evolve. Figure 5.10 shows the evolution of the state of all
network nodes (for the museum data set) in the SIR infection dynamics for various
values of β and tr . In the trivial case (β = 0), the infection seed becomes active one
more time before turning to the recovery (yellow) state (Fig. 5.10a). In the worst case
scenario (β = 1 and tr = ∞, Fig. 5.10b), some nodes remain susceptible for a while
after joining the network however everyone eventually becomes infected in this par-
ticular network configuration. Overall, this layout facilitates a global understanding
of the impact of certain parameters in the dynamics. One may explore variations in
the infection period tr when β is fixed (Fig. 5.10c, e, g) or Fig. 5.10d, f, h) or fix tr and
100 C. D. G. Linhares et al.

(a) (b)

(c) (d)
Fig. 5.11 Infection dynamics for a β = 0.01 and tr = 10, b β = 1 and tr = 10, c β = 0.01 and
tr = 20, d β = 1 and tr = 20 for the museum contact network with δ = 2 min. Nodes are sorted
by order of appearance. The infection seed is the same for all cases, and the same as for Fig. 5.10

study the infection probability β (Fig. 5.10c, d, Fig. 5.10e, f or (Fig. 5.10g, h). For
example, a small increase in β or tr generated a temporal cluster of active infected
nodes (see bottom arrows in Fig. 5.10e, f). In the case of β = 1, one node (see top
arrows in Fig. 5.10d, f) was active and susceptible before the epidemic outbreak and
remained active and susceptible after the end of the outbreak.
The limitation of the previous layout is that information of the transmission paths,
i.e. the edges through which infection events occurred, are unavailable. Those edges
can be included to create a layout of transmission trees, i.e. the sequence of nodes
and edges in which the infection (or information) propagates from the seed, with
transparency used on non-infected nodes to reduce information load yet allowing a
global view of the nodes’ states (see Fig. 5.11 and compare with Fig. 5.10c–f using
the same parameters). This new layout helps to identify who infected whom and the
timings of these infection events. It is particularly useful to visualise the importance
of certain nodes and edges to regulate the infection spread, for example, by visualising
the transmission trees before and after vaccination, i.e. removal of those nodes or
edges. The intensity of edge colour is also used to identify edge overlap but since
infection events are rarer than the chance of having a contact in a given time step,
overlap (and thus visual clutter) is less of a problem than if all active edges were
shown. The layout can be further optimised by ordering nodes according to timings of
infection, with earlier infected nodes positioned on the central part of the layout and
those infected later at more peripheral positions. This is similar to the RN algorithm
(see Sect. 5.3.2) however using timings of infection events for ordering nodes.
5 Visualisation of Structure and Processes on Temporal Networks 101

5.5 Visual and Computational Limitations

In the temporal layout, node re-ordering methods optimise the distribution of edges,
reducing edge overlap (visual clutter) and improving the visualisation. Neverthe-
less, 2 dimensional spatial constrains also create lengthy edges crossing several
in-between nodes and hindering relevant structural information. If activity is rela-
tively high (e.g. a couple of edges active at the same time step) and the network is
large (some hundreds of nodes), re-ordering techniques may be insufficient to pro-
vide meaningful visualisation (Keim 2001). Alternative solutions under development
include identifying and removing specific edges, e.g. those edges between or within
network communities. Moving to 3D layouts (with 2 dimensions for space and 1
dimension for time) may also improve visualisation by disentangling overlapping
edges at the cost of more information being available. The temporal activity map
solution is more scalable with network size and density since it removes the issue
of edges overlap. Combined with node re-ordering techniques to improve the place-
ment of nodes, this solution may highlight relevant activity and structural patterns
with thousands of nodes and any edge density (without showing the edges). The
main advantage is the possibility to embed information about the network structure,
dynamic processes or meta-data in each node through a colouring scheme with low
computational cost.
A crucial step when studying temporal networks is the choice of the observation
period (T ) and the temporal resolution (δ). Both parameters affect the visualisation
since the viewer is effectively looking at a static network on screen. The obser-
vation period is in principle less critical since one can zoom in/out or move the
network around but since some re-ordering algorithms are based on the cumulative
network measures, this period may affect the location of nodes (e.g. longer periods
may imply on more edges and thus higher strength Rocha et al. 2017). On the other
hand, variations in the temporal resolution may substantially change the network
structure (Ribeiro et al. 2013; Rocha et al. 2017) and thus the information (net-
work structure) being viewed. Although node re-ordering algorithms work for any
resolution and easily accommodate all these cases, the viewer has to keep in mind
the potential limitations or variations in the structure and activity when performing
qualitative visual network analysis.
Since the visualisation per se is static (though can be interactive in a software, see
below), the computational bottleneck is the algorithm to calculate node re-ordering,
edge removal, or the dynamic process, that are done in a pre-processing stage. There-
fore, computational scalability depends more on the specific choice of algorithms
than on the visualisation stage. For example, the recurrent neighbours strategy is
faster for small-to-medium size networks than the optimised MSV because the first
is deterministic and the second runs over several configurations (simulated anneal-
ing) to find the optimal solution (van den Elzen et al. 2014). For larger networks
(with thousands of nodes), both methods require intensive calculations and naive
solutions (e.g. appearance or lexicographic) may work relatively faster. Such com-
102 C. D. G. Linhares et al.

putational limitations may also hamper the applicability of such methods on online
visualisation of real-time data.
A free software developed in Java™ implements all methods discussed in this
chapter and is available online (www.dynetvis.com). It is multiplatform and runs over
the JGraphX library.8 The DyNetVis system allows the user to perform interactive
visual analysis of temporal networks using either structural (node-edge diagram)
or temporal layout, and allows node re-ordering, changes in edge and node colours
among other practical functions via interaction tools.

5.6 Conclusion

Network visualisation provides qualitative visual insights about the network structure
to support identifying non-trivial connectivity patterns and developing new statis-
tical measures. The visualisation of dynamic processes unfolding on the network
also helps to trace trajectories, transmission paths and the evolution of the states
of nodes. The visualisation of temporal networks are particularly helpful since pat-
terns of node and edge activity are typically highly irregular in time. Various visual
layouts have been proposed to view temporal networks but all have limitations due
to their multiple degrees of freedom. The main challenge is that visual information
scales with network size and edge density causing visual clutter due to edge overlap,
node proximity and fine temporal resolution. In this chapter we explored a 2D layout
where active nodes appear as circles or squares in horizontal lines and vertical lines
represent active edges at specific time steps. To avoid edge overlap and to highlight
connectivity patterns, colour gradients and node re-ordering strategies were imple-
mented. An effective strategy called recurrent neighbours places highly connected
nodes more centrally in the 2D layout and the less active nodes in peripheral loca-
tions. Another strategy consists on complete removal of edges and visualisation of
node properties (e.g. node activity, structure or states) using colours and gradients on
nodes. This layout named temporal activity map is useful to identify activity patterns
such as temporal clusters of activity and periods of inactivity. The temporal activ-
ity map was also used to simultaneously visualise temporal networks and dynamic
processes. This method revealed non-trivial random walker trajectories (e.g. being
trapped between two nodes over time), allowed mapping of infection transmission
paths and the identification of timings and directions of infection events. The tem-
poral layouts discussed in this chapter are naturally limited to a few thousand nodes,
hence alternative scalable strategies are necessary to handle Big network data. Nev-
ertheless, these solutions have a range of applications on mid-size network data such
as those frequently used in social systems (friendship, proximity contacts, opinion
dynamics), economics (inter-bank loan networks, transportation, cascade failures),
business (enterprise partnerships, corporate board directors), public health (contact
tracing, impact of vaccination/intervention, epidemics), or biology (neuronal activ-

8 Available at https://github.com/jgraph/jgraphx. No need of separate installation to run DyNetVis.


5 Visualisation of Structure and Processes on Temporal Networks 103

ity, signaling), to name a few possibilities. Some ideas developed in this chapter may
also help to visualise multiplex networks that can be seen as temporal networks con-
taining a few snapshots. Future research efforts however aim to improve methods to
filter edges to highlight particular temporal structures, to improve the visual analysis
of larger network data sets (scalability issue) and to handle streaming networks, in
which the distribution of incoming edges is unknown. The analysis of such cases
may require innovative multidimensional layouts involving grouping of nodes and
the complete removal of edges.

References

D. Archambault, H.C. Purchase, Can animation support the visualisation of dynamic graphs? Inf.
Sci. 330, 495–509 (2016)
B. Bach, Unfolding dynamic networks for visual exploration. IEEE Comput. Graph. Appl. 36,
74–82 (2016)
B. Bach, E. Pietriga, J.-D. Fekete, Visualizing dynamic networks with matrixcubes, in Proceedings
of the 2014 Annual Conference on Human Factors in Computing Systems (CHI2014) (ACM,
2014), pp. 877–886
A.-L. Barabási, The origin of bursts and heavy tails in human dynamics. Nature 435, 207–211
(2005)
A. Barrat, M. Barthélemy, A. Vespignani, Dynamical Processes on Complex Networks (Cambridge
University Press, 2008)
G.D. Battista, P. Eades, R. Tamassia, I.G. Tollis, Algorithms for drawing graphs: an annotated
bibliography. Comput. Geom. 4(5), 235–282 (1994)
F. Beck, M. Burch, S. Diehl, D. Weiskopf, The state of the art in visualizing dynamic graphs, in
Eurographics Conference on Visualization (EuroVis) (2014)
M. Behrisch, B. Bach, N. Henry Riche, T. Schreck, J.-D. Fekete, Matrix reordering methods for
table and network visualization, in Computer Graphics Forum, vol. 35 (Wiley Online Library,
2016), pp. 693–716
S. Card, J. Mackinlay, B. Shneiderman, Readings in Information Visualization: Using Vision to
Think (Morgan Kaufmann, 1999)
B. Cornelissen, D. Holten, A. Zaidman, L. Moonen, J.J. van Wijk, A. van Deursen, Understanding
execution traces using massive sequence and circular bundle views, in ICPC (IEEE Computer
Society, 2007), pp. 49–58
L. da Fontoura Costa, O.N.O. Jr., G. Travieso, F.A. Rodrigues, P.R.V. Boas, L. Antiqueira, M.P.
Viana, L.E.C. Rocha, Analyzing and modeling real-world phenomena with complex networks: a
survey of applications. Adv. Phys. 60(3), 329–412 (2011)
G. Ellis, A. Dix, A taxonomy of clutter reduction for information visualisation. IEEE Trans. Vis.
Comput. Graph. 13(6), 1216–1223 (2007)
M. Gleicher, D. Albers, R. Walker, I. Jusufi, C.D. Hansen, J.C. Roberts, Visual comparison for
information visualization. Inf. Vis. 10(4), 289–309 (2011)
P. Holme, J. Saramäki, Temporal networks. Phys. Rep. 519, 97–125 (2012)
P. Holme F. Liljeros, Birth and death of links control disease spreading in empirical contact networks.
Sci. Rep. 4(4999) (2014)
W. Huang, P. Eadesband, S.-H. Hong, Measuring effectiveness of graph visualizations: A cognitive
load perspective. Inf. Vis. 8(3), 139–152 (2009)
L. Isella, J. Stehlé, A. Barrat, C. Cattuto, J.-F. Pinton, W.V. den Broeck, What’s in a crowd? analysis
of face-to-face behavioral networks. J. Theor. Biol. 271, 166–180 (2011)
104 C. D. G. Linhares et al.

D.F. Jerding, J.T. Stasko, The information mural: a technique for displaying and navigating large
information spaces. IEEE Trans. Vis. Comput. Graph. 4(3), 257–271 (1998)
M. Karsai, H.-H. Jo, K. Kaski, Bursty Human Dynamics (Springer, 2018)
D. Keim, Visual exploration of large data sets. Commun. ACM 44(8), 38–44 (2001)
M. Lima, Visual Complexity Mapping Patterns of Information. (Princeton Architectural Press, 2011)
C.D.G. Linhares, J.R. Ponciano, F.S.F. Pereira, L.E.C. Rocha, J.G.S. Paiva, B.A.N. Travençolo, A
scalable node ordering strategy based on community structure for enhanced temporal network
visualization, unpublished (2019)
C.D.G. Linhares, B.A.N. Travençolo, J.G.S. Paiva, L.E.C. Rocha, DyNetVis: a system for visual-
ization of dynamic networks, in Proceedings of the Symposium on Applied Computing, SAC ’17,
(Marrakech, Morocco) (ACM, 2017), pp. 187–194
R. Mastrandrea, J. Fournet, A. Barrat, Contact patterns in a high school: a comparison between
data collected using wearable sensors, contact diaries and friendship surveys. PLOS ONE 10(9)
(2015)
N. Masuda, R. Lambiotte, A Guide to Temporal Networks (World Scientific, 2016)
P. Mi, M. Sun, M. Masiane, Y. Cao, C. North, Interactive graph layout of a million nodes. Informatics
3, 23, 12/2016 (2016)
R.G. Miller, Survival Analysis (Wiley, 1997)
J.L. Moreno, Who Shall Survive? A New Approach to The Problem of Human Interrelations (Beacon
House Lima, 1934)
P.J. Mucha, T. Richardson, K. Macon, M.A. Porter, J.-P. Onnela, Community structure in time-
dependent, multiscale, and multiplex networks. Science 328(5980), 876–878 (2010)
M. Newman, Networks: An Introduction (OUP Oxford, 2010)
B. Ribeiro, N. Perra, A. Baronchelli, Quantifying the effect of temporal resolution on time-varying
networks. Sci. Rep. 3, 3006 (2013)
L.E.C. Rocha, Dynamics of air transport networks: a review from a complex systems perspective.
Chin. J. Aeronaut. 30, 469–478 (2017)
L.E.C. Rocha, V.D. Blondel, Bursts of vertex activation and epidemics in evolving networks. PLOS
Comput. Biol. 9, e1002974 (2013)
L.E.C. Rocha, N. Masuda, Random walk centrality for temporal networks. New J. Phys. 16, 063023
(2014)
L.E.C. Rocha, N. Masuda, P. Holme, Sampling of temporal networks: methods and biases. Phys.
Rev. E 96(5), 052302 (2017)
G. Rossetti, R. Cazabet, Community discovery in dynamic networks: A survey. ACM Comput.
Surv. 51(2), 35 (2018)
M. Rosvall, C.T. Bergstrom, Mapping change in large networks. PLoS ONE 5(1), e8694 (2010)
T. Sales, Llull as computer scientist, or why llull was one of us, in Ramon Llull: From the Ars
Magna to artificial intelligence, ed. by C. Sierra, A. Fidora, Chap. 2 (Artificial Intelligence
Research Institute, Barcelona, Spain, 2011), pp. 25–38
P.J. Sazama, An overview of visualizing dynamic graphs, Unpublished (2015)
I. Scholtes, N. Wider, R. Pfitzner, A. Garas, C.J. Tessone, F. Schweitzer, Causality-driven slow-
down and speed-up of diffusion in non-markovian temporal networks. Nat. Commun. 5, 5024
(2014)
J.M. Six, I.G. Tollis, A framework and algorithms for circular drawings of graphs. J. Discret. Alg.
4, 25–50 (2006). (March)
M. Starnini, A. Baronchelli, A. Barrat, R. Pastor-Satorras, Random walks on temporal networks.
Phys. Rev. E 85, 056115 (2012)
R. Tamassia, Handbook of Graph Drawing and Visualization (Chapman and Hall/CRC, 2013)
S. van den Elzen, D. Holten, J. Blaas, J.J. van Wijk, Dynamic network visualization with extended
massive sequence views. IEEE Trans. Vis. Comput. Graph. 20(8), 1087–1099 (2014)
C. Ware, Information Visualization: Perception for Design, vol. 3 (Morgan Kaufmann Publishers
Inc, 2013)
5 Visualisation of Structure and Processes on Temporal Networks 105

C. Wilke, Fundamentals of Data Visualization: A Primer on Making Informative and Compelling


Figures (O’Reilly, 2019)
Y. Zhao, Y. She, W. Chen, Y. Lu, J. Xia, W. Chen, J. Liu, F. Zhou, Eod edge sampling for visualizing
dynamic network via massive sequence view. IEEE Access 6, 53006–53018 (2018)
Chapter 6
Weighted Temporal Event Graphs
and Temporal-Network Connectivity

Jari Saramäki, Arash Badie-Modiri, Abbas K. Rizi, Mikko Kivelä,


and Márton Karsai

Abstract Correlations between the times of events in a temporal network carry infor-
mation on the function of the network and constrain how dynamical processes taking
place on the network can unfold. Various techniques for extracting information from
correlated event times have been developed, from the analysis of time-respecting
paths to temporal motif statistics. In this chapter, we discuss a recently-introduced,
general framework that maps the connectivity structure contained in a temporal net-
work’s event sequence onto static, weighted graphs. This mapping retains all infor-
mation on time-respecting paths and the time differences between their events. The
weighted temporal event graph framework builds on directed, acyclic graphs (DAGs)
that contain a superposition of all temporal paths of the network. We introduce the
reader to the mapping from temporal networks onto DAGs and the associated compu-
tational methods and illustrate the power of this framework by applying it to temporal
motifs and to temporal-network percolation.

Keywords Temporal networks · Event graphs · Network connectivity

J. Saramäki (B) · A. Badie-Modiri · A. K. Rizi · M. Kivelä


Department of Computer Science, Aalto University School of Science, P.O. Box 15400, 00076
Espoo, Finland
e-mail: jari.saramaki@aalto.fi
A. Badie-Modiri
e-mail: arash.badie-modiri@aalto.fi
A. K. Rizi
e-mail: abbas.k.rizi@aalto.fi
M. Kivelä
e-mail: mikko.kivela@aalto.fi
M. Karsai
Department of Network and Data Science, Central European University, A-1100 Vienna, Austria
e-mail: karsaim@ceu.edu

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 107
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_6
108 J. Saramäki et al.

6.1 Introduction

There are two key reasons behind the success of the temporal networks frame-
work (Holme and Saramäki 2012; Holme 2015). They both have to do with the rich
additional information brought by the knowledge of the specific times of interactions
between nodes. First, the times of interaction events and their correlations contain
detailed information about the dynamics of the entities that form the network. Con-
sider, for example, studies in computational social science that build on data on human
communication: the time stamps of communication events carry far more informa-
tion on human behaviour than any static network mapping would (see, e.g., Karsai
et al. (2011); Jo et al. (2012); Miritello et al. (2013); Kovanen et al. (2013b); Ale-
davood et al. (2015); Navarro et al. (2017)). Second, the times of events and their
correlations can strongly influence dynamical processes taking place on a network.
The effects of time-domain heterogeneity can under certain conditions be strong
enough to render the static-network picture invalid for some dynamical processes, as
is the case for spreading dynamics in networks with bursty event sequences (Karsai
et al. 2011; Iribarren and Moro 2009; Horváth and Kertész 2014). Therefore, more
often than not, the times of interactions simply must be taken into account if one
wants to obtain an accurate understanding of the dynamics of processes that unfold
on temporal networks.
For both of the above goals—extracting information from the network itself and
understanding how the network affects the dynamic processes it supports—new kinds
of mathematical and computational tools are required. While some static-network
concepts can be extended to temporal networks, the additional degree of freedom
provided by the temporal dimension makes things more complicated. Developing
useful measures for temporal networks often requires approaches that are very dif-
ferent from the static case, and such measures may be less than straightforward to
define. Consider, e.g., the shortest paths between nodes: in a static, unweighted net-
work, the only attribute of a shortest path—if it exists—is its length, and this length
is readily discovered by a breadth-first search. In temporal networks, when consid-
ering shortest paths, one has to first define what “short” means—does it mean the
fastest path, or the one with the smallest number of events, or maybe the one corre-
sponding to the shortest static-network path? Then, additionally, one has to choose
the time frame that one is interested in, as paths are ever changing entities that are
only brought into existence by their constituent, temporary events: even if there is
a path now, there may be none a second later. And then, finally, one has to devise a
computational method applicable to empirical data that extracts the shortest temporal
paths in a reasonable amount of time.
Even if static-network concepts are not directly applicable, it would be convenient
to repurpose computational and theoretical methods developed for static networks for
temporal-network analysis, as there is an abundance of such methods. This would be
made possible, e.g., by casting temporal networks as static entities so that the proper-
ties of those static entities would correspond to the properties of the original temporal
networks (though not necessarily in a one-to-one manner). Generally speaking, this is
6 Weighted Temporal Event Graphs and Temporal-Network Connectivity 109

not straightforward, although some approaches have been introduced in the literature
that typically focus on some chosen subset of the properties of temporal networks
(see, e.g., Nicosia et al. (2012)).
In this chapter, we discuss an approach that maps temporal-network structure onto
a weighted static event graph that is directed and acyclic and whose weights encode
the time differences between events (Kivelä et al. 2018; Badie-Modiri et al. 2020,
2022a, b). This mapping is done in a way that preserves the time-respecting paths
of the original network. Temporal-network event graphs are analogous to the line
graph representation of static networks (Mellor 2017), where the nodes of the line
graph represent links of the original network. In the event graphs discussed here,
nodes represent the events of the original temporal network, while directed links that
follow the direction of time connect events that share a node in the temporal network.
The link weights of these directed links indicate the time difference between the two
events that the link connects. As a concrete example, if in a temporal network of
mobile telephone calls person A calls B who then calls C, the weighted event graph
has two corresponding nodes (the AB call and the BC call), so that the AB node is
connected to the BC node with a directed link whose weight is the waiting time from
the AB call to the BC call.
The key strength of the this approach is that it encodes temporal information
as a static network structure. This static network allows fast and straightforward
extraction of temporal-network structures that are constrained by the time differences
Δt between successive events, from temporal motifs to time-respecting paths whose
events have to follow one another within some time limit and to temporal components
defined by connectivity through such time-respecting paths. Additionally, one can
use known static-graph-based methods and find these structures in a way that is
computationally extremely efficient as compared to brute-force methods applied to
the original temporal network (Badie-Modiri et al. 2020). For example, one can use
the method developed for percolation studies (Newman and Ziff 2001) where one
performs sweeps of activating one connection at a time, in this case in the order of
increasing time difference Δt. This method is computationally much more efficient
than brute-force breadth-first-search approaches, which have been used in temporal-
network studies. Such approaches were also used in conventional percolation studies
before more efficient methods were discovered (Leath 1976).
Being able to quickly obtain time-respecting paths and temporal-network compo-
nents by sweeping through a range of constraints is particularly useful for studying
spreading and transportation processes, where the agent has to be transmitted from
a node within some time limit Δt or the process stops at the node. Such processes
include typical variants of the common models of contagion, such as the Susceptible-
Infectious-Recovered (SIR) and Susceptible-Infectious-Susceptible (SIS) models,
where the recovery/infectiousness time is assumed to be constant or has a clear
upper bound. Other types of dynamics include social contagion where information
or rumours have a finite lifetime, ad-hoc message passing by mobile agents that keep
the message only for a limited time, and routing of passengers in transport networks
where both lower and upper limits on the possible or acceptable transfer time may
exist.
110 J. Saramäki et al.

Once the weighted event graph has been constructed from the temporal network,
one can quickly extract subnetworks that correspond to chosen values of Δt; we shall
discuss how this is done below. These subgraphs, being static, can then be approached
using static-network methods and algorithms. There can be additional computational
advantages from these subnetworks being directed and acyclic because there are
fast algorithms developed for directed acyclic graphs. Further, as discussed above,
because an event graph encodes all time differences between events in the origi-
nal temporal network, one can quickly sweep through a range of time differences
to see how the maximum (or minimum) difference affects the outcome. For exam-
ple, the size distribution of temporally connected components depends on the time
differences between events.
This chapter is structured as follows. We begin by providing definitions for con-
cepts related to temporal adjacency and connectivity that are required for constructing
the weighted event graphs. We then continue by discussing how temporal networks
can be mapped to weighted event graphs, both in theory and in practice. Follow-
ing this, we discuss how to interpret the structural properties of weighted event
graphs—how their topological features, such as directed paths or weakly connected
components, map back onto features of the original temporal networks. This dis-
cussion is followed by two examples of applications of this framework. First, we
discuss temporal motifs and their analysis. Second, we show how the weighted event
graph approach can be used in temporal-network percolation studies, where it has
been used both theoretically and empirically to show that temporal-network perco-
lation belongs to the directed percolation universality class. Finally, we present our
conclusions and discuss possible future directions.

6.2 Mapping Temporal Networks Onto Weighted Event


Graphs

6.2.1 Definitions: Vertices, Events, Temporal Network

Let us consider a temporal network G = (VG , E G , T ), where VG is the set of ver-


tices and E G ⊂ VG × VG × [0, T ] is the finite set of interaction events between the
vertices with known times, so that the interactions take place within some limited
period of observation [0, T ] that can also be considered to be the lifetime of G. We
denote an interaction event—simply an “event” from now on—between vertices i
and j at time t with e(i, j, t). Please note that in the following, we require that one
node is only allowed to participate in one event at any given point in time.
Note that depending on the context, events may be considered as directed (e.g.,
representing an email from i to j in email data) or undirected (e.g., representing a
face-to-face conversation between two persons). This choice has consequences on
dynamical processes taking place on top of the temporal network: in the case of social
contagion, for example, one email or one text message carries information one way
6 Weighted Temporal Event Graphs and Temporal-Network Connectivity 111

only, while a face-to-face conversation can carry information both ways. Therefore,
the choice of whether to use directed or undirected events is influenced by the model
or phenomenon that one wants to study.
There are also cases where the events have a non-zero duration that has to be
taken into account in temporal-network studies. Examples include the flights in a
passenger’s route in an air transport network and phone calls in a communication
network—in both cases, a new event (the connecting flight, the next phone call)
cannot begin before the first event is finished. When event duration needs to be taken
into account, events are defined as quadruples, e(i, j, t, τ ), where τ indicates the
duration of the event.
Depending on the type of events in a temporal network, the time difference is
defined in a slightly different way:
Definition 1 (Time difference between events) Given two events e1 and e2 , their time
difference δt (e1 , e2 ) is for events without duration the difference in times δt (e1 , e2 ) =
t2 − t1 , and for events with duration the time from the end of the first event to the
beginning of the second one δt (e1 , e2 ) = t2 − t1 − τ1 .
Note that these two definitions become the same if the events have zero duration.

6.2.2 Definitions: Adjacency and Δt-Adjacency

Our goal is to investigate larger temporal network structures, from mesoscopic to


macroscopic entities, that arise from the topological and temporal correlations of the
network’s constituent events. We begin by defining criteria for events being topologi-
cally and temporally close to one another and then move on to defining larger entities
based on these criteria. The following concepts of temporal adjacency, temporal con-
nectivity, and temporal subgraphs are the building blocks for temporal-network event
graphs as well as their substructures from components to temporal motifs. Moreover,
as we shall see below, the concept of temporal adjacency straightforwardly leads to
the notion of time-respecting paths.
Definition 2 (Temporal adjacency) Two events e1 (i, j, t1 ) and e2 (k, l, t2 ) are tem-
porally adjacent, denoted e1 → e2 , if they share (at least) a node, |{i, j} ∩ {k, l}| > 0
and they are consecutive (but not simultaneous) in time, i.e., δt (e1 , e2 ) > 0.
Δt
Definition 3 (Δt-adjacency) Two events e1 and e2 are Δt-adjacent, denoted e1 − →
e2 , if they are temporally adjacent and the time difference between them is δt (e1 , e2 ) ≤
Δt.
Temporal adjacency and Δt-adjacency are always directed regardless of whether
the events themselves are directed or not, and their direction follows the direction of
time, from the event that took place first to the event that took place next. Please note
that here we use a directed definition of adjacency unlike in Kovanen et al. (2011,
2013a, b) for reasons that will become apparent later.
112 J. Saramäki et al.

Depending on the problem at hand, one may wish to include additional constraints
in the definition of temporal adjacency. If the original events are directed and this
directionality is important, e.g., because it affects information flows, this can be
directly incorporated into the definition of temporal adjacency, so that e(i, j, t) and
e( j, k, t + 1) are considered adjacent, while e(i, j, t) and e(k, j, t + 1) are not (see
Definition 2). This will also affect time-respecting paths defined according to Defini-
tion 10. It is possible to introduce further constraints, e.g., by ignoring return events
and considering non-backtracking events only, which might be useful for modelling
certain types of spreading dynamics. In this case, the pair e(i, j, t) and e( j, i, t + 1)
should not be considered adjacent.
One can also consider allowing simultaneous interactions of the same node by
introducing hyper-events and an adjacency relationship between them for the defi-
nition of a hyper-event graph. In this extension, events happening at the same time
and sharing nodes may be grouped into a hyper-event, which then represents a set
of simultaneous events with a single object. Two hyper-events taking place at times
t and t  are then adjacent if they are consecutive (t < t  ) and share at least one node
from the set of nodes they involve (Karsai et al. 2019).

6.2.3 Definitions: Temporal Connectivity and Temporal


Subgraphs

To study the mesoscopic building blocks of temporal structures, we need to use their
local connectivity patterns for identifying meaningful temporal subgraphs in their
fabric. Following the approach of Kovanen et al. (2011), we’ll use the concept of
temporal adjacency defined above to introduce temporal connectivity and temporal
subgraphs.
Definition 4 (Weak temporal connectivity) Two events ei and e j are temporally
weakly connected, if without considering the directionality of adjacency, there is a
sequence of temporally adjacent events between them.
Definition 5 (Weak Δt-connectivity) Two events ei and e j are weakly Δt-connected
if, without considering the directionality of adjacency, there is a sequence of Δt-
adjacent events between them.
The above definitions of temporal connectivity are weak in the sense that the
directions of adjacency do not matter. Their motivation is to ensure that temporal
subgraphs—as defined below—are connected both topologically and temporally.
Definition 6 (Connected temporal subgraphs) A connected temporal subgraph con-
sists of a set of events where all pairs of events are weakly temporally connected.
Note that in the above definition we have left out the word “weak” as there cannot
be strong connectivity between all pairs of events in a temporal subgraph, because
there cannot be any loops in time.
6 Weighted Temporal Event Graphs and Temporal-Network Connectivity 113

Definition 7 (Δt-connected temporal subgraphs) A Δt-connected temporal sub-


graph consists of a set of events where all pairs of events are weakly Δt-connected.
The subgraph is called valid if no events are skipped when constructing the subgraph;
i.e., for each node’s time span in the subgraph, all Δt-adjacent events that can be
included are included.
Definition 8 (Maximal valid connected subgraphs) A maximal valid connected tem-
poral subgraph is a connected temporal subgraph that contains all events that can be
added to it such that all its event pairs are weakly temporally connected.
Definition 9 (Maximal valid Δt-connected subgraphs) A maximal valid
Δt-connected temporal subgraph is a Δt-connected temporal subgraph that contains
all events that can be added to it such that all its event pairs are weakly Δt-connected.
Note that by definition, maximal valid Δt-connected subgraphs are themselves sub-
graphs of maximal valid temporal subgraphs.

6.2.4 Definitions: Time-Respecting Path and Δt-Constrained


Time-Respecting Path

As the final building block for weighted event graphs, we will next focus on temporal-
network paths that define which nodes can reach one another and when. Similarly
to static-network paths that are sequences of nodes joined by edges, the events of
temporal networks form paths in time that connect nodes. For a temporal-network
path to be meaningful, it has to respect the direction of time:
Definition 10 (Time-respecting path) An alternating sequence of nodes and undi-
rected events P = [v1 , e1 (i 1 , j1 , t1 ), . . . , en (i n , jn , tn ), vn+1 ] is a time-respecting
path if the events are consecutive in time and each consecutive pair of events
is temporally adjacent, ek → ek+1 for all k < n, and vk , vk+1 ∈ {i k , jk }, such that
vk = vk+1 . If the events are directed, then additionally each event’s target node must
be the source node of the next event on the path, jk = i k+1 . For notational conve-
nience, we can omit the nodes, defining a time-respecting path through its events:
Pe = [e1 (i 1 , j1 , t1 ), . . . , en (i n , jn , tn )].
As an example, for events with zero duration, the sequence of events Pe =
[e1 (i, j, t1 ), e2 ( j, h, t2 ), e3 (h, l, t3 )] is a time-respecting path if t1 < t2 < t3 , in other
words, if δt = tk+1 − tk > 0 ∀k = 1, 2, ek ∈ Pe . The inequality follows from the
requirement that vertices only participate in at most one event at a time. For events
with durations, the next event on the path cannot begin before the first event
is finished: Pe = [e1 (i, j, t1 , τ1 ), e2 ( j, h, t2 , τ2 ), e3 (h, l, t3 , τ3 )] is a time-respecting
path when t1 + τ1 < t2 and t2 + τ2 < t3 . That is, if δt = tk+1 − tk − τk > 0 ∀k =
1, 2, ei ∈ Pe . Note that time-respecting paths are always directed, regardless of
whether the events themselves are directed or not, and the direction of time-respecting
paths direction follows the arrow of time.
114 J. Saramäki et al.

Finally, as a special case of time-respecting paths, we define a subset of them


where the events have to follow one another within some specific time limit Δt.

Definition 11 (Δt-constrained time-respecting path) A time-respecting path is Δt-


constrained if all its consecutive pairs of events are Δt-adjacent, i.e., all consecutive
events follow one another with a time difference of no more than Δt: δt (ek , ek+1 ) ≤
Δt ∀k < n.

6.2.5 The Weighted Event Graph and Its Thresholded


and Reduced Versions

Armed with the above definitions, our aim is now to map the original temporal
network onto a static representation that retains information of the time-respecting
paths of the network (Definition 10) as well as the time differences δt between events
on such paths. For a temporal network G = (VG , E G , T ), let A G = {(ei , e j )|ei →
e j ; ei , e j ∈ E G } ⊂ E G × E G be the set of all temporal adjacency relations between
the events E G of G (see Definition 2). We are now ready to define the weighted event
graph.

Definition 12 (Weighted event graph) The weighted event graph of a temporal net-
work G is a weighted graph D = (VD , L D , w), where the nodes VD = E G , links
L D = A G , and the weights of the edges are given by w(ei , e j ) = δt (ei , e j ).

In other words, the weighted event graph D is a directed graph whose vertices
map to the events of G, whose directed links L D map to the adjacency relations
ei → e j between G’s events, and whose link weights W indicate for each adjacency
relation the time difference δt between the two events. For a schematic example of
how D is constructed, see Fig. 6.1.
Because the links of D represent the temporal adjacency relationships of the
original graph G that are directed, D is also directed, with the direction of its links
following the direction of time. Consequently, because there cannot be any loops in
time, D is also acyclic and therefore a DAG (Directed, Acyclic Graph). This provides
certain computational advantages.
Note that in this mapping, isolated events, that is, single events connecting pairs
of nodes that have no other events in G, become isolated zero-degree nodes of the
event graph D, and it may be convenient to entirely disregard such nodes.
A key strength of the weighted event graph approach is that the event graph D can
be quickly thresholded so that the resulting graph DΔt only contains directed links
between events that follow one another within a time Δt in the original temporal
network G. Formally, DΔt is defined as follows: let G = (VG , E G , T ) be a temporal
network and A G,Δt ⊂ E G × E G the set of all Δt-adjacency relations between its
events E G . We can now define thresholded event graph Δt.
6 Weighted Temporal Event Graphs and Temporal-Network Connectivity 115

a)

v1
e1 e2
v2
e3
v3
e4 e5
v4

0 1 2 3 4 5 6
time t

b) c)
w=1
e2 w=3
e5 w=1
e2 e5
e1 w=1
w=2
w=1 e1 w=1 w=1
w=2
w=2
e3 e4 e3 e4
w=1 w=1

Fig. 6.1 Constructing the weighted event graph D. Panel a shows the timeline representation of
the original temporal network G with vertices v1 , v2 , v3 , and v4 , and events e1 . . . e5 . Panel b shows
the weighted event graph D that corresponds to G. Panel c displays the thresholded version DΔt
with Δt = 1. In this case, the thresholded event graph is also identical with the reduced event graph

Definition 13 (Thresholded event graph DΔt ) The thresholded event graph DΔt of G
is the graph DΔt = (VDΔt , L DΔt , w) with vertices VDΔt , directed links L DΔt , and link
weights W DΔt , so that VDΔt = E G , L DΔt = A G,Δt , and w(ei , e j ) = δt (ei , e j ) ≤ Δt.

In other words, DΔt ’s nodes are again vertices of G, its directed links are Δt-
adjacency relations between the events of G, and its link weights are the time differ-
ences δt between Δt-adjacent events where by definition δt ≤ Δt. Therefore, DΔt
is a subgraph of D that only contains links between events that follow one another
within δt ≤ Δt.
If one is only interested in reachability in the original temporal graph, one may
apply transitive reduction to the event graph, so that its number of directed links is
reduced while the overall connectivity remains—if event ei is connected to event e j
through several directed paths in D, after the reduction only one of them remains.
The reduced event graph is formally defined as follows (Mellor 2017; Badie-Modiri
et al. 2022a, b):

Definition 14 (Reduced event graph D̂) The reduced event graph D̂ is subgraph of
D that only contains a reduced set of adjacency relationships denoted with  and
defined as follows: let e1 ∩ e2 ∈ VG denote the set of nodes of G that events e1 and
116 J. Saramäki et al.

e2 share so that |e1 ∩ e2 | ∈ {0, 1, 2}. Now let |ei ∩ e j | > 0. Then ei (ti )  e j (t j ) iff
t j = min tk ∀ek (tk ) where tk > ti , |ei ∩ e j ∩ ek | > 0.

In other words, for the event ei , one only takes into account the next events in which
its endpoint nodes in G participate. Therefore, the maximum number of outgoing
adjacency relationships is 2; this number is 1 if the next event shares both vertices
with ei . Hence the maximum out-degree of D̂ is also 2.
The above definition can be directly applied to a thresholded event graph DΔt to
yield D̂Δt .

6.2.6 Computational Considerations

The weighted event graph presentation D of an empirical temporal network G can be


constructed computationally by iterating through the timeline of events of each of its
nodes separately. To save memory, we recommend setting a maximum allowed value
for the time difference between two events, Δtmax , above which two events will not
be connected in D. More often than not, the problem at hand yields a natural time
scale for this. For example, when studying processes of contagion it is not necessary
to connect events whose time difference indicates that there is a vanishingly low
probability of transmission because the shared node almost certainly recovers before
the second event. However, if memory consumption is not a problem, one can use
the entire available time range and set Δtmax = T , where T is the largest time in the
data set.
As the temporal adjacency of two events requires that the events share at least one
temporal network node, it is convenient to compute the adjacencies around each node
separately. For instantaneous events (that is, events that have no duration), one can
construct a time-ordered sequence of events containing node i: {ei1 , ei2 , . . . , eik }.
Then, it is straightforward to iterate over this sequence: begin at each event eil and
scan forward until the event ein where the time limit Δtmax is met, that is, tin − til >
Δtmax . While scanning, connect each intermediate event eim with the focal event
with the weight wil,im = δ (eil , eim ) = tim − til , unless they are already connected
by a previous sweep, which is possible for repeated events between the same pair of
nodes. Somewhat similar but slightly more complicated algorithms can be used for
temporal networks with events that have a duration or even higher-order events of
more than two nodes.
Constructing and sorting these sequences of events can be done in O(|E G | log |E G |)
time. Because each step of the algorithms yields one connection in D (note
that some links may be visited twice), the total runtime of the algorithm is
O(|E G | log |E G | + |E D |). However, even though the computation of event graphs is
quite economic, this representation can have significantly higher memory complex-
ity than the original temporal network representation. While a temporal network can
be represented as an event sequence that requires O(|E G |) of memory, event graphs
6 Weighted Temporal Event Graphs and Temporal-Network Connectivity 117

can occupy way more memory than such sequences. In the worst case, their memory
complexity is O(|E G |a), where a is the largest number of events a node participates
in.
While the thresholded event graph DΔt can be constructed directly from G using
Δt-adjacency relations, this is not the fastest approach if one wants to vary Δt.
Rather, it is much faster to first construct D up to the maximum Δtmax and then
threshold it to DΔt by discarding all links with weights above the chosen value of
Δt.
It is also often useful to sweep through a range of values of Δt, as in the percolation
studies discussed later in this chapter. When varying Δt so that all links of D with
weights below Δt are retained, one saves a lot of computational time by the following
procedure: (1) order the links of D in increasing order of weight, (2) begin with an
empty network, (3) add links one by one, (4) after each link addition, mark down
current the threshold value Δt and compute the quantities of interest such as the
sizes of components in the network. For weakly connected components, one can
easily and quickly keep track of component sizes by initially assigning each node to
its own component and then always checking if the newly entered link merges two
components or not. Using the disjoint-sets forest data structure, one does not even
need to construct the network: it is enough to keep track of the component merging
and size. This procedure is similar to the ones used for analysing connectivity of
static networks in percolation studies (Newman and Ziff 2001; Onnela et al. 2007).
However, if one is only interested in connectivity, that is, in knowing whether there
is a path between two events regardless of the rest of the paths, then it is possible to
use the directed and acyclic nature of the event graph D as an advantage when doing
the computations. As the directed connectivity in a DAG is a transitive relationship,
one can always remove edges whose source and target nodes are connected by some
other path without affecting the overall connectivity (weak or strong). Such redundant
edges can be removed, if instead of D one constructs the reduced event graph D̂ of
Definition 14. Computationally, a convenient way to do this is to modify the above-
described algorithm that sweeps through the events of each node of G so that it stops
after the first iteration for each node (Mellor 2017). Note that this works only for
weighted temporal event graphs built with undirected events. Because the maximum
out-degree of D̂ is 2, the time and memory complexity of event graph construction
is dramatically reduced.
Finally, if one is interested in out-component sizes, that is, the number of unique
temporal network nodes or events reachable from any given event by following the
directed links of DΔt , there is a very fast approximate algorithm that aggregates
out-component sizes for each event by iterating through the nodes of DΔt backward
in time (Badie-Modiri et al. 2020).
118 J. Saramäki et al.

6.3 How to Interpret and Use Weighted Event Graphs

6.3.1 How the Basic Features of D and DΔt Map


Onto Features of G

Let us begin dissecting the weighted event graphs by mapping out simple correspon-
dences between some features of D and features of G. In the following, for the sake
of simplicity, we shall consider the original temporal network G’s events as undi-
rected and instantaneous. Further, we assume that the weighted event graph D has
been constructed using the whole available time range, that is, with time differences
up to Δtmax = T . By definition, the thresholded version of the event graph DΔt only
contains links between Δt-adjacent events, that is, events with time differences less
than Δt.
First, as already evident, the elements of D map to the elements of G so that the
nodes of D are events in G, the links of D are temporal adjacency relations between
the events of G, and the link weights of D indicate the times between adjacent events
in G. The in- and out-degrees of a node of D indicate the numbers of temporal
adjacency relations between the corresponding event of G and previous/future events
of the two nodes that the event connects: the in-degree of node ei of D (event ei of
G) is the number of events that took place earlier than ei and involved either or both
of the connected nodes. The out-degree is the number of similar future events. For
DΔt , the in- and out-degrees of nodes correspond to the numbers of past and future
events of the event’s endpoint nodes in G within a time Δt. This latter property
could be useful, e.g., for studying temporal threshold models (see, e.g., Karimi and
Holme (2013); Takaguchi et al. (2013); Backlund et al. (2014)) where the process
of contagion is triggered by infection from multiple sources within some short time
range.
Due to D’s construction, a directed path in D is a time-respecting path in G, and
vice versa. If we define (without the loss of generality) a vertex path Pv in a graph
as a sequence of vertices joined by an edge, then we can formalize this relationship:
Theorem 1 (Path equivalence) A path P is a vertex path in D if and only if P is a
time-respecting event path in G.
Put in another way, if Pv (D) is the set of all vertex paths in the graph D, and Pe (G)
is the set of all time-respecting event paths in G, then Pv (D) = Pe (G).
For DΔt , the corresponding time-respecting path in G is also Δt-constrained and
so the time difference between its consecutive events is always less than Δt (see
Definition 11).
Theorem 2 (Constrained path equivalence) A path P is a vertex path in DΔt if and
only if P is a Δt-constrained time-respecting event path in G.
Now, if in addition we denote by Pe (G, Δt) the set of all Δt-constrained time-
respecting event paths in G, then Pv (DΔt ) = Pe (G, Δt).
6 Weighted Temporal Event Graphs and Temporal-Network Connectivity 119

Table 6.1 Correspondence between features of the weighted event graph D and the original tem-
poral graph G
Feature in D Feature in G
Node VD Event E G
Link L D Temporally adjacent pair of events e1 → e2
Link weight w Time difference δt between adjacent events
In-degree kin # of previous events of the event’s endpoint
nodes
Out-degree kout # of future events of the event’s endpoint nodes
Directed vertex path Pv Time-respecting event path Pe
Sum of weights on path P Duration of time-resp. path P (if events
instantaneous)
Set of downstream nodes for v D Set of events reachable from eG (“future
light-cone”)
Set of upstream nodes for v D Set of events that can influence event eG (“past
light-cone”)
Weakly connected component Maximal valid temporal subgraph

If the events are instantaneous, then, additionally, the sum of link weights of a
path in D equals the latency or temporal distance of the corresponding path in G,
i.e., its duration. For non-instantaneous events, the total weight of the links of a path
in D equals the total waiting time at the nodes of the corresponding path in G; e.g.,
for a trip through an air transport network, the total weight would be equivalent to
the total layover time at airports.
Moreover, the set of downstream nodes in D reached by following the directed
links of D from its node ei equals the reachable set of event ei in G, in other words,
the set of all events in G that can be reached from ei through time-respecting paths
(its “future light-cone”). This set is also called ei ’s out-component. Likewise, the set
of upstream nodes that can be reached by following D’s links in reverse direction
equals the set of all events in G that can lead to ei through time-respecting paths: the
set of events that may influence ei (“past light-cone”), also called ei ’s in-component.
For Δt, the sets of upstream/downstream nodes come with the additional constraint
that they must be reachable through Δt-constrained time-respecting paths.
Finally, the weakly connected components of D (more on components later)
correspond by definition to maximal valid temporal subgraphs in G (Definition 8); for
DΔt , the weakly connected components correspond to maximal valid Δt-connected
subgraphs (Definition 9).
All the above correspondences are summarized in Table 6.1 for D and in Table 6.2
for DΔt .
120 J. Saramäki et al.

Table 6.2 Correspondence between features of the Δt-thresholded event graph DΔt and the original
temporal graph G
Feature in DΔt Feature in G
Node v D Event eG
Δt
Link l D Δt-adjacent pair of events e1 −→ e2
Link weight w Time difference δt between adjacent events
In-degree kin # of previous events of the event’s endpoint
nodes within Δt
Out-degree kout # of future events of the event’s endpoint nodes
within Δt
Directed vertex path Pv Δt-constrained time-respecting path Pe
Sum of weights on path P Duration of time-resp. path P (if events
instantaneous)
Set of downstream nodes for v D Set of events reachable from eG through
Δt-constrained time-respecting paths
Set of upstream nodes for v D Set of events that can influence event eG
through Δt-constrained time-respecting paths
Weakly connected component Maximal valid Δt-connected subgraph

6.3.2 Temporal Motifs and DΔt

The concepts of Δt-adjacency, Δt-connectivity and temporal subgraphs are inti-


mately related to temporal motifs (Kovanen et al. 2011, 2013a, b). The concept of
network motifs was originally introduced for static networks by Milo et al. (2002)
in 2002. They defined network motifs as classes of isomorphic induced subgraphs
with cardinality that is higher in empirical data than in a randomized reference sys-
tem, usually the configuration model. Milo et al. showed that similar networks had
similar characteristic network motifs, suggesting that motif statistics are informative
of the function of the system and could be used to define universality classes of
networks (Milo 2004).
Similarly to static-network motifs, temporal motifs are one way of looking at
frequent, characteristic patterns in networks. In this case, the patterns are defined in
terms of both topology and time. For temporal motifs, a natural starting point is to
use the definition of Δt-connected subgraphs (Definition 7), and to look at temporal-
network entities where a sequence of interaction events unfolds in the same way. As
an example, the sequence where A calls B calls C calls A forms a triangular Δt-
connected subgraph if all calls follow one another with a time difference of no more
than Δt. Note that here we consider the events to be directed, but using undirected
events is also possible.
Such temporal-topological patterns reflect the dynamics of the system in question.
Therefore, their characterization can improve our understanding of various complex
systems, e.g., of temporal networks whose structure reflects the nature of human
6 Weighted Temporal Event Graphs and Temporal-Network Connectivity 121

social interactions and information processing by groups of people. As an example,


in Kovanen et al. (2013b) it was shown that there is a tendency of similar individuals
to participate in temporal communication patterns beyond what would be expected
on the basis of their average interaction frequencies or static-network structure and
that the temporal patterns differed between dense and sparse regions of the network.
These observations relied on the timings of the communication events, reflected in
their Δt-connectivity.
Temporal motifs are formally defined as equivalence classes of isomorphic Δt-
connected, valid temporal subgraphs (Definition 7), where the isomorphism takes
into account both the topology of the subgraph and the temporal order of events.
With this definition, the two-call sequences A calls B calls C and D calls E calls F
both belong to the same two-event equivalence class (if the Δt-adjacency condition
is met).
The temporal-topological isomorphism problem can be solved using a trick that
combines the event graph approach presented in this chapter with the topology of
the subgraph in the original network: a “virtual” node is added onto each (event)
link, analogous to an event node in D. This virtual node is then connected with a
directed arrow to the event that immediately follows it (Kovanen et al. 2011, 2013a);
this is a reduced version of temporal adjacency, as only the next event is considered.
The directed arrows between the virtual nodes determine the order of events in the
subgraph. The virtual nodes are then assigned a “color” different than the original
nodes, and the isomorphism problem is finally solved using static-network algorithms
for directed, coloured graphs, such as Bliss (2007).
The procedure for obtaining temporal-motif statistics from empirical temporal
networks with time-stamped events is as follows (Kovanen et al. 2011), for a given
value of Δt and a chosen motif size s measured in events:

i. Find all maximal Δt-connected subgraphs E max of G.
∗ ∗
ii. Find all valid temporal subgraphs E ⊂ E max of size s.
iii. Solve the isomorphism problem to find equivalence classes for all E ∗ .
iv. Count the number of motif instances in each equivalence class, and compare
against a chosen null model.
For details including pseudocode for the required algorithms, we refer the reader
to Kovanen et al. (2011, 2013a).
Here, if one wants to compare motif statistics for a range of values of Δt, as
is often the case, the weighted event graph approach helps to substantially reduce
computational time for step (i) of the above procedure. While it is possible to generate
the maximal Δt-connected subgraphs for each value of Δt separately from G’s events
using brute force, the threshold sweep approach outlined in Sect. 6.2.6 is a much
better solution.
With this approach, one simply needs to generate the weighted event graph D
and then threshold it by discarding all edges with weights above each Δt. If one
wants to compute motif statistics for, say Δt1 < Δt2 < · · · < Δtmax , the fastest way
is to begin with an empty network and sort links by increasing weight. Then, one
adds links up to link weight Δt1 and either store DΔt1 or compute the quantities of
122 J. Saramäki et al.

interest, then add more links up to Δt2 and do the same, and repeat up to Δtmax . Note
that here, one does not initially need to construct the whole D which might cause
memory problems: building it up to δt = Δtmax is sufficient.

6.3.3 Components of D and Temporal-Network Percolation

6.3.3.1 Measuring Component Size

Let us next discuss the components of D (and DΔt ) in more detail. First, because
the event graph D is directed, the usual complications of defining components in
directed networks apply. However, because D is also acyclic, there cannot be any
strongly connected components, where all nodes are reachable from all other nodes.
Therefore, connected components of D (DΔt ) can only be weakly connected by
definition.
For the purposes of our interest, we can focus on three types of components: (i)
maximal weakly connected components of D, where all nodes of D are joined by
a path if the directions of D’s links are ignored and no more nodes can be added,
corresponding to maximal valid temporal subgraphs of G (Definition 6); (ii) maximal
out-components, uniquely defined for each node v D of D, so that all nodes in the out-
component can be reached from the focal node v D , and (iii) maximal in-components,
again defined uniquely for each v D , so that the focal node can be reached from all
of the component’s nodes. These definitions do not change if we use DΔt (however,
the thresholded DΔt is of course expected to have a different component structure,
generally with more components than D). In the following, we will for simplicity
talk about D only, but everything holds for DΔt as well. For an approximate way of
extracting in- and out-component sizes from empirical data, see Badie-Modiri et al.
(2020).
Let us next discuss the properties—in particular, the concept of size—for com-
ponents of D defined using any of the above definitions.
First, the most straightforward way to define component size is to count the number
S E (C) of the event graph D’s nodes that belong to the component C. This is equal
to the number of events in the original temporal network G that belong to the same
component, and S E (C) ∈ [0, |E G |]. For a schematic illustration, see Fig. 6.2, panel a.
Second, one can map the nodes of D in component C back to the events of
the temporal network G and count the number of vertices involved in the events,
SV (C) ∈ [0, |VG |]. This is the “spacelike” definition of size (see Fig. 6.2, panel b).
Third, because the event nodes in D come with time stamps—the events take
place at specified times—one may think of a “timelike” size: the duration (that is,
the lifetime) of the component St (C) ∈ [0, T ], the time difference between C’s last
and first event. This is illustrated in Fig. 6.2, panel c.
Note that these measures of size may or may not be correlated in a temporal
network. In a random, Erdős-Rényi-like temporal network they on average are (see
Kivelä et al. (2018)). In this case, one can think of a single “giant” temporal compo-
6 Weighted Temporal Event Graphs and Temporal-Network Connectivity 123

b)
v v v
a) v v
v v
v v S
e
e e V
e e c)
e e e
e SE e e
e e
time t
e e
e
St
time t
Fig. 6.2 Panel a: The shaded area indicates the size S E of a component of D, measured in the
number of nodes of D (events of G) involved in the component. Panel b: the size SV of the same
component, measured as the number of involved vertices in the original graph G as indicated by the
shaded area. Panel c: The third way of measuring component size, the lifetime St of the component
measured as the time difference between its last and first events

nent that encompasses most of the events in D and nodes in G and that lives for the
entire observation period of the temporal network. However, this is a special case, and
one can equally well think of networks where the different types of “giant” compo-
nents are separated. As an example, there can be short-lived, “spacelike” components
that span most of the nodes in G but contain only a small fraction of the nodes in
D because of their short lifetime. Several such components may appear during the
lifetime of the network. Further, one can also envision a persistent component that
spans the whole time range but involves only a small number of nodes that repeatedly
and frequently interact: this component is large in St but vanishingly small in S E and
SV . Again, multiple such components may coexist.

6.3.3.2 Temporal-Network Percolation Analysis with DΔt

When the event graph D is thresholded to DΔt , its component structure depends on
the threshold weight. Δt can then be viewed as the control parameter of a percola-
tion problem. The value of the control parameter Δt determines the event graph’s
component structure, in particular its largest component, similarly to the edge weight
threshold used in percolation studies on static, weighted networks (see, e.g., Onnela
et al. (2007)). Alternatively, as the effects of varying Δt depend on the characteristic
124 J. Saramäki et al.

timescale of a particular network, for better theoretical comparability, one can use
the excess out-degree of DΔt as the control parameter (Badie-Modiri et al. 2022a).
The excess out-degree behaves qualitatively similarly as a control parameter because
it is a monotonously increasing function of Δt.
In static network percolation, there is a critical value of the control parameter
that separates the connected and disconnected phases of the network. When the
control parameter reaches this value, connectivity suddenly emerges, reflected in
the appearance of a giant connected component that spans a finite fraction of the
network. This is measured either as the fraction of nodes or links included in the
largest component; whichever measure is used, it is called the order parameter of
the percolation problem.
Here, since the control parameter Δt operates on the event graph D, the most
obvious choice for the order parameter would be the relative size of D’s largest com-
ponent (the choices for component definition being weakly or strongly connected or
out-component, see Badie-Modiri et al. (2022a, b)). As discussed above, the compo-
nent’s size regardless of its type can be measured as the number S E of its constituent
event nodes in D, so that the corresponding order parameter

1
ρ E (Δt) = max S E , (6.1)
|E|

where |E| is the number of (event) nodes in D and we’ve made the dependence on
Δt explicit. As such, this definition works in a straightforward way and ρ E (Δt) can
be expected to behave as a typical order parameter would.
The size of the components other than the largest component is often used in
percolation studies to detect the critical point. Using the above definition of size S E ,
one can define the susceptibility

1 
χE = n SE S E2 , (6.2)
|E| S E <max S E

where n SE is the number of components of size S E and the sum is over all components
except the largest. The susceptibility diverges at the critical point that separates the
connected and fragmented phases as the small components are absorbed by the
emerging giant component.
However, as discussed above, one can measure the size of a component of D in two
other ways. The “spacelike” way is to count the number of nodes SV of the original
network G that are involved in the component through D’s event nodes. Using this
definition of component size, we arrive at the order parameter that measures what
fraction of G is associated with D’s component:

1
ρV (Δt) = max SV , (6.3)
|V |
6 Weighted Temporal Event Graphs and Temporal-Network Connectivity 125

where |V | is the number of nodes in G. For this order parameter, while one could
naïvely define the corresponding susceptibility-like measure as

1 
χV = n SV SV2 , (6.4)
|V | SV <max SV

which may behave in hard-to-predict ways at the critical point—if something that
can be called a critical point even exists. This is because a node v ∈ VG that partic-
ipates in multiple connected events will appear multiple times in the corresponding
component of D. The nodes of the original network G may, for similar reasons, also
belong
 to multiple components occurring at different times. In other words, the sum
SV <max SV n SV is not a conserved quantity.
The third component size definition captures the time length of components in D,
leading to an order parameter:

1
ρt (Δt) = max St , (6.5)
T
where T is the observation period, i.e., the lifetime of G. While one could in principle
again define a susceptibility-like measure for this control parameter, as in Eq. (6.4),
this measure would not be too useful. This is because multiple components of D
can easily co-exist, overlapping in time, and there can be a number of long-lived
simultaneous components.

6.3.3.3 Temporal-Network Percolation: Empirical Examples

To illustrate the behaviour of the order parameter and the susceptibility as a function
of the event-graph threshold Δt, we will next review some of the results originally
published in Kivelä et al. (2018), obtained for three data sets and for weakly connected
components: a large dataset of time-stamped mobile-telephone calls (Karsai et al.
2011), a dataset on sexual interactions from a study of prostitution (Rocha et al. 2011),
and an air transportation network in the US (Bureau of Transportation Statistics
2017). See Kivelä et al. (2018) for more details on the datasets.
Two versions of the relative largest component size (order parameter) and the
susceptibility are shown for all datasets in Fig. 6.3. The first version is based on the
event-graph component size S E and the second is based on the number of involved
nodes in the original graph, SV . For these data sets, the critical points indicated by
the diverging susceptibility are fairly similar for both measures, with the exception
of a small difference for the air transport network where χV peaks slightly earlier
than χ E . For these datasets, the “timelike” order parameter of Eq. 6.5 (not shown)
does not produce meaningful results; it does not behave like an order parameter for
reasons discussed in Sect. 6.3.3.2.
The identified critical points are related to characteristic time scales in the systems
in question; as an example, they indicate lower bounds how long a spreading process
126 J. Saramäki et al.

a) b) c)
ρE χE ρE χE ρE χE
.5 ρE
χE 104 .5 50 .5 ρE 2000
ρE χE
χE
0 0
χ ρV0 0 0 0χ
ρV G χV ρG V
.5 ρV 500 .5 ρV 20 ρV
χV χV .5 χV 5
0 0 0 0 0 0
0 12 24 0 25 50 75 100 20 40
Δt (hours) Δt (days) Δt (minutes)

Fig. 6.3 The behaviour of the relative largest component size ρ and the susceptibility χ as a
function of Δt, for three data sets and two variants of the measures. ρ E and χ E are for component
size measured in event-graph nodes and ρV and χV for size measured in the number of temporal-
network nodes involved in the component’s events. Panel a: mobile telephone calls, displaying a
critical point at around 4 h 20 min. Panel b: sexual interactions, with a critical point at around 7 days
(followed by a second peak at ∼16 days). Panel c: US air transport, with a critical point at ∼20
min. Figure adapted from the original in Kivelä et al. (2018)

would typically need to survive in order to eventually reach most of the network
(lower bounds because they are for weakly connected components, see below). In
the case of mobile communication networks, if we imagine, e.g. a rumour spreading
through the phone calls, the cascade will die out unless the rumour is still relevant
and worth spreading for each node after 4 h and 20 min have passed since the node
received it. For the sexual contact network, a sexually transmitted disease can become
an epidemic and spread through the network if it remains infectious for longer than 7
days after being infected. For the air transport network, the identified characteristic
time of approximately 20 min is related to the synchronization of connecting flights
at airports.
As the above results are for weakly connected components, in the context of
spreading processes, the component size is an upper bound for the number of nodes
that can be infected by the process if it begins inside the component. For processes
constrained from above so that the spreading agent has to move forward from a
node within δt, the observed critical Δt is a lower bound: one can only be certain
that the spreading process would not percolate below this threshold. For temporal-
network percolation results computed with out-components, see Badie-Modiri et al.
(2022a, b).

6.3.3.4 The Directed Percolation Universality Class

Let us conclude the section by addressing the universality class of percolation taking
place in temporal networks and subsequently in weighted event graphs. Universality
classes of phase transitions are, generally speaking, categories that are defined by the
underlying symmetries of the problem. They are manifested through various critical
6 Weighted Temporal Event Graphs and Temporal-Network Connectivity 127

exponents and their relationships that can be extracted from theory or from careful
analysis of empirical data.
In the case of temporal networks, the main difference from typical static net-
works is that there is a direction determined by the arrow of time. However, there are
analogous static networks with time ordering—directed static lattices—whose per-
colation properties are well known. Percolation taking place on such lattices belongs
to the directed percolation universality class (Hinrichsen 2000), where there are two
diverging correlation lengths, one parallel and one perpendicular to the lattice. In
temporal networks, they correspond to the temporal and spatial dimensions. In fact,
the three types of component sizes (events, space-like, time-like) have their exact
counterparts in such lattices, if one considers the perpendicular components of the
lattice to represent nodes at a certain time instance.
In light of the above, it seems natural that temporal-network percolation would
correspond to directed percolation. This was indeed proven theoretically for random
temporal networks in Badie-Modiri et al. (2022a), where the mean field rate equation
defining component sizes in the reduced event graph D̂ was shown to be equivalent
to the rate equation for directed lattices, indicating that the critical exponents for
temporal-network percolation are those of the directed percolation universality class.
This seems to hold for correlated empirical temporal networks that deviate from
randomness as well (Badie-Modiri et al. 2022b).

6.4 Discussion and Conclusions

Because temporal networks carry information on the times of the interactions


between the network’s nodes, they allow for the detection of patterns that would
be lost if the networks’ events were aggregated into static structures. This has led to
an increased understanding of the dynamics of network structures and processes that
unfold on top of networks. The downside of this framework is that network analysis
becomes more complicated because of the additional degrees of freedom brought by
the temporal dimension. Temporal networks are, in a way, mixtures of graphs and
time series: therefore, if one is not satisfied with studying one of these aspects only,
entirely new ways of looking at their structure are required.
In this chapter, we have presented an approach that projects important features
of temporal network dynamics into a static line graph structure: the weighted event
graph. Weighted event graphs can be used both to understand the structural features of
the temporal networks they encode as well as to investigate dynamic processes taking
place on temporal networks. The weighted event graph framework maps temporal-
topological structures onto weighted, directed, acyclic graphs. This representation
preserves the time-respecting paths of the original network as well as the timing
differences between consecutive events on those paths. Weighted event graphs are
particularly useful for studying paths, structures, and processes where one wants to
set constraints to the times between successive events (Δt-connectivity), in other
words, where the events have to follow one another quickly enough.
128 J. Saramäki et al.

Beyond the examples discussed in this chapter (temporal motifs, temporal-


network percolation), one can envision many uses for weighted event graphs. In
theory, any method or approach which has been developed around the concepts of
paths or walks could benefit from being viewed as a topological problem in weighted
event graphs instead of a dynamical problem in temporal networks. Looking beyond
the surface, it is clear that many important topics and measures in network science are
at least partly based on the path structure of networks, including several approaches
in dynamic models on networks, community detection, and centrality measures. As
is evident from the cases of percolation analysis and temporal motifs, weighted tem-
poral event graphs can be useful for both defining concepts and measures that are
intuitive and transparent as well as providing access to computationally efficient
methods for solving temporal network problems.
There is one rather obvious use of weighted event graphs that we have not dis-
cussed yet: the issue of centrality measures. The computation of various temporal-
network centralities should greatly benefit from weighted event graphs as they encap-
sulate the whole set of time-respecting paths (or their Δt-constrained subset). Such
centralities could straightforwardly be computed using definitions and algorithms
developed for static networks but in this case, applied to the event graphs instead
(perhaps using fast approximate algorithms such as in Badie-Modiri et al. (2020)).
As a bonus, because of the event graph’s construction, these measures would be
computed for events of the original network instead of its vertices. It can be argued
that this is—at least in some cases—more meaningful than computing quantities
for the nodes. Any centrality measure for a node should come with an additional
constraint on its valid time range. As an example, should the “temporal betweenness
centrality” of a node characterize the node’s centrality over some time range (up to
the entire range of observation of the temporal network), or at some specific point in
time, building on the paths that pass through the node at that point (so that it would
be computed for a node-time instead of a node)? However, with events, the defini-
tion is more straightforward: temporal betweenness centrality should depend on the
number of (fastest) temporal paths passing through the event. Therefore, at least for
instantaneous events, it can be directly and simply calculated from the event graph’s
directed path structure. This reflects the central position of events as the building
blocks of temporal networks—fundamentally, temporal networks are networks of
events.

Acknowledgements JS acknowledges funding from the Strategic Research Council at the Academy
of Finland (NetResilience consortium, grant numbers 345188 and 345183).
6 Weighted Temporal Event Graphs and Temporal-Network Connectivity 129

References

T. Aledavood, E. López, S.G.B. Roberts, F. Reed-Tsochas, E. Moro, R.I.M. Dunbar, J. Saramäki,


Daily rhythms in mobile telephone communication. PLoS One 10, e0138098 (2015)
V.-P. Backlund, J. Saramäki, R.K. Pan, Effects of temporal correlations on cascades: threshold
models on temporal networks. Phys. Rev. E 89, 062815 (2014)
A. Badie-Modiri, M. Karsai, M. Kivelä, Efficient limited-time reachability estimation in temporal
networks. Phys. Rev. E 101, 052303 (2020)
A. Badie-Modiri, A.K. Rizi, M. Karsai, M. Kivelä, Directed percolation in temporal networks. Phys.
Rev. X 4, L022047 (2022)
A. Badie-Modiri, A.K. Rizi, M. Karsai, M. Kivelä, Directed percolation in random temporal network
models with heterogeneities. Phys. Rev. E 105, 054313 (2022)
Bureau of Transportation Statistics (2017). www.bts.gov
H. Hinrichsen, Non-equilibrium critical phenomena and phase transitions into absorbing states.
Adv. Phys. 49, 815 (2000)
P. Holme, Modern temporal network theory: a colloquium. Eur. Phys. J. B 88(9), 234 (2015)
P. Holme, J. Saramäki, Temporal networks. Phys. Rep. 519(3), 97–125 (2012)
D.X. Horváth, J. Kertész, Spreading dynamics on networks: the role of burstiness, topology and
non-stationarity. New J. Phys. 16, 073037 (2014)
J.L. Iribarren, E. Moro, Impact of human activity patterns on the dynamics of information diffusion.
Phys. Rev. Lett. 103, 038702 (2009)
H.-H. Jo, M. Karsai, J. Kertész, K. Kaski, Circadian pattern and burstiness in human communication
activity. New J. Phys. 14, 013055 (2012)
T. Junttila, P. Kaski, Engineering an efficient canonical labeling tool for large and sparse graphs,
in Proceedings of ALENEX 2007, ed. by D. Applegate, G.S. Brodal, D. Panario, R. Sedgewick
(SIAM, 2007), p. 135
F. Karimi, P. Holme, Threshold model of cascades in temporal networks. Phys. A 392, 3476 (2013)
M. Karsai, A. Noiret, A. Brovelli, Work in progress (2019)
M. Karsai, M. Kivelä, R.K. Pan, K. Kaski, J. Kertész, A.-L. Barabási, J. Saramäki, Small but slow
world: how network topology and burstiness slow down spreading. Phys. Rev. E 83, 025102
(2011)
M. Kivelä, J. Cambe, J. Saramäki, M. Karsai, Mapping temporal-network percolation to weighted,
static event graphs. Sci. Rep. 8, 12357 (2018)
L. Kovanen, M. Karsai, K..Kaski, J. Kertész, J. Saramäki, Temporal motifs in time-dependent
networks. J. Stat. Mech. Theor. Exp. 2011, P11005+ (2011)
L. Kovanen, M. Karsai, K. Kaski, J. Kertész, J. Saramäki, Temporal motifs. In: Temporal Networks,
ed. by P. Holme, J. Saramäki (Springer, Heidelberg, 2013), pp. 119–134
L. Kovanen, K. Kaski, J. Kertész, J. Saramäki, Temporal motifs reveal homophily, gender-specific
patterns, and group talk in call sequences. Proc. Natl. Acad. Sci. (USA) 110(45), 18070–18075
(2013)
P.L. Leath, Cluster size and boundary distribution near percolation threshold. Phys. Rev. B 14, 5046
(1976)
A. Mellor, The temporal event graph. J. Complex Netw. 6, 639–659 (2017)
R. Milo, Superfamilies of evolved and designed networks. Science 303(5663), 1538–1542 (2004)
R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, U. Alon, Network motifs: simple
building blocks of complex networks. Science 298(5594), 824–827 (2002)
G. Miritello, R. Lara, M. Cebrian, E. Moro, Limited communication capacity unveils strategies for
human interaction. Sci. Rep. 3, 1950 (2013)
H. Navarro, G. Miritello, A. Canales, E. Moro, Temporal patterns behind the strength of persistent
ties. EPJ Data Sci. 6, 31 (2017)
M.E.J. Newman, R.M. Ziff, Fast Monte Carlo algorithm for site or bond percolation. Phys. Rev. E
64, 016706 (2001)
130 J. Saramäki et al.

V. Nicosia, M. Musolesi, G. Russo, C. Mascolo, V. Latora, Components in time-varying graphs.


Chaos 22, 023101 (2012)
J.-P. Onnela, J. Saramäki, J. Hyvönen, G. Szábo, D. Lazer, K. Kaski, J. Kertész, A.-L. Barabási,
Structure and tie strengths in mobile communication networks. Proc. Natl. Acad. Sci. USA 104,
7332 (2007)
L.E. Rocha, F. Liljeros, P. Holme, Simulated epidemics in an empirical spatiotemporal network of
50,185 sexual contacts. PLoS Comp. Biol. 7, e1001109 (2011)
T. Takaguchi, N. Masuda, P. Holme, Bursty communication patterns facilitate spreading in a
threshold-based epidemic dynamics. PLoS One 8, e68629 (2013)
Chapter 7
Exploring Concurrency and Reachability
in the Presence of High Temporal
Resolution

Eun Lee, James Moody, and Peter J. Mucha

Abstract Network properties govern the rate and extent of spreading processes
on networks, from simple contagions to complex cascades. Recent advances have
extended the study of spreading processes from static networks to temporal networks,
where nodes and links appear and disappear. We review previous studies on the effects
of temporal connectivity for understanding the spreading rate and outbreak size of
model infection processes. We focus on the effects of “accessibility”, whether there is
a temporally consistent path from one node to another, and “reachability”, the density
of the corresponding “accessibility graph” representation of the temporal network.
We study reachability in terms of the overall level of temporal concurrency between
edges, quantifying the overlap of edges in time. We explore the role of temporal
resolution of contacts by calculating reachability with the full temporal information
as well as with a simplified interval representation approximation that demands less
computation. We demonstrate the extent to which the computed reachability changes
due to this simplified interval representation.

Keywords Temporal networks · Concurrency · Accessibility · Reachability ·


Temporal contacts · Structural cohesion · Disease spread · Epidemic potential ·
STD

E. Lee
Pukyong National University, Busan, Republic of Korea
e-mail: eunlee@pknu.ac.kr
J. Moody
Duke University, Durham, NC, USA
e-mail: jmoody77@duke.edu
P. J. Mucha (B)
Dartmouth College, Hanover, NH, USA
e-mail: peter.j.mucha@dartmouth.edu

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 131
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_7
132 E. Lee et al.

7.1 Introduction

Variation in epidemic spreading stems in part from the diversity of temporal contact
patterns between subjects, whether such changes are a direct result of individual
state changes (as in, e.g., Daley and Kendall (1964); May and Anderson (1987,
1988)) or more general temporal variation (see, e.g., Holme and Saramäki (2012);
Masuda and Lambiotte (2016)). For example, the distribution of the lifespan of edges
can significantly affect the speed and eventual spread of an infection (Masuda et al.
2013; Holme and Liljeros 2015; Li et al. 2018). The increased availability of detailed,
digitized temporal contact patterns supports and accelerates new investigations about
the effects of temporal details, including analysis of properties such as fat-tailed inter-
event time distributions (Vazquez et al. 2007; Karsai et al. 2011; Rocha et al. 2011;
Gernat et al. 2018). Indeed, the ‘burstiness’ of inter-event times can either slow
down (Karsai et al. 2011) or speed up dynamics (Rocha et al. 2011; Gernat et al.
2018). Meanwhile, such apparently contradictory effects provide a clue that there
may be other elements controlling the dynamics beyond the bursty inter-event times.
Holme and Liljeros (2015) investigated the changes to the observed outbreak sizes
from various selected shifts to the contact histories: “beginning intervals neutralized”
(BIN) shifts all contact pairs to first appear at the same time, “end intervals neutral-
ized” (EIN) shifts the last contact between all pairs to the same time, and “interevent
intervals neutralized” (IIN) replaces the heterogeneous intervals of contact between
a pair to a uniform step size in time (keeping start and end times the same). For 12
empirical temporal networks, they found that BIN and EIN resulted in more sig-
nificant differences in the outbreak size compared with differences obtained from
IIN. A possible explanation for the relatively larger effect of these BIN and EIN
modifications could be in the resulting changes in the concurrency of contacts. That
is, by shifting all contacts to start (BIN) or end (EIN) at the same time, there is pre-
sumably greater temporal overlap between different contact intervals, augmenting
the temporally consistent paths in the network over which the infection may spread.
Further supporting this possible interpretation, Li et al. (2018) analyzed the tran-
sient behavior of reference models with randomly permuted times that either pre-
serve the lifetimes of edges or of the nodes, focusing on changes in spreading speed
according to the selected reference model. Their results demonstrate the dependence
between the ability of an edge to help spread the infection and the time interval
of its lifetime. Together, these results highlight the importance of the overall time
interval over which a given pair is in contact, as opposed to the detailed timings of
the contacts in those intervals.
Such studies point to the crucial function of the concurrency of edges in infec-
tion dynamics. In this chapter, we summarize previous studies related to issues of
concurrency and the overall reachability constrained by the network timing details.
We then explore the impact of concurrency on reachability by rescaling the start
times in a set of empirical temporal networks. Using these empirical networks we
then demonstrate the accuracy with which reachability is correctly calculated using
a simplified interval representation for each edge that ignores the detailed timings of
interevent contacts.
7 Exploring Concurrency and Reachability … 133

7.2 Previous Studies on Concurrency and Reachability

Although there exist various definitions for concurrency, its essence is clear: the
extent of temporal overlap among the contacts. The significance of concurrency in a
temporal network is immediately obvious for governing the reachable extent of any
information or infection. Consider for example a simple situation with only three
actors {A, B, C} with B and C connected by an edge at some early time and A
and B connected at a later time. If the temporal extent of these two edges do not
overlap, then there is no way for any infection or information that spreads from A
to B to continue on to C. However, if the two edges temporally overlap, then C is
indeed “accessible” or “reachable” from A. The reachable extent allowed by the edge
timings in a temporal network immediately impacts the real spread and modeling of
an infectious disease, independent of the details of the dynamical process (e.g., SI,
SIR, SEIR, complex contagion, etc.). The expected size of the maximally reachable
set can be quantified by “reachability”, defined as the fraction of ordered node pairs
with at least one temporally consistent path from the source node to the target node.
Such ordered node pairs are “accessable.”
Because the reachability is an underlying property of a temporal network, inde-
pendent of the spreading process taking place on that network, and since it naturally
constrains all spreading processes on the temporal network, reachability has been
used in multiple previous studies (Moody 2002; Holme 2005; Lentz et al. 2013;
Moody and Benton 2016; Armbruster et al. 2017; Onaga et al. 2017). For example,
Holme (2005) numerically investigated two types of reachability called reachability
time and reachability ratio, to categorize the effectiveness of the real-world con-
tact networks in terms of time and spreading size. Lentz et al. (2013) also explored
accessibility in empirical networks, proposing the use of causal fidelity, defined as
the fraction of network paths that can be followed by a sequence of events of strictly
increasing times. That is, if all of the temporal contact information is agglomerated
into a static network (collecting all edges that ever exist in the data but ignoring their
timings), causal fidelity is the fraction of paths in this agglomerated static network
that are also available in the full temporal network, thereby quantifying how well the
static network representation might approximate the full dynamics.
In a related line of inquiry, the effect of concurrent partnerships has been
of key interest for the spread of sexually transmitted diseases (STDs) such as
HIV/AIDS (Morris and Kretzschmar 1995; Kretzschmar and Morris 1996; Moody
2002). Moody (2002) emphasized the substantial effect of concurrency on the reach-
ability in an adolescent romantic network, in that reachability plays the role of an
upper bound on the expected outbreak size of an infection spreading on the network.
An array of studies have assessed the effect of concurrent relationships for modeling
infectious spreading on synthetic networks (Eames and Keeling 2004; Doherty et al.
2006; Miller and Slim 2017; Onaga et al. 2017). The merit in studying synthetic net-
works is that it enables researchers to control the network’s structural properties and
the extent of the concurrent partnerships, which are obviously impossible to control
in real-life networks. Despite considerable emphasis by different investigators about
134 E. Lee et al.

the role of temporally overlapping contacts, we still lack a general definition of con-
currency in that slightly different definitions have been used across these studies. For
example, Gursky and Hoffman (2016) assumed concurrency based on the lifetime
of an edge, following the definition in Watts and May (1992). Doherty et al. (2006)
defined concurrency as the proportion of subjects engaging in concurrent relation-
ships within a population. Onaga et al. (2017) set concurrency as a fixed number of
connections of an individual in time.
Onaga et al. (2017) proposed a theoretical framework for the epidemic threshold
induced by concurrency. In general, a low epidemic threshold can indirectly indi-
cate a high probability of infection prevalence. Further, the relationship between
concurrency and the epidemic threshold can help explain the relationship between
concurrency and reachability. Onaga et al. defined the concurrency as a fixed number
of links emanating from a node in unit time, and the activation of the links are decided
by a node’s activity level. The activity level is drawn from uniform and power-law
probability distributions. Then, they applied the analytically tractable activity-driven
model. Given the star-like network in unit time, the authors derived differential
equations for a SIS model to estimate the epidemic threshold. They compared the
analytically derived threshold to the numerically estimated threshold, confirming a
close match. From the results, the authors found that the transition of the epidemic
threshold depended strongly on the extent of the concurrent connections. The results,
again, stress the importance of concurrency in adjusting infectious potential.
In the present work, we are motivated by the framework investigated by Moody and
Benton (2016), which focused on the roles of concurrency and structural cohesion.
They performed numerical experiments simulating edge start times and durations
on network structures sampled with a four-step random walk from a collaboration
network. Moody and Benton thus obtained 100 sample networks with which they
explored different levels of structural cohesion, defined as the average number of
node-independent paths between node pairs (Moody and White 2003). The authors
controlled the concurrency—quantified by the fraction of connected edges whose
temporal intervals overlap in time—by adjusting the distributions of the start times
and durations of the edges. Given the sample networks with random start and duration
times on each edge, they then measured reachability as a function of concurrency
and modeled the relationships with general linear regression models. Moody and
Benton showed that the concurrency and structural cohesion both affect reachability
in their examples, finding that the role of concurrency is particularly important in
low structural cohesion networks because a slight increase in concurrency sharply
increases the number of accessible node pairs (that is, ordered node pairs connected
via temporally consistent paths). When one considers that networks of low structural
cohesion are common in many sexual contact networks, Moody and Benton’s findings
stress the importance of concurrency for STD transmission.
Recently, Lee et al. (2019) developed a tree-like model approximations for the
relationship between concurrency and reachability, to further elucidate numerical
results like those of Moody and Benton (2016). Lee et al. compared their approxi-
mations with numerically-computed reachability in temporal networks obtained with
simulated edge timings on various network structures: balanced and unbalanced trees,
7 Exploring Concurrency and Reachability … 135

Erdős-Rényi (ER) networks, exponential degree distribution networks, and four of


the sampled networks highlighted in Moody and Benton (2016). Because of the
nature of their tree-like assumptions, these models well approximate the relationship
between reachability and concurrency for small values of structural cohesion, doing
particularly well also at small values of concurrency. But their existing models do
not do as well in the presence of larger numbers of available alternate paths between
nodes. Nevertheless, this study further demonstrates how the overall level of reach-
ability emerges through an interplay between concurrency and structural cohesion.

7.3 Effects of Concurrency: Empirical Examples

In the remainder of this chapter, we focus on a set of empirical examples to fur-


ther explore reachability and concurrency, complementing the results in Moody and
Benton (2016) and Lee et al. (2019). As part of our exploration, we transform the
detailed contact time information of the edges into an interval representation wherein
the distinct contacts between each connected node pair is instead represented simply
by a start time (the first observed contact) and an end time (the last contact). That is,
we treat each edge as if it was present for the entirety of the interval between the first
and last observed contacts. We then measure concurrency and reachability on this
simpler interval representation. To perform numerical experiments under different
values of concurrency, we modify these time intervals by rescaling the total range of
the start times in the temporal network while keeping the duration of each edge con-
stant. By doing so, we investigate how concurrency affects reachability and examine
whether reachability on the simpler interval representation matches that measured
from the original contact times.
The basic characteristics of the four example empirical networks used in this study
are described below and in Table 7.1. In the following subsections, we then describe
the transformation to the interval representation, the measurement of concurrency
and our method for modifying it in our present simulations, and the calculation
of reachability. Using the empirical examples, we then demonstrate the impact of
concurrency on reachability as well as the relative accuracy of computing reachability
with the simplified interval representation in these examples.

Table 7.1 Four empirical temporal networks used in the present work. The networks are of size N
(number of nodes) with Mc distinct temporal contacts between Md different node pairs (the number
of edges). The resolution of the temporal contacts is denoted by Δ
Name of the data N Mc Md Δ
High school 180 45,047 2,220 20 s
Conference 113 20,818 2,196 20 s
DNC email 1,890 39,263 4,463 1s
Brazil 6,576 8,497 8,056 5 days
136 E. Lee et al.

7.3.1 Data

We used four empirical networks in the present study. The first data set (denoted “High
School” here) contains the temporal network of contacts between students in a high
school in Marseilles, France, including contacts between the students in five classes
during a seven day period (from a Monday to the Tuesday of the following week) in
November 2012 (Fournet and Barrat 2014). The network includes N = 180 nodes
and Mc = 45,047 distinct contacts between Md = 2,220 different node pairs (that
is, yielding Md different edges in the interval representation). The time resolution of
the measured contacts is Δ = 20 s.
The second data set (“Conference”) corresponds to the contacts among attendees
of the ACM Hypertext 2009 conference (Isella et al. 2011). The conference spanned
2.5 days, with the network sampled every Δ = 20 s. The network consists of N = 113
attendees and Mc = 20,818 contacts between Md = 2,196 node pairs.
The third data set (“DNC Email”) is the Democratic National Committee email
network, as hacked from the DNC in 2016 (data available online at http://konect.cc/
networks/dnc-temporalGraph). Nodes in the network correspond to persons, with
each contact along an edge representing an email sent to another person. Although
the data are originally directed, we treat edges here as undirected for simplicity. The
network includes N = 1,891 nodes and Mc = 39,264 email contacts, connecting
Md = 4,465 node pairs.
The fourth data set (“Brazil”) is a sexual contact data set obtained from a Brazilian
web forum exchanging information about sex sellers between anonymous, hetero-
sexual, male sex buyers between September 2002 and October 2008 (Rocha et al.
2010). In this web forum, male members grade and categorize their sexual encounters
with female escorts in posts using anonymous nicknames. From the posts, Rocha
et al. (2010) constructed a network connecting every community member (sex buyer)
to an escort. The time information of the posts are used here as the temporal contact
between a seller and buyer. The entire network’s size is 16,730. However, to save
computational cost, we ignored temporal contacts that occurred during the first 1,000
days of the data. Additionally, whereas the original data is resolved at the level of
days, we down-sampled the resolution of the contacts to Δ = 5 days. As a result,
the data we consider includes N = 6,576 nodes and Mc = 8,497 distinct contacts
along Md = 8,056 edges.

7.3.2 Change to the Interval Representation

The empirical networks include detailed temporal contact patterns like that repre-
sented in Fig. 7.1a: an edge representing the connection between nodes i and j has
potentially several time stamps that represent the distinct contacts between i and j.
The detailed transmission of any infection occurs during these contacts. However,
instantaneous contacts are not necessarily the best way to think about concurrency in
7 Exploring Concurrency and Reachability … 137

(a) Contact representation (b) Interval representation


{1} [1,1+Δ )
A B A B
[4,7+Δ )
{2,5} {4,7} [2,5+ Δ )
{3,4,6} [3,6+ Δ )

C D C D

Fig. 7.1 A toy temporal-network example represented by a distinct contacts and b the correspond-
ing interval representation. The interval of each edge starts with the first observed contact. To
account for temporal resolution, we set a strict inequality (open interval) end time equal to the last
observed contact plus the temporal resolution Δ

these relationships. Consider the motivation to study the spread of STDs: the appro-
priate notion of concurrency isn’t that the contacts occur at precisely the same time,
only that they are interleaved in time.
As such, in our present investigation of concurrency and reachability we employ
a simplification obtained by transforming the temporal details in the contacts into an
interval representation, keeping only the start- (ts ) and end time (te ) of each edge,
as shown in Fig. 7.1b. In panel (b), the contacts of the edge between A and D—
which includes contacts at times {3,4,6} (see. Figure 7.1a)—are converted to the
time interval [3, 6 + Δ). In this transformation, we explicitly add the time resolution
Δ to the last contact time so that every edge includes a non-zero time interval even
if it represents only a single contact at time t [for example, the edge (A, B) in panel
(a)]. Consistent with this addition, in our convention the edge only persists for times
strictly less than the end time of the interval (open interval on the right).

7.3.3 Measuring and Controlling Concurrency

In the present work, we measure concurrency as the fraction of edge pairs that overlap
in time. In so doing, we first emphasize that the key mechanism through which
concurrency plays out is at the level of connected edges (that is, two edges that share
a common node). However, in simulations where edge timings are independent and
identically distributed, such as those in Moody and Benton (2016) and Lee et al.
(2019), the expected measurement of concurrency over all edge pairs is equivalent
to that over the subset of connected edge pairs. In practice, in the real world, whether
one more naturally defines concurrency over all edge pairs or only connected edge
pairs may be directly determined by the nature of surveyed information. For example,
if distributions of start times and durations of edges are measured, then the resulting
estimate is effectively over all pairs. In contrast, if participants are directly queried
about their numbers of concurrent relationships, then the restriction to connected
edge pairs may be more natural. For the purposes of the present article, we measure
138 E. Lee et al.

(a) ts,l0 = ts,min

l0 ts,l1
td,l1
te,l1
τ1
l1
ts,l′1 = ts,min + rτ1 td,l1 = td,l′1 te,l′1
rτ1
l′1
t
0 1 2 3 4 5 6 7

(b) Original temporal interval representation (c) Adjusted temporal interval with r = 0.5
A A

B B

C C
D t D t
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

Fig. 7.2 Controlling concurrency in our toy example. a The very first link to appear, l0 , sets the
minimum starting time in the empirical data, ts,min = ts,l0 , and the temporal axis in panel a is
expressed in time since ts,min . A later link, l1 , has an empirical start and end time: here ts,l1 = 2
and te,l1 = 6, that is, with duration td,l1 = 4. The grey line l1 represents the new l1 after rescaling
the start time distribution by r = 0.5, with new start time ts,l1 obtained by pulling the original start
time forward by a factor of r . In this rescaling of the start times, the duration of the edge remains
the same. Under this rule, the original intervals in panel b with time resolution Δ = 1 change under
the rescaling r = 0.5 to those in panel c. The concurrency of the intervals in the two panels are
Cr =1.0 = 3/6 = 0.5 and Cr =0.5 = 4/6 = 0.67. Further increases in r increase C here until C = 1
for r < 1/3

concurrency as the fraction of all edge pairs that overlap in time. This definition
enables us to more easily analyze the effect of concurrency on the reachability,
particularly in developing models for the effect as in Lee et al. (2019).
In our numerical experiments, we control the concurrency by rescaling the edge
start times without changing their durations. We identify the minimum start time ts,li
of each edge—that is, each pair of nodes that are ever in contact—where li indicates
the ith edge, i ∈ [0, 1, . . . , L − 1], and L is the total number of edges. (Connecting
the notation of this section to our data analysis, we note that L = Md .) For notational
convenience, we identify the very first start time ts,min = min ts,li among all edges
and define τi = ts,li − ts,min . We can then rescale the distribution of these start times
with the parameter r by ts,li = ts,min + r τi , as depicted in Fig. 7.2a. Meanwhile, we
maintain the duration of each edge with te,li = tsl i + td,i where td,i = te,i − ts,i is
the edge duration. For example, when we set r = 0.5, the interval representation
of the original timings depicted in Figs. 7.1b and 7.2b (corresponding to r = 1)
shift to the intervals in Fig. 7.2c. In particular, note that the edges in panel (c) overlap
each other more than the original edges in panel (b).
7 Exploring Concurrency and Reachability … 139

7.3.4 Measuring Reachability

Given the interval representation of each edge, we can evaluate the average reach-
ability of the network corresponding to that representation. Consider the (different)
toy example in Fig. 7.3a. In general, all direct contacts such as those connecting (A,
B) yield accessible node pairs. Additional pairs like (A, C) and (D, C) in Fig. 7.3c are
accessible because of the temporal ordering of the edges. For example, an infection
starting from D at t = 1 can reach C by either moving first to A and then on to B,
or moving directly to B, and then having B infect C at a later time. However, an
infection seeded at C cannot ever reach A or D because of an absence of available
connections after the appearance of the (B, C) link at t = 4. That is, neither A nor
D is accessible from C. (We again emphasize our convention of open intervals on
the right, so in the example here the edge between B and D disappears immediately
before the edge between B and C starts.) We identify the accessible ordered node
pairs with the elements of the accessibility matrix R, with R(i, j) = 1 if and only if
j is accessible from i, as shown in Fig. 7.3d.
To quantify the overall average accessibility across the whole network, we calcu-
late the reachability R as the density of the accessibility graph (i.e., the density of
the off-diagonal elements of the accessibility matrix),

1 
R= R(i, j) . (7.1)
N (N − 1) i= j

(a) Temporal Network (b) Aggregated Static Network

A [0,3) A
[1,2)
B B
[2,4)
D [4,5) D
C C

(c) Accessible Network (d) Accessibility Matrix


A A B C D
B A - 1 1 1
D B 1 - 1 1
C 0 1 - 0
C D 1 1 1 -

Fig. 7.3 Schematic representations for establishing accessibility. a Each edge in the network of
nodes {A, B, C, D} is denoted by a start and end time, e.g., the contact between A and D starts at
t = 1 and continues until (just before) t = 2. b The static network representation aggregates all
contacts that ever appear in the temporal network. c The corresponding directed graph of acces-
sibility demonstrates that asymmetric accessibilities (red arrows) are possible. d The accessibility
matrix encodes whether node j is accessible from node i
140 E. Lee et al.

To calculate the accessibility matrix R and reachability, we follow the same steps
as in Lee et al. (2019), generating temporal layers corresponding to the moment
immediately before the end of each edge:
1. Sort edges by their end times te,lw . Here, w ∈ [0, L − 1] indexes edges by their
end time in ascending order and L is the total number of edges. (We again note
L = Md here.) For example, l0 is the edge with the earliest end time and l L−1 is the
last edge to end. (Breaking ties is unimportant for calculating reachability, except
it can be used to reduce the number of calculations here, under an appropriate
change of notation.)
2. Construct the adjacency matrix Tw for the wth temporal layer by including edge
lw and all other edges lw with w > w (that is, that end after the wth edge) that
are also present just before the end time of the wth edge. That is, Tw includes lw
and all lw satisfying both ts,lw < te,lw and te,lw ≥ te,lw .
3. By repeating step 2, the full set of temporal layer matrices T0 , T1 , . . . , T L−1 may
be prepared.  L−1 Tw
4. Multiply the matrix exponentials of each temporal matrix: R = w=0 e .
5. Binarize R: For all Ri j > 0, set Ri j = 1.
6. By using Eq. 7.1, evaluate the average reachability R.
The matrix exponentials in step 4 provide a simply-expressed formula to indicate
connected components within each temporal layer. Multiplying the matrix exponen-
tials from consecutive layers yields (after binarizing) the reachable network asso-
ciated with that set of layers. While the matrix exponential works conveniently for
small data sets, for larger networks a more computationally tractable procedure is
to instead directly calculate the connected components of Tw and replace the matrix
exponential in step 4 with the binary indicator matrix whose elements specify whether
the corresponding pair of nodes are together in the same component at that time. In
practice, steps 3 and 4 can be trivially combined to separately consider each temporal
layer in isolation from the others. That is, with this procedure the calculation can be
performed without forming and holding the entire multilayer representation at one
time (cf. breadth-first search on the full multilayer network). For even larger networks
whose adjacency matrices must be represented as sparse matrices in order to even fit
in memory, the corresponding accessibility graph could instead be constructed one
row at a time, updating the running average of the density R to calculate the overall
reachability.
The above-described procedure for calculating reachability for the interval rep-
resentation can be used for the full temporal contact information with only minor
modification. Instead of sorting edges by their end times, the reachability due to
detailed temporal contacts proceeds by taking each possible contact interval as a
separate temporal layer. The adjacency matrix and its exponential (or component
indicator) is computed for each temporal layer, multiplying the later together as in
step 4 except that the index runs over all unique contact times.
7 Exploring Concurrency and Reachability … 141

7.3.5 Reachability with Concurrency

After transforming the temporal contact information into the interval representation
(as described in Sect. 7.3.2), we measure reachability versus concurrency in the
empirical network data sets. We use the parameter r to rescale the start times in
order to vary concurrency. In general, reachability increases with increasing concur-
rency (see Fig. 7.4) although the details of this relationship between reachability and
concurrency depends on the topology of the underlying network.
The High School and Conference examples display only a slight increase in reach-
ability with increasing concurrency for the simple reason that reachability is already
so high at zero concurrency. There are so many alternative paths between node pairs
in these two networks that almost all pairs have at least one temporally consistent
path, even for very small concurrency, and so reachability is almost always close to
1 in these cases (see the inset of Fig. 7.4). The interval representation of the original
edge timings—that is, before we rescale the distribution of start times with the r
parameter described above—corresponds to C = 0.25 (High School) and C = 0.4
(Conference).
In contrast, the DNC email network has larger concurrency in its original edge
timings (C = 0.6) but much smaller reachability, as seen in Fig. 7.4. Even as the start
time distribution is compressed (small r ), to make the concurrency approach 1, the
reachability only approaches 0.94 (not 1). This apparent discrepancy is because the

Fig. 7.4 Reachability of the empirical networks as a function of concurrency. Different levels of
concurrency have been obtained here by rescaling the distribution of start times in the original data
sets. The inset zooms in on the small deviations of reachability from 1 for the High School and
Conference examples. Note that the lines here are only to connect the data points; the lines do not
represent a functional relationship
142 E. Lee et al.

data set includes separate connected components. That is, increasing concurrency all
the way to 1 reduces the question of accessibility to connected components in the
temporally-aggregated network, with reachability then equal to the fraction of node
pairs in the same connected component. In the temporally-aggregated DNC email
network, the largest connected component is of size 1, 833, with another component
of size 58.
Similarly, the relatively small value of reachability for the Brazil network as
C → 1 is because the largest connected component includes 5, 193 (of the 8, 056
total) nodes. At r = 1, the Brazil data has concurrency 0.0172. As such, we can see
that increasing the level of concurrency (that is, r < 1) can dramatically increase the
reachability for this network.
We note in particular the behavior of the High School and the Conference data
sets in having reachability values near 1 for all values of concurrency. We point the
interested reader back to Moody and Benton (2016) and Lee et al. (2019), where the
important role of structural cohesion in the temporally-aggregated graph is demon-
strated. We note that the structural cohesion calculated (White and Newman 2001)
for these two networks are 18.3 (High School) and 28.5 (Conference), quantifying
the large number of node-independent paths typically available in these networks. In
contrast, the structural cohesion of the DNC email network is 1.28, directly quanti-
fying that it is much more tree-like, and as such there are typically few (or in many
cases no) available detours between nodes. Similarly, the structural cohesion of the
Brazil network is 1.21. Given the particularly large values of structural cohesion
for the High School and Conference networks, reachability values near 1 are not
surprising, even as concurrency approaches zero.

7.3.6 Accuracy of Reachability from the Interval


Representation

To further explore reachability and its dependence on the temporal details of the
contacts, we calculate reachability in the four empirical temporal networks, tracing
the change in reachability over time in the original data sets (i.e., without modifying
start times). Figure 7.5 demonstrates the different increasing trends of reachability
with time t across these networks, setting t = 0 in the figure at the appearance of the
very first contact. Figure 7.5 also visualizes this increase in reachability relative to
the number of edges m(t) that have appeared by that time (i.e., the number of distinct
node pairs that have had contact by that time). The figure includes calculations using
the original contact times as well as those from the interval representation wherein
each edge is assumed to be present for the full duration from its first appearance to
its last. We use subscripts to distinguish between the calculations using the distinct
temporal contacts (c) versus the interval representation (d, indicating each edge is
assumed to be present for its total duration).
7 Exploring Concurrency and Reachability … 143

Fig. 7.5 Temporal traces of reachability (R), the size of the largest component (S) and the nor-
malized edge count (m/M) in four empirical networks as calculated from the contacts (subscripted
with c, plotted as dashed lines) and the interval representation (subscripted with d, solid lines). In
many cases, the dashed lines are not distinguishable from the corresponding solid lines

In addition to reachability (Rd , Rc ), Fig. 7.5 includes the largest connected com-
ponent size (Sd , Sc ) and normalized edge count (m d /Md , m c /Mc ) as a function of
time, considering all edges that have appeared since the very first contact. For ease
of comparing different time scales, we re-plot these results for reachability and the
size of the largest connected component versus the edge densities in Fig. 7.6. As
observed in the figures, the differences between the calculated values based on full
contacts versus the interval representation are relatively small, and in many cases
barely distinguishable.
Of course, any error in computing the accessibility of an ordered node pair in
the interval representation can only overestimate reachability. That is, an ordered
node pair identified as accessible in the full contact representation is necessarily
also accessible in the interval representation. However, it is possible that particular
paths that appear to be temporally consistent in the interval representation do not
144 E. Lee et al.

Fig. 7.6 Temporal traces of reachability (R) and the size of the largest component (S) in the four
empirical networks, previously plotted in Fig. 7.5 are re-plotted here versus the number of edges
that have appeared to that point in time. Once again, some of the dashed lines corresponding to
calculations with the contacts are indistinguishable from the solid lines obtained from the interval
representations

actually have an allowed set of distinct contacts. That said, because our reachability
calculation in the interval representation only computes results at the end times of
edges, a new edge that appears (the node pair have their first contact) at time t
does not get accounted for in the interval representation until the first end time that
occurs after t. (At that time, this new edge is accounted for, even if its end time
is much later.) By showing the results of both calculations, we demonstrate how
accurately the interval representation describes reachability in these examples, with
good agreement throughout Figs. 7.5 and 7.6.
In line with the very high structural cohesion of the High School and Confer-
ence networks, we observe very sharp increases in reachability at early times, with
reachability values only slightly behind the fraction of nodes in the largest connected
component. In contrast, we observe in the figure that the reachability of the DNC
7 Exploring Concurrency and Reachability … 145

email and Brazil networks increase more slowly with time, even after redisplay-
ing reachability versus normalized edge count. Remarkably, reachability calculated
from the interval representation deviates only slightly from the full calculation using
the complete temporal contact details. The most notable differences between the
two calculations apparent in Fig. 7.6 is in the High School data, with the interval
representation slightly overestimating reachability through its increase over time.
A smaller overestimate is also apparent in the panels for the Conference and DNC
email networks.
Considering the importance of reachability as the average of the maximum possi-
ble outbreak size (averaging over “patient zero” source nodes), these results provide
hope that reachability can be well estimated from the simpler interval representation
in most cases, even though the detailed dynamics of a spreading infection surely
varies between the true contacts and the interval representation.

7.4 Final Remarks

The details of edge timings in a temporal network can affect the speed and extent of
the spread of diffusive dynamics such as infections or information propagation on the
network. But because including temporal details greatly increases the complexity of
the system, there has been a much greater amount of study and successful modeling
of spreading processes on static networks. With ever greater emphasis on temporal
network data, focusing on the role of concurrency appears to be one productive way to
accurately summarize the population-level effects of the edge timing details. We here
collected references to some previous studies related to the impact of concurrency
on spreading processes, including in particular the relationship between concurrency
and the average reachability in the temporal network. We have further demonstrated
this relationship by calculation of reachability on empirical examples, rescaling the
start time distributions in the original edge timing data to consider different levels of
concurrency and reachability.
In so doing, we also compare the calculation of reachability on the full contact
information against that using a simplified interval representation that treats each
edge as present for the entire interval between the appearance of its first contact and
its last. We demonstrate with these examples that the level of reachability calculated in
the interval representation is nearly identical to that calculated with the full temporal
contact information. We note that this result is similar at least in spirit to the findings
of Holme and Liljeros (2015) where the detailed inter-event timings did not affect
the results in their model simulations as much as the start times and end times.
In terms of the temporal trace of reachability, the High School and Conference
networks show simultaneous increase of reachability with the size of the largest con-
nected component at early times. In contrast, the DNC email and Brazil networks
display a much slower increase in reachability, lagging behind the connected compo-
nent size, and the reachability in these networks remains relatively low. We confirm
146 E. Lee et al.

with these empirical examples that the effect of concurrency can be quite large in
some networks, as seen for the DNC email and Brazil networks.
The importance of concurrency was first identified in the context of the spread of
HIV (Morris and Kretzschmar 1995). Conflicting observational works at the national
and individual levels have since raised questions about the value of concurrency in
the public health context (see, e.g., Lurie and Rosenthal (2010); Mah and Halperin
(2010); Morris et al. (2010); Epstein and Morris (2011)), but most of this work misun-
derstands the necessary relation between reachability and diffusion risk highlighted
here (and in, e.g., Moody and Benton (2016); Lee et al. (2019)). Whereas increased
concurrency increases temporal path accessibility, and this increased reachability
must increase diffusion potential, the amount of increase in reachability depends on
other network factors, as we have demonstrated. While we have no data to speak
directly to these questions about the value of concurrency in the public health con-
text, our results suggest that one contributing factor might be high variance in the
levels of structural cohesion in the underlying networks. As such, by analyzing
the extent of concurrency in a temporal network and its impact on reachability given
the structural properties of the underlying network, one might be able to better choose
between different intervention strategies to best mitigate the spread of an infectious
disease or enhance the extent of positive behaviors. We hope this chapter serves to
gather relevant previous studies and motivate future work.

Acknowledgements We thank Petter Holme and Jari Saramäki for the invitation to write this
chapter. Research reported in this publication was supported by the Eunice Kennedy Shriver National
Institute of Child Health & Human Development of the National Institutes of Health under Award
Number R01HD075712. Additional support was provided by the James S. McDonnell Foundation
21st Century Science Initiative—Complex Systems Scholar Award (grant #220020315) and by the
Army Research Office (MURI award W911NF-18-1-0244). The content is solely the responsibility
of the authors and does not necessarily represent the official views of any supporting agency.

References

B. Armbruster, L. Wang, M. Morris, Forward reachable sets: analytically derived properties of


connected components for dynamic networks. Netw. Sci. 5(3), 328–354 (2017)
D.J. Daley, D.G. Kendall, Epidemics and rumours. Nature 204(4963), 1118 (1964)
I.A. Doherty, S. Shiboski, J.M. Ellen, A.A. Adimora, N.S. Padian, Sexual Bridging Socially and
Over Time: A Simulation Model Exploring the Relative Effects of Mixing and Concurrency on
Viral Sexually Transmitted Infection Transmission. Sex. Transm. Dis. 33(6), 368–373 (2006)
K.T.D. Eames, M.J. Keeling, Monogamous networks and the spread of sexually transmitted diseases.
Math. Biosci. 189(2), 115–130 (2004)
H. Epstein, M. Morris, Concurrent partnerships and HIV: an inconvenient truth. J. Int. AIDS Soc.
14(1), 13–13 (2011)
J. Fournet, A. Barrat, Contact patterns among high school students. PLoS One 9(9), 1–17 (2014)
T. Gernat, V.D. Rao, M. Middendorf, H. Dankowicz, N. Goldenfeld, G.E. Robinson, Automated
monitoring of behavior reveals bursty interaction patterns and rapid spreading dynamics in hon-
eybee social networks. Proc. Natl. Acad. Sci. USA 115(7), 1433–1438 (2018)
K. Gurski, K. Hoffman, Influence of concurrency, partner choice, and viral suppression on racial
disparity in the prevalence of HIV infected women. Math. Biosci. 282, 91–108 (2016)
7 Exploring Concurrency and Reachability … 147

P. Holme, Network reachability of real-world contact sequences. Phys. Rev. E 71, 046119 (2005)
P. Holme, F. Liljeros, Birth and death of links control disease spreading in empirical contact net-
works. Sci. Rep. 4(1), 4999 (2015)
P. Holme, J. Saramäki, Temporal networks. Phys. Rep. 519(3), 97–125 (2012)
L. Isella, J. Stehlé, A. Barrat, C. Cattuto, J.F. Pinton, W.V. den Broeck, What’s in a crowd? analysis
of face-to-face behavioral networks. J. Theor. Biol. 271(1), 166–180 (2011)
M. Karsai, M. Kivelä, R.K. Pan, K. Kaski, J. Kertész, A.L. Barabási, J. Saramäki, Small but slow
world: how network topology and burstiness slow down spreading. Phys. Rev. E 83, 025102
(2011)
M. Kretzschmar, M. Morris, Measures of concurrency in networks and the spread of infectious
disease. Math. Biosci. 133(2), 165–195 (1996)
E. Lee, S. Emmons, R. Gibson, J. Moody, P.J. Mucha, Concurrency and reachability in treelike
temporal networks. Phys. Rev. E 100, 062305 (2019)
H.H.K. Lentz, T. Selhorst, I.M. Sokolov, Unfolding accessibility provides a macroscopic approach
to temporal networks. Phys. Rev. Lett. 110, 118701 (2013)
M. Li, V.D. Rao, T. Gernat, H. Dankowicz, Lifetime-preserving reference models for characterizing
spreading dynamics on temporal networks. Sci. Rep. 8(1), 709 (2018)
M.N. Lurie, S. Rosenthal, The concurrency hypothesis in sub-Saharan Africa: convincing empirical
evidence is still lacking. Response to Mah and Halperin, Epstein, and Morris. AIDS Behav.14(1),
34–37 (2010)
T.L. Mah, D.T. Halperin, The evidence for the role of concurrent partnerships in Africa’s HIV
epidemics: a response to Lurie and Rosenthal. AIDS Behav. 14(1), 25–28 (2010)
N. Masuda, R. Lambiotte, A guide to temporal networks. World Sci. (2016)
N. Masuda, K. Klemm, V.M. Eguíluz, Temporal networks: slowing down diffusion by long lasting
interactions. Phys. Rev. Lett. 111, 188701 (2013)
R.M. May, R.M. Anderson, Transmission dynainics of HIV infection. Nature 326 (1987)
R.M. May, R.M. Anderson, The transmission dynamics of human immunodeficiency virus (HIV).
Trans. R. Soc. Land. B 321, 565–607 (1988)
J.C. Miller, A.C. Slim, Saturation effects and the concurrency hypothesis: insights from an analytic
model. PLoS One 12(11), e0187938 (2017)
J. Moody, The importance of relationship timing for diffusion: indirect connectivity and STD
infections risk. Soc. Forces 81(1), 25–56 (2002)
J. Moody, R.A. Benton, Interdependent effects of cohesion and concurrency for epidemic potential.
Ann. Epidemiol. 26(4), 241–248 (2016)
J. Moody, D.R. White, Structural cohesion and embeddedness: a hierarchical concept of social
groups. Am. Sociol. Rev. 68(1), 103–127 (2003)
M. Morris, M. Kretzschmar, Concurrent partnerships and transmission dynamics in networks. Soc.
Netw. 17(3), 299–318 (1995)
M. Morris, H. Epstein, M. Wawer, Timing is everything: international variations in historical sexual
partnership concurrency and HIV prevalence. PLoS One 5(11), e14092 (2010)
T. Onaga, J.P. Gleeson, N. Masuda, Concurrency-induced transitions in epidemic dynamics on
temporal networks. Phys. Rev. Lett. 119, 108301 (2017)
L.E.C. Rocha, F. Liljeros, P. Holme, Information dynamics shape the sexual networks of internet-
mediated prostitution. Proc. Natl. Acad. Sci. 107(13), 5706–5711 (2010)
L.E.C. Rocha, F. Liljeros, P. Holme, Simulated epidemics in an empirical spatiotemporal network
of 50,185 sexual contacts. PLoS Comput. Biol. 7(3), 1–9 (2011)
A. Vazquez, B. Rácz, A. Lukács, A.L. Barabási, Impact of non-poissonian activity patterns on
spreading processes. Phys. Rev. Lett. 98, 158702 (2007)
C.H. Watts, R.M. May, The influence of concurrent partnerships on the dynamics of HIV/AIDS.
Math. Biosci. 108(1), 89–104 (1992)
D.R. White, M. Newman, Fast approximation algorithms for finding node-independent paths in
networks. SSRN Electron. J. (2001)
Chapter 8
Metrics for Temporal Text Networks

Davide Vega and Matteo Magnani

Abstract Human communication, either online or offline, is characterized by when


information is shared from one actor to the other and by what specific information
is exchanged. Using text as a way to represent the exchanged information, we can
represent human communication systems with a temporal text network model where
actors and messages coexist in a dynamic multilayer network. In this model, actors
and messages are represented in separate layers, connected by inter-layer tempo-
ral edges representing the communication acts—who and when communicate what
information. In this chapter we revisit some measures specifically developed for tem-
poral networks, and extend them to the case of temporal text networks. In particular,
we focus on defining measures relevant for the analysis of information propagation,
including the concepts of walk, path, temporal precedence and path distance mea-
sures. We conclude by discussing how to use the proposed measures in practice by
conducting a comparative analysis in a sample communication network based on
Twitter mentions.

Keywords Human communication · Communication system · Information ·


Temporal text network · Multilayer · Text · Distance measure · Precedence · Path

8.1 Introduction

The concept of communication is fundamental to study modern and contemporary


societies (Luhmann 1995), and it is particularly important in social network analysis:
many of the existing network-based models of social systems directly or indirectly
represent communication processes. For example, if we focus on temporal social
networks, empirical studies include conversations on social media (Magnani et al.
2012; Wang et al. 2021; Mathew et al. 2019), mobile telephone calls (Karsai et al.

D. Vega · M. Magnani (B)


InfoLab, Department of Information Technology, Uppsala University, Uppsala, Sweden
e-mail: matteo.magnani@it.uu.se
D. Vega
e-mail: davide.vega@it.uu.se
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 149
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_8
150 D. Vega and M. Magnani

2011), and face-to-face interactions (Stehlé et al. 2011; Sapiezynski et al. 2019).
Even when we consider static models of social networks, such as friendship graphs
without any associated temporal information, many of the metrics used to analyze
them are still based on the assumption that some information is shared through the
network. For example, we can measure the ability of actors or groups of actors to
efficiently spread information (closeness, diameter, Page-Rank centrality), or we can
identify actors with the ability to influence existing information flows (betweenness
centrality). In summary, the most typical application of social network models and
in particular temporal social networks is to study systems of communication.
Despite the central role of information in communication systems, the information
exchanged through the social ties has often been neglected in network analysis. The
most popular methods for the analysis of social networks are defined on simple graph
models, only including actors and their relationships, and hence temporal network
analysis methods often only rely on the additional availability of time annotations.
Studies on information diffusion processes often acknowledge the importance of
considering the actors propagating the information, the times of the propagation,
and the content. In practice, however, the content (e.g., text posted online) is used
only to define how actors are connected with each other based on, for example, the
order of links between blog posts (Leskovec et al. 2007) and private messaging (Cae-
tano et al. 2019), who re-shared the same content in social media (Gomez Rodriguez
et al. 2010) or how actors interact with messages shared across multiple social net-
works (Salehi et al. 2015; Roth and Cointet 2010a). Tamine et al. (2016) use the
concept of polyadic conversation, a model where chains of Twitter user interactions
(replies, mentions and retweets) during a time interval are first grouped into conver-
sation trees, and then aggregated into a static weighted graph of interactions between
authors. This type of graph aggregation has recurrently appeared in the literature
of network modeling (Aragón et al. 2017) and information retrieval (Magnani et al.
2012), but there is no consensus on either what is the best method to build such
model (e.g., how to compute the length of a conversation in terms of time and/or
tree’s depth) or how the textual content affects the grouping of actors. These are
important limitations, because studying communication networks without consider-
ing what is communicated can only allow a partial understanding of the underlying
social system (Deri et al. 2018).
This chapter is based on a model for temporal text networks designed to enable a
more accurate representation of human communication (Vega and Magnani 2018).
Temporal text networks describe communication events among actors, including the
actors exchanging information, the textual representation of this information and the
times when the communication happens. While this model is still limited to textual
information, text is a very common way to communicate (for example by email, or
via social media posts) and can also be used to represent other forms of expression,
for instance oral communication that can be translated to text either manually or
semi-automatically through speech-to-text algorithms, and also images, that can be
turned into a set of keywords describing them (Vadicamo et al. 2017; Magnani and
Segerberg 2021).
8 Metrics for Temporal Text Networks 151

While mathematically temporal text networks can be seen as extensions of tem-


poral networks, which are themselves extensions of simple networks, there are two
important differences that require the introduction of specific analysis methods. The
most intuitive difference is of course the presence of text. An additional and more
subtle difference lies in the semantics of the temporal annotations.
In the literature on temporal social networks the time on edges is typically used
to indicate when an edge exists, e.g., that during that time the two actors are in
contact and can exchange information. An implicit assumption in existing works is
that information can be exchanged at any time when an edge is active, and that the
exchange of information is instantaneous.
When we explicitly model communication networks, we should make a difference
between edges representing the possibility of communicating and edges representing
the actual production and consumption of information. In many cases the first type
of edges exist between all actors; for example, we can always send an email to an
existing email address. Therefore, in this chapter we focus on edges representing
communication acts, that is, the actual exchange of information. These acts may
have a non-negligible duration, therefore the time annotations in a temporal text
network indicate when the transmission of a (text) message starts and when it finishes.
Examples where this is important are messages exchanged through networks where
the communication channel has a physical delay, and asynchronous communication
such as by email and via social media, where the text is sent at some time but in
general only received at a later time.
This different semantics of the temporal edges in temporal networks and in tem-
poral text networks requires the re-definition of some central concepts, such as time-
consistent paths, which in turn leads to the definition of new specific metrics.
Finally, it is worth mentioning that Natural Language Processing (NLP) meth-
ods such as sentiment analysis (O’Connor et al. 2023; Dodds and Danforth 2010)
have been used in the past to study the evolution of tweets, songs, blogs, presiden-
tial speeches without requiring information about the underlying communication
structure (who exchanges these data sources and how), using only data from time-
annotated documents and time series information (Lavrenko et al. 2000). The tempo-
ral text network model does not only allow researchers to use NLP methods during
the analysis, but it provides specific metrics to combine them with other measures
from temporal networks.
This chapter introduces the concept of path in temporal text networks and various
metrics to characterize them. In Sect. 8.2 we introduce the temporal text network
model to encode communication networks. In Sect. 8.3 we introduce the concepts
of walk and path in temporal text networks, and in Sect. 8.4 we define alternative
ways of summarizing a path, based either on the times when the communication acts
happen or on the text exchanged through a path. Finally, in Sect. 8.5 we conclude
with an empirical comparison of some of the measures introduced in this chapter in
a sample network formed by the Twitter interactions between Swedish politicians.
152 D. Vega and M. Magnani

8.2 Representing Temporal Text Networks

From a mathematical point of view, a temporal text network (Vega and Magnani
2018) can be represented as a triple (G, x, t) where G = (A, M, E) is a directed
bipartite graph representing the communication network, x : M → X is a mapping
between the messages in M and a set of sequences of characters (text) in X and t :
E → T represents the time associated to each edge, where T is an ordered set of time
annotations. Edge directionality indicates the flow of the communication: (ai , m k ) ∈
E indicates that actor ai has produced text m k , while (m k , a j ) ∈ E indicates that actor
a j is the recipient of message m k . Actors with out-degree larger than 0 are information
producers, actors with in-degree greater than 0 are information consumers, and actors
with both positive in- and out-degrees are information prosumers. For the sake of
readability, we will sometimes use a compact notation, e.g., (a, m, t, x) to indicate
an edge (a, m) ∈ E where t (a, m) = t and x(m) = x.
Figure 8.1 describes a working example we will use during the remainder of this
chapter, representing a temporal text network with |A| = 8 actors, |M| = 6 messages
and |E| = 15 edges. It is important to observe that, in most cases, the edges to/from
a message have different time attributes; the only restriction imposed by the model
is that (ai , m), (m, a j ) ∈ E ⇒ t (ai , m) ≤ t (m, a j ). In other words, a message can
be consumed at different times by each actor (e.g., different social media users can
check their notifications at different times), but can never be received before it has
been generated (e.g., a user cannot access information that has not been shared yet).
This simple model can be used to differentiate between so-called unicast (mes-
sages m 2 and m 3 in the figure) and multicast (messages m 1 , m 4 and m 5 ) communica-
tion. The model can also be used to represent a variety of communication platforms
such as email and Twitter mention networks, and can be easily extended adding edges
between actors or between messages to represent additional relationships such as a
follower/followee network. Unless we explicitly mention it, in the remainder of the
chapter we will ignore these extensions.
A similar model to represent temporal interactions is the contact sequence (Holme
and Saramäki 2012; Gauvin et al. 2013) model, which expresses temporal networks as

Fig. 8.1 A temporal text network model. Circles represent actors, squares represent messages
and the edges between them represent the production and reception of the messages by the actors.
Edges are also annotated with a time attribute ti ∈ T
8 Metrics for Temporal Text Networks 153

a set of directed edges (called contacts) during a finite span of time. While this model
has been successfully used to study spreading processes of information (Lambiotte
et al. 2013; Cheng et al. 2016; Caetano et al. 2019) or the structural evolution of
social networks (Paranjape et al. 2017; Viard et al. 2016; Kim and Diesner 2017),
the model ignores the role of the content of the messages.
A natural alternative to represent time in networks is to use a sequence of time-
annotated graphs, forming a multilayer network (Dickison et al. 2016; Kivelä et al.
2014). In time-sliced models (Mucha and Porter 2010), for example, each one of the
aggregated networks represents a fixed interval of time, and an edge ei j exists in a slice
if at least one contact has been registered between nodes i and j in the corresponding
time interval. The aggregated graphs are sometimes weighted, in which case the edges
have an assigned weight attribute wi j proportional to the number of original edges,
their frequency or another relevant time summarization function. In longitudinal
networks, instead, the relations between the same or similar actors are detected at
different points of time (Snijders 2005, 2014). From the modeling point of view there
is not much difference between the two models, apart from the fact that in time-sliced
networks the time intervals of two adjacent slices are typically contiguous, which is
not necessarily true for longitudinal networks.

8.3 Path-Based Metrics

Metrics for simple networks are based on basic concepts in graph theory, such as
adjacency and incidence, and on counting discrete objects such as edges. Temporal
networks extend simple networks with time. This requires the extension of some
basic concepts in graph theory, and as time is often represented as a real number or
interval, then temporal measures also require some additional simple arithmetical
operations, such as time difference.
Temporal text networks also contain a text attribute. Text is a much more complex
type of data, with a large number of possible operations. For example, the comparison
of two texts can be done using different models (edit distance, word overlapping,
vector representation, etc.), applying different preprocessing operators (stemming,
stop word removal, dictionary based word replacement) or mapping the text to other
domains (for example sentiments or topics). While these choices are very important
in practice, hard-coding all these details in the metrics would make the model very
complex.
Therefore, as discussed by Vega and Magnani (2018), when dealing with temporal
text networks we assume to have at least one of the following two types of text
functions. The first type is used in a so-called continuous analysis approach, based
on the idea of having different grades of similarity between messages. In this case we
assume to have a distance function d : M × M → [0, ∞), indicating how similar
two messages are; if d = 0, the two messages are considered indistinguishable (for
example because they contain the same text), and higher values of d indicate that the
two messages are less similar. Notice that one can then plug specific functions into
154 D. Vega and M. Magnani

the model based on the text operations described above. An example of a message
distance function is the cosine of the angle between vector representations of the two
texts.
The second type of functions is used in a so-called discrete analysis approach,
where each message is assigned to 0, 1 or more classes. For each class i we have
a function ci : M → {0, 1}, which returns 1 if the message belongs to class i, 0
otherwise. One example is a topic modelling function with k topics, where ci (m) = 1
if m belongs to topic i. Notice that starting from a discretization function we can also
define a text distance function, for example based on how many common topics are
shared between the two input messages.

8.3.1 Incidence and Adjacency

In digraphs two vertices are adjacent if there is an edge between them, and two edges
are adjacent if the tail of the first is the head of the second. In temporal text networks
two vertices are adjacent at time t if there is an edge between them at that time.
The concept of adjacency has also been extended to edges (also known as events or
contacts): an edge entering a vertex is adjacent to an edge leaving the same vertex
at a later time. This enables the definition of Δt-adjacency between edges, which
is satisfied when they are adjacent and the time between them is less or equal than
Δt. Note that this terminology is not completely consistent with the one in digraphs,
where only vertices can be adjacent.
Temporal text networks differ from the previous cases in two regards. First, we do
not need to extend the concept of adjacency to edges: we have two types of vertices
(actors and messages), so for example the concept of adjacency between edges in
temporal networks corresponds to adjacency between messages. This also means
that we can retain the concept of incident edges from the theory of digraphs. Second,
the idea of filtering those pairs of vertices that are close enough in time can also
be extended to actors. In summary, all the concepts discussed above can be reduced
to the following definitions, where vi , v j can be either actors or messages, with u k
being a node of the other type.
Definition 1 (Edge incidence) Let e1 = (vi , u k , t1 ) and e2 = (u k , v j , t2 ) be two
edges in a temporal text network. We say that e1 is incident to e2 if t1 ≤ t2 .

Definition 2 (Adjacency) Let e1 = (vi , u k , t1 ) and e2 = (u k , v j , t2 ) be two edges


in a temporal text network. Then:
1. vi is adjacent to u k at time t1 .
2. vi is Δt-temporally adjacent to v j if t2 − t1 ≤ Δt.
3. vi is Δx-textually adjacent to v j if vi , v j ∈ M and d(vi , v j ) ≤ Δx.

Notice that the definition of incidence and adjacency hold independently of the
type of vertices (vi , u k and v j ) involved. If vi , v j ∈ A are actors, then their temporal
8 Metrics for Temporal Text Networks 155

adjacency is defined by the delay between the production and consumption of the
message u k ∈ M. We call an edge from an actor a to a message m a producer edge
(e p ), while an edge from a message m to an actor a is called a consumer edge (ec ).
If vi , v j ∈ M are messages, then their temporal adjacency is defined by the delay
between when the intermediate actor consumes (e.g., receives) the first message and
the time when it produces (e.g., sends) the second. For example, the producer edge
e4 = (al , m 4 ) in Fig. 8.1 is incident to the consumer edge e10 = (m 4 , an ), therefore
actor al is Δt-adjacent to actor an for all Δt ≥ t9 − t4 .

8.3.2 Walks and Paths

Definition 3 (Walk) A walk in a temporal text network (also called a temporal walk)
is a sequence of edges e1 , e2 , . . . , el where ei is incident to ei+1 for all i from 1 to
l − 1.
In the following we will write a ∈ w to indicate that a vertex (actor or message)
is present in walk w.
Notice that the definition above does not constrain the starting and ending vertices
of a path to be actors or messages. However, we will often be interested in walks
starting from an actor, because every message has a single producer in the model
used in this chapter.
Definition 4 (Path) A path in a temporal text network (also called a temporal path)
is a walk where no vertex (message or actor) is traversed twice.
Each path establishes a precedence relation between actors indicating that the
network allows a flow of information between them. Similarly, we have a precedence
relation between messages indicating that the two messages can be part of the same
flow of information.
Definition 5 (Temporal precedence) An actor ai temporally precedes another actor
a j if there is a path from ai to a j . A message m i temporally precedes another message
m j if there is a path from m i to m j .
Figure 8.2 represents the temporal text network of Fig. 8.1 as a temporal sequence
of edges between actors and messages. In this example, w1 = [e4 , e7 , e8 , e9 ] and
w2 = [e4 , e10 , e11 , e12 , e14 ] are two walks of 4 and 5 edges.1 The second walk is also
a path, starting at an actor and ending in a message m 6 , but the first walk is not a path
because the last edge e9 = (m 2 , al , t9 ) visits for a second time the actor al . Finally,
notice that in this example al precedes actor ak in path p1 = [e4 , e7 ] and vice-versa
in path p2 = [e8 , e9 ], while m 3 precedes m 6 but not otherwise.
In some cases we may want to consider only those paths with a limited delay
and with a limited textual difference between adjacent messages. We can thus use

1 To simplify the notation, in this chapter we are assuming that i ≤ j ⇒ ti ≤ t j .


156 D. Vega and M. Magnani

Fig. 8.2 Temporal text network represented as a sequence of edges. The horizontal lines repre-
sent the actors (gray color) and messages (green color) and vertical lines represent the transmission
or consumption of a message. The shaded lines indicate all existing paths beginning at actor al at
the exact time t = 4

the definitions of Δ-adjacency introduced above to select specific paths where suf-
ficiently similar messages are exchanged often enough with respect to some user-
defined thresholds.

8.4 Path Lengths

From now on we will focus on paths starting at an actor and ending at an actor.
While a path can also start or end at a message, paths from and to actors are the
ones providing the most accurate description of an information flow, because for
every message there must always be an actor producing it, and messages that are
not consumed by anyone (as message m 6 in our example) do not correspond to any
exchange of information.
8 Metrics for Temporal Text Networks 157

The length of a path in a temporal text network can be defined based on topology,
on time, and on text.
The topological length is an unambiguous measure in simple and temporal net-
works, which are only made of vertices and edges. In a temporal text network a path
contains actors, edges and messages, and the definition of length that is compatible
with the one used in temporal networks corresponds to the number of messages in
the path. This is because when a temporal network is translated into a temporal text
network every edge is transformed into a message.
The temporal length, instead, defines the overall duration of the communication
and is computed as the difference between the time of the last consumer edge and
the time of the first producer edge in the path.
The topological and temporal length measures we have just described can be
used to characterize the several paths that traverse our graph. In Fig. 8.2 we have
highlighted all the existing paths starting at actor al at exactly t = 4, including those
ending in a message. For example if we compare the path p1 = [e4 , e10 , e11 , e13 ] with
the path p2 = [e4 , e10 , e11 , e15 ] we can see that both have the same topological length
of 2 messages. However, while both paths start at the same time e4 = (al , m 4 , t4 ),
the time of the last consumer edge is different and so their temporal length: t (e13 ) =
t (m 5 , a p , t10 ) ≤ t (e15 ) = t (m 5 , am , t15 ).
Interestingly, in temporal text networks the temporal length of a path measures
two different types of delays. On the one hand it measures the transmission time (δt)
as the difference between the time of the consumer edge t (ec ) and the time when the
content has been produced t (e p ). On the other hand, it indicates the idle time (τ ) of
the actors involved in the communication between two consecutive edges.
Definition 6 (Transmission time) Let e1 = (ai , m, t1 ) and e2 = (m, a j , t2 ) two inci-
dent edges, with m ∈ M. Then the quantity t2 − t1 is called transmission time.

Definition 7 (Idle time) Let e1 = (m i , a, t1 ) and e2 = (a, m j , t2 ) two incident


edges, with a ∈ A. Then the quantity t2 − t1 is called idle time.

Once one has defined transmission and idle times, one can also compute the sum
of all transmission times in a path, the sum of all idle times in a path, and the ratio
between these values and the temporal length of the path. Back to our previous
example, we can observe that the total transmission time of the messages in the first
path δ1 = (t9 − t4 ) + (t10 − t9 ) = 6 is three units smaller than in the second path
δ1 = (t9 − t4 ) + (t13 − t9 ) = 9 while their idle time is the same τ F = t9 − t9 = 0;
which explains why the first path has a smaller temporal length.
The last type of length concerns the textual content in the path. Every time a
message is exchanged, this increases the temporal length of the corresponding amount
of time. Similarly, every time a new text is included in the path, this increases the
textual information in it.
Definition 8 (Textual length) Given a text distance function, the textual length of a
communication path is defined as the sum of the distances between the texts of all
pairs of adjacent messages in the path.
158 D. Vega and M. Magnani

This definition quantifies the variations between adjacent messages. At the same
time, it is possible that the texts of the message keep being updated when transmitted
through the path, but never significantly deviate from the original message. In this
case, an alternative definition of length can be used to compute the maximum distance
between any pair of messages.
In the case of discrete text analysis, where each message can belong to some
classes (for example topics), this idea of estimating how homogeneous the text is
across the path can be computed using a classical measure of entropy, for example
the Shannon index:
Definition 9 (Entropy) Let c1 , . . . , cn be text discretization
 functions mapping text
ci (x(m))
into one of n classes. Given a path p, we define ρi ( p) = m∈ pM p , where M p is
the number of messages in p. The textual entropy of path p is then defined as:


n
H ( p) = − ρi ( p) ln ρi ( p) (8.1)
i=1

According to this definition, if all messages that are part of a path belong to the
same class (e.g., to the same topic), then the textual entropy will be 0, indicating a
homogeneous path when we look at its text. Higher values of entropy would indicate
that multiple classes (e.g., topics) are included in the path. This information can
be useful in various analysis tasks, including the identification of information flows
(when the same textual content is transferred through the network) or community
detection, where one wants a community to be homogeneous not only with respect
to the topology but also with the exchanged messages.
Once we decide which definition of length to use, this defines what the shortest
paths between any pair of actors are, which implies that we can compute all the
existing network measures based on shortest paths, including closeness centrality,
betweenness centrality, eccentricity, diameter, etc. For the definitions of these metrics
we refer the reader to any basic book on network analysis.

8.5 Empirical Study

In this section, we show an empirical comparison of the measures introduced in


this chapter in a real communication network. Our sample dataset consists of all the
public Twitter mentions (messages including another Twitter @username) written
by Swedish politicians during January, 2019. The period of observation takes place
four months after the Swedish general elections in 2018, and includes the time when
the new government coalition was formed.2 Our final network consists of |A| = 886
actors, including 26 politicians (8 information producers and 18 prosumers) and 860

2We considered only politicians who were either members of the parliament before the elections
or were part of an electoral ballot.
8 Metrics for Temporal Text Networks 159

Fig. 8.3 Temporal length.


Summary of the temporal
length distribution for all
shortest paths found in the
Swedish politicians network,
grouped by their topological
path length. All topological
paths involve an even
number of hops because we
are measuring only pairs of
reachable actors

mentioned users (all of them consumers), |M| = 1, 707 Twitter messages with their
corresponding text and |E| = 4, 882 edges between actors and messages. Modelling
the reception time is more difficult, because many social media platforms like Twitter
do not provide information about when and who consumed a piece of information.
In our experiments we assumed that the consumption time of all messages is the
same as the production time, which is not true in general (e.g., users are not always
connected to all their social media and, even if they are, the tweet might be lost in
the myriad of information provided by the user’s wall).
Figure 8.3 shows, for each one of the 6,773 pairs of actors temporally reachable, a
comparison of their topological and temporal shortest path length. It includes 5,787
(85.4%) paths with only two edges, representing two Δ0-textually and temporally
adjacent actors who have been in direct communication. The average temporal path
length of the remaining paths increases with the number of hops (topological length)
while its statistical dispersion is reduced, as we usually observe in other types of tem-
poral networks (e.g., contact networks). For example, the 56 pairs of actors connected
through 3 messages (6 hops) have an average communication time (shortest temporal
length) of approximately 14 days. The order of magnitude of these numbers can be
explained by the skewed distribution of roles (producer, consumer and prosumer) of
the actors in the data and the small sample of the original social network.
Another important component to understand communication networks is the spe-
cific content their members intend to share with each other. For example, in a con-
versation within a group of close people the content (text) of the messages will be
probably different between communications, while news spreading processes will
probably have a more similar topic distribution. The consistency of the topics in an
information cascade, therefore, can be a good metric to describe the dynamics of a
complex system.
160 D. Vega and M. Magnani

Fig. 8.4 Empirical


cumulative distribution
function (ECDF) of the
textual length

Using the concepts described in Sect. 8.4, we have first identified the topics of the
messages exchanged in our sample network and then, computed the textual length
using the Shannon index described in Eq. 8.1 to identify the shortest paths of each
pair of temporally reachable actors. While identifying the topics, we have used the
hashtags as proxies, which is a simple and sometimes acceptable solution; but as
we will see, problematic in practice. As we mentioned in the previous section, the
definition of textual length assumes that there is a discretization function mapping the
text into at least one topic. Hence, because many tweets do not contain any hashtag,
their topic assignment is empty.
Figure 8.4 shows the empirical cumulative distribution function (ECDF) of the
textual shortest path in our sample network. In this particular example, only 420
observations of 6,773 were computed, as many paths have an unidentified length,
either because none of the messages have a topic assigned or because they contain
only one message.
We can observe that more than 75% of the textual paths computed have 0 entropy,
indicating that there is one single topic in the messages of the path. A closer look
does not indicate any correlation of these results with the topological or temporal
length of the paths. The minimum textual length paths include, for example, all the
paths with 5 messages (10 hops) and 85.45% of the paths with 4 messages, but less
than 50% of the paths with 3 messages.
8 Metrics for Temporal Text Networks 161

8.6 Final Remarks

In this chapter, we have revisited some of the fundamental graph measures for tem-
poral networks and extended them to be compatible with the temporal text network
model for communication systems. We have shown that using the proposed model
we can directly represent, in a simple but extensible way, all the elements necessary
to study communication (time, text and topology), without requiring complex graph
transformations. While mathematically temporal text networks are not much differ-
ent from time-varying graphs, the semantics of its interactions and the presence of
textual information in the model require the introduction of specific analysis methods.
In particular, in this chapter we have focused on redefining the idea of connectivity
and most of its related measures such as incidence, adjacency, paths and distance,
providing alternative metrics for actors and messages when we found it was relevant
and necessary. Finally, we have shown how the different distance measures can be
used in practice to discover patterns of connectivity.
Temporal text networks can be seen as extensions of simpler network models,
motivated by the need to capture more information that is not easily represented in
the form of simple relationships between entities. This is just one example of a larger
trend in network science, that has been present for a long time in social network
analysis where there is often a clear need to obtain a non-binary understanding of
social relations. Interdonato et al. review different types of features (beyond time
and text) that have been considered in the literature (Interdonato et al. 2019), and
temporal text networks are just one of the many approaches using complex network
concepts to study text data (Oliva et al. 2021). Some of the approaches to study text
using networks focus on the modelling of concepts or topics, that can be related to
the discrete classes computed on temporal text networks to reduce the complexity of
the individual text messages (Taskin et al. 2020; Camilleri and Miah 2021). These
approaches are of particular relevance for temporal text network analysis when the
concepts or topics can be related to actors (St-Ong et al. 2022) and the analysis has
a temporal perspective (Roth and Cointet 2010a).
In Sect. 8.5 we have provided a simple empirical application of the presented
measures, with the aim of exemplifying them. In the literature, networks including
text and time have been used to study online political communication (Hanteer et al.
2018), online communities (Ustek-Spilda et al. 2021), and information spreading
(Pereira 2021). While our example is about social media data, which is certainly a
main field of application for temporal text networks, other types of data can also be
modeled, including data from historical archives (Milonia and Mazzamurro 2022)
and biomedical texts (Chai et al. 2020) but also non-textual data, where for example
images can be discretized or compared using computer vision tools such as convo-
lutional neural networks (Magnani and Segerberg 2021).
Each of the concepts and measures described in this chapter only focuses on
some aspects of the data. As an example, a path is defined in a conservative way
with respect to graph theory, not allowing the multiple appearance of the same actor.
However, in some cases it can be useful to consider walks where the same actors
162 D. Vega and M. Magnani

appear multiple times, as long as they exchange different messages. For example, as
part of a longer discussion between two actors. Similarly, entropy and Δ-adjacency
only consider respectively consecutive interactions in a path and the unordered set
of all interactions, while in some cases we may be interested just in the difference
between the first and the last message. In summary, these functions should be con-
sidered as a non-exhaustive set of fundamental building blocks, to be extended and
expanded. Beyond their direct application to different analysis tasks, the basic mea-
sures described in this chapter can also be used to redefine other network measures
so that they can capture more information from communication systems. Examples
include centrality measures and community detection algorithms.

Acknowledgements We would like to thank Prof. Christian Rohner for his comments and
suggestions.
This work was partially supported by the European Community through the project “Values and
ethics in Innovation for Responsible Technology in Europe” (Virt-EU) funded under Horizon 2020
ICT-35-RIA call Enabling Responsible ICT-related Research and Innovation, and by eSSENCE, an
e-Science collaboration funded as a strategic research area of Sweden.

References

P. Aragón, V. Gómez, D. García, A. Kaltenbrunner, Generative models of online discussion threads:


state of the art and research challenges. J. Internet Serv. Appl. 8(1), 1–17 (2017)
J.A. Caetano, G. Magno, M. Gonçalves, J. Almeida, H.T. Marques-Neto, V. Almeida, Characterizing
attention cascades in whatsapp groups, in Proceedings of the 10th ACM Conference on Web
Science (2019), pp. 27–36
E. Camilleri, S.J. Miah, Evaluating latent content within unstructured text: an analytical methodol-
ogy based on a temporal network of associated topics. J. Big Data 8(1), 124 (2021)
L.R. Chai, D. Zhou, D.S. Bassett, Evolution of semantic networks in biomedical texts. J. Complex
Netw. 8(1), cnz023 (2020). https://doi.org/10.1093/comnet/cnz023
J. Cheng, L.A. Adamic, J.M. Kleinberg, J. Leskovec, Do cascades recur? in Proceedings of the
25th International Conference on World Wide Web (International WWW Conferences Steering
Committee, 2016), pp. 671–681
S. Deri, J. Rappaz, L.M. Aiello, D. Quercia, Coloring in the links: capturing social ties as they are
perceived. Proc. ACM Hum. Comput. Interact. 2(CSCW), 43:1–43:18 (2018)
M. Dickison, M. Magnani, L. Rossi, Multilayer Social Networks (Cambridge University Press,
2016)
P.S. Dodds, C.M. Danforth, Measuring the happiness of large-scale written expression: songs, blogs,
and presidents. J. Happiness Stud. 11(4), 441–456 (2010)
L. Gauvin, A. Panisson, C. Cattuto, A. Barrat, Activity clocks: spreading dynamics on temporal
networks of human contact. Sci. Rep. 3, 3099 (2013)
M. Gomez Rodriguez, J. Leskovec, A. Krause, Inferring networks of diffusion and influence, in
Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, KDD ’10 (ACM, New York, NY, USA, 2010), pp. 1019–1028
O. Hanteer, L. Rossi, D.V. D’Aurelio, M. Magnani, From interaction to participation: the role
of the imagined audience in social media community detection and an application to political
communication on twitter, in 2018 IEEE/ACM International Conference on Advances in Social
Networks Analysis and Mining (ASONAM) (2018), pp. 531–534
P. Holme, J. Saramäki, Temporal networks. Phys. Rep. 519(3), 97–125 (2012)
8 Metrics for Temporal Text Networks 163

R. Interdonato, M. Atzmueller, S. Gaito, R. Kanawati, C. Largeron, A. Sala, Feature-rich networks:


going beyond complex network topologies. Appl. Netw. Sci. 4(1), 4 (2019)
M. Karsai, M. Kivelä, R.K. Pan, K. Kaski, J. Kertész, A.L Barabási, J. Saramäki, Small but slow
world: how network topology and burstiness slow down spreading. Phys. Rev. E-Stat., Nonlinear,
Soft Matter Phys. 83(2) (2011)
J. Kim, J. Diesner, Over-time measurement of triadic closure in coauthorship networks. Soc. Netw.
Anal. Min. 7(1), 9 (2017)
M. Kivelä, A. Arenas, M. Barthelemy, J.P. Gleeson, Y. Moreno, M.A. Porter, Multilayer networks.
J. Complex Netw. 2(3), 203–271 (2014)
R. Lambiotte, L. Tabourier, J.C. Delvenne, Burstiness and spreading on temporal networks. Eur.
Phys. J. B 86(7), 320 (2013)
V. Lavrenko, M. Schmill, D. Lawrie, P. Ogilvie, D. Jensen, J. Allan, Mining of concurrent text and
time series, in SIGKDD Workshop on Text Mining (2000), pp. 37–44
J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, N. Glance, Cost-effective outbreak
detection in networks, in International conference on Knowledge Discovery and Data Mining
(KDD) (2007), p. 420
N. Luhmann, Social Systems (Stanford University Press, 1995)
M. Magnani, D. Montesi, L. Rossi, Conversation retrieval from microblogging sites. Inf. Retr. J.
15(3–4) (2012)
M. Magnani, A. Segerberg, On the conditions for integrating deep learning into the study of visual
politics, in 13th ACM Web Science Conference (2021)
B. Mathew, R. Dutt, P. Goyal, A. Mukherjee, Spread of hate speech in online social media, in
Proceedings of the 10th ACM Conference on Web Science, WebSci ’19 (Association for Computing
Machinery, New York, NY, USA, 2019), pp. 173–182
S. Milonia, M. Mazzamurro, Temporal networks of ‘Contrafacta’ in the first three troubadour
generations. Digit. Sch. Humanities fqac018 (2022)
P.J. Mucha, M.A. Porter, Communities in multislice voting networks. Chaos: Interdiscip. J. Non-
linear Sci. 20(4) (2010)
B. O’Connor, R. Balasubramanyan, B.R. Routledge, N.A. Smith, From tweets to polls: linking text
sentiment to public opinion time series, in Proceedings of the Eleventh International Conference
on Web and Social Media, ed. by W.W. Cohen, S. Gosling (The AAAI Press)
S.Z. Oliva, L. Oliveira-Ciabati, D.G. Dezembro, M.S.A. Júnior, M. de Carvalho Silva, H.C. Pes-
sotti, J.T. Pollettini, Text structuring methods based on complex network: a systematic review.
Scientometrics 126(2), 1471–1493 (2021)
A. Paranjape, A.R. Benson, J. Leskovec, Motifs in temporal networks, in Proceedings of the 10th
ACM International Conference on Web Search and Data Mining, WSDM ’17 (ACM, New York,
NY, USA, 2017), pp. 601–610
F.S.F. Pereira, Caracterização da propagação de rumores no twitter utilizando redes textuais tempo-
rais, in Anais do Brazilian Workshop on Social Network Analysis and Mining (BraSNAM) (SBC,
2021), pp. 25–31
C. Roth, J.P. Cointet, Social and semantic coevolution in knowledge networks. Soc. Netw. 32(1),
16–29 (2010)
M. Salehi, R. Sharma, M. Marzolla, M. Magnani, P. Siyari, D. Montesi, Spreading processes in
multilayer networks. IEEE Trans. Netw. Sci. Eng. 2(2), 65–83 (2015)
P. Sapiezynski, A. Stopczynski, D.D. Lassen, S. Lehmann, Interaction data from the copenhagen
networks study. Sci. Data 6(1), 1–10 (2019)
T.A.B. Snijders, Models for longitudinal network data, in Models and Methods in Social Network
Analysis, Structural Analysis in the Social Sciences, ed. by P.J. Carrington, J. Scott, S. Wasserman
(Cambridge University Press, 2005), pp. 215–247
T.A.B. Snijders, Siena: statistical modeling of longitudinal network data, in Encyclopedia of Social
Network Analysis and Mining (Springer New York, New York, NY, 2014), pp. 1718–1725
J. St-Onge, L. Renaud-Desjardins, P. Mongeau, J. Saint-Charles, Socio-semantic networks as mutu-
alistic networks. Sci. Rep. 12(1), 1889 (2022). Number: 1 Publisher: Nature Publishing Group
164 D. Vega and M. Magnani

J. Stehlé, N. Voirin, A. Barrat, C. Cattuto, L. Isella, J.F. Pinton, P. Vanhems, High-resolution mea-
surements of face-to-face contact patterns in a primary school. PLoS One 6(8) (2011)
L. Tamine, L. Soulier, L., Jabeur, F. Amblard, C. Hanachi, G. Hubert, C. Roth, Social media-based
collaborative information access: analysis of online crisis-related twitter conversations, in HT
2016 - Proceedings of the 27th ACM Conference on Hypertext and Social Media (2016), pp.
159–168
Y. Taskin, T. Hecking, H.U. Hoppe, ESA-T2N: a novel approach to network-text analysis, in Com-
plex Networks and Their Applications VIII, Studies in Computational Intelligence. ed. by H.
Cherifi, S. Gaito, J.F. Mendes, E. Moro, L.M. Rocha (Springer International Publishing, Cham,
2020), pp.129–139
F. Ustek-Spilda, D. Vega, M. Magnani, L. Rossi, I. Shklovski, S. Lehuede, A. Powell, A twitter-based
study of the European internet of things. Inf. Syst. Front. 23(1), 135–149 (2021)
L. Vadicamo, F. Carrara, A. Cimino, S. Cresci, F. Dell’Orletta, F. Falchi, M. Tesconi, Cross-media
learning for image sentiment analysis in the wild, in 2017 IEEE International Conference on
Computer Vision Workshops (ICCVW) (2017), pp. 308–317
D. Vega, M. Magnani, Foundations of temporal text networks. Appl. Netw. Sci. 3(1), 26 (2018)
T. Viard, M. Latapy, C. Magnien, Computing maximal cliques in link streams. Theoret. Comput.
Sci. 609(1), 245–252 (2016)
L. Wang, A. Yang, K. Thorson, Serial participants of social media climate discussion as a community
of practice: a longitudinal network analysis. Inf., Commun. Soc. 24(7), 941–959 (2021)
Chapter 9
Bursty Time Series Analysis
for Temporal Networks

Hang-Hyun Jo and Takayuki Hiraoka

Abstract Characterizing bursty temporal interaction patterns of temporal networks


is crucial to investigate the evolution of temporal networks as well as various col-
lective dynamics taking place in them. The temporal interaction patterns have been
described by a series of interaction events or event sequences, often showing non-
Poissonian or bursty nature. Such bursty event sequences can be understood not only
by heterogeneous interevent times (IETs) but also by correlations between IETs.
The heterogeneities of IETs have been extensively studied in recent years, while
the correlations between IETs are far from being fully explored. In this chapter, we
introduce various measures for bursty time series analysis, such as the IET distri-
bution, the burstiness parameter, the memory coefficient, the bursty train sizes, and
the autocorrelation function, to discuss the relation between those measures. Then
we show that the correlations between IETs can affect the speed of spreading taking
place in temporal networks. Finally, we discuss possible research topics regarding
bursty time series analysis for temporal networks.

9.1 Introduction

Characterizing the interaction structure between constituents of complex systems is


crucial to understand not only the dynamics of those systems but also the dynamical
processes taking place in them. The topological structure of interaction has been
modeled by a network, where nodes and links denote the constituents and their
pairwise interactions, respectively (Albert and Barabási 2002; Newman 2010). When
the interaction is temporal, one can adopt a framework of temporal networks (Holme
and Saramäki 2012; Masuda and Lambiotte 2016; Gauvin et al. 2018), where links
are considered being existent or activated only at the moment of interaction. The

H.-H. Jo (B)
The Catholic University of Korea, Bucheon 14662, Republic of Korea
e-mail: h2jo@catholic.ac.kr
H.-H. Jo · T. Hiraoka
Aalto University, 00076 Espoo, Finland
e-mail: takayuki.hiraoka@aalto.fi

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 165
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_9
166 H.-H. Jo and T. Hiraoka

temporal interaction pattern of each link can be described by a series of interaction


events or an event sequence. Many empirical event sequences are known to be non-
Poissonian or bursty (Barabási 2005; Karsai et al. 2012a, 2018), e.g., as shown in
human communication patterns (Eckmann et al. 2004; Malmgren et al. 2009; Cattuto
et al. 2010; Jo et al. 2012; Rybski et al. 2012; Jiang et al. 2013; Stopczynski et al.
2014; Panzarasa and Bonaventura 2015), where bursts denote a number of events
occurring in short active periods separated by long inactive periods. Such bursty
event sequences can be fully understood both by heterogeneous interevent times
(IETs) and by correlations between IETs (Goh and Barabási 2008; Jo 2017). Here
the IET, denoted by τ , is defined by the time interval between two consecutive IETs.
The heterogeneities of IETs have been extensively studied in terms of heavy-tailed
or power-law IET distributions (Karsai et al. 2018), while the correlations between
IETs have been far from being fully explored.
In this chapter, we introduce various measures for bursty time series analysis, such
as the IET distribution, the burstiness parameter, the memory coefficient, the bursty
train sizes, and the autocorrelation function, to discuss the relation between those
measures. Then we show that the correlations between IETs can affect the speed of
spreading taking place in temporal networks. Finally, we discuss possible research
topics regarding bursty time series analysis for temporal networks.

9.2 Bursty Time Series Analysis

9.2.1 Measures and Characterizations

Non-Poissonian, bursty time series or event sequences have been observed not only
in the human communication patterns (Karsai et al. 2018), but also in other natural
and biological phenomena, including solar flares (Wheatland et al. 1998), earth-
quakes (Corral 2004; de Arcangelis et al. 2006), neuronal firings (Kemuriyama et al.
2010), and animal behaviors (Sorribes et al. 2011; Boyer et al. 2012). Temporal cor-
relations in such event sequences have been characterized by various measures and
quantities (Karsai et al. 2018), such as the IET distribution, the burstiness parameter,
the memory coefficient, the bursty train sizes, and the autocorrelation function. Each
of these five measures captures a different aspect of the bursty time series, while they
are not independent of each other. Here we discuss the relation between these five
measures, which is conceptually illustrated in Fig. 9.1.
(i) The autocorrelation function for an event sequence x(t) is defined with delay
time td as follows:
x(t)x(t + td )t − x(t)2t
A(td ) ≡ , (9.1)
x(t)2 t − x(t)2t

where ·t means a time average. The event sequence x(t) can be considered to have
the value of 1 at the moment of event occurred, 0 otherwise. For the event sequences
9 Bursty Time Series Analysis for Temporal Networks 167

A(td) ∼ td−γ

P(τ) ∼ τ −α QΔt(b) ∼ b −β
M ∼ ⟨τiτi+1⟩
Fig. 9.1 Conceptual diagram for the relation between the autocorrelation function A(td ), the
interevent time distribution P(τ ), and the burst size distribution for a given time window Q Δt (b),
together with the burstiness parameter B and the memory coefficient M. The relation between these
five measures is discussed in Sect. 9.2: In particular, for the dependence of γ on α and β, refer to
Sect. 9.2.3, and for the relation between M and Q Δt (b), refer to Sect. 9.2.4

with long-term memory effects, one may find a power-law decaying behavior with
a decaying exponent γ :
−γ
A(td ) ∼ td . (9.2)

Temporal correlations measured by A(td ) can be understood not only by the hetero-
geneous IETs but also by correlations between them.
(ii) The heterogeneous properties of IETs have often been characterized by the
heavy-tailed or power-law IET distribution P(τ ) with a power-law exponent α:

P(τ ) ∼ τ −α , (9.3)

which may already imply clustered short IETs even with no correlations between
IETs. In the case when IETs are fully uncorrelated with each other, i.e., for renewal
processes (Mainardi et al. 2007), the power spectral density was analytically calcu-
lated from power-law IET distributions (Lowen and Teich 1993). Using this result,
one can straightforwardly derive the scaling relation between α and γ :

α + γ = 2 for 1 < α ≤ 2,
(9.4)
α − γ = 2 for 2 < α ≤ 3.

This relation was also derived in the study of priority queueing models (Vajna et al.
2013). The relation α + γ = 2 for 1 < α ≤ 2 has been derived in the context of
earthquakes (Abe and Suzuki 2009) as well as of the hierarchical burst model (Lee
et al. 2018).
168 H.-H. Jo and T. Hiraoka

(iii) The degree of burstiness in the event sequence can be measured by a single
value derived from the IET distribution, namely, the burstiness parameter B, which
is defined as Goh and Barabási (2008)

σ − τ 
B≡ , (9.5)
σ + τ 

where σ and τ  are the standard deviation and mean of IETs, respectively. For the
regular event sequence, all IETs are the same, leading to B = −1, while for the
totally random, Poisson process, since σ = τ , one gets B = 0. In the extremely
bursty case, characterized by σ  τ , one finds B → 1. However, when analyzing
the empirical event sequences of finite sizes, the value of σ is typically limited by
number of events n such that the maximum value of σ turns out to be σmax
the √
τ  n − 1, allowing to propose an alternative burstiness measure (Kim and Jo 2016):
√ √
n + 1σ − n − 1τ 
Bn ≡ √ √ , (9.6)
( n + 1 − 2)σ + n − 1τ 

which can have the value of 1 (0) in the extremely bursty case (in the Poisson process)
for any n.
(iv) The correlations between IETs have been characterized by several mea-
sures (Karsai et al. 2018). Among them, we focus on the memory coefficient and
bursty train sizes. The memory coefficient M is defined as the Pearson correlation
coefficient between two consecutive IETs, whose value for a sequence of n IETs,
i.e., {τi }i=1,...,n , can be estimated by Goh and Barabási (2008)

1  (τi − τ 1 )(τi+1 − τ 2 )
n−1
M≡ , (9.7)
n − 1 i=1 σ1 σ2

where τ 1 (τ 2 ) and σ1 (σ2 ) are the average and the standard deviation of the first
(last) n − 1 IETs, respectively. Positive M implies that the small (large) IETs tend to
be followed by small (large) IETs. Negative M implies the opposite tendency, while
M = 0 is for the uncorrelated IETs. In many empirical analyses, positive M has been
observed (Goh and Barabási 2008; Wang et al. 2015; Guo et al. 2017; Böttcher et al.
2017).
(v) Another notion for measuring the correlations between IETs is the bursty
trains (Karsai et al. 2012a). A bursty train is defined as a set of consecutive events
such that IETs between any two consecutive events in the bursty train are less than
or equal to a given time window Δt, while those between events in different bursty
trains are larger than Δt. The number of events in the bursty train is called bursty
train size or burst size, and it is denoted by b. The distribution of b would follow an
exponential function if the IETs are fully uncorrelated with each other. However, b
has been empirically found to be power-law distributed, i.e.,
9 Bursty Time Series Analysis for Temporal Networks 169

Q Δt (b) ∼ b−β (9.8)

for a wide range of Δt, e.g., in earthquakes, neuronal activities, and human commu-
nication patterns (Karsai et al. 2012a, b; Yasseri et al. 2012; Wang et al. 2015). This
indicates the presence of higher-order correlations between IETs beyond the corre-
lations measured by M 1 . We note that the exponential distributions of Q Δt (b) have
also been reported for mobile phone calls of individual users in another work (Jiang
et al. 2016).
We show that the statistics of IETs and burst sizes are interrelated to each other (Jo
2017). Let us consider an event sequence with n + 1 events and n IETs, denoted by
T ≡ {τ1 , . . . , τn }. For a given Δt one can detect m bursty trains whose sizes are
 by B ≡ {b1 , . . . , bm }. The sum of burst sizes must be the number of events,
denoted
i.e., mj=1 b j = n + 1. With b denoting the average burst size, we can write

mb = n + 1 n, (9.9)

where n  1 is assumed. The number of bursty trains is related to the number of


IETs larger than Δt, i.e.,
m = |{τi |τi > Δt}| + 1. (9.10)

It is because each burst size, say b, requires b − 1 consecutive IETs less than or equal
to Δt and one IET larger than Δt. In the case with n, m  1, we get

m n F(Δt), (9.11)
∞
where F(Δt) ≡ Δt P(τ )dτ denotes the complementary cumulative distribution
function of P(τ ). By combining Eqs. (9.9) and (9.11), we obtain a general relation
as
bF(Δt) 1, (9.12)

which holds for arbitrary functional forms of P(τ ) and Q Δt (b) (Jo 2017).

9.2.2 Correlation Structure and the Bursty-Get-Burstier


Mechanism

We pay special attention to the empirical observation that the tail parts of burst size
distributions are characterized by the same power-law exponent for a wide range
of time windows, e.g., ranging from a few minutes to the order of one hour in

1 The generalized memory coefficient has also been suggested as the Pearson correlation coefficient
between two IETs separated by k IETs (Goh and Barabási 2008). The case with k = 0 corresponds
to the M in Eq. (9.7). The relation between the generalized memory coefficients and burst size
distributions can be studied for better understanding the correlation structure between IETs.
170 H.-H. Jo and T. Hiraoka

mobile phone communication patterns (Karsai et al. 2012a). To better understand


this observation, let us begin with a simple example in Fig. 9.2. For each given time
window Δtl with “level” l = 0, 1, 2, one can obtain the corresponding set of burst
sizes, denoted by {b(l) }. Here we observe that several bursty trains at the level l are
merged to make one bursty train at the level l + 1. In other words, one burst size in
{b(l+1) } is typically written as a sum of several burst sizes in {b(l) }. By characterizing
this merging pattern one can get insight into the correlation structure between IETs.
In particular, we raise a question: In order to find the power-law tail as Q Δtl (b(l) ) ∼
b(l)−β for every l, which burst sizes in {b(l) } should be merged to make one burst
size in {b(l+1) }? One possible answer to this question has recently been suggested,
which is called the bursty-get-burstier (BGB) mechanism (Jo 2017), indicating that
the bigger (smaller) bursty trains tend to follow the bigger (smaller) ones.
We introduce one implementation method of the BGB mechanism following Jo
(2017), where P(τ ) and Q Δt0 (b(0) ) are assumed to be given. Although this method
has been suggested for arbitrary forms of P(τ ) and Q Δt0 (b(0) ), we focus on the case
with power-law tails for both distributions. Precisely, we consider a power-law P(τ )
with a power-law exponent α > 1 and a lower bound of IET τmin , i.e.,
α−1 −α
P(τ ) = (α − 1)τmin τ θ (τ − τmin ), (9.13)

and a power-law distribution of burst sizes at the zeroth level (l = 0) as

Q Δt0 (b(0) ) = ζ (β)−1 b(0)−β for b(0) = 1, 2, . . . , (9.14)

where θ (·) denotes the Heaviside step function and ζ (·) does the Riemann zeta
function.
We first prepare a set of n IETs, T = {τ1 , . . . , τn }, that are independently drawn
from P(τ ) in Eq. (9.13). This T is partitioned into several subsets, denoted by Tl ,
at different timescales or levels l = 0, 1, . . . , L:

Δt0 3 1 2 4 1 2 2 = b (0)
Δt1 4 6 1 4 = b (1)
Δt2 4 11 = b (2)

Fig. 9.2 Schematic diagram for the hierarchical organization of bursty trains at various timescales
with 15 events, denoted by vertical lines. These events are clustered using time windows Δtl with
l = 0, 1, 2, and the sizes of bursty trains or burst sizes are denoted by b(l) , e.g., {b(1) } = {4, 6, 1, 4}
9 Bursty Time Series Analysis for Temporal Networks 171

T0 ≡ {τi |τmin ≤ τi ≤ Δt0 },


Tl ≡ {τi |Δtl−1 < τi ≤ Δtl } for l = 1, . . . , L − 1, (9.15)
T L ≡ {τi |τi > Δt L−1 },

where Δtl < Δtl+1 for all ls. For example, one can use Δtl = τmin cs l with constants
c, s > 1. This partition readily determines the number of bursty trains at each level,
denoted by m l , similarly to Eq. (9.10):
 
m l = {τi |τi > Δtl } + 1. (9.16)

Then the sizes of bursty trains for a given Δtl are denoted by Bl ≡ {b(l) }, with
m l = |Bl |. To generate B0 , m 0 burst sizes are independently drawn from Q Δt0 (b(0) )
in Eq. (9.14). Partitioning B0 into subsets and summing up the burst sizes in each
subset leads to B1 . Precisely, for each l, Bl is sorted, e.g., in a descending order,
then it is sequentially partitioned into m l+1 subsets of the (almost) same size. The
sum of b(l) s in each subset leads to one b(l+1) , implying that the bigger bursty trains
are merged together, so do the smaller ones. This procedure is repeated until the
level L is reached. Using the information on which burst sizes at the level l are
merged to get each of burst sizes at the level l + 1, one can construct the sequence of
IETs by permuting IETs in T and finally get the event sequence. See Jo (2017) for
details. Numerical simulations have shown that the generated event sequences show
Q Δtl (b(l) ) ∼ b(l)−β at all levels (Jo 2017).
We remark that the above method lacks some realistic features observed in the
empirical analyses. For example, by the above method the number of burst sizes in
each partition at the level l is almost the same as being either mml+1l  or mml+1l  + 1,
which is not always the case. Therefore more realistic merging processes for the
correlation structure between IETs could be investigated as a future work.

9.2.3 Temporal Scaling Behaviors

The scaling relation between α and γ for the uncorrelated IETs in Eq. (9.4) implies
that the autocorrelation function is solely determined by the IET distribution. We can
consider a more general case that the IETs are correlated with each other, in particular,
in terms of the power-law burst size distributions. Then the temporal correlations
measured by the autocorrelation function A(td ) can be understood by means of the
statistical properties of IETs, P(τ ), together with those of the correlations between
IETs, Q Δt (b). In terms of scaling behaviors, one can study the dependence of γ on
α and β.
The dependence of γ on α and β has been investigated by dynamically generating
event sequences showing temporal correlations, described by the power-law distri-
butions of IETs and burst sizes in Eqs. (9.3) and (9.8). These generative approaches
have been based on two-state Markov chain (Karsai et al. 2012a) or self-exciting
172 H.-H. Jo and T. Hiraoka

(a) 1 (b) 1
α=1.6 2.6
1.9 2.3
0.8 2.0 0.8 2.1

0.6 0.6
γ

γ
0.4 0.4

0.2 0.2
2.5 3 3.5 4 2.5 3 3.5 4
β β

Fig. 9.3 The values of γ estimated from the numerically obtained autocorrelation functions, for
various values of α and β, with horizontal dashed lines corresponding to those for the uncorrelated
cases. Reprinted figure with permission from Jo (2017) Copyright (2017) by the American Physical
Society

point processes (Jo et al. 2015). One can also take an alternative approach by shuf-
fling or permuting a given set of IETs according to the BGB mechanism described in
Sect. 9.2.2, where power-law distributions of IETs and burst sizes are inputs rather
than outputs of the model. Then one can explicitly tune the degree of correlations
between IETs to test whether the scaling relation in Eq. (9.4) will be violated due to
the correlations between IETs.
For this, the event sequences are generated using the BGB mechanism for the
power-law distributions of IETs and burst sizes, which are then analyzed by mea-
suring autocorrelation functions A(td ) for various values of α and β. The decaying
−γ
exponent γ of A(td ) is estimated based on the simple scaling form of A(td ) ∼ td .
The estimated values of γ for various values of α and β are presented in Fig. 9.3. When
α ≤ 2, it is numerically found that the autocorrelation functions for β < 3 deviate
from the uncorrelated case, implying the violation of scaling relation between α and
γ in Eq. (9.4). Precisely, the smaller β leads to the larger γ , implying that the stronger
correlations between IETs may induce the faster decaying of autocorrelation. On the
other hand, in the case with α > 2, the estimated γ deviates significantly from that
for the uncorrelated case for the almost entire range of β, although γ approaches the
uncorrelated case as β increases as expected.
One can argue that the deviation (or the violation of α + γ = 2) observed for
β < 3 is due to the fact that the variance of b diverges for β < 3. This argument
seems to explain why α + γ = 2 is observed even when β = 3, for event sequences
generated using two-state Markov chain (Karsai et al. 2012a).
For better understanding the above results, more rigorous studies need to be done.
As the analytical calculation of γ as a function of β is a very challenging task, one can
tackle a simplified problem. For example, the effects of correlations only between
two consecutive IETs on the autocorrelation function have been analytically studied
to find the M-dependence of γ (Jo 2019).
9 Bursty Time Series Analysis for Temporal Networks 173

9.2.4 Limits of the Memory Coefficient in Measuring


Correlations

The memory coefficient, measuring the correlations only between two consecutive
IETs, has been used to analyze event sequences in natural phenomena and human
activities as well as to test models for bursty dynamics (Goh and Barabási 2008; Wang
et al. 2015; Böttcher et al. 2017; Jo et al. 2015). It has been found that M ≈ 0.2 for
earthquakes in Japan, while M is close to 0 or less than 0.1 for various human
activities (Goh and Barabási 2008). In another work on emergency call records in a
Chinese city, individual callers are found to show diverse values of M, i.e., a broad
distribution of M ranging from −0.2 to 0.5 but peaked at M = 0 (Wang et al. 2015).
Based on these empirical observations, one might conclude that most of human
activities do not show strong correlations between IETs. On the other hand, the
empirical value of β for the burst size distributions varies from 2.5 for earthquakes
in Japan to 2.8–3.0 for Wikipedia editing patterns (Yasseri et al. 2012) and 3.9–4.2
for mobile phone communication patterns (Karsai et al. 2012a, b), while it is found
that β ≈ 2.21 in the emergency call dataset (Wang et al. 2015). Since the power-law
behaviors of burst size distributions for a wide range of time windows imply the
complex, higher-order correlations between IETs, this seems to be inconsistent with
the weak correlation implied by the observation M ≈ 0 in human activities.
This puzzling issue has been resolved by deriving the analytical form of M as a
function of parameters describing P(τ ) and Q Δt (b) (Jo and Hiraoka 2018). Here we
introduce the derivation of M following Jo and Hiraoka (2018). By considering bursty
trains detected using one time window or timescale Δt, we divide T = {τ1 , . . . , τn }
into two subsets as

T0 ≡ {τi |τi ≤ Δt}, (9.17)


T1 ≡ {τi |τi > Δt}. (9.18)

The set of all pairs of two consecutive IETs, {(τi , τi+1 )}, can be divided into four
subsets as follows:

Tμν ≡ {(τi , τi+1 )|τi ∈ Tμ , τi+1 ∈ Tν }, (9.19)

where μ, ν ∈ {0, 1}. By denoting the fraction of IET pairs in each Tμν by tμν ≡
|Tμν |/(n − 1), the term τi τi+1  in Eq. (9.7) can be written as

τi τi+1  = tμν τ (μ) τ (ν) , (9.20)
μ,ν∈{0,1}

where  Δt ∞
τ P(τ )dτ τ P(τ )dτ
τ (0)
≡  Δt
0
, τ (1)
≡ Δt∞ . (9.21)
0 P(τ )dτ Δt P(τ )dτ
174 H.-H. Jo and T. Hiraoka

Here we have assumed that the information on the correlation between τi and τi+1
is carried only by tμν , while such consecutive IETs are independent of each other
under the condition that τi ∈ Tμ and τi+1 ∈ Tν . This assumption of conditional
independence is based on the fact that the correlation between τi and τi+1 with
τi ∈ Tμ and τi+1 ∈ Tν is no longer relevant to the burst size statistics, because the
bursty trains are determined depending only on whether each IET is larger than Δt
or not. Then M in Eq. (9.7) reads in the asymptotic limit with n  1
 (μ) (ν)
μ,ν∈{0,1} tμν τ τ − τ 2
M . (9.22)
σ 2

Here we have approximated as τ 1 τ 2 τ  and σ1 σ2 σ , with τ  and σ


denoting the average and standard deviation of IETs, respectively. Note that τ (0) and
τ (1) are related as follows:
 
1 1 (1)
1− τ (0) + τ τ . (9.23)
b b

For deriving M in Eq. (9.22), tμν s need to be calculated. Since each pair of IETs
in T11 implies a bursty train of size 1, the average size of T11 is m Q Δt (1), with m
denoting the number of bursty trains detected using Δt. Thus, the average fraction
of IET pairs in T11 becomes

|T11 | Q Δt (1)
t11 ≡ , (9.24)
n−1 b

where Eq. (9.9) has been used. The pair of IETs in T10 (T01 ) is found whenever a
bursty train of size larger than 1 begins (ends). Hence, the average fraction of T10 ,
equivalent to that of T01 , must be

|T10 | 1  1 − Q Δt (1)
t10 ≡ Q Δt (b) = , (9.25)
n−1 b b=2 b

which is the same as t01 ≡ |T01 |/(n − 1). Finally, for each bursty train of size
larger than 2, we find b − 2 pairs of IETs belonging to T00 , indicating that the
average fraction of T00 is

|T00 | 1  b − 2 + Q Δt (1)
t00 ≡ (b − 2)Q Δt (b) = . (9.26)
n−1 b b=3 b

Note that t00 + t01 + t10 + t11 1. Then by using Eqs. (9.21) and (9.23) one obtains

tμν τ (μ) τ (ν) = [bQ Δt (1) − 1](τ  − τ (0) )2 + τ 2 , (9.27)
μ,ν∈{0,1}
9 Bursty Time Series Analysis for Temporal Networks 175

finally leading to
[bQ Δt (1) − 1](τ  − τ (0) )2
M . (9.28)
σ2
This solution has been derived for arbitrary forms of P(τ ) and Q Δt (b).
We investigate the dependence of M on Q Δt (b), while keeping the same P(τ ).
As for the burst size distribution, we consider a power-law distribution as follows:

Q Δt (b) = ζ (β)−1 b−β for b = 1, 2, . . . . (9.29)

We assume that β > 2 for the existence of b, i.e., b = ζ (β − 1)/ζ (β). As for the
IET distribution, a power-law distribution with an exponential cutoff is considered:

τcα−1
P(τ ) = τ −α e−τ/τc θ (τ − τmin ), (9.30)
(1 − α, τmin /τc )

where τmin and τc denote the lower bound and the exponential cutoff of τ , respec-
tively. Here (·, ·) denotes the upper incomplete Gamma function. Figure 9.4 shows
how M varies according to the power-law exponent β for a given α for both cases
with diverging and finite τc , respectively. For the numerical simulations, the event
sequences were generated using the implementation method of the BGB mechanism
in Sect. 9.2.2, but using Eq. (9.30). We confirm the tendency that the larger posi-
tive value of M is associated with the smaller value of β, i.e., the heavier tail. This
tendency can be understood by the intuition that the smaller β implies the stronger
correlations between IETs, possibly leading to the larger M. We also find that M ≈ 0
for β ≈ 4, whether τc is finite or infinite. This implies that the apparently conflicting
observations in human activities are indeed compatible. Hence, we raise an important
question regarding the effectiveness or limits of M in measuring higher-order corre-
lations between IETs. Although the definition of M is straightforward and intuitive,
it may not properly characterize the complex correlation structure between IETs in
some cases.

9.3 Effects of Correlations Between IETs on Dynamical


Processes

The dynamical processes, such as spreading, diffusion, and cascades, taking place
in a temporal network of individuals are known to be strongly affected by bursty
interaction patterns between individuals (Vazquez 2007; Karsai et al. 2011; Miritello
et al. 2011; Iribarren and Moro 2009; Rocha et al. 2011; Rocha and Blondel 2013;
Takaguchi et al. 2013; Masuda and Holme 2013; Jo et al. 2014; Perotti et al. 2014;
Delvenne et al. 2015; Pastor-Satorras et al. 2015; Artime et al. 2017; Hiraoka and
Jo 2018): In particular, spreading processes in temporal networks have been exten-
sively studied. An important question is what features of temporal networks are most
176 H.-H. Jo and T. Hiraoka

(a) (b)
α=5.0 α=1.5
0.3 4.0 0.3 2.1
3.5 2.5
3.1 2.9
0.2 0.2
M

M
-2
10
10-3
0.1 0.1

2 3 4
0 0
2 2.5 3 3.5 4 4.5 2 2.5 3 3.5 4 4.5
β β

Fig. 9.4 The analytical solution of M in Eq. (9.28) as a function of β in Eq. (9.14) for several
values of α in Eq. (9.30) (solid lines), compared with corresponding numerical results (symbols
with error bars). In panel (a) we use the pure power-law distribution of P(τ ) in Eq. (9.30), with
infinite exponential cutoff, i.e., τc → ∞, while the general form of P(τ ) with τc = 103 τmin is used
in panel (b). The inset shows the same result as in panel (b), but in a semi-log scale. Each point and
its standard deviation are obtained from 50 event sequences of size n = 5 × 105 . Reprinted figure
with permission from Jo and Hiraoka (2018) Copyright (2018) by the American Physical Society

relevant to predict the speed of propagation, e.g., of disease or information. One of


the crucial and widely studied features is the heterogeneities of IETs in the temporal
interaction patterns. It was shown that the bursty interaction patterns can slow down
the early-stage spreading by comparing the simulated spreading behaviors in some
empirical networks and in their randomized versions (Vazquez 2007; Karsai et al.
2011; Perotti et al. 2014). The opposite tendency was also reported using another
empirical network or model networks (Rocha et al. 2011; Rocha and Blondel 2013;
Jo et al. 2014).
In contrast to the effects of heterogeneous IETs on the spreading, yet little is known
about the effects of correlations between IETs on the spreading, except for few recent
works (Artime et al. 2017; Masuda and Rocha 2018). This could be partly because the
contagion dynamics studied in many previous works, e.g., susceptible-infected (SI)
dynamics (Pastor-Satorras et al. 2015), has focused on an immediate infection upon
the first contact between susceptible and infected nodes, hence without the need to
consider correlated IETs. In another work (Gueuning et al. 2015), probabilistic con-
tagion dynamics, which naturally involves multiple consecutive IETs, was studied
by assuming heterogeneous but uncorrelated IETs. Therefore, the effects of hetero-
geneous and correlated IETs on the spreading need to be systematically studied for
better understanding the dynamical processes in complex systems.
To study the spreading dynamics, one can consider one of the extensively stud-
ied epidemic processes, i.e., susceptible-infected (SI) dynamics (Pastor-Satorras
et al. 2015): A state of each node in a network is either susceptible or infected,
and an infected node can infect a susceptible node by the contact with it. Here
we assume that the contact is instantaneous. One can study a probabilistic SI
dynamics, in which an infected node can infect a susceptible node with probability
9 Bursty Time Series Analysis for Temporal Networks 177

(a)
r time
u
r0
v
τi τi+1 τi+l

(b)
r
u
r0
v
τi

(c) r
u
r0
v
τi τi+1
Fig. 9.5 Schematic diagrams for a the probabilistic susceptible-infected (SI) dynamics, b the one-
step deterministic SI dynamics, and c the two-step deterministic SI dynamics. For each node, the
susceptible or intermediate state is represented by a dashed horizontal line, while the infected state
is by a solid horizontal line. In each panel, a node u gets infected in the time denoted by an upper
vertical arrow, then it tries to infect its susceptible neighbor v whenever they make contact (vertical
lines). The successful infection of v by u is marked by a lower vertical arrow. The time interval
between the infection of u and that of v (striped band) defines the transmission time r . For the
definitions of r0 and τ s, see the text. Figure in Hiraoka and Jo (2018) by Takayuki Hiraoka and
Hang-Hyun Jo is licensed under CC BY 4.0

η (0 < η < 1) per contact, as depicted in Fig. 9.5a. Due to the stochastic nature of
infection, multiple IETs can be involved in the contagion, hence the correlations
between IETs in the contact patterns can influence the spreading behavior. The case
with η = 1 corresponds to the deterministic version of SI dynamics: A susceptible
node is immediately infected after its first contact with an infected node, see Fig. 9.5b.
Finally, for studying the effect of correlations between IETs on the spreading in a
simpler setup, we introduce two-step deterministic SI (“2DSI” in short) dynam-
ics (Hiraoka and Jo 2018) as a variation of generalized epidemic processes (Janssen
et al. 2004; Dodds and Watts 2004; Bizhani et al. 2012; Chung et al. 2014), see
Fig. 9.5c. Here a susceptible node first changes its state to an intermediate state upon
its first contact with an infected node; it then becomes infected after the second con-
tact with the same or another infected node. Below we only introduce the results for
2DSI dynamics from Hiraoka and Jo (2018).
178 H.-H. Jo and T. Hiraoka

(b) α = 1.5 (c) α = 2 (d) α = 3


0.2 0.6
(a) α = 1.5, k = 4 0.03
103 0.15
0.4
0.02

a
0.1

0.01 0.2
0.05
2
10
0 0 0
0 0.1 0.2 0.3 0 0.1 0.2 0.3 0 0.1 0.2 0.3
〈I(t)〉

M M M
(e) α = 1.5 (f) α = 2 (g) α = 3
M = 0.00
1
0.05 1 1 1
10 0.10
0.15 0.9 0.9 0.9
0.20
0.8 0.8 0.8
a/a0

a/a0

a/a0
0.25
0.30 k=3
0.7 0.7 0.7
0.35 k=4
0
10 0 200 400 600 800 0.6 0.6 0.6 k=5
t k=6
0.5 0.5 0.5
0 0.1 0.2 0.3 0 0.1 0.2 0.3 0 0.1 0.2 0.3
M M M

Fig. 9.6 Two-step deterministic SI dynamics in Bethe lattices: a Average numbers of infected
nodes as a function of time, I (t), in Bethe lattices with k = 4 for the same IET distribution with
power-law exponent α = 1.5 in Eq. (9.30), but with various values of memory coefficient M. For
each value of M, the average (dashed curve) and its standard error (shaded area) were obtained
from 103 runs with different initial conditions. b–g Estimated exponential growth rates a, defined
in Eq. (9.31) (top panels) and their relative growth rates a/a0 with a0 ≡ a(M = 0) (bottom panels)
are plotted for various values of k, α, and M. The lines are guides to the eye. Figure in Hiraoka
and Jo (2018) by Takayuki Hiraoka and Hang-Hyun Jo is licensed under CC BY 4.0

For modeling the interaction structure in a population, we focus on Bethe lattices as


networks of infinite size, where each node has k neighbors. As for the temporal contact
patterns, we assume that the contacts between a pair of nodes or on a link connecting
these nodes are instantaneous and undirected. Moreover, the contact pattern on each
link is assumed to be independent of the states of two end nodes as well as of contact
patterns on other links. The contact pattern on each link is modeled by a statistically
identical event sequence with heterogeneous and correlated IETs. For this, the shape
of IET distribution P(τ ) and the value of memory coefficient M are given as inputs of
the model. As for the IET distribution, we adopt P(τ ) in Eq. (9.30). We fix τmin = 1
without loss of generality and set τc = 103 in our work. Based on the empirical
findings for α (Karsai et al. 2018), we consider the case with 1.5 ≤ α ≤ 3. Secondly,
only the positive memory coefficient M is considered, precisely, 0 ≤ M < 0.4, based
on the empirical observations (Goh and Barabási 2008; Wang et al. 2015; Guo et al.
2017; Böttcher et al. 2017).
Precisely, for each link, we draw n random values from P(τ ) to make an IET
sequence T = {τ1 , . . . , τn }, for sufficiently large n. Using Eq. (9.7), we measure the
memory coefficient from T , denoted by M̃. Two IETs are randomly chosen in T
and swapped only when this swapping makes M̃ closer to M, i.e., the target value.
By repeating the swapping, we obtain the IET sequence whose M̃ is close enough to
M, and from this IET sequence we get the sequence of contact timings for each link.2

2Another algorithm for generating bursty time series using the copula has recently been sug-
gested (Jo et al. 2019).
9 Bursty Time Series Analysis for Temporal Networks 179

Then the temporal network can be fully described by a set of contact timings for all
links. Each simulation begins with one node infected at random in time, which we set
as t = 0, while all other nodes are susceptible at this moment. For each simulation,
we measure the number of infected nodes as a function of time, I (t). The average
number of infected nodes I (t) is found to exponentially increase with time, e.g.,
as shown in Fig. 9.6a:
I (t) ∼ eat , (9.31)

where a = a(k, α, M) denotes the exponential growth rate, known as the Malthusian
parameter (Kimmel and Axelrod 2002). a turns out to be a decreasing function of
M, indicating the slowdown of spreading due to the positive correlation between
IETs, see Fig. 9.6b–d. The slowdown can be more clearly presented in terms of the
relative growth rate a/a0 with a0 ≡ a(M = 0) for all cases of k and α, as shown in
Fig. 9.6e–g. We summarize the main observations from the numerical simulations as
follows:
1. a decreases with M.
2. a increases with α.
3. a increases with k.
4. The deviation of a/a0 from 1 tends to be larger for smaller α.
For understanding these observations, we provide an analytical solution for the
transmission time in a single link setup. Let us consider a link connecting nodes u
and v, see Fig. 9.5. If u gets infected from its neighbor other than v in time tu , and
later it infects v in time tv , the time interval between tu and tv defines the transmission
time r ≡ tv − tu . Here we assume that v is not affected by any other neighbors than
u, for the sake of simplicity. In order for the infected u to infect the susceptible v, u
must wait at least for the next contact with v. This waiting or residual time is denoted
by r0 . For the 2DSI dynamics, the transmission process involves two consecutive
IETs. If the infection of u occurs during the IET of τi , then the transmission time is
written as
r = r0 + τi+1 , (9.32)

with τi+1 denoting the IET following τi . Information on the correlations between τi
and τi+1 is carried by the joint distribution P(τi , τi+1 ) or the conditional distribution
P(τi+1 |τi ). Using P(τi+1 |τi ) with τi+1 = r − r0 , the transmission time distribution
is written as  r  ∞
1
R(r ) = dr0 dτi P(τi )P(r − r0 |τi ), (9.33)
τ  0 r0

where it is obvious from Eq. (9.32) that τi ≥ r0 and 0 ≤ r0 ≤ r . The average trans-
mission time is calculated as
 ∞  
1 σ2 1
r  ≡ drr R(r ) = τ  + + τi τi+1 , (9.34)
0 2 τ  τ 
180 H.-H. Jo and T. Hiraoka

where  ∞  ∞
τi τi+1  ≡ dτi dτi+1 τi τi+1 P(τi , τi+1 ). (9.35)
0 0

In order to relate this result to the memory coefficient in Eq. (9.7), we define a
parameter as
τi τi+1  − τ 2
M≡ (9.36)
σ2
to finally obtain the analytical result of the average transmission time:
 
3 1 σ2
r  = τ  + +M . (9.37)
2 2 τ 

We remark that our result in Eq. (9.37) is valid for arbitrary functional forms of
IET distributions as long as their mean and variance are finite. M is coupled with
σ 2 /τ , implying that the impact of correlations between IETs becomes larger with
broader IET distributions. More importantly, we find that a stronger positive corre-
lation between consecutive IETs leads to a larger average transmission time. This
can be understood in terms of the role of the variance of IETs in the average trans-
mission time. That is, the variance of the sum of two consecutive IETs is amplified
by the positive correlation between those IETs. Based on the result of the single-link
analysis, we can understand the numerical results in Fig. 9.6: The decreasing a with
M is expected from Eq. (9.37), so is the increasing a with α as both τ  and σ 2 /τ 
decrease with α. The observation that the deviation of a/a0 from 1 tends to be larger
for smaller α implies that the effect of M becomes larger for smaller α, which can be
roughly understood by a larger value of σ 2 /τ  coupled to M in Eq. (9.37). Finally,
the increasing a with the degree k is trivial, while the analytical approach to this
dependency is not trivial, calling for more rigorous investigation.

9.4 Discussion

In this chapter, we have introduced various measures and characterizations for bursty
time series analysis and showed how they can be related to each other. Yet more
rigorous studies need to be done for understanding such relation comprehensively.
In the context of temporal networks, the superposition of event sequences of links
incident to a node can result in the event sequence of the node. Then bursty behaviors
of a node can be understood in terms of those of links incident to the node. For
analyzing the relation between bursty behaviors in nodes and links, one can adopt
the notion of contextual bursts by which the scaling behaviors of IET distributions of
nodes and links can be systematically understood (Jo et al. 2013). Researchers can
also study how the correlations between IETs in one node or link are related to those
9 Bursty Time Series Analysis for Temporal Networks 181

in other nodes or links, how such inter-correlations can be properly characterized,


and how they can affect the dynamical processes taking place in temporal networks.

Acknowledgements The authors were supported by Basic Science Research Program through
the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-
2018R1D1A1A09081919).

References

S. Abe, N. Suzuki, Violation of the scaling relation and non-Markovian nature of earthquake after-
shocks. Phys. A 388(9), 1917–1920 (2009). (May)
R. Albert, A.-L. Barabási, Statistical mechanics of complex networks. Rev. Mod. Phys. 74(1), 47–97
(2002). (January)
O. Artime, J.J. Ramasco, M. San Miguel, Dynamics on networks: competition of temporal and
topological correlations. Sci. Rep. 7, 41627 (2017)
A.-L. Barabási, The origin of bursts and heavy tails in human dynamics. Nature 435(7039), 207–211
(2005). (May)
G. Bizhani, M. Paczuski, P. Grassberger, Discontinuous percolation transitions in epidemic pro-
cesses, surface depinning in random media, and Hamiltonian random graphs. Phys. Rev. E 86(1),
011128 (2012). (July)
L. Böttcher, O. Woolley-Meza, D. Brockmann, Temporal dynamics of online petitions. PLOS One
12(5), e0178062 (2017)
D. Boyer, M.C. Crofoot, P.D. Walsh, Non-random walks in monkeys and humans. J. R. Soc. Interface
9(70), 842–847 (2012). (May)
C. Cattuto, W. Van den Broeck, A. Barrat, V. Colizza, J.-F. Pinton, A. Vespignani, Dynamics of
person-to-person interactions from distributed RFID sensor networks. PLoS One 5(7), e11596
(2010). (July)
K. Chung, Y. Baek, D. Kim, M. Ha, H. Jeong, Generalized epidemic process on modular networks.
Phys. Rev. E 89(5), 052811 (2014). (May)
Á. Corral, Long-term clustering, scaling, and universality in the temporal occurrence of earthquakes.
Phys. Rev. Lett. 92, 108501 (2004). (March)
L. de Arcangelis, C. Godano, E. Lippiello, M. Nicodemi, Universality in solar flare and earthquake
occurrence. Phys. Rev. Lett. 96(5), 051102 (2006). (February)
J.C. Delvenne, R. Lambiotte, L.E.C. Rocha, Diffusion on networked systems is a question of time
or structure. Nat. Commun. 6, 7366 (2015)
P.S. Dodds, D.J. Watts, Universal behavior in a generalized model of contagion. Phys. Rev. Lett.
92(21), 218701 (2004). (May)
J.-P. Eckmann, E. Moses, D. Sergi, Entropy of dialogues creates coherent structures in e-mail traffic.
Proc. Natl. Acad. Sci. 101(40), 14333–14337 (2004). (October)
L. Gauvin, M. Génois, M. Karsai, M. Kivelä, T. Takaguchi, E. Valdano, C.L. Vestergaard, Random-
ized reference models for temporal networks (2018). arXiv:1806.04032
K.I. Goh, A.L. Barabási, Burstiness and memory in complex systems. EPL (Eur. Lett.) 81, 48002
(2008)
M. Gueuning, J.-C. Delvenne, R. Lambiotte, Imperfect spreading on temporal networks. Eur. Phys.
J. B 88(11), 282 (2015). (November)
F. Guo, D. Yang, Z. Yang, Z.-D. Zhao, T. Zhou, Bounds of memory strength for power-law series.
Phys. Rev. E 95(5), 052314 (2017). (May)
T. Hiraoka, H.-H. Jo, Correlated bursts in temporal networks slow down spreading. Sci. Rep. 8(1),
15321 (2018). (October)
P. Holme, J. Saramäki, Temporal networks. Phys. Rep. 519(3), 97–125 (2012). (October)
182 H.-H. Jo and T. Hiraoka

J.L. Iribarren, E. Moro, Impact of human activity patterns on the dynamics of information diffusion.
Phys. Rev. Lett. 103(3), 038702 (2009). (July)
H.K. Janssen, M. Müller, O. Stenull, Generalized epidemic process and tricritical dynamic perco-
lation. Phys. Rev. E 70(2), 026114 (2004)
Z.Q. Jiang, W.J. Xie, M.X. Li, B. Podobnik, W.X. Zhou, H.E. Stanley, Calling patterns in human
communication dynamics. Proc. Natl. Acad. Sci. 110(5), 1600–1605 (2013)
Z.-Q. Jiang, W.-J. Xie, M.-X. Li, W.-X. Zhou, D. Sornette, Two-state Markov-chain Poisson nature
of individual cellphone call statistics. J. Stat. Mech: Theory Exp. 2016(7), 073210 (2016). (July)
H.H. Jo, Analytically solvable autocorrelation function for correlated interevent times (2019).
arXiv:1901.00982
H.H. Jo, M. Karsai, J. Kertész, K. Kaski, Circadian pattern and burstiness in mobile phone com-
munication. New J. Phys.14(1), 013055 (2012)
H.H. Jo, B.H. Lee, T. Hiraoka, W.S. Jung, Copula-based algorithm for generating bursty time series
(2019). arXiv:1904.08795
H.-H. Jo, Modeling correlated bursts by the bursty-get-burstier mechanism. Phys. Rev. E 96(6),
062131 (2017). (December)
H.-H. Jo, T. Hiraoka, Limits of the memory coefficient in measuring correlated bursts. Phys. Rev.
E 97(3), 032121 (2018). (March)
H.-H. Jo, R.K. Pan, J.I. Perotti, K. Kaski, Contextual analysis framework for bursty dynamics. Phys.
Rev. E 87, 062131 (2013). (June)
H.-H. Jo, J.I. Perotti, K. Kaski, J. Kertész, Analytically solvable model of spreading dynamics with
non-Poissonian processes. Phys. Rev. X 4(1), 011041 (2014). (March)
H.-H. Jo, J.I. Perotti, K. Kaski, J. Kertész, Correlated bursts and the role of memory range. Phys.
Rev. E 92(2), 022814 (2015). (August)
M. Karsai, K. Kaski, A.L. Barabási, J. Kertész, Universal features of correlated bursty behaviour.
Sci. Rep. 2, 397 (2012)
M. Karsai, M. Kivelä, R.K. Pan, K. Kaski, J. Kertész, Albert-László Barabási, J. Saramäki, Small
but slow world: how network topology and burstiness slow down spreading. Phys. Rev. E 83(2),
025102 (2011)
M. Karsai, K. Kaski, J. Kertész, Correlated dynamics in egocentric communication networks. PLoS
One 7(7), e40612 (2012). (July)
M. Karsai, H.-H. Jo, K. Kaski, Bursty Human Dynamics (Springer International Publishing, Cham,
Switzerland, 2018). (January)
T. Kemuriyama, H. Ohta, Y. Sato, S. Maruyama, M. Tandai-Hiruma, K. Kato, Y. Nishida, A
power-law distribution of inter-spike intervals in renal sympathetic nerve activity in salt-sensitive
hypertension-induced chronic heart failure. Biosystems 101(2), 144–147 (2010). (August)
E.-K. Kim, H.-H. Jo, Measuring burstiness for finite event sequences. Phys. Rev. E 94, 032311
(2016). (September)
M. Kimmel, D.E. Axelrod, Branching Processes in Biology, vol. 19 (Springer, New York, New
York, NY, 2002)
B.-H. Lee, W.-S. Jung, H.-H. Jo, Hierarchical burst model for complex bursty dynamics. Phys. Rev.
E 98(2), 022316 (2018). (August)
S.B. Lowen, M.C. Teich, Fractal renewal processes generate 1/f noise. Phys. Rev. E 47, 992–1001
(1993). (February)
F. Mainardi, R. Gorenflo, A. Vivoli, Beyond the Poisson renewal process: a tutorial survey. J.
Comput. Appl. Math. 205(2), 725–735 (2007). (August)
R.D. Malmgren, D.B. Stouffer, A.S.L.O. Campanharo, L.A. Amaral, On universality in human
correspondence activity. Science 325(5948), 1696–1700 (2009)
N. Masuda, P. Holme, Predicting and controlling infectious disease epidemics using temporal net-
works. F1000Prime Rep. 5, 6 (2013)
N. Masuda, L.E.C. Rocha, A Gillespie algorithm for non-Markovian stochastic processes. SIAM
Rev. 60(1), 95–115 (2018)
9 Bursty Time Series Analysis for Temporal Networks 183

N. Masuda, R. Lambiotte, A Guide to Temporal Networks Series on complexity science. (World


Scientific, New Jersey, 2016)
G. Miritello, E. Moro, R. Lara, Dynamical strength of social ties in information spreading. Phys.
Rev. E 83(4), 045102 (2011). (April)
M.E.J. Newman. Networks: An Introduction, 1st edn. (Oxford University Press, 2010)
P. Panzarasa, M. Bonaventura, Emergence of long-range correlations and bursty activity patterns in
online communication. Phys. Rev. E 92(6), 062821 (2015). (December)
R. Pastor-Satorras, C. Castellano, P. Van Mieghem, A. Vespignani, Epidemic processes in complex
networks. Rev. Mod. Phys. 87(3), 925–979 (2015). (August)
J.I. Perotti, H.H. Jo, P. Holme, J. Saramäki, Temporal network sparsity and the slowing down of
spreading (2014). arXiv:1411.5553
L.E.C. Rocha, V.D. Blondel, Bursts of vertex activation and epidemics in evolving networks. PLOS
Comput. Biol. 9(3), e1002974 (2013)
L.E.C. Rocha, F. Liljeros, P. Holme, Simulated epidemics in an empirical spatiotemporal network
of 50,185 sexual contacts. PLOS Comput. Biol. 7(3), e1001109 (2011)
D. Rybski, S.V. Buldyrev, S. Havlin, F. Liljeros, H.A. Makse, Communication activity in a social
network: relation between long-term correlations and inter-event clustering. Sci. Rep. 2, 560
(2012). (August)
A. Sorribes, B.G. Armendariz, D. Lopez-Pigozzi, C. Murga, G.G. de Polavieja, The origin of
behavioral bursts in decision-making circuitry. PLoS Comput. Biol. 7(6), e1002075 (2011). (June)
A. Stopczynski, V. Sekara, P. Sapiezynski, A. Cuttone, M.M. Madsen, J.E. Larsen, S. Lehmann,
Measuring large-scale social networks with high resolution. PLoS One 9(4), e95978 (2014).
(April)
T. Takaguchi, N. Masuda, P. Holme, Bursty communication patterns facilitate spreading in a
threshold-based epidemic dynamics. PLoS One 8(7), e68629 (2013). (July)
S. Vajna, B. Tóth, J. Kertész, Modelling bursty time series. New J. Phys. 15(10), 103023 (2013)
A. Vazquez, Impact of memory on human dynamics. Phys. A 373, 747–752 (2007). (January)
W. Wang, N. Yuan, L. Pan, P. Jiao, W. Dai, G. Xue, D. Liu, Temporal patterns of emergency calls
of a metropolitan city in China. Phys. A 436, 846–855 (2015). (October)
M.S. Wheatland, P.A. Sturrock, J.M. McTiernan, The waiting-time distribution of solar flare hard
X-Ray bursts. Astrophys. J. 509, 448–455 (1998). (December)
T. Yasseri, R. Sumi, A. Rung, A. Kornai, J. Kertész, Dynamics of conflicts in Wikipedia. PLoS One
7(6), e38869 (2012). (June)
Chapter 10
Challenges in Community Discovery
on Temporal Networks

Remy Cazabet and Giulio Rossetti

Abstract Community discovery is one of the most studied problems in network


science. In recent years, many works have focused on discovering communities in
temporal networks, thus identifying dynamic communities. Interestingly, dynamic
communities are not mere sequences of static ones; new challenges arise from their
dynamic nature. Despite the large number of algorithms introduced in the literature,
some of these challenges have been overlooked or little studied until recently. In this
chapter, we will discuss some of these challenges and recent propositions to tackle
them. We will, among other topics, discuss of community events in gradually evolving
networks, on the notion of identity through change and the ship of Theseus paradox,
on dynamic communities in different types of networks including link streams, on
the smoothness of dynamic communities, and on the different types of complexity of
algorithms for their discovery. We will also list available tools and libraries adapted
to work with this problem.

Keywords Temporal networks · Community detection

10.1 Introduction

The modular nature of networks is one of the most studied aspects of network science.
In most real-world networks, a mesoscale organization exists, with nodes belonging
to one or several modules or clusters (Newman 2006): think of groups in social
networks (groups of friends, families, organizations, countries, etc.), or biological
networks such as brain networks (Meunier et al. 2010). The term community is
commonly used in the network science literature to describe a set of nodes that are
grouped for topological reasons (e.g., they are strongly connected together and more

R. Cazabet (B)
Univ Lyon, UCBL, CNRS, LIRIS UMR 5205, F-69621 Lyon, France
e-mail: remy.cazabet@gmail.com
G. Rossetti
Knowledge Discovery and Data Mining Lab, ISTI-CNR, Pisa, Italy

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 185
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_10
186 R. Cazabet and G. Rossetti

weakly connected to the rest of the network. Other topological criteria exist, such
as having a high internal clustering, similar connection patterns, etc. See Sect. 10.3
for more on this topic). The literature on the topic is large and diverse, not only
on the topic of automatic community discovery but also on community evaluation,
analysis, or even generation of networks with realistic community structure. In the
last ten years, many works have focused on adapting those problems to temporal
networks (Rossetti and Cazabet 2018). In this chapter, we present an overview of the
active topics of research on dynamic communities. For each of these topics, when
relevant, we highlight some current challenges.
The chapter is organized into five parts. In the first one, we discuss the definition
of dynamic clusters in temporal networks, and how to represent them. In the second
section, we concentrate on the specificity of dynamic communities, in particular
focusing on smoothness, identity and algorithmic complexity. Section 10.4 focuses
on the differences between communities in different types of dynamic networks such
as link streams or snapshot sequences. In Sect. 10.5, we discuss the evaluation of
dynamic communities, using internal and external evaluation –requiring appropriate
synthetic benchmarks. Finally, in Sect. 10.6, we briefly introduce existing tools to
work with dynamic communities.

10.2 Representing Dynamic Communities

The first question to answer when dealing with communities is: what is a good
community? There is no universal consensus on this topic in the literature; thus, in
this article, we adopt a definition as large as possible:
Definition 10.1 (Community) A (static) community in a graph G = (V, E) is i) a
cluster (i.e., a set) of nodes C ⊆ V ii) having relevant topological characteristics as
defined by a community detection algorithm.
The second part of this definition will be discussed in Sect. 10.3, and is concerned by
the question of the quality of a set of nodes as a community, based on a topological
criterion. On the contrary, this section discusses the transposition of the first part
of this definition to temporal networks, i.e., the definition of dynamic node clusters
themselves, independently of any quality criteria. We use the term cluster in its data
analysis meaning, i.e., clusters are groups of items defined such as those items are
more similar (in some sense) to each other than to those in other groups (clusters).
To define dynamic clusters, we first need to define what is a temporal network.
This question will be discussed in detail in Sect. 10.4. For now, let’s adopt a generic
definition provided in Latapy et al. (2017), representing in an abstract way any type
of temporal network:
Definition 10.2 (Temporal Network) A temporal network, or stream graph, is
defined as S = (T, V, W, E), with V a set of nodes, T a set of time instants (contin-
uous or discrete), W ⊆ T × V , and E ⊆ T × V ⊗ V .
10 Challenges in Community Discovery on Temporal Networks 187

10.2.1 Fixed Membership Cluster in Temporal Networks

The first possible transposition of static clusters to temporal networks is to consider


memberships as fixed.
Definition 10.3 (Fixed Membership Cluster) A fixed membership cluster is defined
on a temporal network S = (T, V, W, E) as a cluster of nodes C ⊆ V .
In fixed membership clusters, nodes cannot change community along time. Commu-
nities identified using this definition in a temporal network are usually considered
relevant when the clustering they induce would be considered relevant according to
a static definition of communities (e.g., modularity) in most times t of the temporal
network. Those communities are different from static ones found in the aggregated
graph in that they take into account the temporal order of edges. Note that in some
algorithms such as stochastic block models, in which communities are defined not
only by sets of nodes but also by properties of relations between communities, those
properties might evolve, while membership themselves stay unchanged (e.g., Matias
et al. 2015). This approach can also be combined with change point detection to find
periods of the graph with stable community structures (Peel and Clauset 2014).

10.2.2 Evolving-Membership Clusters in Temporal Networks

In this second transposition of the definition of cluster, nodes can change membership
along time. Note that, for methods based on crisp communities, each node must
belong to one (and only one) community at each step, while less constrained methods
allow having nodes not belonging to any community (conversely, belonging to several
communities), in some or all steps.

Definition 10.4 (Evolving-Membership Cluster) An evolving-membership clus-


ter is defined on a temporal network S = (T, V, W, E) as a cluster C = {(t, v),
(t, v) ⊆ W }.

Dynamic communities using this type of clusters are usually considered relevant
when i) the clusters it defines at each t would be considered relevant according to a
static definition of communities (e.g., modularity) at each step t, and ii) the clusters it
defines at time t are relatively similar to those belonging to the same dynamic cluster
at t − 1 and t + 1. This is related to the notion of dynamic community smoothness
discussed in Sect. 10.3.1.

10.2.2.1 Persistent-Labels Formalism

The usual way to implement this definition is by using what we call the persistent
labels formalism: community identifiers—labels—are associated with some nodes
188 R. Cazabet and G. Rossetti

over some periods. There is, therefore, no notion of being an ancestor/descendent of


another community: two nodes can either share a common label, and therefore be part
of the same dynamic community, or not. This representation is the most widespread,
used for instance in Mucha et al. (2010), Falkowski et al. (2006).

10.2.3 Evolving-Membership Clusters with Events

One of the most interesting features of dynamic communities is that they can undergo
events. Their first formal categorization was introduced in Palla et al. (2007), which
listed six of them (birth, death, growth, contraction, merge, and split). A seventh
operation, continue, is sometimes added to these. In Cazabet and Amblard (2014),
an eighth operation was proposed (resurgence). These events, illustrated in Fig. 10.1,
are the following:
• Birth: The first appearance of a new community composed of any number of
nodes.
• Death: The vanishing of a community: all nodes belonging to the vanished com-
munity lose this membership.
• Growth: New nodes increase the size of a community.
• Contraction: Some nodes are lost by a community, thus reducing its size.
• Merge: Two communities or more merge into a single one.
• Split: A community, as a consequence of node/edge vanishing, splits into two or
more components.
• Continue: A community remains unchanged in consecutive time steps.
• Resurgence: A community vanishes for a period, then comes back without per-
turbations as if it has never stopped existing. This event can be seen as a fake
death-birth pair involving the same node set over a lagged period (e.g., seasonal
behaviors).
Not all operations are necessarily handled by a generic Dynamic Community Detec-
tion algorithm.
Let’s consider a situation in which two communities merge at time t. Using the
persistent-labels formalism introduced previously, this event can be represented in
two ways: either both clusters disappear at time t and a new one—the result of the
merge—is created, or one of the clusters becomes the merged one from time t, and
the other—considered absorbed—disappear. In both cases, important information is
lost. A third definition of evolving membership can be used to solve this problem:
Definition 10.5 (Evolving-membership clusters with events) Evolving membership
cluster with events are defined on a temporal network S = (T, V, W, E) as a set of
fixed-membership Cluster defined at each time t (or as a set of evolving-membership
clusters), and a set of community events F. Those events can involve several clusters
(merge, split), or a single one (birth, death, shrink, etc.)
10 Challenges in Community Discovery on Temporal Networks 189

Birth Death

Growth Contraction

Merge Split

Continue

Resurgence

t t+1 t+n-1 t+n

Fig. 10.1 Different types of community events


190 R. Cazabet and G. Rossetti

10.2.3.1 Event-Graph Formalism

In practice, most algorithms that do detect events record them in an ad-hoc manner
(e.g., the same event can be recorded as: “a split event occurred to community c1
at time t, yielding communities c1 and c2 ” or “community c2 was born at time t,
spawn from c1 . Different representations might even be semantically different. A
few works, notably (Greene et al. 2010), have used an alternative way to represent
dynamic communities and events, using what we call here an event-graph. We define
it as follows:

Definition 10.6 (Event Graph) An event graph is an oriented graph representing


dynamic communities of the temporal network S = (T, V, W, E), in which each
node corresponds to a pair C, t, with C ⊆ V, t ⊆ T , and each directed edge rep-
resents a relation of continuity between two communities, directed from the earlier
to the latter.

Using this representation, some events can be characterized using nodes in/out
degrees:
• In-degree=0 represents new-born communities
• In-degree≥2 represents merge events
• Out-degree=0 represents death events
• Out-degree≥2 represents split events
Events represented by an event graph can be much more complex than simple
merge/split, since, for instance, a node-community can have multiple out-going links
towards node-community having themselves multiple incoming ones.
Both representations, event-graph and persistent labels, have advantages and
drawbacks. The former can represent any event or relation between different com-
munities at different times, while the later can identify which community is the same
as which other one in a different time.

10.2.4 Community Life-Cycle

Identified events allow to describe for each cluster the life-cycle of its corresponding
community:

Definition 10.7 (Community Life-Cycle) Given a community C, its life-cycle (which


univocally identifies C’s complete evolution history) is composed of the directed
acyclic graph (DAG) such that (i) the roots are birth events of C, and of its potential
predecessors if C has been implicated in merge events; (ii) the leafs are death events,
corresponding to deaths of C and of its successors, if C has been implicated in split
events; and (iii) the central nodes are the remaining actions of C, its successors, and
predecessors. The edges of the tree represent transitions between subsequent actions
in C life.
10 Challenges in Community Discovery on Temporal Networks 191

Challenges

Usual events such as birth, merge or shrink were designed to describe a few steps of
evolution in the context of snapshot graphs, but are not well suited to describe com-
plex dynamics in networks studied at a fine temporal granularity. In real scenarios,
communities are susceptible to evolve gradually. A shrink event might corresponds
to different scenarios, such as a node switching to another community, a node leaving
the system (disappearing), or the community spouting a newborn community com-
posed of a subset of its nodes –and maybe, of other nodes. The usual representation
with only labels, even with the addition of some simple events, might be too limited
to represent the full range of possible community life-cycle. Defining a complete
framework to represent formally complex community evolution scenarios therefore
represents a challenge for researchers in the field.

10.3 Detecting Dynamic Communities

Defining what are good communities in networks is already a challenge in itself.


Community discovery is often used as an umbrella term for several related problems,
not sharing the same formal objective. It stems from earlier, well-defined problems,
in particular, graph partitioning, which consists, for a graph and given properties
of a partition (number and size of clusters), to find affiliations of nodes minimizing
the number of inter-cluster edges. This problem is well-defined, in that its objective
can be expressed unequivocally in mathematical terms, and has no trivial solution.
But having to provide the number and size of communities was considered too
constraining when working with real networks having unknown properties. New
methods were therefore introduced, based on ideas such as the modularity (Newman
and Girvan 2004), compression of random walks (Rosvall and Bergstrom 2008),
stochastic block models (SBM) and minimal description length (MDL) (Peixoto
2014), intrinsic properties of communities, and so on. While some of them—e.g.,
modularity—are based on the same principle of keeping (exceptionally) low the
number of inter-community edges, other techniques are searching for completely
different things, such as methods based on the Stochastic Block Model framework,
in which blocks are groups of nodes sharing a similar pattern of connections with
nodes belonging to other groups. Furthermore, communities are often categorized in
overlapping—one node can belong to several communities—and non-overlapping
(crisp) clustering methods. In this chapter, we make abstraction of those differences:
each algorithm has a definition of what are good static communities, and what we
focus on are challenges introduced when going from static to dynamic ones, in
particular the notions of temporal smoothness, of identity preservation, and finally
the problem of scalability of existing algorithms.
192 R. Cazabet and G. Rossetti

10.3.1 Different Approaches of Temporal Smoothness

In the process of searching for communities over an evolving topology, one of the
main questions that need to be answered is: how can the stability of the identified
solution be ensured? In static contexts, it has been shown that a generic algorithm
executed on the same network that experienced a few topological variations—or
even none in case of stochastic algorithms—might lead to different results (Aynaud
and Guillaume 2010). The way Dynamic Community Discover (henceforth, DCD)
algorithms take into account this problem plays a crucial role in the degree of stability
of the solutions they can identify, i.e., on their smoothness. In Rossetti and Cazabet
(2018) DCD algorithms were grouped in three main categories, depending on the
degree of smoothness they aim for:
• Instant Optimal: it assumes that communities existing at time t only depend on the
current state of the network at t. Matching communities found at different steps
might involve looking at communities found in previous steps, or considering all
steps, but communities found at t are considered optimal concerning the topology
of the network at t. By definition, algorithms falling in this family are not tem-
porally smoothed. Examples of Instant Optimal algorithms are Palla et al. (2007),
Rosvall and Bergstrom (2010), Takaffoli et al. (2011), Chen et al. (2010).
• Temporal Trade-off: it assumes that communities defined at time t depend not only
on the topology of the network at t but also on the past topology, past identified
partitions, or both. Communities at t are therefore defined as a trade-off between an
optimal solution at t and the known past. They do not depend on future topological
perturbations. Conversely, from Instant Optimal approaches, the Temporal Trade-
off ones are incrementally temporally smoothed. Examples of Temporal Trade-off
algorithms are Görke et al. (2010), Cazabet et al. (2010), Rossetti et al. (2017),
Folino and Pizzuti (2010).
• Cross-Time: algorithms of this class focus on searching communities relevant
when considering the whole network evolution. Methods of this class search a
single temporal partition that encompasses all the topological evolution of the
observed network: communities identified at time t depend on both past and future
network structures. Methods in this class produce communities that are completely
temporally smoothed. Examples of Cross-Time algorithms are Aynaud and Guil-
laume (2011), Matias and Miele (2017), Ghasemian et al. (2016), Jdidia et al.
(2007), Viard et al. (2016).
All three classes of approaches have advantages and drawbacks; none is superior
to the other since they model different DCD problem definition. Nevertheless, we
can observe how each one of them is more suitable for some specific use cases. For
instance, if the final goal is to provide on-the-fly community detection on a network
that will evolve in the future, Instant Optimal and Temporal Trade-off approaches
represent the most suitable fit since they do not require to know in advance all the
topological history of the analyzed network. Moreover, if the context requires work-
ing with a fine temporal granularity, therefore modeling the observed phenomena
10 Challenges in Community Discovery on Temporal Networks 193

with link streams instead of snapshots, it is suggested to avoid methods of the first
class, which are usually defined to handle well defined—stable—topologies.
Temporal smoothness and partition quality often play conflicting roles. We can
observe, for instance that, usually:
• Instant Optimal approaches are the best choice when the final goal is to provide
communities that are as good as possible at each step of the evolution of the
network;
• Cross-Time approaches are the best choice when the final goal is to provide com-
munities that are coherent in time, particularly over the long term;
• Temporal Trade-off approaches represent a trade-off between these other two
classes: they are the best choice in the case of continuous monitoring, rapidly
evolving data, and in some cases, limited memory applications. However, they
can be subject to “avalanche” effects due to the limited temporal information they
leverage to identify communities (i.e., partitions evolve based on local temporal-
optimal solutions that, on the long run may degenerate).

10.3.2 Preservation of Identity: The Ship of Theseus Paradox

The smoothness problem affects the way nodes are split into communities at each
time. A different notion is the question of identity preservation along time, which
arises in particular in case of a continued slow evolution of communities. It is well
illustrated by the paradox of the ship of Theseus. It is originally an ancient thought
experiment introduced by Plutarch about the identity of an object evolving through
time. It can be formulated as follows.
Let’s consider a famous ship, the ship of Theseus, composed of planks, and kept
in a harbor as a historical artifact. As time passes, some planks deteriorate and need
to be replaced by new ones. After a long enough period, all the original planks of the
ship have been replaced. Can we consider the ship in the harbor to still be the same
ship of Theseus? If not, at which point exactly did it ceased to be the same ship?
Another aspect of the problem arises if we add a second part to the story. Let’s
consider that the removed planks were stored in a warehouse, cleaned, and that a
new ship, identical to the original one, is built with them. Should this ship, just built
out, be considered as the real ship of Theseus, because it is composed of the same
elements?
Let’s call the original ship A, the ship that is in the harbor after all replacements
B, and the reconstructed from original pieces, C. In terms of dynamic community
detection, this scenario can be modeled by a slowly evolving community c1 (c1 = A),
from which nodes are removed one after the others, until all of them have been
replaced (c1 = B). A new community c2 appearing after that, composed of the same
nodes as the original community (c2 = C). See Fig. 10.2 for an illustration. A static
algorithm analyzing the state of the network at every step would be able to discover
that there is, at each step, a community (c1 , slowly evolving), and, at the end of the
194 R. Cazabet and G. Rossetti

Nodes

A C

Time

Fig. 10.2 Illustration of the ship of Theseus paradox. Each horizontal line represents a node. A
same color represents nodes belonging to the same community according to a topological criterion
(e.g., SBM). The community A is progressively modified until reaching state B. Community C is
composed of the same nodes as the other community at its start. Which cluster (B or C) has the
same identity as A? What if all details of the evolution are not known?

experiment, two communities (c1 and c2 ). But the whole point of dynamic community
detection is to yield a longitudinal description, and therefore, to decide when two
ships at different points in time are the same or not.
This problem has barely been considered explicitly in the literature. However,
each algorithm has implicitly to make a choice between which ship is the true ship of
Theseus. For instance, methods that are based on a successive match of communities,
such as Greene et al. (2010), consider that A and B are the same boats, but not A and
C. On the contrary, a method that matches similar clusters without the constraint of
being consecutive, such as Falkowski et al. (2006), consider that C is more likely
than B to be the same ship than A. Finally, methods such as Mucha et al. (2010) allow
to set what is the influence of time on similarity, and therefore, to choose between
those two extreme solutions.

Challenges

The question of identity preservation in dynamic communities has been little dis-
cussed and experimented in the literature. For the sake of simplicity, most proposed
methods use a mechanism of iterative matching or update of communities and there-
fore ignore the similarity between ships A and C. However, this situation is proba-
bly very common in real networks, for instance, when confronted with seasonal or
other cyclical patterns, where groups can disband and reform later. Developing new
methods aware of the choice made in terms of identity preservation is, therefore, a
challenge for the community.
10 Challenges in Community Discovery on Temporal Networks 195

10.3.3 Scalability and Computational Complexity

Early methods for community detection in static graphs had high computational
complexity (e.g., Girvan and Newman 2002), thus were not scalable to large graphs.
One part of the success of methods such as louvain (Blondel et al. 2008) or infomap
(Rosvall and Bergstrom 2008) is that they can handle networks of thousands of nodes
and millions of edges.
Dynamic graphs represent a new challenge in terms of complexity. Among exist-
ing algorithms, we can distinguish different categories
• Those whose complexity depends on the average size of the graph
• Those whose complexity depends on the number of graph changes.
Let’s consider the example of a (large) graph composed of n nodes and m
edges at time t, and which is evolving at the speed of k changes every step, for
s steps. Algorithms in the first category, such as identify & match methods, needs
to first compute communities at every step, thus their complexity is proportional to
sOC D (n, m) + (s − 1)O→ (n) with OC D (n, m) the complexity of the algorithm used
at each step, and O→ (n) the complexity of the matching process for communities
found on the n nodes.
Conversely, the complexity of an algorithm that update communities at each step
such as Cazabet et al. (2010) is roughly proportional (after the initial detection) to
sO+= (k), with O+= (k) the complexity of updating the community structure accord-
ing to k changes. As a consequence, the first category is more efficient in situations
where k is large, and n/m are small, while the second is more efficient when n/m
are large and k small. The complexity is not necessarily imposed by the adopted
definition of community. For instance, algorithms proposed in Palla et al. (2007) and
Boudebza et al. (2018) yields rigorously the same dynamic communities, but they
belong respectively to the first and second categories, as studied in Boudebza et al.
(2018).
Another aspect to consider is parallelization. Although the computation of OC D
on many steps might seem expensive, this task can straightforwardly be processed
in parallel. On the contrary, methods involving smoothing, or updating the structure
in order, cannot be parallelized, as they need to know the communities at time t to
compute communities at time t + 1. One must, therefore, consider the properties
of a temporal network to know which method will or will not be computationally
efficient on it.

Challenges

The complexity of DCD algorithms has barely been explored and represents an
important challenge to consider in future works. It is important to note that when
dynamic networks are considered at a fine temporal resolution as in link streams, the
number of edges (interactions) can be much larger than the number of nodes. For
196 R. Cazabet and G. Rossetti

instance, in the SocioPatterns Primary School dataset (Stehlé et al. 2011), more than
77 000 interactions are observed in a period spanning two days, despite having only
242 nodes. Algorithms developed for static algorithms use the sparsity of networks to
improve their efficiency, but such an approach might be less rewarding in temporal
networks. Analyzing the complexities of existing algorithms and developing new
ones adapted to fine temporal resolution is, therefore, a challenge for researchers of
the field.

10.4 Handling Different Types of Temporal Networks

Temporal networks can be modeled in different ways. Among the most common
framework, we can cite:
• Snapshot sequences, in which the dynamic is represented as an ordered series of
graphs
• Interval graphs (or series of change) (Holme and Saramäki 2012), in which
intervals of time are associated with edges, and sometimes nodes
• Link streams (Latapy et al. 2017), in which edges are associated with a finite set
of transient interaction times.
Each DCD algorithm is designed to work on a particular type of network rep-
resentation. For instance, Identify & Match approaches consists of first identifying
communities in each snapshot, and then matching similar communities across snap-
shots. Such a method is therefore designed to work (only) with snapshot sequences.
However, as it has been done in several articles, datasets can be transformed from one
representation to the other, for instance by aggregating link streams into snapshots
(e.g., Mucha et al. 2010), or into interval graphs (e.g., Cazabet et al. 2012); thus the
representation of the dynamic graph does not necessarily limit our capacity to use a
particular algorithm on a particular dataset.
We think however, that one aspect of the problem, related to representation, has
not yet been considered in the literature. Methods working with snapshots and with
interval graphs make the implicit assumptions that the graph any point in time is
well defined, i.e., that each snapshot—or the graph defined by all nodes and edges
present at any time t—is not null, has a well-defined community structure, and is
somewhat similar to neighboring snapshots. Said differently, those methods expect
progressively evolving graphs. To the best of our knowledge, this question has not
been studied in the literature. A practitioner creating a snapshot sequence from a link
stream using a too short sliding window (e.g., a window of one hour in a dataset of
email exchanges) might obtain a well-formed dynamic graph on which an Identify
& Match method can be applied, but the results would be inconsistent, as the com-
munity structure would not persist at such scales. The same dataset analyzed using
longer sliding windows might provide insightful results. The problem is particularly
pregnant for interval graphs, that can represent real situations of very different nature.
For instance, an interval graph could represent relations (friend/follower relation in
10 Challenges in Community Discovery on Temporal Networks 197

social networks) as well as interactions (phone calls, face-to-face interactions, etc.).


It is clear that both networks should not be processed in the same way.

Challenges

A challenge in the field will be to define the conditions of applicability of different


methods better, and theoretical grounds to define when a network needs transforma-
tion to become suitable to be analyzed by a given method.

10.5 Evaluation of Dynamic Communities

We have seen in previous sections that several approaches and methods exist to
discover communities in temporal networks. In this section, we first discuss the eval-
uation of community quality. This process often requires the generation of dynamic
networks with community structures, the topic of the second part of the section.

10.5.1 Evaluation Methods and Scores

As already discussed, there is not a single, universal definition of what is a good


community and, consequently, no unique and universal way to evaluate their quality.
Nevertheless, for static communities, many functions have been proposed, to evaluate
them either (i) intrinsically (internal evaluation), by means of quality functions, (e.g.,
Modularity, Conductance, etc.) and (ii) Relatively to a reference partition (external
evaluation), using a similarity function (e.g., NMI, aNMI, etc.). Both approaches
have pros and cons that have been thoroughly discussed in the literature (Peel et al.
2017; Yang and Leskovec 2015). Few works have been done to extend those functions
to the dynamic case.

10.5.1.1 Internal Evaluation

In most works, static quality functions are optimized at each step, often adding a
trade-off of similarity with temporally adjacent partitions to improve community
smoothness (see Sect. 10.3.1). Some works are based on a longitudinal adaptation of
the modularity (Mucha et al. 2010; Aynaud and Guillaume 2010), but they require to
create a new graph with added inter-snapshot edges, and therefore cannot be used to
evaluate algorithms based on different principles. Works based on Stochastic Block
Model (Matias and Miele 2017; Yang et al. 2009) also optimize a custom longitudinal
quality function.
198 R. Cazabet and G. Rossetti

10.5.1.2 External Evaluation

Articles doing external evaluation requires to have a reference partition. Since few
annotated datasets exist, a synthetic generator is used (see Sect. 10.5.2. The compari-
son often uses the average of a static measure (e.g., NMI) computed at each temporal
step (Bazzi et al. 2016), eventually weighted to take into account the evolution of
network properties (Rossetti 2017). A notable exception is found in Granell et al.
(2015), where windowed versions of similarity functions (Jaccard, NMI, NVI) are
introduced, by computing their contingency table on two successive snapshots at the
same time.

Challenges

The evaluation of the quality of dynamic communities, both internally and externally,
certainly represents a challenge for future works in dynamic community detection.
Methods directly adapted from the static case do not consider the specificity of
dynamic communities, in particular, the problems of smoothness and community
events. This question is of utmost importance, since, despite the large variety of
methods already proposed, their performances on real networks besides the ones
they have been designed to work on is still mostly unknown.

10.5.2 Generating Dynamic Graphs with Communities

Complex network modeling studies gave birth to a new field of research: synthetic
network generators. Generators allow scientists to evaluate their algorithms on syn-
thetic data whose characteristics resemble the ones that can be observed in real-world
networks. The main reason behind the adoption of network generators while ana-
lyzing the performance of a dynamic community detection (DCD) algorithm is the
ability to produce benchmark datasets that enable (i) Controlled environment testing,
e.g., in term on network size, dynamics, structural properties, etc., and (ii) comparison
with a planted ground-truth.
Two families of network generators have been described to provide benchmarks
for DCD algorithms: generators that produce static graphs-partitions and generators
that describe dynamic graphs-partitions. Static graphs are used to evaluate the quality
of the detection at a single time t, and cannot inform about the smoothness of com-
munities. The most known are the GN benchmark (Girvan and Newman 2002), the
LFR benchmark (Lancichinetti and Fortunato 2009) and planted partitions according
to the stochastic block model.
Several methods have been proposed to generate dynamic networks with commu-
nities. The network can be composed of a sequence of snapshots, as in Bazzi et al.
(2016), in which, at each step, the community structure (based on an SBM) drifts
according to a user-defined inter-layer dependency. Another approach consists in
10 Challenges in Community Discovery on Temporal Networks 199

having an initial partition yielded by a static algorithm (LFR in Greene et al. 2010,
GN in Lin et al. (2008)), and to make it evolves randomly (Greene et al. 2010) or
until reaching an objective network with a different community structure (Lin et al.
2008).
Finally, another class of methods generates slowly evolving networks whose
changes are driven by community events—merge, split, etc.—that can be tuned
with parameters such as the probability of event occurrences. One of these methods
is RDyn Rossetti (2017), whose communities are based on a similar principle than
LFR. Another method has been proposed in Sengupta et al. Sengupta et al. (2017),
which has the particularity of generating overlapping community structures.

Challenges

As we have seen, various methods already exist to generate dynamic graphs with
slowly evolving communities. They have different properties, such as community
events, stable edges, or overlapping communities. Active challenges are still open
in this domain, among them (i) The generation of link streams with community
structures, (ii) The empirical comparison of various DCD methods on those bench-
marks, and (iii) An assessment on the realism of communities generated with such
benchmarks, compared with how empirical dynamic communities behave.

10.6 Libraries and Standard Formats to Work with


Dynamic Communities

In recent years, many tools and software have been developed to manipulate and pro-
cess network data. Many of those tools have implemented community detection algo-
rithms. Among the best known, we can cite networkx (Hagberg et al. 2008), iGraph
(Csardi and Nepusz 2006) and snap (Leskovec and Sosič 2016), which propose a
wide variety of network analysis tools, among them community detection algorithms,
and related quality functions and scores. Some libraries are even designed specifi-
cally for community detection such as CDlib.1 However, none of them can deal with
dynamic networks. Very recently, a few libraries have been introduced to work with
dynamic networks, such as tacoma2 and pathpy (Scholtes 2017) but do not include
community detection algorithms.
Furthermore, no standard format has yet emerged to represent dynamic communi-
ties and their evolution, which is particularly a problem to compare solutions yielded
by different methods. This lack of common tools and standard representation cer-
tainly represents an obstacle, and a challenge to overcome for the DCD research
community.

1 https://github.com/GiulioRossetti/cdlib/tree/master/cdlib.
2 https://github.com/benmaier/tacoma.
200 R. Cazabet and G. Rossetti

10.7 Conclusion

In this chapter, we have introduced the theoretical aspects of dynamic community


detection and highlighted some of the most interesting challenges in the field. Among
them, we think that a better formalism to represent the evolution of dynamic clusters
and their events, in particular in the context of gradually evolving communities, would
facilitate the comparison and the evaluation of communities and detection methods.
The scalability of existing approaches is also a concern, again, in the context of link
streams or other temporal networks studied at fine temporal scales. Finally, a recently
introduced technique graph embedding, has attracted a lot of attention in various
domains. Applications exist to temporal networks, although no work has focused on
the dynamic community detection problem yet, to the best of our knowledge. Using
this new technique to propose scalable methods could be another challenge worthy
of investigation.

References

T. Aynaud, J.L. Guillaume, Static community detection algorithms for evolving networks, in Pro-
ceedings of the 8th International Symposium on Modeling and Optimization in Mobile, Ad Hoc
and Wireless Networks (WiOpt) (IEEE, 2010), pp. 513–519
T. Aynaud, J.L. Guillaume, Multi-step community detection and hierarchical time segmentation in
evolving networks, in Proceedings of the 5th SNA-KDD Workshop (2011)
M. Bazzi, L.G. Jeub, A. Arenas, S.D. Howison, M.A. Porter, Generative benchmark models for
mesoscale structure in multilayer networks (2016). arXiv:1608.06196
V.D. Blondel, J.L. Guillaume, R. Lambiotte, E. Lefebvre, Fast unfolding of communities in large
networks. J. Stat. Mech. Theory Exp. 2008(10), 10,008 (2008)
S. Boudebza, R. Cazabet, F. Azouaou, O. Nouali, Olcpm: an online framework for detecting over-
lapping communities in dynamic social networks. Comput. Commun. 123, 36–51 (2018)
R. Cazabet, F. Amblard, Dynamic community detection, in Encyclopedia of Social Network Analysis
and Mining (Springer, 2014), pp. 404–414
R. Cazabet, F. Amblard, C. Hanachi, Detection of overlapping communities in dynamical social
networks, in 2010 IEEE Second International Conference on Social Computing (IEEE, 2010),
pp. 309–314
R. Cazabet, H. Takeda, M. Hamasaki, F. Amblard, Using dynamic community detection to identify
trends in user-generated content. Soc. Netw. Anal. Mining 2(4), 361–371 (2012)
Z. Chen, K.A. Wilson, Y. Jin, W. Hendrix, N.F. Samatova, Detecting and tracking community
dynamics in evolutionary networks, in 2010 IEEE International Conference on Data Mining
Workshops (IEEE, 2010), pp. 318–327
G. Csardi, T. Nepusz, The igraph software package for complex network research. Int. J. Complex
Syst. 1695 (2006). http://igraph.org
T. Falkowski, J. Bartelheimer, M. Spiliopoulou, Mining and visualizing the evolution of subgroups
in social networks, in IEEE/WIC/ACM International Conference on Web Intelligence (WI) (IEEE,
2006), pp. 52–58
F. Folino, C. Pizzuti, Multiobjective evolutionary community detection for dynamic networks, in
GECCO (2010), pp. 535–536
A. Ghasemian, P. Zhang, A. Clauset, C. Moore, L. Peel, Detectability thresholds and optimal
algorithms for community structure in dynamic networks. Phys. Rev. X 6(3), 031,005 (2016)
10 Challenges in Community Discovery on Temporal Networks 201

M. Girvan, M.E. Newman, Community structure in social and biological networks. Proc. Natl.
Acad. Sci. 99(12), 7821–7826 (2002)
R. Görke, P. Maillard, C. Staudt, D. Wagner, Modularity-driven clustering of dynamic graphs, in
International Symposium on Experimental Algorithms (Springer, 2010), pp. 436–448
C. Granell, R.K. Darst, A. Arenas, S. Fortunato, S. Gómez, Benchmark model to assess community
structure in evolving networks. Phys. Rev. E 92(1), 012,805 (2015)
D. Greene, D. Doyle, P. Cunningham, Tracking the evolution of communities in dynamic social
networks, in International Conference on Advances in Social Networks Analysis and Mining
(ASONAM) (IEEE, 2010), pp. 176–183
A. Hagberg, P. Swart, D.S. Chult, Exploring network structure, dynamics, and function using net-
workx. Technical Report, Los Alamos National Lab.(LANL), Los Alamos, NM (United States)
(2008)
P. Holme, J. Saramäki, Temporal networks. Phys. Rep. 519(3), 97–125 (2012)
M.B. Jdidia, C. Robardet, E. Fleury, Communities detection and analysis of their dynamics in col-
laborative networks, in 2007 2nd International Conference on Digital Information Management,
vol. 2 (IEEE, 2007), pp. 744–749
A. Lancichinetti, S. Fortunato, Benchmarks for testing community detection algorithms on directed
and weighted graphs with overlapping communities. Phys. Rev. E 80(1), 016,118 (2009)
M. Latapy, T. Viard, C. Magnien, Stream graphs and link streams for the modeling of interactions
over time (2017). CoRR arXiv.org/abs/1710.04073
J. Leskovec, R. Sosič, Snap: a general-purpose network analysis and graph-mining library. ACM
Trans. Intell. Syst. Technol. (TIST) 8(1), 1 (2016)
Y.R. Lin, Y. Chi, S. Zhu, H. Sundaram, B.L. Tseng, Facetnet: a framework for analyzing communities
and their evolutions in dynamic networks, in Proceedings of the 17th International Conference
on World Wide Web (WWW) (ACM, 2008), pp. 685–694
C. Matias, V. Miele, Statistical clustering of temporal networks through a dynamic stochastic block
model. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 79(4), 1119–1141 (2017)
C. Matias, T. Rebafka, F. Villers, Estimation and clustering in a semiparametric poisson process
stochastic block model for longitudinal networks (2015)
D. Meunier, R. Lambiotte, E.T. Bullmore, Modular and hierarchically modular organization of brain
networks. Front. Neurosci. 4, 200 (2010)
P.J. Mucha, T. Richardson, K. Macon, M.A. Porter, J.P. Onnela, Community structure in time-
dependent, multiscale, and multiplex networks. Science 328(5980), 876–878 (2010)
M.E. Newman, Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103(23),
8577–8582 (2006)
M.E. Newman, M. Girvan, Finding and evaluating community structure in networks. Phys. Rev. E
69(2), 026,113 (2004)
G. Palla, A.L. Barabási, T. Vicsek, Quantifying social group evolution. Nature 446(7136), 664–667
(2007)
L. Peel, A. Clauset, Detecting change points in the large-scale structure of evolving networks (2014).
CoRR arXiv.org/abs/1403.0989
L. Peel, D.B. Larremore, A. Clauset, The ground truth about metadata and community detection in
networks. Sci. Adv. 3(5), e1602,548 (2017)
T.P. Peixoto, Hierarchical block structures and high-resolution model selection in large networks.
Phys. Rev. X 4(1), 011,047 (2014)
G. Rossetti, Rdyn: graph benchmark handling community dynamics. J. Complex Netw. (2017).
https://doi.org/10.1093/comnet/cnx016
G. Rossetti, R. Cazabet, Community discovery in dynamic networks: a survey. ACM Comput.
Surveys (CSUR) 51(2), 35 (2018)
G. Rossetti, L. Pappalardo, D. Pedreschi, F. Giannotti, Tiles: an online algorithm for community
discovery in dynamic social networks. Mach. Learn. 106(8), 1213–1241 (2017)
M. Rosvall, C.T. Bergstrom, Maps of random walks on complex networks reveal community struc-
ture. Proc. Natl. Acad. Sci. 105(4), 1118–1123 (2008)
202 R. Cazabet and G. Rossetti

M. Rosvall, C.T. Bergstrom, Mapping change in large networks. PloS one 5(1), e8694 (2010)
I. Scholtes, When is a network a network?: Multi-order graphical model selection in pathways
and temporal networks, in Proceedings of the 23rd ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining (ACM, 2017), pp. 1037–1046
N. Sengupta, M. Hamann, D. Wagner, Benchmark generator for dynamic overlapping communities
in networks, in 2017 IEEE International Conference on Data Mining (ICDM) (IEEE, 2017), pp.
415–424
J. Stehlé, N. Voirin, A. Barrat, C. Cattuto, L. Isella, J. Pinton, M. Quaggiotto, W. Van den Broeck, C.
Régis, B. Lina, P. Vanhems, High-resolution measurements of face-to-face contact patterns in a
primary school. PLOS ONE 6(8), e23,176 (2011). https://doi.org/10.1371/journal.pone.0023176
M. Takaffoli, F. Sangi, J. Fagnan, O.R. Zaïane, Modec-modeling and detecting evolutions of com-
munities, in 5th International Conference on Weblogs and Social Media (ICWSM) (AAAI, 2011),
pp. 30–41
T. Viard, M. Latapy, C. Magnien, Computing maximal cliques in link streams. Theoret. Comput.
Sci. 609, 245–252 (2016)
J. Yang, J. Leskovec, Defining and evaluating network communities based on ground-truth. Knowl.
Inf. Syst. 42(1), 181–213 (2015)
T. Yang, Y. Chi, S. Zhu, Y. Gong, R. Jin, A bayesian approach toward finding communities and
their evolutions in dynamic social networks, in Proceedings of the International Conference on
Data Mining (SIAM, 2009), pp. 990–1001
Chapter 11
Information Diffusion Backbone
From the Union of Shortest Paths to the Union of
Fastest Path Trees

Huijuan Wang and Xiu-Xiu Zhan

Abstract Information diffusion on a network, either static or temporal (time evolv-


ing), has been modelled by e.g. shortest path routing and epidemic spreading pro-
cesses. Information is assumed to diffuse along the shortest or fastest path/trajectory.
Not all the links contribute to the information diffusion, namely, appear in a diffusion
trajectory. For example, a link with a large weight in a static network seldom appears
in the shortest path between any node pair. We address the question which kind of
links are more likely to appear in a diffusion trajectory in two scenarios: information
diffuses along the shortest path on a static weighted network and through the fastest
path trees governed by the Susceptible-Infected epidemic spreading on a temporal
network. We construct the information diffusion backbone to record the probability
for each (static or temporal) link to appear in an information diffusion trajectory.
Our theory and simulations show how network link weights influence the backbone.
Our findings about links with what local topological properties contribute more to
the actual information diffusion is crucial to tackle optimization problems such as
which node pairs should be stimulated to link and at what time in order to maximize
the speed or prevalence of information spreading.

Keywords Information diffusion · Temporal network · Backbone · Weighted


network · Shortest path · Fastest path · Betweenness

11.1 Introduction

Many types of complex networks are designed to serve the diffusion of infor-
mation, traffic, epidemic or behavior, ranging from telecommunications networks,
social networks to railway transportation networks. How information (traffic or epi-
demic) propagates on a network has been modeled by various processes whereas the

H. Wang (B) · X.-X. Zhan


Delft University of Technology, Mekelweg 4, 2628 CD Delft, The Netherlands
e-mail: H.Wang@tudelft.nl
X.-X. Zhan
e-mail: zhanxxiu@gmail.com
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 203
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_11
204 H. Wang and X.-X. Zhan

underlying network can be static or evolving over time i.e. temporal. For example,
transport on e.g. the Internet and transportation networks such as railway, airline
and roadway networks is mainly carried along the shortest paths. In a weighted
network, the shortest path between two nodes is the path the minimizes the sum
of the weights along the path, where the weight of a link may represent the time
delay, distance or cost when information traverses the link. We consider the Internet
and transportation networks static since the evolution of their topology is far slower
than the information diffusion process. Social networks can be considered to be
static where nodes represent the individuals and where links indicate the relationship
between nodes such as friendship (Barabási 2016). The information diffusion process
on a social network has been modeled by e.g., independent cascade models (Watts
2002), threshold models (Granovetter 1978) and epidemic spreading models (Liu
et al. 2015; Pastor-Satorras et al. 2015a; Zhang et al. 2016; Wang et al. 2013; Pastor-
Satorras et al. 2015b; Qu and Wang 2017b). Take the Susceptible-Infected (SI) model
as an example. Each individual can be in one of the two states: susceptible (S) or
infected (I) where the infected state means that the node possesses the information.
An infected node could spread the information to a susceptible neighbor with a rate
β. An infected individual remains infected forever. Recently, the temporal nature
of contact networks have been taken into account in the spreading processes, i.e.,
the contacts or connection between a node pair occur at specific time stamps (the
link between nodes is time dependent) and information could possibly propagate
only through contacts (or temporal links) (Holme 2015; Holme and Saramäki 2012;
Scholtes et al. 2014; Valdano et al. 2015; Zhang et al. 2017). In the SI spreading
process on a temporal network (Pastor-Satorras et al. 2015a; Zhang et al. 2016), a
susceptible node could get infected with an infection probability β via each contact
with an infected node.
Significant progress has been made in understanding how the network topology
affects emergent properties such as the distribution of the shortest path length (Chen
et al. 2006), the prevalence of (percentage of nodes reached by) an epidemic or
information (Pastor-Satorras et al. 2015a; Qu and Wang 2017b; Qu and Wang 2017a).
In this work, we explore another, less-studied problem: which links (i.e. the static
or temporal connection of which node pairs) are likely to contribute to the actual
diffusion of information, i.e., appear in a diffusion trajectory? To solve this problem,
we will firstly introduce how to construct the information diffusion backbone, a
weighted network to represent the probability for each link to appear in an information
diffusion trajectory. Our research question is equivalent to how the probability that
a link appears in an diffusion trajectory (the weight of the link in the backbone) is
related to local network properties of the link or its two ending nodes.
To address this question, we consider two scenarios as examples: (i) the shortest
path transport on a static network and (ii) the SI spreading process on a temporal
network. These scenarios correspond to a deterministic process on a static network
and a stochastic process on a temporal network. Information is assumed to diffuse
along the shortest path and the fastest path trees respectively.
11 Information Diffusion Backbone 205

11.2 Network Representation

The topology of a static network G can be represented by an adjacency matrix A


consisting of elements A(i, j) that are either one or zero depending on whether node
i is connected to j or not. If the network is weighted, the adjacency matrix is also
weighted. In this case, each element A(i, j) is equal to the weight of the link if i is
connected to j or is equal to zero when i is not connected to j.
A temporal network G = (N , L ) records the contacts between each node pair
at each time step within a given observation time window [0, T ]. N is the set of
nodes, whose size, i.e., the number of nodes is N = |N |, and L = {l( j, k, t), t ∈
[0, T ], j, k ∈ N } is the set of contacts, where the element l( j, k, t) indicates that
the nodes j and k have a contact at time step t. A temporal network can also be
described by a three-dimensional binary adjacency matrix A N ×N ×T , where the ele-
ments A ( j, k, t) = 1 and A ( j, k, t) = 0 represent, respectively, that there is a con-
tact or no contact between the nodes j and k at time step t.
An integrated weighted network G W = (N , LW ) can be derived from a tempo-
ral network G by aggregating the contacts between each node pair over the entire
observation time window T . In other words, two nodes are connected in G W if there
is at least one contact between them in G . Each link l( j, k) in LW is associated with
a weight w jk counting the total number of contacts between node j and k in G . The
integrated weighted network G W can therefore be described by a weighted adjacency
matrix A N ×N , with its element


T
A( j, k) = A ( j, k, t) (11.1)
t=1

counting the number of contacts between a node pair. An example of a temporal


network G and its integrated weighted network G W are given in Fig. 11.1a and b,
respectively.

11.3 Shortest Paths in Static Networks

11.3.1 Construction of the Backbone

We construct the information diffusion backbone as the union of links that possibly
appear in an information diffusion trajectory where the weight of each link in the
backbone represents the probability that the link appears in an information diffusion
trajectory. In case of the shortest path transport on a weighted network G, the infor-
mation diffusion trajectory between a node pair is the shortest path, the path that
minimizes the sum of the weights over all the links in that path. The topology of the
information diffusion backbone G B is the union of the shortest paths between all
206 H. Wang and X.-X. Zhan

Fig. 11.1 a A temporal network G with N = 5 nodes and T = 8 time steps. b The integrated
weighted network G W , in which a link exists between a node pair in G W as long as there is at least
one contact between them in G . The weight of a link in G W is the number of contacts between the
two nodes in G . c Diffusion or fastest path tree Ti (β), where node i is the seed and infection rate
is β = 1. d Diffusion backbone G B (1), where the infection probability β = 1 in the SI diffusion
process. The weight of a node pair represents the number of times it appears in all the diffusion
path trees

possible node pairs. The betweenness of a link in the underlying network is the total
number of shortest paths between all node pairs that traverse this link (Goh et al.
2002; Wang et al. 2008). Hence, the link weight in the backbone is the betweenness
of that link in the underlying network normalized by the total number of node pairs
(N (N − 1))/2.
If the underlying network is unweighted, the topology of the backbone is the
same as the topology of the the underlying network, because every link in the static
network is the shortest path between its two ending nodes. If the underlying network
is weighted, the topology of the backbone is possibly a sub-graph of the underlying
network. Links in the underlying network that have a zero betweenness do not appear
in the backbone.

11.3.2 Network with i.i.d. Polynomial Link Weights

As an example, we consider the weight w of each link in G is an independently iden-


tically distributed (i.i.d.) random variable, that follows the polynomial distribution,

Fw (x) = Pr[w ≤ x] = x α 1x∈[0,1) + 1x∈[1,∞) , α > 0, (11.2)


11 Information Diffusion Backbone 207

where the indicator function 1z is one if z is true else it is zero. The corresponding
probability density is f w (x) = αx α−1 , 0 < x < 1. The exponent

log Fw (x)
α = lim
x↓0 log x

is called the extreme value index of the probability distribution. One motivation
to consider the polynomial distribution is that link weights around zero primarily
influence shortest paths. The polynomial distribution can capture distinct behavior
for small values by tuning the α. If α → ∞, we can obtain w = 1 almost surely for all
links according to Eq. (11.2). The We can consider the network as unweighted. When
α → 0, all link weights will be close to 0, but, relatively, they differ significantly with
each other. When α = 1, the polynomial distribution becomes a uniform distribution.
The Minimum Spanning Tree MST on a weighted network is a tree that spans
over all the nodes and for which the sum of the weights of all its links is minimal.
If the weight of each link is independently identically distributed and if the link
weight distribution is sufficiently broad that it reaches the strong disorder region, the
backbone topology equals the MST on the underlying network G. Van Mieghem and
Magdalena (2005) have found that, by tuning the extreme value index α of the poly-
nomial link weight distribution, a phase transition occurs around a critical extreme
value αc . The critical extreme value αc is defined through the relation H (αc ) = 21
 
where H (α) = Pr G ∪spt(α) = MST and G B = G ∪spt(α) is the backbone topology
i.e. the union of all the shortest path trees. When α > αc , the backbone contains more
than N − 1 links whereas for α < αc , the backbone equals the MST, consisting of
N − 1 links.
Van Mieghem and Wang found the same phase transition curve H (α) = Pr [G B =
MST] versus α/αc , in diverse types of networks (see Fig. 11.2) (Van Mieghem and
Wang 2009). As α increases, the transport is more likely to traverse over more links
and the backbone G ∪spt(α) will less likely become a tree. This finding strengthens the
 2
α

belief that the curve FT (α) ≈ 2 αc is universal for all networks that are not trees.
Which kind of links in G tend to have a high link betweenness, equivalently a high
weight in the backbone? Does a low link weight implies a high link betweenness
in G? Table 11.1 shows the linear correlation coefficient between the weight and
betweenness of a link in different network models for different index values α Wang
et al. (2008). The critical extreme value index αc ≤ 0.1 in 2D-lattice with N = 100
nodes and 3D-lattice with N = 125 nodes Van Mieghem and Wang (2009) and αc =
0.2 in Erdős-Rényi random graphs Van Mieghem and Magdalena (2005). When link
weights are relatively weakly disordered, i.e., α ≥ 1, the link weight and betweenness
are evidently and negatively correlated, implying a link with a low weight tends to
have a high betweenness, a high weight in the backbone. When link weights are in
relatively strong disorder e.g. when αc = 0.2, the correlation strength is relatively
weak, because the weight of a path i.e. the sum of the link weight over all the
links in that path is dominated by the largest link weight (Braunstein et al. 2007).
Furthermore, the correlation is shown to be dependent on the underlying graph as
208 H. Wang and X.-X. Zhan

1.0

N = 200, KN
N = 100, KN
N = 100, 2D-lattice
0.8 N = 125, 3D-lattice
N = 216, 3D-lattice
N = 200, power law τ = 2.0
N = 200, power law τ = 2.4
H(α) = Pr[GB = MST]

0.6 N = 300, power law τ = 2.4

0.4

0.2

0.0
0.1 1 10
α/αc

Fig. 11.2 The probability H (α) = Pr [G B = M ST ] that the backbone topology equals the MST
as a function of α/αc . Three classes of networks are considered: complete graphs K N with N nodes,
2D- and 3D lattice and Havel-Hakimi power law networks (Chartrand and Lesniak 1996; Chartrand
and Oellermann 1992)

Table 11.1 The correlation coefficient between weight and betweenness of a link
α = 0.2 α = 1.0 α = 2.0 α = 4.0 α = 8.0 α = 16.0
G ∪spt on −0.06 −0.61 −0.70 −0.78 −0.84 −0.84
G 0.4 (100)
G ∪spt on −0.22 −0.53 −0.54 −0.53 −0.53 −0.53
2D-lattice
N = 100
G ∪spt on −0.18 −0.60 −0.66 −0.67 −0.68 −0.68
3D-lattice
N = 125
G ∪spt on −0.12 −0.53 −0.66 −0.60 −0.50 −0.49
BA N =
100, m = 3
11 Information Diffusion Backbone 209

well as on the extreme value index α. For homogeneous network models such as
the Erdős-Rényi random graphs, the correlation strength increases monotonically
as α increases. When α → ∞, all links have the same weight and almost the same
betweenness. As α decreases, or equivalently as the strength of the disorder in link
weight increases, a link with a low link weight tends to have a high betweenness.
However, in a non-homogeneous topology such as the Barabási-Albert BA power law
networks, the betweenness of a link depends on both the weight and the connectivity
of the link. That is why the correlation strength decreases after a maximum has been
reached as α increases. When the underlying network is unweighted, i.e., α → ∞,
how the betweenness of link is related to the local network features, e.g., of the two
end nodes of the link is still far from well understood (Goh et al. 2002; Wang et al.
2008).
Besides the number of links and the link weights in the backbone that we focus
on in this chapter, other network properties of the backbone such as the degree distri-
bution, the spectrum and the path length of a shortest path have been explored (Chen
et al. 2006; Van Mieghem and Wang 2009).

11.3.3 Link Weight Scaling

We assume information diffuses along the shortest path, where the weight of a link
may represent the distance, delay, monetary cost etc. In functional brain networks,
the correlation between the activities measured at any two brain regions by e.g.
magnetoencephalography (MEG) can be defined and derived Wang et al. (2010). To
compute the shortest paths, the weight of the link between the two regions/nodes can
be defined as e.g. one minus the correlation value or the reciprocal of the correlation
value (Rubinov and Sporns 2010). Different choices of the link weight including
the correlation may lead to different features of the backbone. We will illustrate the
choice of the link weight by the following theorem.
Let w > 0 be a weight of an arbitrary link in a given weighted network G(N , L)
with N nodes and L links. We construct a new graph G α1 (N , L) by scaling the weight
1
w of each link as w α1 , where α1 > 0 but without changing the network topology.
The backbone of G α1 (N , L) is denoted as G B (α1 ). Similarly, we can obtain weighted
1
network G α2 (N , L) by scaling each link weight as w α2 , whose backbone is G B (α2 )
and α2 > 0. The following theorem shows that backbone with a smaller value of α
is always included in the backbone with a larger value of α. In other words, all links
in G B (α2 ) belong to G B (α1 ) if α1 ≥ α2 > 0.
Theorem 11.1 If α1 ≥ α2 > 0, then G B (α2 ) ⊂ G B (α2 ).

Proof We order the original set of link weights in G(N , L) as w(1) ≥ w(2) ≥ . . . ≥
w(L) , where w(i) denotes the i − th smallest link weight and 1 ≤ i ≤ L. The ordering
of the link weights in G α1 after link weight transformation with parameter α1 is
α1 α1 α1
unchanged, w(1) ≥ w(2) ≥ . . . ≥ w(L) . The same holds for G α2 . The ordering of the
210 H. Wang and X.-X. Zhan

weight of all the links is independent of α > 0. Our proof is by contradiction. Assume
that there exist a link with rank k in G α2 that belongs to G B (α2 ), but this link in G α1
does not belong to G B (α2 ). The link with rank k connects the nodes A and B in both
G B (α1 ) and G B (α2 ). The fact that k ∈
/ G B (α1 ) means that there exists a path P AB
between nodes A and B, such that

α
1  α
1

wk 1 > wi 1 (11.3)
i∈P AB ;i<k

where, importantly, the rank condition i < k for a link implies that each link in
1 1 α2 −α1
α α α α2
P AB must have a smaller weight than the link with rank k. Since w(i)1 = w(i)2 w(i)1 ,
inequality (11.3) can be rewritten as

α
1
α
1 α2 −α1
α α2
 α
1 α2 −α1
α α2
w(k)1 = w(k)2 w(k)1 > w(i)2 w(i)1
i∈P AB ;i<k

α2 −α1 α2 −α1
α α2 α α2
Since i < k, α1 ≥ α2 , it holds that w(i)1 > w(k)1 and
 α
1 α2 −α1
α α2
α2 −α1
α α2
 α
1

w(i)2 w(i)1 > w(k)1 w(i)2


i∈P AB ;i<k i∈P AB ;i<k

Hence,
α
1  α
1

w(k)2 > w(i)2


i∈P AB ,i<k

which contradicts the hypothesis that the link with rank k ∈ G B (α2 ).
This inclusion theorem illustrates the effect of the choice of the link weights,
specifically the scaling of the link weights on the link density of the backbone. This
finding is in line with what we have observed before: link weights in strong disorder
or with a high variance tends to lead to a sparse backbone or heterogeneous traffic
distribution.

11.4 SI Spreading Process on Temporal Networks

We explore further the other extreme scenario, where a stochastic process, e.g., the
SI spreading process unfolds on a temporal network (Zhan et al. 2019).
11 Information Diffusion Backbone 211

11.4.1 Construction of the Backbone

We first construct the backbone when the infection probability of the SI spreading
process is β = 1. At time step t = 0, the seed node i is infected and all the other
nodes are susceptible. The trajectory of the SI diffusion on G started from root i
can be recorded by a diffusion path tree Ti (β), also called the fastest path tree.
The diffusion path tree Ti (β) records the union of contacts, via which information
reaches each of the rest N − 1 nodes in the earliest time. A diffusion tree, composed
of maximally N − 1 contacts is actually the union of fastest paths to reach the rest
N − 1 nodes. We define the diffusion backbone G B (β) = (N , L B (β)) as the union

N
of all diffusion/fastest path trees, i.e., Ti (β), that start at each node as the seed
i=1
node. The node set of G B (β) equals the node set N in the underlying temporal
network G = (N , L ). Nodes are connected in G B (β) if they are connected in
any diffusion path tree. Each link in L B (β) is associated with a weight w Bjk , which
denotes the number of times link ( j, k) (i.e. contact between j and k) appears in all
diffusion path trees. An example of how to construct the diffusion backbone is given
wB
in Fig. 11.1c and d for β = 1. The ratio Njk indicates the probability that link ( j, k)
appears in a diffusion trajectory starting from an arbitrary seed node.
When 0 < β < 1, the diffusion process is stochastic. In this case, we construct
the backbone as the average of a number of realizations of the backbones. In each
realization, we perform the SI process starting from each node serving as the seed for
information diffusion, obtain the diffusion path trees and construct one realization
of the diffusion backbone. The weight w Bjk of a link in G B (β) is the average weight
of this link over the h realizations. The computational complexity of constructing
G B (β) is O(N 3 T h), where T is the length of the observation time window of the
temporal network in number of time steps.

11.4.2 Real-World Temporal Networks

Description and basic features

We consider a large number of empirical temporal networks that capture two types
of contacts, i.e., (face-to-face) proximity and electronic communication (mailing and
messaging) contacts. These temporal networks and their basic statistical properties
are provided in Table 11.2. The networks are measured at discrete time steps, whereas
the duration of a time step differs among the datasets. We have removed the time steps
without any contact in order to consider the steps that are relevant for information
diffusion and to avoid the periods that have no contact due to technical errors in
measurements.
212 H. Wang and X.-X. Zhan

Table 11.2 Basic properties of a list of empirical networks: The number of nodes (N ), the original
length of the observation time window (T in number of steps), the total number of contacts (|C |),
the number of links (|LW |) in G W and contact type
Network N T |C | |LW | Contact type
Reality mining (RM) Reality 96 33,452 1,086,404 2,539 Proximity
mining network
dataset–KONECT (2023),
Eagle and Pentland (2006)
Hypertext 2009 113 5,246 20,818 2,196 Proximity
(HT2009) Hypertext 2009
network dataset–KONECT
(2023), Isella et al. (2011)
High school 2011 126 5,609 28,561 1,710 Proximity
(HS2011) Fournet and Barrat
(2014)
High school 2012 180 11,273 45,047 2,220 Proximity
(HS2012) Fournet and Barrat
(2014)
High school 2013 327 7,375 188,508 5,818 Proximity
(HS2013) Mastrandrea et al.
(2015)
Primary school (PS) Stehlé 242 3,100 125,773 8,317 Proximity
et al. (2011)
Workplace (WP) Génois et al. 92 7,104 9,827 755 Proximity
(2015)
Manufacturing email 167 57,791 82,876 3,250 Electronic
(ME) Manufacturing emails communication
network dataset–KONECT
(2023), Michalski et al.
(2011)
Email Eu (EEU) Leskovec 986 207,880 332,334 16,064 Electronic
et al. (2007) communication
Haggle Haggle network 274 15,662 28,244 2,124 Proximity
dataset–KONECT (2023),
Chaintreau et al. (2007)
Infectious Isella et al. (2011) 410 1,392 17,298 2,765 Proximity
DNC email (DNC) Dnc 1866 1,8682 37,421 4,384 Electronic
emails network communication
dataset–KONECT (2023)
Collegemsg Panzarasa et al. 1899 5,8911 59,835 13,838 Electronic
(2009) communication

Observation time windows

We aim to understand which node pairs are likely to be connected in the backbone,
thus to contribute to a diffusion process and how such connection in the backbone is
11 Information Diffusion Backbone 213

related to this node pair’s local temporal connection properties. However, real-world
temporal networks are measured for different lengths T of time windows as showing
in Table 11.2. If a diffusion process has a relatively high spreading probability and
the temporal network has a relatively long observation time window, most nodes can
be reached within a short time. The temporal contacts happened afterwards will not
contribute to the diffusion process. Hence, we will select the time windows such
that all contacts within each selected time window could possibly contribute, or
equivalently, are relevant to a diffusion process. On the other hand, we will consider
several time windows for each temporal network. This will allow us to understand
how the time window of a temporal network may influence the relation between the
backbones of different spreading probabilities and relation between a node pair’s local
connection properties and its connection in the backbone. We select the observation
time windows for each measured temporal network within its original time window
[0, T ] as follows. On each measured temporal network with its original observation
time window [0, T ], we conduct the SI diffusion process with β = 1 by setting each
node as the seed of the information diffusion process and plot the average prevalence
ρ as a function of time, which is the time step t normalized by the original length
of observation window T (see Fig. 11.3). The average prevalence at the end of the
observation is recorded as ρ(t/T = 1). The time to reach the steady state varies
significantly across the empirical networks. The prevalence curves ρ of the last four
networks in the list (i.e., Haggle, Infectious, DNC and Collegemsg) increase slowly
and approximately linearly over the whole period. In the other networks, however, the
diffusion finishes or stops earlier and contacts happened afterwards are not relevant
for the diffusion process.
For each real-world temporal network with its original length of observation time
window T , we consider the following lengths of observation time windows: the
time T p% when the average prevalence reaches p%, where p ∈ {10, 20, . . . , 90} and
p% < ρ(t = T ). For a given empirical temporal network G = (N , L ), we consider
maximally nine observation time windows. For each length T p% , we construct a
sub-temporal network, G p% = (N , L p% ), in which L p% include contacts in L
that occur earlier than T p% . The lengths of observation time window T p% for the
empirical networks are given in Zhan et al. (2019). For a network like RM, we can
get nine sub-networks and for network like Infectious, we can only obtain five sub-
networks. In total, we obtain 106 sub-networks. Contacts in all these sub-networks
are possibly relevant for a SI diffusion process with any spreading probability β.
Without loss of generality, we will consider all these sub-networks with diverse
lengths of observation time windows and temporal network features to study the
relationship between diffusion backbones and temporal connection features.

11.4.3 Relationship Between Diffusion Backbones

What are the relationships among the backbones G B (β) with different spreading
probabilities β ∈ [0, 1] on the same temporal network? When the infection proba-
214 H. Wang and X.-X. Zhan

Fig. 11.3 Average prevalence ρ of the SI spreading process with β = 1 on each original empirical
temporal network over time. The time steps are normalized by the corresponding observation time
window T of each network

bility β → 0, the backbone G B (β → 0) approaches the integrated weighted network


G W if the network is finite regarding to its size and number of contacts, as proved in
Zhan et al. (2019).
We denote G B (β → 0)  G B (β = 0) = G W except that the weight of each node
pair in the two networks are scaled. When the infection probability β is small, node
pairs with more contacts are more likely to appear in the backbone. The backbone
G B (β) varies from G B (0) = G W when β → 0 to G B (1) when β = 1. G B (β = 0)
well approximates the backbones with a small β. Similarly, G B (1) well approximates
the backbones with a large β. When the observation time window of a temporal
network is small, the backbones with different β are relatively similar in topology.

Degree of a Node in Different Backbones

From now on, we focus on the two extreme backbones G B (0) = G W and G B (1). A
node pair that has at least one contact may not necessarily contribute to a diffusion
process. Hence, the degree of a node in G B (0) is larger or equal to its degree in
G B (1). A universal finding is that the degree of a node in these two backbones tend
to be linearly and positively correlated in all the empirical networks, where the linear
correlation coefficient between the degree of a node in these two backbones is above
0.7 for all networks Zhan et al. (2019). Since G B (1) is a sub-graph of G W , the degrees
of a node in these two backbones tend to be correlated if these two backbones have a
similar number of links. However, the two backbones may differ much in number of
11 Information Diffusion Backbone 215

Fig. 11.4 Scatter plots of each node’s degree in G W and in G B (1) for networks PS and Infectious
with the longest observation window respectively

links in many temporal networks, especially those with a long observation window.
Figure 11.4 shows the scatter plot of the degree of each node in G W and G B (1)
respectively for the network with the longest observation window from two datasets
respectively. In both cases, the backbone G W and G B (1) differ much in the number
of links. Our observation suggests that a node that has contacts with many others
tends to be able to propagate the information directly to many others.
216 H. Wang and X.-X. Zhan

11.4.4 Identifying the Diffusion Backbone G B (1)

In this section, we investigate how to identify the (large weight) links in the backbone
G B (1) based on local and/or temporal connection properties of each node pair. The
key objective to understand how local and temporal connection features of a node
pair are related to its role in the diffusion backbone G B (1). Our investigation may
also allow us to approximate the backbone, whose computational complexity is high
(O(N 3 T )) base on local temporal features whose computational complexity is low.
We propose to consider systematically a set of local temporal features of a node
pair and examine whether node pairs having a higher value of each feature/metric
tend to be connected in the backbone G B (1). Some of these properties are derived
from the integrated network G W whereas the feature Time-scaled Weight that we
will propose encodes also the time stamps of the contacts between a node pair. These
node pair properties or metrics are:
• Time-scaled Weight of a node pair ( j, k) is defined as


n
1
φ jk (α) = ( (m)
)α (11.4)
m=1 t jk

where n is the total number of contacts between j and k over the given observation
window and t (i)jk is the time stamp when the i’th contact occurs and α is the scaling
parameter to control the contribution of temporal information. For the node pairs that
have no contact, their temporal weights are assumed to be zero. The motivation to
consider this metric is that when each node is set as the seed of a diffusion process
at time t = 0, contacts that happen earlier have a higher probability to contribute
to the actual diffusion, i.e. appear in G B (1). When α = 0, φ jk (0) = w Bjk (β = 0)
degenerates to the weight of the node pair in G W . When α is larger, node pairs with
early contacts have a relatively higher time-scaled weight.
• Degree Product of a node pair ( j, k) is the product d j (β = 0) · dk (β = 0) of
the degrees of j and k in the integrated network G W . Two nodes that are not con-
nected in G W , have a degree product zero. Given the degree of each node in G B (1)
and if the links are randomly placed as in the configuration model Newman et al.
(2001), the probability that a node pair (i, j) is connected in G B (1) is proportional to
d j (β = 1) · dk (β = 1), which approximates d j (β = 0) · dk (β = 0) since the degrees
of a node in G W and G B (1) are found to be strongly and positively correlated and
only node pairs connected in G W are possible to be connected in G B (1).
• Strength Product of a node pair ( j, k) is defined as the product s j (β = 0) ·
sk (β = 0) of the node strengths of j and k in G W , where the strength s j (β = 0) =
i∈N A( j, i) of a node in G W equals the total weight of all the links incident to
this node (Wang et al. 2010; Grady et al. 2012). Two nodes that are not connected in
G W are considered to have Strength Product zero.
11 Information Diffusion Backbone 217

• Betweenness of a link in G W counts the number of shortest paths between all


node pairs that traverse the link. The distance of each link, based on which the shortest
1
path is computed, is considered to be w B (β=0) , inversely proportional to its link weight
jk
in G W , because a node pair with more contacts tend to propagate information faster
(Newman 2001; Wang et al. 2008). If two nodes are not connected in G W , they have
a zero betweenness. Although betweenness is not a local topological property, we
consider it as a benchmark property that has been widely studied.
We explore further whether these node pair features could well identify the links
in G B (1). According to the definition of the aforementioned metrics, a higher value
of a metric may suggest the connection of the corresponding node pair in G B (1).
According to each metric, we rank the node pairs and the |L B (1)| node pairs with
the highest values are identified as the links in G B (1). The identification quality of
a metric, e.g. the time-scaled weight φ jk (α), is quantified as the overlap r between
the identified link set L B (φ jk (α)) and the link set L B (1) in G B (1)

|L B (φ jk (α)) ∩ L B (1)|
r = r (L B (φ jk (α)), L B (1))) = , (11.5)
|L B (1)|

We focus first on the time-scaled weight φ jk (α) and explore how the quality of
identifying links in G B (1) by using this metric is influenced by the scaling parameter
α. As shown in Fig. 11.5, the quality differs mostly when 0 ≤ α ≤ 2 and remains
relatively the same when α ≥ 2 in all temporal networks. Hence, we will confine
ourselves to the range 0 ≤ α ≤ 2.
The identification quality r by using each metric versus the ratio |L|LB W(1)| |
of the
number of links in G B (1) to that in G W are plotted in Fig. 11.6 for all the empirical
temporal networks, with different lengths of the observation time windows. The
diagonal curve r = |L|LB W(1)| |
corresponds to the quality of the random identification,
where |L B (1)| links are randomly selected from the links in G W as the identified links
in G B (1). Degree product, strength product and betweenness perform, in general,
worse than or similarly to the random identification. Even if the connections in G B (1)
were random given the degree of each node in G B (1), the quality r of identifying
links in G B (1) by using the degree product can be close to the quality of the random
identification, if the distribution of the degree product is relatively homogeneous or if
the |L|LB W(1)|
|
is large. The degree distribution in G B (1) is indeed relatively homogeneous
and |L|LB W(1)|
|
is large in most empirical networks. This explains why the degree product
performs similarly to the random identification. The link weight in G W , equivalently,
φ jk (α = 0), outperforms the random identification, whereas the time-scaled weight
φ jk (α) with a larger α performs better. Node pairs with many contacts that occur early
tend to contribute to the actual information propagation, i.e., be connected in G B (1).
This observation suggests that the temporal information is essential in determining
the role of nodes in a spreading process.
218 H. Wang and X.-X. Zhan

Fig. 11.5 The quality of identifying links in G B (1) by using the time-scaled weight φ jk (α) in
relation to α in temporal networks derived from datasets a RM, b HT2009, c HS2011 and d HS2012

We investigate further whether these metrics can identify the links with the highest
weights in G B (1). We choose the top f |L B (1)| node pairs according to each metric
as the identification of the top f fraction of links in G B (1) with the highest weights.
The quality r of identifying the top f fraction of links with the highest weight in
G B (1) is plotted in Fig. 11.7 for the networks with the longest observation window
from each dataset. The line r = f |L|LB W(1)|
|
corresponds to the quality of the random
identification. Similar to the identification of all the links in G B (1), the time-scaled
weight φ jk (α) with a large α performs the best in identifying highly weighted links in
G B (1), emphasizing again the important role of the temporal information of contacts.
11 Information Diffusion Backbone 219

Fig. 11.6 The quality of


identifying links in G B (1)
by using each metric for a all
the networks with all
possible observation
windows; b all the networks
with longest observation
windows. The time-scaled
weight with different α
values are considered
220 H. Wang and X.-X. Zhan

Fig. 11.7 The quality r of identifying top f fraction weighted links in G B (1) by using each metric
for each network with the longest observation window in each dataset. We consider the time-scaled
weight with α ∈ [0, 2]

11.5 Conclusion and Discussions

In this chapter, we address a generic question, namely, which links (either static
or temporal) are more likely to appear in an information diffusion trajectory. We
construct the backbone as the union of links that possibly appear in a diffusion
trajectory. Each link in the backbone is associated with a weight representing the
probability that the link appears in a diffusion trajectory. When information diffuses
along the shortest path on a static weighted network, the backbone is the union of
shortest paths. Both numerical simulations and our theory point out the importance
of the link weight in determining the a link’s probability to appear a diffusion trajec-
tory. When information propagates through the fastest path trees governed by the SI
spreading model, the backbone is the union of the fastest path trees rooted at every
possible seed node. The temporal information of the contacts between a node pair
turns out to be crucial in determining the node pair’s weight in the backbone. A node
pair with many early contacts tends to have a high weight in the backbone. Still,
11 Information Diffusion Backbone 221

using such local topological properties to predict the links that tend to appear in the
backbone or have a high weight in the backbone is far from accurate.
The backbones can be defined or constructed differently to capture various roles
of links in diverse types of diffusion trajectories. Grady et al. (2014) considered the
shortest path routing on a weighted network. They defined the salience of a link as the
probability that it appears in a shortest path tree rooted at an arbitrary node. The dif-
ference between link salience and link betweenness has been discussed in Shekhtman
et al. (2014). Considering unweighted static networks, Zhang et al. (2018) defined
the link transmission centrality, which can be used to identify weak ties in social net-
works and is shown to be correlated with link betweenness. How link salience and
transmission centrality are related to local topological properties remains a non-trivial
question. Our question which links are more likely to appear in an information diffu-
sion trajectory is challenging and interesting for the Susceptible-Infected-Susceptible
epidemic spreading process on a static network. An epidemic spreads through a link
only when one end node of the link is susceptible whereas the other is infected. The
relation between the infection probability of a node and its local topological proper-
ties is still far from well understood. The correlation between the states (infected or
not) of two neighboring nodes introduces extra complexity.

References

Dnc emails network dataset–KONECT. http://konect.uni-koblenz.de/networks/dnc-temporalGraph


Haggle network dataset–KONECT. http://konect.uni-koblenz.de/networks/contact
Hypertext 2009 network dataset–KONECT. http://konect.uni-koblenz.de/networks/sociopatterns-
hypertext
Manufacturing emails network dataset–KONECT. http://konect.uni-koblenz.de/networks/
radoslaw_email
Reality mining network dataset–KONECT. http://konect.uni-koblenz.de/networks/mit
A.L. Barabási, Network Science (Cambridge University Press, 2016)
L. Braunstein, Z. Wu, Y. Chen, S. Buldyrev, T. Kalisky, S. Sreenivasan, R. Cohen, E. López, S.
Havlin, H. Stanley, Optimal path and minimal spanning trees in random weighted networks. I. J.
Bifurc. Chaos 17, 2215–2255 (2007). https://doi.org/10.1142/S0218127407018361
A. Chaintreau, P. Hui, J. Crowcroft, C. Diot, R. Gass, J. Scott, Impact of human mobility on
opportunistic forwarding algorithms. IEEE Trans. Mob. Comput. 6(6), 606–620 (2007)
G. Chartrand, L. Lesniak, Graphs and Digraphs (Chapman and Hall/CRC, 1996)
G. Chartrand, O.R. Oellermann, Applied and Algorithmic Graph Theory (Mcgraw-Hill College,
1992)
Y. Chen, E. López, S. Havlin, H.E. Stanley, Universal behavior of optimal paths in weighted networks
with general disorder. Phys. Rev. Lett. 96, 068,702 (2006). https://doi.org/10.1103/PhysRevLett.
96.068702. https://link.aps.org/doi/10.1103/PhysRevLett.96.068702
N. Eagle, A. (Sandy) Pentland, Reality mining: sensing complex social systems. Pers. Ubiquitous
Comput. 10(4), 255–268 (2006)
J. Fournet, A. Barrat, Contact patterns among high school students. PloS One 9(9), e107,878 (2014)
M. Génois, C.L. Vestergaard, J. Fournet, A. Panisson, I. Bonmarin, A. Barrat, Data on face-to-face
contacts in an office building suggest a low-cost vaccination strategy based on community linkers.
Netw. Sci. 3(3), 326–347 (2015)
222 H. Wang and X.-X. Zhan

K.I. Goh, E. Oh, H. Jeong, B. Kahng, D. Kim, Classification of scale-free networks. Proc. Natl.
Acad. Sci. 99(20), 12583–12588 (2002). https://doi.org/10.1073/pnas.202301299
D. Grady, C. Thiemann, D. Brockmann, Robust classification of salient links in complex networks.
Nat. Commun. 3, 864 (2012)
M. Granovetter, Threshold models of collective behavior. Am. J. Sociol. 83(6), 1420–1443 (1978)
P. Holme, Modern temporal network theory: a colloquium. Eur. Phys. J. B 88(9), 234 (2015)
P. Holme, J. Saramäki, Temporal networks. Phys. Rep. 519(3), 97–125 (2012)
L. Isella, J. Stehlé, A. Barrat, C. Cattuto, J.F. Pinton, W. Van den Broeck, What’s in a crowd?
Analysis of face-to-face behavioral networks. J. Theor. Biol. 271(1), 166–180 (2011)
L. Isella, J. Stehlé’, A. Barrat, C. Cattuto, J.F. Pinton, W.V. den Broeck, What’s in a crowd? Analysis
of face-to-face behavioral networks. J. Theor. Biol. 271(1), 166–180 (2011)
J. Leskovec, J. Kleinberg, C. Faloutsos, Graph evolution: densification and shrinking diameters.
ACM Trans. Knowl. Discov. Data 1(1), 2 (2007)
C. Liu, X.X. Zhan, Z.K. Zhang, G.Q. Sun, P.M. Hui, How events determine spreading patterns:
information transmission via internal and external influences on social networks. New J. Phys.
17(11), 113,045 (2015)
R. Mastrandrea, J. Fournet, A. Barrat, Contact patterns in a high school: a comparison between
data collected using wearable sensors, contact diaries and friendship surveys. PloS One 10(9),
e0136,497 (2015)
R. Michalski, S. Palus, P. Kazienko, Matching organizational structure and social network extracted
from email communication, in Lecture Notes in Business Information Processing, vol. 87.
(Springer Berlin Heidelberg, 2011), pp. 197–206
M.E. Newman, Scientific collaboration networks. II. Shortest paths, weighted networks, and cen-
trality. Phys. Rev. E 64(1), 016,132 (2001)
M.E.J. Newman, S.H. Strogatz, D.J. Watts, Random graphs with arbitrary degree distributions and
their applications. Phys. Rev. E 64(2), 026,118 (2001)
P. Panzarasa, T. Opsahl, K.M. Carley, Patterns and dynamics of users’ behavior and interaction:
network analysis of an online community. J. Assoc. Inf. Sci. Technol. 60(5), 911–932 (2009)
R. Pastor-Satorras, C. Castellano, P. Van Mieghem, A. Vespignani, Epidemic processes in complex
networks. Rev. Mod. Phys. 87(3), 925 (2015)
R. Pastor-Satorras, C. Castellano, P. Van Mieghem, A. Vespignani, Epidemic processes in complex
networks. Rev. Mod. Phys. 87, 925–979 (2015)
B. Qu, H. Wang, Sis epidemic spreading with correlated heterogeneous infection rates. Phys. A
Stat. Mech. Appl. 472, 13–24 (2017)
B. Qu, H. Wang, Sis epidemic spreading with heterogeneous infection rates. IEEE Trans. Netw.
Sci. Eng. 4, 177–186 (2017)
M. Rubinov, O. Sporns, Complex network measures of brain connectivity: uses and interpretations.
Neuroimage 52(3), 1059–1069 (2010). Computational Models of the Brain
I. Scholtes, N. Wider, R. Pfitzner, A. Garas, C.J. Tessone, F. Schweitzer, Causality-driven slow-
down and speed-up of diffusion in non-Markovian temporal networks. Nat. Commun. 5, 5024
(2014)
L.M. Shekhtman, J.P. Bagrow, D. Brockmann, Robustness of skeletons and salient features in
networks. J. Complex Netw. 2(2), 110–120 (2014)
J. Stehlé, N. Voirin, A. Barrat, C. Cattuto, L. Isella, J.F. Pinton, M. Quaggiotto, W. Van den Broeck,
C. Régis, B. Lina, et al., High-resolution measurements of face-to-face contact patterns in a
primary school. PloS One 6(8), e23,176 (2011)
E. Valdano, L. Ferreri, C. Poletto, V. Colizza, Analytical computation of the epidemic threshold on
temporal networks. Phys. Rev. X 5(2), 021,005 (2015)
P. Van Mieghem, S.M. Magdalena, A phase transition in the link weight structure of networks. Phys.
Rev. E 72, 056,138 (2005)
P. Van Mieghem, H. Wang, The observable part of a network. IEEE/ACM Trans. Netw. 17(1),
93–105 (2009). https://doi.org/10.1109/TNET.2008.925089
11 Information Diffusion Backbone 223

H. Wang, L. Douw, J.M. Hernández, J.C. Reijneveld, C.J. Stam, P. Van Mieghem, Effect of tumor
resection on the characteristics of functional brain networks. Phys. Rev. E 82, 021,924 (2010)
H. Wang, J.M. Hernandez, P. Van Mieghem, Betweenness centrality in a weighted network. Phys.
Rev. E 77, 046,105 (2008)
H. Wang, Q. Li, G. D’Agostino, S. Havlin, H.E. Stanley, P. Van Mieghem, Effect of the intercon-
nected network structure on the epidemic threshold. Phys. Rev. E 88, 022,801 (2013)
D.J. Watts, A simple model of global cascades on random networks. Proc. Natl. Acad. Sci. USA
99(9), 5766–5771 (2002)
X.X. Zhan, A. Hanjalic, H. Wang, Information diffusion backbones in temporal networks. Sci. Rep.
9(1), 6798 (2019)
Q. Zhang, M. Karsai, A. Vespignani, Link transmission centrality in large-scale social networks.
EPJ Data Sci. 7(1), 33 (2018)
Y.Q. Zhang, X. Li, A.V. Vasilakos, Spectral analysis of epidemic thresholds of temporal networks.
IEEE Trans. Cybern. (2017)
Z.K. Zhang, C. Liu, X.X. Zhan, X. Lu, C.X. Zhang, Y.C. Zhang, Dynamics of information diffusion
and its applications on complex networks. Phys. Rep. 651, 1–34 (2016)
Chapter 12
Continuous-Time Random Walks
and Temporal Networks

Renaud Lambiotte

Abstract Real-world networks often exhibit complex temporal patterns that affect
their dynamics and function. In this chapter, we focus on the mathematical modelling
of diffusion on temporal networks, and on its connection with continuous-time ran-
dom walks. In that case, it is important to distinguish active walkers, whose motion
triggers the activity of the network, from passive walkers, whose motion is restricted
by the activity of the network. One can then develop renewal processes for the dynam-
ics of the walker and for the dynamics of the network respectively, and identify how
the shape of the temporal distribution affects spreading. As we show, the system
exhibits non-Markovian features when the renewal process departs from a Poisson
process, and different mechanisms tend to slow down the exploration of the network
when the temporal distribution presents a fat tail. We further highlight how some of
these ideas could be generalised, for instance to the case of more general spreading
processes.

Keywords Temporal networks · Random walks

12.1 Introduction

Random walks are a paradigmatic model for stochastic processes, finding applica-
tions in a variety of scientific domains (Balescu 1997; Klafter and Sokolov 2011), and
helping to understand how the random motion of particles leads to diffusive processes
at the macroscopic scale. Classically defined on infinitely large regular lattices or on
continuous media, random walks have long been studied on non-trivial topologies,
as different parts of the system, adjacent or not, are connected through a predefined
transition probability. In a finite and discrete setting, random walks are equivalent
to Markov chains (Lovász et al. 1993), whose behaviour is entirely encoded in their
transition matrix. The matrix allows to the succession of states visited by the walker,

R. Lambiotte (B)
Mathematical Institute, University of Oxford, Oxford, UK
e-mail: renaud.lambiotte@maths.ox.ac.uk

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 225
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_12
226 R. Lambiotte

whose dynamics is seen as a discrete-time process and where time is measured by the
number of jumps experienced by the walker. However, the complete description of a
trajectory requires an additional input about the statistical properties of the timings
at which the jumps take place, usually under the form of a waiting-time distribution
for the walker. Taken together, the modelling of whereto and when the next step will
be forms the core of the theory of continuous-time random walks.
Random walks also play a central role within the field of network science, and
provide a simple framework for understanding the relation between their structure
and dynamics. Random walks on networks have been used to model diffusion of ideas
in social networks, or diffusion of people in location networks, for instance (Masuda
et al. 2017). In their dual form, they are also used for the modelling of decentralised
consensus (Blondel et al. 2005). In addition, random walks have been exploited to
extract non-local information from the underlying network structure. Take Pagerank
for instance, defined as the density of walkers on a node at stationarity (Brin and Page
1998); or random walk-based kernels defining a similarity measure between nodes,
and embedding nodes in low-dimensional space (Fouss et al. 2016); or community
detection where clusters are defined in terms of their tendency to trap a walker for long
times (Rosvall and Bergstrom 2008; Delvenne et al. 2010; Lambiotte et al. 2014).
In each of these examples, the properties of the process at a slow time scale are
essentially used to capture large-scale information in the system. Importantly, these
works usually rely on discrete-time random walks or on basic Poisson processes in
their continuous counterpart.
The structure of networks has been the subject of intense investigation since the
early works on small-world or scale-free networks (Newman 2010). This activity
has originally been driven by the availability of large relational datasets in a variety
of disciplines, leading to the design of new methods to uncover their properties
and of models to reproduce the empirical findings. Yet, a vast majority of datasets
have only provided static snapshots of networks, or information about their growth
in certain conditions. It is only more recently that the availability of fine-grained
longitudinal data has motivated the study of the temporal properties of networks
(Holme and Saramäki 2012; Holme 2015; Masuda and Lambiotte 1996). Several
works have shown that the dynamics of real-world networks are non-trivial and
exhibit a combination of temporal correlations and non-stationarity.
In this chapter, we will focus on a particular aspect that has attracted much attention
in the literature, the presence of burstiness in the temporal series of network activity
(Barabasi 2005). Take a specific node, or a specific edge, and look at the sequence
of events associated to that object. The resulting distribution of inter-event times
has been shown to differ significantly from an exponential, even after discarding
confounding factors (Malmgren et al. 2008). After a short introduction on relevant
concepts, and a clarification of the differences between active and passive diffusion,
we will use the language of continuous-time random walks to identify how the shape
of temporal distributions affects diffusion. In analogy with static networks, where
deviations from a binomial degree distribution play a central role, we will focus on
renewal processes whose inter-event time distribution differs from an exponential.
12 Continuous-Time Random Walks and Temporal Networks 227

Finally, we will widen the scope and discuss possible generalisations of the models,
for instance in the case of non-conservative spreading processes.

12.2 Models of Graphs and of Temporal Sequences

Random models play an important role when analysing real-world data. The main
purpose of this section is to introduce simple random models for graphs and for
temporal sequences. Both sub-sections are organised in mirror to highlight the sim-
ilarities between the models.

12.2.1 Random Graphs

The most fundamental model of random graph is the Erdős–Rényi model. Usually
denoted by G(N , q), it takes as parameters the number of nodes N , and the probability
q that two distinct nodes are connected by a link. By construction, each pair of node
is a Bernoulli process, whose realisation is independent from that of other pairs in the
graph. The Erdős–Rényi model, as any random graph model, has to be considered
as an ensemble of graphs, whose probability of being realised depends on the model
parameters. Due to the independence between the processes defined on each edge,
several properties of the model can be computed exactly. This includes the degree
distribution, expected to take the form of the binomial distribution
 
N −1 k
p(k) = q (1 − q) N −1−k , (12.1)
k

but also the number of cliques of any size, or the percolation threshold. The under-
lying assumptions of a Erdős–Rényi model are often violated in empirical data.
Take connections in a social network, where triadic closure induces correlations
between neighbouring edges for instance. Yet, the model’s simplicity and analytical
tractability naturally make it a baseline model, and more realistic network models
can be developed systematically by relaxing its assumptions. Well-known examples
include:
• The configuration model. Real-life networks tend to present a strong heterogeneity
in their degrees, associated to fat tailed distributions very different from a bino-
mial. The configuration model is defined as a random graph in which all possible
configurations appear with the same probability, with the constraint that each node
i has a prescribed degree ki , that is with a tuneable degree distribution.
• Stochastic block models. Real-life networks are not homogeneous and their nodes
tend to be organised in groups revealing their function in the system. These
groups may take the form of assortative of disassortative communities and may be
228 R. Lambiotte

reproduced by stochastic block models where the nodes are divided into k classes,
and the probability pi j for two nodes i, j, belonging to class ci and c j , to be
connected is encoded in an affinity matrix ci c j .
Note that both models can be combined to form so-called degree-corrected stochastic
block models (Karrer and Newman 2011). In essence, both models assume that
the processes on each edge are independent but break the assumption that they are
identical. Certain pairs of nodes are more or less likely to be connected within this
framework. Models questioning the independence of different edges include
• The preferential attachment model. In this mechanistic model for growing net-
works, nodes are added one at a time and tend to connect with a higher probability
to high degree nodes. The resulting networks naturally produce fat-tailed degree
distributions and exhibit correlations building in the course of the process. Varia-
tions of the model include divergence-duplication models (Ispolatov et al. 2005),
or copying models (Lambiotte et al. 2016), whose correlations produce a high den-
sity of triangles, and cliques of all size, that are negligible in the afore-mentioned
models.

12.2.2 Poisson and Renewal Processes

Let us now turn our attention to the modelling of temporal sequences of events.
Examples include the sequences of retweets of an original tweet, of meeting a cat
in the street, or of nuclei to disintegrate in a radioactive material. As a first order
approximation, these systems can be modelled by a Poisson process, assuming that
events are independent of each other and that their rate is constant over time. As
in case of the Erdős–Rényi model, these assumptions are often unrealistic, but the
resulting simplicity allows us to derive analytically its statistical properties. For
Poisson processes, the inter-event time between two consecutive events, usually
denoted by τ , is exponentially distributed according to

ψ(τ ) = λe−λτ , (12.2)

where λ is the rate at which events occur. Likewise, the distribution of the number
of events observed within a given time window is readily found to be

(λt)n −λt
p(n, t) = e (12.3)
n!
for any n ≥ 0. Deviations from these distributions in empirical data indicate that
the assumptions of a Poisson process are not verified and that a more complicated
process is at play. Generalised models relaxing some of these assumptions include:
• Renewal processes. Empirical data often show fat-tailed inter-event time distribu-
tions, which can be captured by renewal processes. In a renewal process, inter-event
12 Continuous-Time Random Walks and Temporal Networks 229

times are independent of each other and drawn from the same distribution ψ(τ ).
When ψ(τ ) = λe−λτ , we recover a Poisson process. The properties of renewal
processes are usually best analysed in the frequency domain. After defining the
Laplace transform  ∞
ψ̂(s) = ψ(τ )e−sτ dτ (12.4)
0

and noting that a convolution in time translates into a product in the Laplace
domain, one readily finds the probability of having observed n events at time t

 n 1 − ψ̂(s)
p̂(n, s) = ψ̂(s) . (12.5)
s
As we will see below, this quantity is critical, as it relates two ways to count time:
one in terms of the number of events, n, and the other in terms of the physical
time, t.
• Non-homogeneous Poisson processes. Real-life time series are often non-
stationary, which can be incorporated in a Poisson process with a time-dependent
event rate λ(t). For a non-homogeneous Poisson process, (12.3) is extended as

(t)n −(t)
p(n, t) = e , (12.6)
n!
where  t
(t) = λ(t  )dt  . (12.7)
0

Similarly, the distribution of inter-event times is given by

ψ(τ ) = λ(τ )e−(τ ) , (12.8)

and leads to time-dependent, non-exponential distributions in general.


The previous two generalisations assume that successive inter-event times are
independent processes, which is expected to be invalid in many situations. Take
discussion threads between individuals for instance, or cascades of events in social
media. This type of situation can instead be modelled by
• Self-exciting processes. The underlying idea is that an event induces an additional
event rate for future events. This property is at the core of Hawkes processes
(Hawkes 1971; Masuda et al. 2013) where, in their simplest form, the event rate
at time t is given by 
λ(t) = λ0 + χ (t − t ), (12.9)
,t ≤t

where t is the time of the th event. The model incorporates a baseline rate λ0
independent of self excitation and a memory kernel χ (t) describing the additional
230 R. Lambiotte

rate incurred by an event. It is generally assumed that χ (t) peaks at t = 0 and


monotonically decay towards zero as t increases.
As in the case of random graph models, different generalisations can be combined to
form more realistic models. This is the case of TideH, for instance, a model for retweet
dynamics combining ingredients from Hawkes processes and non-homogeneous
Poisson processes (Kobayashi and Lambiotte 2016).

12.3 Trajectories on Networks

12.3.1 Discrete-Time Dynamics

Let us now turn to the case of static networks and the description of trajectories
on their nodes. A canonical example could be one-dollar bill transiting between
individuals forming a large social network (Brockmann et al. 2006). For the sake of
simplicity, we will assume that social interactions are undirected and unweighted,
and that the whole network forms a connected component. The structure of the
network is encoded through its adjacency matrix A, whose element Ai j determines
the presence of an edge between nodes i and j. As a first step, we consider the case
when the random walk process takes place at the discrete times n. A trajectory on
the network is thus characterised by a sequence X 0 , X 1 , . . ., X n , . . ., where X n is a
random variable denoting the node visited by the walker at time n. In general, the
state X n+1 may depend on all preceding states of the dynamics and the probability
of visiting a certain node i requires information about the full history of the process

p(X n+1 = i n+1 |X n = i n , . . . , X 1 = i 1 , X 0 = i 0 ). (12.10)

The process simplifies drastically in situations when the system is stationary and the
conditional probability depends only on the state at time n. The process then takes
the form of a Markov chain, and is fully described by its initial state and the N × N
transition matrix
Ai j
p(X n+1 = j|X n = i) ≡ Ti j = , (12.11)
ki

where ki is, as before, the degree of node i. For instance, the probability that state i
is visited at time n, denoted by Pi (n), evolves according to


N
P j (n + 1) = Pi (n)Ti j (12.12)
i=1

or, in matrix notations,


P(n + 1) = P(n)T , (12.13)
12 Continuous-Time Random Walks and Temporal Networks 231

yielding the formal solution


P(n) = P(0)T n . (12.14)

When the underlying network is connected and the corresponding Markov chain is
ergodic, it can be shown that any initial condition converges to the stationary density
Pi∗ = ki /2m solution of
P∗ = P∗T . (12.15)

12.3.2 Fourier Modes

The solution, Eq. (12.14), involves products of matrices, which can be simplified
by rewriting the system in the basis formed by the eigenvectors of the transition
matrix. This operation is sometimes called the graph Fourier transform (Perraudin
and Vandergheynst 2017), and it allows to replace the matrix products by algebraic
products for amplitudes associated to the different dynamical modes. To show this,
we consider the symmetric matrix

Ai j
Ãi j =  , (12.16)
ki k j

and its spectral decomposition


N
Ãi j = λ u  u 
, (12.17)
=1

where u  is the normalised eigenvector of eigenvalue λ , and where we have assumed


that all eigenvalues are distinct to avoid unnecessary complications. By construction,
the eigenvectors verify u  , u  = δ and form a proper basis for signals defined
on the nodes of the network. One should also note that the eigenvalues are in the
interval [−1, 1], that the multiplicity of the dominant eigenvalue 1 gives the number of
connected components in the graph, and that an eigenvalue equal to −1 indicates that
the graph is bipartite. It is straightforward to show that the left and right eigenvectors
of T are given by
 
u L = (u  )1 k1 · · · (u  ) N k N , (12.18)
  
u R = (u  )1 / k1 · · · (u  ) N / k N (12.19)

respectively which implies, after some algebra, that the state of the random walk
after n steps is given by a linear combination of the eigenmodes
232 R. Lambiotte


N
Pi (n) = a (n)(u L )i , (12.20)
=1

where the amplitude of the modes evolves as

a (n) = λn a (0). (12.21)

and a (0) ≡ P(0), u R is given by the initial condition.


The spectral decomposition (12.20) helps to understand how the structure of a
network affects the diffusion of a random walker. The stationary density corresponds
to the mode with λ = 1, assumed to be unique as the network is connected. In
addition, in situations when the network is non-bipartite, and no eigenvalue is equal
to −1, all the other modes asymptotically decay to 0, each one with a time-scale
associated to its eigenvalue. The long-time relaxation to the stationary density is
determined by the eigenmode associated to λ2nd , the second largest eigenvalue, which
is related to the spectral gap 1 − λ2nd . A small spectral gap entails slow relaxation
and the presence of a bottleneck between communities in the network, as shown by
the Cheeger inequality (Chung and Graham 1997).

12.3.3 Continuous-Time Dynamics

We have described the trajectory of the walker in terms of the number of jumps so far.
We now turn to the situation when the jumps take place in continuous time, which
motivates the use of continuous-time random walk processes. The passage from dis-
crete to continuous time is usually done by modelling the sequence of jumps of the
walker as a renewal process: the walker waits between two jumps for a duration τ
given by the probability density function ψ(τ ) before performing a transition accord-
ing tor the Markov chain. We have assumed here that the waiting time distribution
is identical on each node. The position of the walker at time t is given by


Pi (t) = Pi (n) p(n, t), (12.22)
n=0

where we have used the fact that arriving at node i in n steps, and performing n steps
in time t are independent processes.
By going into the Laplace domain and combining (12.5) and (12.14), we find

1 − ψ(s)  −1
P̂(s) = P(0) I − T ψ̂(s) , (12.23)
s

whose inverse Laplace transform provides the probability Pi (t) for the walker to be
on node i at time t. The latter usually involves convolutions in time, reflecting the
12 Continuous-Time Random Walks and Temporal Networks 233

lack of Markovianity of the process for general renewal process. (12.23) also takes
the equivalent form
   
1 1 1
− 1 P̂(s) = −1 P(0) + P̂(s)L (12.24)
ψ(s) ψ(s) s

where L = T − I is the normalised Laplacian of the network. This expression sim-


plifies drastically in the case when the renewal process is a Poisson process, and
ψ(τ ) = λe−λτ , leading to the standard rate equation

d P(t)
= P(t)L. (12.25)
dt
It is important to emphasise that (12.24) provides an exact solution to the problem and
that it departs from (12.25) through its causal operator ψ(s) 1
− 1 , which translates
the input from the neighbours of a node into a change of its state, different from a
usual dtd . Note that this operator asymptotically behaves like a fractional derivative
in situations when the waiting time distribution has a power-law tail (De Nigris et al.
2016).
This solution also helps us to clarify the impact of the shape of the waiting time
distribution on the speed of diffusion. In the basis of the eigenvectors of the transition
matrix, it is straightforward to generalise (12.21) to obtain

1 − ψ̂(s) 
N
a (0)
P̂(s) = u L . (12.26)
s =1 1 − λ  ψ̂(s)

In other words, the time evolution of the amplitude of each mode is given by

1 − ψ̂(s)
â (s) =   a (0), (12.27)
s 1 − λ ψ̂(s)

which is, in general, different from an exponential decay. The asymptotic behaviour
can be obtained by performing a small s expansion

1 2 2
ψ̂(s) = 1 − τ s + τ s + o(s 2 ) (12.28)
2
and assuming a finite mean and variance, yielding the dominant terms
 
τ λ τ τ2
a (s) = 1−s + (12.29)
1−λ 1 − λ 2τ

and thus a characteristic time t ,


234 R. Lambiotte
 
1
t = τ +β , (12.30)


where  = 1 − λ is an eigenvalue of the normalised Laplacian and

στ2 − τ 2
β= (12.31)
2τ 2

is the variance of τ . Poisson processes yield β = 0 and this expression clearly shows
that negative values of β tend to accelerate the relaxation of each mode, while larger
values slow them down. The former happens in the case of discrete-time dynamics
for instance, when ψ(τ ) is a delta distribution. The latter is when the distribution has
a fat tail. Importantly, (12.30) shows that a combination of structural and temporal
information determines the temporal properties of the process.

12.4 Diffusion on Temporal Networks

12.4.1 Active Versus Passive Walks

What about temporal networks? The results derived in the previous section focus
on random walks on static networks. They are nonetheless instructive to model and
understand diffusion on temporal networks. As a first step, we should emphasise
that the distinction between the dynamics on the network and the dynamics of the
network is not always clear-cut (Hoffmann et al. 2012; Speidel et al. 2015). The
temporal nature of a network usually comes from time series of events taking place
on nodes or edges. There are situations when it is the diffusive process itself that
defines the temporal network. Take the action of sending an email or an SMS to a
friend, and the modelling of information diffusion in the resulting network. In this
case of active diffusion, the motion of the random walker is defining the temporal
patterns of activity on links existing, as transition trigger the activation of a potential
edge. The model of Sect. 12.3.3 is then a good candidate to explore the interplay
between structure and dynamics in the resulting temporal network. Note that even
the Poisson model described by (12.25) can then be seen as generating a temporal
network, even if it is not a very interesting one.
There are other situations, however, when the motion of the walker does not trig-
ger the connections between nodes, but it is instead constrained by their temporal
patterns. A good example would that of a person random walking a public transporta-
tion network, or of a disease spreading in a contact network. In that case, the temporal
sequence of edges restrains the time-respecting paths that are available for the walker
and one talks of passive diffusion. As we will see, passive random walks can also
be mapped to continuous-time random walks to some extent, but this operation must
be performed more carefully. As a simple model of temporal networks supporting
12 Continuous-Time Random Walks and Temporal Networks 235

diffusion, let us consider a stochastic temporal network. The network is modelled as


a set of potential edges between nodes, each one evolving as an independent renewal
process, with a distribution of inter-event times φ(τ ). A random walker located on a
node i performs a jump as soon as an edge appears, for instance to node j, where it
waits until the next available edge.

12.4.2 Bus Paradox and Backtracking Transitions

When considering passive random walks, it is critical to clearly distinguish the


waiting-time distribution ψ(τ ) from the inter-event time distribution φ(τ ). The for-
mer characterises the times that a walker has to wait on a node before its next move.
The latter gives the time between edge activations in the renewal process defining the
stochastic temporal network. The inter-event time distribution is a parameter of the
network model but the motion of the walker is directly affected by the waiting-time
distribution, and it should thus be estimated. To do so, let us first focus on the case
of a walker arriving at a node j from i and calculating the time before an edge to
another edge k appears. Assuming the independence between the act of arriving at
node j and the appearance of the edge to k, one finds that both distributions are
related as  ∞
1  
ψ(τ ) = φ(τ )dτ , (12.32)
τ φ τ

where τ φ is the average inter-event time distribution. Most strikingly, the average
waiting time is
1 τ2 φ
τ ψ= (12.33)
2 τ φ

and it depends on the variance of the inter-activation time. At a fixed value of τ φ ,


the waiting time can thus be arbitrarily large if the variance of φ(τ ) is sufficiently
large. The waiting-time paradox is a standard result in queuing theory (Allen 1990)
and is an example of length-biased sampling. It is often called the bus paradox, after
the observation that the average waiting time at a bus stop tends to be longer than
half of the average interval between two buses expected from the timetable.
As a second step, it is important to note that the approximation behind the deriva-
tion of (12.32) is, in general, not respected if the walker passes several times by the
same edge, as information about the previous passage time may help to predict the
next activation time. This effect is most apparent in (but is not limited to) the case of
undirected networks. Consider a walker taking an edge from node i to node j. The
time before the next activation of the edge back to i is clearly not given by ψ(τ ) in
(12.32), but simply by φ(τ ). The waiting time for a backtracking edge is therefore
typically different, and shorter for fat-tailed distributions, than the waiting time of
a non-backtracking edge. This difference is particularly critical because edges are
in competition with each other (Hoffmann et al. 2012). When a walker waits on a
236 R. Lambiotte

node, the model specifies that it takes the first edge to appear. For this reason, the
prevalence of an edge over another may bias the trajectory of the walker, and make
the process non-Markovian (Speidel et al. 2015).
Let us quantify this effect when the walker arrives on a node j with degree k j , and
consider the probability of the time t at which the walker takes a specific edge. As
before, the inter-event times of the links are identically and independently distributed
according to φ(τ ). Starting from the time when the walker arrived on j, the time
of the next activation for a backtracking step is simply φ(τ ). The time for another
edge to activate is instead determined by the waiting-time distribution ψ(τ ), where
we assume that the approximation (12.32) is valid. For the walker to take an edge at
time t, no other edge can have appeared in [0, t]. Therefore, we obtain
 ∞ ki −1
f back (t) ≈ φ(t) ψ(τ )dτ (12.34)
t
 ∞ ki −2  ∞
f non−back (t) ≈ ψ(t) ψ(τ )dτ φ(τ )dτ (12.35)
t t

As we discussed, if ψ has a fat tail, the waiting time is larger than the inter-event
time on average and the walker preferentially backtracks, thereby leading to non-
Markovian trajectories. Here, we should clarify the distinction between two types of
non-Markovianity that can be induced on temporal networks. In (12.24), the sequence
of nodes visited by the walker is described by a Markov chain, but the statistical
properties of the timings induce long-term memory effects. In (12.34), instead, the
sequence of nodes visited by the walker cannot be produced by a first-order Markov
process.
To summarise, when considering passive diffusion on a stochastic temporal net-
work, the rate at which the random walker explores the network is slowed down in
three ways when the inter-event time distribution has a fat tail, namely through:

• the bus paradox, because the waiting time of the walker on the nodes tends to be
longer on average. As the speed of diffusion is controlled by the sum of the waiting
times of the walker, this effect leads to a slow down of diffusion.
• the backtracking bias. The random walker tends to backtrack, which hinders its
exploration of the network and can be shown to increase the mixing time of the
process (Saramäki and Holme 2015; Gueuning et al. 2017).
• the variance of the waiting-time distribution. On top of the slow down due to
the bus paradox, and an increase of the average waiting time, the variance of the
waiting time also slows down diffusion through the same mechanisms as for active
random walks, in (12.30).
12 Continuous-Time Random Walks and Temporal Networks 237

12.5 Perspectives

The main purpose of this chapter was to provide an overview on theoretical results for
diffusion on temporal networks. As we discussed, the problem may be understood
through the lens of continuous-time random walks, after carefully distinguishing
between waiting time and inter-event time, on the one hand, and active versus passive
walks, on the other hand. An analytical approach allows us to identify unambigu-
ously the mechanisms accelerating or slowing down the diffusion, and also helps to
warn against caveats that could be met with numerical simulations. A good example
concerns the use of null models to determine how the temporal nature of a real-world
network affects diffusion. A popular solution consists in comparing numerical sim-
ulations of diffusion on the original data and on different versions of randomised
data (Karsai et al. 2011). The results from Sect. 12.4.2 show that randomised null
models in which temporal correlations are removed yet have the tendency towards
backtracking, and thus to slow down the exploration of the network.
The results presented in this chapter also open different perspectives for future
research. As we discussed in Sect. 12.2.1, different types of random graph models
have been proposed for static networks. A model like the stochastic temporal network
incorporates temporal activity on a given network structure, which opens the question
of how to properly define generalisations of the configuration model or stochastic
block model in this context. Answers may be found by clarifying the connections with
activity-driven models, where the dynamics can also be generated by general renewal
processes (Moinet et al. 2015, 2019). The temporal networks presented in this chapter
also suffer from limitations that may hinder their applicability. Those include their
stationarity, the absence of correlations between edges and the instantaneity of the
interactions. To address the last two limitations, we point the reader to the possibility
to use higher-order Markov models for the data (Scholtes et al. 2014; Lambiotte et al.
2019), and recent generalisations based on continuous-time random walks allowing
for interactions with a finite duration (Petit et al. 2018). The latter emphasises a
critical aspect of temporal networks, which are characterised by different processes
and their corresponding time scales. Those time scales include one associated to the
motion of the random walker, one to the time between successive activity periods
of the edges and another to the duration of the activity periods. Depending on the
model parameter, one process may dominate the others and lead to mathematical
simplifications for the dynamics.
As a final comment, we would also like to come back on our observation that
temporal networks may be generated by active diffusive processes. This chapter was
limited to conservative spreading processes, where the number of diffusing entities
is preserved in time. There are many practical situations, however, when this is
not the case. Take the spreading of viruses in human populations or of hashtags in
online social networks for instance. In that case, other spreading models should be
considered to generate more realistic temporal networks. A promising candidate is
multivariate Hawkes processes, generalising (12.9) to interacting entities, and whose
equation of evolution takes a form very similar to (12.24):
238 R. Lambiotte

λ0
λ(s) = + χ (s)λ(s) A, (12.36)
s

where the vector λ(s) is the Laplace transform of the average rate of activity on
each node, χ (s) of the memory kernel and A is the adjacency matrix of the network.
The presence of the adjacency matrix instead of the Laplacian is a clear sign of the
epidemic nature of the spreading. Note also that a heterogeneous mean-field, à la
configuration model, version of the model has been considered for the modelling of
retweet cascades (Zhao et al. 2015). Alternatively, the related family of Bellman-
Harris branching processes could be used, where a node i remains infected for a
duration t, determined by a distribution ρ(t), before infecting its neighbours, leading
to
1 − ρ(s)
λ(s) = + ρ(s)λ(s) A. (12.37)
s
In each case, the active process allows for the formation of a growing cascade of
infections, and includes a non-trivial causal operator.

Acknowledgements I would like to thank my many collaborators without whom none of this work
would have been done and, in particular, Naoki Masuda for co-writing (Masuda and Lambiotte 1996)
that was a great inspiration for this chapter.

References

A.O. Allen, Probability, Statistics, and Queueing Theory: With Computer Science Applications,
2nd edn. (Academic, Boston, 1990)
R. Balescu, Statistical Dynamics: Matter Out of Equilibrium (Imperial College London, 1997)
A.-L. Barabasi, The origin of bursts and heavy tails in human dynamics. Nature 435(7039), 207
(2005)
V.D. Blondel, J.M. Hendrickx, A. Olshevsky, J.N. Tsitsiklis, Convergence in multiagent coordi-
nation, consensus, and flocking, in Proceedings of the 44th IEEE Conference on Decision and
Control (IEEE, 2005), pp. 2996–3000
S. Brin, L. Page, Anatomy of a large-scale hypertextual web search engine, in Proceedings of the
Seventh International World Wide Web Conference (1998), pp. 107–117
D. Brockmann, L. Hufnagel, T. Geisel, The scaling laws of human travel. Nature 439(7075), 462
(2006)
F.R.K. Chung, F.C. Graham, Spectral Graph Theory. Number 92 (American Mathematical Society,
1997)
S. De Nigris, A. Hastir, R. Lambiotte, Burstiness and fractional diffusion on complex networks.
Eur. Phys. J. B 89(5), 114 (2016)
J.C. Delvenne, S.N. Yaliraki, M. Barahona, Stability of graph communities across time scales. Proc.
Natl. Acad. Sci. U.S.A. 107, 12755–12760 (2010)
F. Fouss, M. Saerens, M. Shimbo, Algorithms and Models for Network Data and Link Analysis
(Cambridge University Press, 2016)
M. Gueuning, R. Lambiotte, J.-C. Delvenne, Backtracking and mixing rate of diffusion on uncor-
related temporal networks. Entropy 19(10), 542 (2017)
A.G. Hawkes, Point spectra of some mutually exciting point processes. J. R. Stat. Soc. B 33, 438–443
(1971)
12 Continuous-Time Random Walks and Temporal Networks 239

T. Hoffmann, M.A. Porter, R. Lambiotte, Generalized master equations for non-Poisson dynamics
on networks. Phys. Rev. E 86, 046102 (2012)
P. Holme, Modern temporal network theory: a colloquium. Eur. Phys. J. B 88(9), 1–30 (2015)
P. Holme, J. Saramäki, Temporal networks. Phys. Rep. 519(3), 97–125 (2012)
I. Ispolatov, P.L. Krapivsky, A. Yuryev, Duplication-divergence model of protein interaction net-
work. Phys. Rev. E 71(6), 061911 (2005)
B. Karrer, M.E.J. Newman, Stochastic blockmodels and community structure in networks. Phys.
Rev. E 83(1), 016107 (2011)
M. Karsai, M. Kivelä, R.K. Pan, K. Kaski, J. Kertész, A.-L. Barabási, J. Saramäki, Small but slow
world: how network topology and burstiness slow down spreading. Phys. Rev. E 83(2), 025102
(2011)
J. Klafter, I.M. Sokolov, First Steps in Random Walks: From Tools to Applications (Oxford University
Press, New York, 2011)
R. Kobayashi, R. Lambiotte, Tideh: time-dependent hawkes process for predicting retweet dynam-
ics, in Tenth International AAAI Conference on Web and Social Media (2016)
R. Lambiotte, M. Rosvall, I. Scholtes, From networks to optimal higher-order models of complex
systems. Nat. Phys. 1 (2019)
R. Lambiotte, J.C. Delvenne, M. Barahona, Random walks, Markov processes and the multiscale
modular organization of complex networks. IEEE Trans. Netw. Sci. Eng. 1, 76–90 (2014)
R. Lambiotte, P.L. Krapivsky, U. Bhat, S. Redner, Structural transitions in densifying networks.
Phys. Rev. Lett. 117(21), 218301 (2016)
L. Lovász et al., Random walks on graphs: a survey, in Combinatorics, Paul erdos is Eighty, vol. 2,
no. 1 (1993), pp. 1–46
R.D. Malmgren, D.B. Stouffer, A.E. Motter, L.A.N. Amaral, A poissonian explanation for heavy
tails in e-mail communication. Proc. Natl. Acad. Sci. 105(47), 18153–18158 (2008)
N. Masuda, T. Takaguchi, N. Sato, K. Yano, Self-exciting point process modeling of conversation
event sequences, in Temporal Networks (Springer, 2013), pp. 245–264
N. Masuda, R. Lambiotte, A guide to temporal networks (World Scientific, London, 1996)
N. Masuda, M.A. Porter, R. Lambiotte, Random walks and diffusion on networks. Phys. Rep. 716,
1–58 (2017)
A. Moinet, M. Starnini, R. Pastor-Satorras, Random walks in non-poissoinan activity driven tem-
poral networks (2019). arXiv:1904.10749
A. Moinet, M. Starnini, R. Pastor-Satorras, Burstiness and aging in social temporal networks. Phys.
Rev. Lett. 114(10), 108701 (2015)
M. Newman, Networks: An Introduction (Oxford University Press, 2010)
N. Perraudin, P. Vandergheynst, Stationary signal processing on graphs. IEEE Trans. Signal Process.
65(13), 3462–3477 (2017)
J. Petit, M. Gueuning, T. Carletti, B. Lauwens, R. Lambiotte, Random walk on temporal networks
with lasting edges. Phys. Rev. E 98(5), 052307 (2018)
M. Rosvall, C.T. Bergstrom, Maps of random walks on complex networks reveal community struc-
ture. Proc. Natl. Acad. Sci. U.S.A. 105, 1118–1123 (2008)
J. Saramäki, P. Holme, Exploring temporal networks with greedy walks. Eur. Phys. J. B 88(12), 334
(2015)
I. Scholtes, N. Wider, R. Pfitzner, A. Garas, C.J. Tessone, F. Schweitzer, Causality-driven slow-
down and speed-up of diffusion in non-Markovian temporal networks. Nat. Commun. 5, 5024
(2014)
L. Speidel, R. Lambiotte, K. Aihara, N. Masuda, Steady state and mean recurrence time for random
walks on stochastic temporal networks. Phys. Rev. E 91, 012806 (2015)
Q. Zhao, M.A. Erdogdu, H.Y. He, A. Rajaraman, J. Leskovec, Seismic: a self-exciting point process
model for predicting tweet popularity, in Proceedings of the 21th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (ACM, 2015), pp. 1513–1522
Chapter 13
Spreading of Infection on Temporal
Networks: An Edge-Centered,
Contact-Based Perspective

Andreas Koher, James P. Gleeson, and Philipp Hövel

Abstract We discuss a continuous-time extension of the contact-based (CB) model,


as proposed in [Koher et al., Phys. Rev. X 9, 031017 (2019)], for infections with
permanent immunity on temporal networks. At the core of our methodology is a
fundamental change to an edge-centered perspective, which allows for an accurate
model on temporal networks, where the underlying time-aggregated graph has a tree
structure. From the continuous-time CB model, we derive the infection propagator
for the low prevalence limit and propose a novel spectral criterion to estimate the
epidemic threshold. In addition, we explore the relation between the continuous-time
CB model and the previously proposed edge-based compartmental model, as well as
the message-passing framework.

Keywords Epidemic spreading · Temporal networks · Epidemic threshold ·


Infection propagator · Spectral radius · Non-backtracking matrix

13.1 Introduction

The foundation of modern theoretical epidemiology was established at the beginning


of the 20th century, mainly by health physicians such as Ross, Hamer, McKendrick,
and Kermack who introduced the compartment model (Hamer 1906; Ross 1910;
Kermack and McKendrick 1927). This approach separates individuals within a pop-
ulation into epidemic categories or compartments, depending on their health status
such as susceptible, infected, and recovered. Since the early years, development in
the field of mathematical epidemiology has accelerated, not least due to the seminal

A. Koher · J. P. Gleeson
MACSI, Department of Mathematics and Statistics, University of Limerick, Limerick, Ireland
e-mail: James.Gleeson@ul.ie
P. Hövel (B)
Theoretical Physics and Center for Biophysics, Universität des Saarlandes, Campus E2 6, 66123
Saarbrücken, Germany
e-mail: philipp.hoevel@uni-saarland.de

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 241
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_13
242 A. Koher et al.

works of Bailey (1957), Anderson and May (1992) and Hethocote (2000). Modern
models include stochasticity (van Kampen 1981; Bailey 1975; Simon et al. 2011; Van
Mieghem et al. 2009), non-Markovian dynamics (Kiss et al. 2015b; Sherborne et al.
2018; Karrer and Newman 2010; Gonçalves et al. 2011; Van Mieghem and van de
Bovenkamp 2013), demographic structures, vaccinations, disease vectors and quar-
antine (see references in Hethcote 2000 for details). Thus, the field of research ranges
from simple explanatory models that reveal hidden patterns and reproduce funda-
mental observations to elaborate numerical models that provide realistic predictions
(Keeling and Rohani 2008).
In recent years, we witnessed a second golden age (Pastor-Satorras et al. 2015) of
epidemiological modeling. The driving forces behind this development are increasing
computing power and an unprecedented amount of mobility data. The combination
of both allows scientists to simulate the behavior of entire populations at the level
of single individuals (Balcan et al. 2009; Eubank et al. 2004; Ferguson et al. 2005;
Halloran et al. 2008; Chao et al. 2010; Longini et al. 2005; Merler and Ajelli 2011)
and thus to advise policy makers by means of quantitative models.
One of the cornerstones of network-based disease models is the individual-based
(IB) approach. It is a drastic simplification of the exact description using a master
equation, because it assumes that the epidemiological states of neighboring nodes are
statistically independent. Under this approximation, one can define a set of dynamic
equations for the marginal probability to find a node in a given disease state (Van
Mieghem et al. 2009; Wang et al. 2003; Valdano et al. 2015; Rocha and Masuda
2016; Chakrabarti et al. 2008; Ganesh et al. 2005; Gómez et al. 2010; Youssef and
Scoglio 2011). This method is widely employed, because it offers an intuitive and
analytically tractable approach to integrate the underlying contact network. As a
particularly important result, we mention that the largest eigenvalue of the adjacency
matrix, which represents the topology of the network, determines the critical disease
parameters that separate local and global outbreaks (Chakrabarti et al. 2008; Valdano
et al. 2015).
Karrer and Newman substantially improved previous models of uni-directional
diseases, such as the generic susceptible-infected-recovered (SIR) model, using the
message-passing framework (Karrer and Newman 2010). This approach dates back
to the computer scientist Pearl (1982), who formulated an exact inference algorithm
on trees. Karrer and Newman proposed an integro-differential equation as a model
for non-Markovian disease dynamics and improved previous estimates of the criti-
cal disease parameters on static networks (Karrer et al. 2014). A crucial conceptual
difference to earlier works is that edges instead of nodes appear as central elements
of the model. This idea has influenced considerably further research on network epi-
demiology (Sherborne et al. 2018; Miller et al. 2012; Lokhov et al. 2014; Wilkinson
et al. 2017; Koher et al. 2019b).
The dynamic message-passing model (Lokhov et al. 2014) is a particularly
application-oriented variant for Markovian SIR dynamics and has been extended
recently to networks with time-varying topologies (Frasca and Sharkey 2016; Koher
et al. 2019b; Humphries et al. 2021). This novel approach for epidemics on temporal
networks, termed the contact-based (CB) model, focuses on edge-based quantities
13 Spreading of Infection on Temporal Networks: An Edge-Centered … 243

that are updated in discrete time and thus allows for a seamless integration of tem-
poral networks that are sampled at a constant rate. Importantly, the authors in Koher
et al. (2019b) derive a critical condition that improves previous estimates of the epi-
demic threshold (Valdano et al. 2015), which is an valuable risk measure for public
health institutions.
Another important research branch in theoretical epidemiology focuses not on
a single realization of a graph but on an ensemble of random networks. A partic-
ularly accurate and compact model of epidemic spreading on this class of random
networks has been proposed in Miller et al. (2012) and is termed edge-based com-
partmental (EBC) model. The original work focused not only on the configura-
tion model for static networks, but also on different classes of random graphs with
time-varying topologies. Since then, several extensions have been proposed, such
as non-Markovian recovery dynamics (Sherborne et al. 2018) and arbitrary initial
conditions (Miller 2014). Moreover, studies in Miller and Volz (2013); Miller and
Kiss (2014) investigated links to other existing models such as pair-approximations
(Eames and Keeling 2002), the effective degree model (Lindquist et al. 2011) and
message-passing (Karrer and Newman 2010).
This chapter is a revised version of our contribution to the 1st edition (Koher et al.
2019a). While the topic and structure remains the same, the chapter has been carefully
to reflect more recent developments such as the temporal pair-based framework
(Humphries et al. 2021), which is closely related. Note that the temporal pair-based
model will be discussed in a separate chapter of this 2nd edition.
As an overview of this chapter, we will derive a continuous-time formulation of
the CB model for temporal networks, analyze the low-prevalence limit and explore
links to previously proposed models. To this end, we will briefly summarize in
Sect. 13.2 the discrete-time version proposed in Koher et al. (2019b). Then, we extend
the dynamic equations to the continuous-time case in Sect. 13.3 and determine in
Sect. 13.4 the epidemic threshold from a stability analysis of the disease-free fixed
point. Moreover, we will link the continuous-time results to existing models and in
particular to the edge-based compartmental model in Sect. 13.5 and the message-
passing framework in Sect. 13.6.

13.2 Discrete-Time Description

The dynamic equations of the contact-based model appeared first in Lokhov et al.
(2014) and Koher et al. (2019b) for static and temporal networks, respectively. In the
following, we will briefly re-derive the discrete-time model, which will then serve as
the starting point for a continuous-time formulation in the main part of this chapter.
We begin by introducing our notational convention and consider a network G(t),
whose topology can change at any time t ∈ [0, T ]. Next we sample Ts snapshots of
the graph with a constant interval Δt. The resulting sequence [G 0 , G 1 , ..., G Ts −1 ]
is an approximation of the continuous-time network, which approaches the exact
representation in the limit Δt → 0.
244 A. Koher et al.

Let us denote with N and C ⊂ T × N × N the set of all nodes (|N | = N )


and time-resolved contacts, respectively, where T = {0, 1, . . . , Ts − 1} represents
the set of sampling times. To emphasize the difference between temporal and static
elements, we will refer to edges as static links (k, l) ∈ E ⊂ N × N of the time-
aggregated graph and denote the number of edges with E = |C |. In other words, an
edge exists if and only if at least one contact was recorded between the corresponding
nodes. We assume a directed network and hence represent a potential undirected
contact through two reciprocal elements. Finally, it is helpful to define an indicator
function that returns whether or not a contact from k to l exists at sampling time t:

1, if (t, k, l) ∈ C
ak→l (t) = (13.1)
0, otherwise.

Here we use the notation k → l to denote quantities defined on the set of edges E ,
thus preventing potential confusion with node-based elements.
As a model for disease spreading, we consider the generic susceptible-infected-
recovered (SIR) dynamic as a paradigmatic model for infections that lead to per-
manent immunity. In this modeling framework, a susceptible node (S) contracts
the disease from an infected neighbor (I) with a constant and uniform rate β. The
transition to the recovered state (R) follows with a likewise universal rate μ.
After this formal introduction we can now start with the actual modeling and to
this end, we begin with the marginal probability PlS (t) that node l is susceptible at
time t. We observe that l is susceptible if it has been so already at the beginning t = 0
(with a corresponding probability of zl ) and hence did not contract the infection from
any of its neighbors up to the observation time t. We denote the probability of the
latter event with Φl (t), which leads to the following relation:

PlS (t) = zl Φl (t). (13.2)

In order to determine the central quantity Φl (t), we make the simplifying assump-
tion that the undirected, time-aggregated graph has a tree structure. In other words,
ignoring the directionality, the static backbone does not contain loops and hence all
branches emanating from the root node l can be considered independently of each
other.
In order to factorize the probability Φl (t), we also need to introduce a concept
that is sometimes referred to as test node (Miller et al. 2012), cut-vertex (Kiss et al.
2015a), or cavity state (Karrer and Newman 2010; Lokhov et al. 2014). To understand
why this concept is helpful, imagine the case that a disease appears in one branch
and hence may spread into another branch via the root node l. As a consequence, the
probability that l will be infected by either of the two infected neighbors is clearly
correlated and therefore cannot factorize. However, this case requires l to be already
infected and hence appears as an artifact. In order to exclude this event we remove
(virtually) all edges emanating from l, which prevents a disease transmission from
one branch to another. We refer to vertex l as being in the cavity state or simply a
13 Spreading of Infection on Temporal Networks: An Edge-Centered … 245

cavity node. This intervention does not change the dynamics of l, as the node can still
be infected and once it is, it recovers regardless of the network structure. Furthermore,
we call this modification virtual because we restore the topological modification as
soon as we focus on the dynamics of another node. This method ensures that Φl (t)
factorizes and thus we arrive at

PlS (t) = zl θk→l (t). (13.3)
k∈N l

Here, the product iterates through all neighbors k ∈ Nl of node l and with θk→l (t)
we denote the probability that cavity node l has not contracted the disease from k up
to the observation time.
The conceptual change from a node-based to an edge-based modeling approach
requires new auxiliary dynamic variables. Besides θk→l (t), we will introduce addi-
tional quantities that are defined on the set of edges E and following our convention,
we use the index notation source → target. In order to avoid repetition, we also note
that in all cases the target node is considered to be in the cavity state.
To set up a dynamic equation for θk→l (t), we observe that the value can only
decrease precisely when (i) a contact indicated by ak→l (t) exists and (ii) the source
node k is infected and has not yet transmitted the disease to l. We denote the corre-
sponding probability for event (ii) with Ik→l (t). Together with the probability βΔt
to contract the disease within the time step Δt, we obtain our first, discrete-time
dynamic equation:

θk→l (t + Δt) = θk→l (t) − βΔtak→l (t)Ik→l (t). (13.4)

As the initial condition we choose θk→l (t) = 1 for all edges (k, l) ∈ E .
Next, we investigate Ik→l (t) and observe that the value can change due to three
independent events: (i) Node k recovers, with probability μΔt; (ii) Node l contracts
the disease from k upon a contact, with probability βΔt, whereby both events, i.e.
(i) and (ii), can also occur simultaneously with probability βμ(Δt)2 ; (iii) Source
node k is newly infected by one of its incident neighbors, excluding the cavity node l
with probability −ΔSk→l (t) = Sk→l (t + Δt) − Sk→l (t). Here, Sk→l (t) denotes the
probability to find k in the susceptible state. Balancing all probabilities, we obtain
the following dynamic equation:

Ik→l (t + Δt) = (1 − μΔt)[1 − βΔtak→l (t)]Ik→l (t) − ΔSk→l (t). (13.5)

The initial condition is given by Ik→l (0) = 1 − z k for all edges.


We determine the probability Sk→l (t) in the same manner as Eq. (13.2), i.e., we
find that k is susceptible if (i) it has been initially with probability z k and (ii) with
probability Φk→l no pathogens were transmitted from one of its neighbors j ∈ Nk \
{l}, excluding the cavity node l. Hence, we find Sk→l (t) = z k Φk→l . Moreover, the
authors in Lokhov (2014) demonstrated that Φk→l factorizes under the assumption
of a tree topology and thus, similar to Eq. (13.3), we obtain:
246 A. Koher et al.

Sk→l (t) = z k θ j→k (t). (13.6)
j∈N k \{l}

We can now substitute Eq. (13.6) into Eq. (13.5) and together with Eq. (13.4) we
thus obtain a closed system of 2E dynamical equations that determine the disease
progression.
Finally, we return to node-centric quantities. To this end, we follow Karrer and
Newman (2010) and note first that Eq. (13.3) determines already the probability
PlS (t) that node l is susceptible at the time t. Then, we obtain the corresponding
probability P I (t) for the infected state from the conservation condition, i.e., a node
can assume only one of the three possible states X ∈ {S, I, R}:

PlI (t) = 1 − PlS (t) − PlR (t). (13.7)

The remaining marginal probability P R (t) can only increase due to a transition from
the infected to the recovered state, which is given by μΔt P I (t). Hence, the third
node-centric equation reads

PlR (t + Δt) = PlR (t) + μΔt PlI (t). (13.8)

After this brief review of the discrete-time case that has been first derived in Koher
et al. (2019b), we will elaborate on a continuous-time version next.

13.3 Continuous-Time Description

In the continuous-time limit Δt → 0, Eq. (13.4) leads to

d
θk→l (t) = −βak→l (t)Ik→l (t). (13.9)
dt

We focus on Sk→l (t + Δt) from Eq. (13.6) and using the definition of θ j→k (t + Δt)
(cf. Eq. (13.4)), we obtain:

Sk→l (t + Δt) = z k θ j→k (t + Δt) (13.10a)
j∈N k \{l}
  
= zk θ j→k (t) − βΔtak→l (t)I j→k (t) . (13.10b)
j∈N k \{l}

For a sufficiently small sampling interval Δt, such that θ j→k (t)  βΔtak→l (t)
Ik→l (t), we can linearize Eq. (13.10b) and thus arrive at:
13 Spreading of Infection on Temporal Networks: An Edge-Centered … 247
⎡ ⎤
  βΔtak→l (t)I j  →k (t) ⎦
Sk→l (t + Δt) = z k θ j→k (t) · ⎣1 −
θ j  →k (t)
j∈N k \{l} j  ∈N k \{l}
(13.11a)
⎡ ⎤
 I j  →k (t) ⎦
Sk→l (t + Δt) = Sk→l (t) ⎣1 − βΔt ak→l (t) (13.11b)
θ j  →k (t)
j  ∈N k \{l}

In Eq. (13.11b) we inserted the definition of Sk→l (t) from Eq. (13.6) and this leads
directly to our second continuous-time dynamic equation

d  I j→k (t)
Sk→l (t) = −β Sk→l (t) ak→l (t) . (13.12)
dt θ j→k (t)
j∈N k \{l}

The quotient I j→k (t)/θ j→k (t) can be interpreted as the conditional probability that
j is infected given that cavity node k has not yet contracted the disease from j. It is
worth noting that Eq. (13.12) is well defined because we start from the initial condition
θ j→k (t) = 1 for all edges k → j and Eq. (13.9) asserts positivity for θ j→k (t) for all
finite observation times t. The remaining discrete-time Eq. (13.5) can be immediately
written down in terms of a difference quotient ΔX (t) = [X (t + Δt) − X (t)]/Δt:

ΔIk→l (t) ΔSk→l (t)


= [−μ − βak→l (t) + μβΔtak→l (t)] Ik→l (t) − . (13.13)
Δt Δt

In the continuous-time limit, the higher order term μβΔtak→l (t) vanishes, leading
to
d d
Ik→l (t) = [−μ − βak→l (t)]Ik→l (t) − Sk→l (t). (13.14)
dt dt

At last it is also instructive to formulate the dynamic equation for Rk→l (t), i.e.,
the probability that node k has recovered at time t without transmitting the disease to
cavity node l. The value of Rk→l (t) can only increase over time and the corresponding
in-flux at time t is given by (i) the probability Ik→l (t) that node k is in state I and has
not infected its neighbor l together with (ii) the probability μΔt to recover within
the time step Δt. With this, we obtain:

Rk→l (t + Δt) = Rk→l (t) + μΔt Ik→l (t). (13.15)

The corresponding continuous-time equation thus reads

d
Rk→l (t) = μIk→l (t). (13.16)
dt
Unlike the discrete-time model, it is now obvious that the dynamic Eqs. (13.9),
(13.12), (13.14), and (13.16) satisfy the conservation condition
248 A. Koher et al.

θk→l (t) = Sk→l (t) + Ik→l (t) + Rk→l (t) (13.17)

at every time t. Moreover, we can rescale time according to μt → t and rewrite


the continuous-time contact-based model in terms of the dimensionless epidemic
parameter γ = β/μ:

d
θk→l (t) = −γ ak→l (t)Ik→l (t) (13.18a)
dt
d  I j→k (t)
Sk→l (t) = −γ Sk→l (t) a j→k (t) (13.18b)
dt θ j→k (t)
j∈N k \{l}
d d
Ik→l (t) = −[1 + γ ak→l (t)]Ik→l (t) − Sk→l (t) (13.18c)
dt dt
d
Rk→l (t) = Ik→l (t). (13.18d)
dt
We can further reduce the set of dynamic equations using the conservation con-
dition in Eq. (13.17). To this end, we first substitute Sk→l (t) in Eq. (13.17) with the
definition from Eq. (13.6):

Ik→l (t) = θk→l (t) − z k θ j→k (t) − Rk→l (t). (13.19)
j∈N k \{l}

With this, we replace Ik→l (t) in Eqs. (13.18a) and (13.18d) and thus we obtain a
closed set of 2E dynamic equations that determine the disease progression of the
continuous-time CB model.
Returning to node-centric quantities, i.e., the probability that a given node l is
susceptible, infected or recovered, the continuous-time equivalent formulation to
Eqs. (13.3), (13.7), and (13.8) reads

PlS (t) = zl θk→l (t) (13.20a)
k∈N l

PlI (t) = 1 − PlS (t) − PlR (t) (13.20b)


d R
P (t) = PlI (t). (13.20c)
dt l

13.4 Spectral Properties of the Continuous-Time Model

In this section, we evaluate the low prevalence limit of Eqs. (13.18) in order to
derive a spectral criterion that determines the epidemic threshold. To this end, we
assume θk→l (t) = 1 − δk→l (t), where δk→l (t)  1 as well as Ik→l  1. With this,
we linearize Eq. (13.18b) and obtain
13 Spreading of Infection on Temporal Networks: An Edge-Centered … 249
⎡ ⎤
d   I j→k (t)
Sk→l (t) = −γ ⎣1 − δ j→k (t)⎦ · ak→l (t) (13.21a)
dt 1 − δk→l (t)
j∈N k \{l} j∈N k \{l}

= −γ ak→l (t)I j→k (t). (13.21b)
j∈N k \{l}

In Eq. (13.21b), we keep only linear terms in δk→l (t) and I j→k (t). This allows us
to decouple the set of dynamic equations and express Eq. (13.18c) only in terms of
Ik→l (t):

d 
Ik→l (t) = [−1 − γ ak→l (t)]Ik→l (t) + γ ak→l (t)I j→k (t). (13.22)
dt
j∈N k \{l}

Next, we vectorize Eq. (13.22) and to this end, we define the vectors I(t) and
a(t) with elements Ik→l (t) and ak→l (t), respectively. In order to rewrite j∈N k \l
a j→k (t)Ik→l (t) from Eq. (13.22) in terms of a matrix that acts on the state vector
I(t), we introduce the time-dependent non-backtracking operator B(t) as in Koher
et al. (2019b):

a j→k  (t), if k  = k, and j = l
Bk→l, j→k  (t) = (13.23)
0, otherwise.

Expressed in words, we find Bk→l, j→k  (t) = 1 if the contact (t, j, k  ) is incident
on the edge (k, l), implying k  = k, and additionally j = l. The latter constraint
prevents a probability flow back to the initially infected node and constitutes the non-
backtracking property. In all other cases, we find Bk→l, j→k  (t) = 0. Unlike the static
definition in Krzakala et al. (2013) and Karrer et al. (2014), we have to differentiate
between the first and second index of the L × L dimensional matrix B: The first
index, i.e. (k, l) ∈ E , corresponds to an edge in the aggregated graph, thus reflecting
a potential path for future infections. The second index (t, j, k  ) ∈ C , however, is a
(temporal) contact from node j to k  at time t.
Moreover, we define the diagonal matrix diag(a(t)) with elements ak→l (t) for
all edges (k, l) ∈ E on the diagonal and, additionally, we denote with 1 the identity
matrix. Similar to the discrete-time derivation in Koher et al. (2019b), we thus obtain:

d  
I(t) = −1 − γ diag(a(t)) + γ B(t) I(t). (13.24)
dt
The only structural difference to the discrete-time result in Koher et al. (2019b) is
that the higher order term βak→l μ does not appear, because the simultaneous event
of infection and recovery does not need to be accounted for in the continuous-time
formulation.
250 A. Koher et al.

Within the open interval [tn , tn+1 ) where the boundaries tn and tn+1 , respectively,
mark subsequent change points of the network topology, we integrate Eq. (13.24)
and obtain

I(tn+1 ) = M n (γ )I(tn ) (13.25a)


tn+1 
M n (γ ) = exp dτ [−1 − γ diag(a(τ )) + γ B(τ )] . (13.25b)
tn

Using the initial condition I(0), we can formally state the explicit solution as follows:

G −1
N
I(T ) = M n (γ )I(0). (13.26)
n=0

Here, NG is the total number of discrete changing points of the network


 N −1topology.
Following Valdano et al. (2018) we can state the propagator M(γ ) = n=0 M n (γ )
in a compact notation using Dyson’s time ordering operator T B(τ1 )B(τ2 ) = B(τ1 )B
(τ2 )Θ(τ1 − τ2 ) + B(τ2 )B(τ1 )Θ(τ2 − τ1 ), where Θ(x) denotes the Heaviside func-
tion: 
t  
M(γ ) = T exp dτ −1 − γ diag(a(τ )) + γ B(τ ) . (13.27)
0

Any small initial perturbation will decrease exponentially if the largest eigenvalue λ1 ,
i.e., the spectral radius satisfies λ1 [M(γ )] < 1. This result corresponds to Valdano
et al. (2018) where the epidemic propagator M(γ ) has been derived within the IB
framework and reads

t  
M(γ ) = T exp dτ −1 + γ A(τ ) . (13.28)
0

In Eq. (13.28), we denote with A(τ ) the time-dependent adjacency matrix and here,
1 is the N × N dimensional identity matrix.
In many cases, the temporal network is sampled with equidistant time steps Δt
and in this case, we can simplify the propagator Eq. (13.28) to

G −1
N
  
M(γ ) = exp Δt −1 − γ diag(a(n)) + γ B(tn ) . (13.29)
n=0

The CB result in Eq. (13.29) is akin to the IB formulation that was first derived in
Speidel et al. (2016). In the quenched limit, when the disease evolves on a much faster
time scale than the temporal network, we can assume a static underlying topology
and thus identify diag(a(t)) ≡ 1 and B(t) ≡ B(1) ≡ B. Then, the linearized result
in Eq. (13.24) simplifies to
13 Spreading of Infection on Temporal Networks: An Edge-Centered … 251

d  
I(t) = −(1 + γ )1 + γ B I(t). (13.30)
dt
Finally, Eq. (13.30) is asymptotically stable if the largest eigenvalue λ1 of the
infection operator M(γ ) = (1 + γ )1 + γ B is negative. Hence, we recover the
continuous-time threshold as previously derived within the more general message-
passing framework on static networks (Karrer and Newman 2010; Karrer et al. 2014):

γ 1
= . (13.31)
γ +1 λ1 (B)

For non-Markovian infection and recovery processes the generalized criticality


condition reads T = 1/λ1 (B) (see Karrer and Newman 2010; Karrer et al. 2014) ,
where the transmissibility T is given by
∞ ∞ 
T = s(τ ) r (τ  )dτ  dτ. (13.32)
0 τ

Intuitively, T can be interpreted as the probability that a newly infected node transmits
the disease to a given neighbor prior to recovery (Karrer and Newman 2010; Newman
2002). Within this general formulation s(τ )dτ is the probability that an infected node
passes the disease to a neighbor within a time interval [τ, τ + dτ ] after contracting
the infection. Similarly, we define the probability r (τ )dτ that a node recovers in the
interval [τ, τ + dτ ] after it has been infected. For a constant infection and recovery
rate, i.e., for the Markovian dynamics that we assumed in this article, we find s(τ ) =
β exp(−βτ ) and s(τ ) = μ exp(−μτ ). This particularly simple and widely studied
choice then leads to T = γ /(γ + 1) and thus to Eq. (13.31).
For temporal networks, we cannot separate in general the transmissibility T from
the network topology in order to find a similarly elegant results like Eq. (13.31). The
reason is that the probability to infect a given neighbor depends on the timing of
contacts and as a consequence the transmissibility T would have to be both edge-
and time-dependent even in the Markovian case.

13.5 Relation to the Edge-Based Compartmental Model

An important branch in theoretical epidemiology focuses on random graphs, i.e.,


an ensemble of networks derived from a generating model, instead of a single real-
ization. In this context, the edge-based compartmental (EBC) model (Miller et al.
2012) is a particularly compact and accurate approach to model infections with per-
manent immunity. In this section, we will explore the relation between the CB model
presented in Sect. 13.3 and the EBC framework for static random graphs.
To this end, we will focus on random networks with unweighted and undirected
edges that are derived from the configuration model (Molloy and Reed 1995; Newman
252 A. Koher et al.

et al. 2001). This widely used generative model allows to study the effect of the degree
distribution on the spread of infections (Newman 2002). For this, we have to create an
ensemble of networks with the same degree distribution that are otherwise maximally
random. This can be done according to the Bender-Canfield algorithm (Molloy and
Reed 1995), which begins with a set of N vertices. To each node we assign a number
of k (undirected) stubs, i.e., edges with no target node that are drawn independently
from the given degree distribution p(k). In the next step, we connect two randomly
chosen stubs which then form a proper edge between the corresponding nodes. The
step is repeated until no more stubs are available and if initially the number of stubs
were found to be odd then the we would replace one node repeatedly until the sum
is even.
Before we proceed with the ensemble average, we restate for convenience the
relevant dynamic equations of the continuous-time model, i.e., Eqs. (13.18a) and
(13.18d), for a network with a static topology. In this case, ak→l ≡ 1 for all edges
(k, l) ∈ E and thus we obtain

d
θk→l (t) = −γ Ik→l (t) (13.33a)
dt
d
Rk→l (t) = Ik→l (t) (13.33b)
dt
and close the set of equations with the conservation condition from Eq. (13.19):

Ik→l (t) = θk→l (t) − z k θ j→k (t) − Rk→l (t). (13.34)
j∈N k \{l}

For static networks, we can further simplify the dynamic equations by substituting
Ik→l (t) in Eq. (13.33a) with Eq. (13.33b) and integrating the result:

d 1 d
Rk→l (t) = − θk→l (t) (13.35a)
dt γ dt
1
Rk→l (t) = (1 − θk→l (t)). (13.35b)
γ

From Eqs. (13.33a), (13.34), and (13.35b), we obtain a coupled set of E dynamic
equations that determine the progression of an SIR epidemic on a static graph:

d 
θk→l (t) = 1 − (1 + γ )θk→l (t) + γ z k θ j→k (t). (13.36)
dt
j∈N k \{l}

The result in Eq. (13.36) constitutes a message-passing equation as derived in Karrer


and Newman (2010). We will explore the connection to the more general message-
passing framework for epidemics with non-Markovian dynamics in Sect. 13.6. Here,
we continue instead with the ensemble average over random networks, thereby fol-
lowing closely the approach outlined in Karrer and Newman (2010). We start with the
13 Spreading of Infection on Temporal Networks: An Edge-Centered … 253

following crucial observation: The state of a given edge k → l in a single realization


of a graph displays a characteristic trajectory in state space, i.e., a time-dependent
curve given by θk→l (t) from Eq. (13.36). As we perform an average over the ensem-
ble of graphs, our selected edge k → l will assume every position within a network.
As a consequence the averaged state trajectory is identical to the one that we would
obtain if we had started with a different edge initially and then performed the average.
In other words, it is sufficient to determine the ensemble averaged probabilities for
one representative edge:

d d
θ (t) ≡ θk→l (t) (13.37a)
dt dt  

= 1 − (1 + γ )θ (t) + γ z k θ j→k (t) . (13.37b)
j∈N k \{l}

Next, we focus on the second term in Eq. (13.37b). A crucial property of large net-
works that are generated by the configuration model is that they are locally tree-like
in the sense that the average length of the smallest cycle diverges with increasing net-
work size. Hence, we can assume in the limit N → ∞ that different branches emerg-
ing from k can be treated independently. The average over the product thus equals
the product over averages. Moreover, we remember that ensemble averaged dynamic
quantities are equal for all edges and in particular θ j→k (t) ≡ θ (t) for all edges
( j, k) ∈ Nk \ {l}. With this the product in Eq. (13.37b) simplifies to [θ (t)]ke , where ke
is the average number of next nearest neighbors, or equally, the excess degree (Karrer
and Newman 2010). From a given degree distribution pn in the configuration model,
we can derive the excess degree distribution qn according to qn = (n + 1) pn+1 /k
(Newman et al. 2001), where k = n denotes the average degree. Finally, we make
use of the corresponding generating function G 1 (x) = n qn x n and thus the second
term in Eq. (13.37b) simplifies to
 
 
N
zk θ j→k (t) = z qn [θ (t)]n (13.38a)
j∈N k \{l} n=0

= zG 1 (θ (t)). (13.38b)

Here, z = z k denotes the probability that a randomly chosen node is initially suscep-
tible. With Eqs. (13.37b) and (13.38b), we obtain the following ensemble averaged
dynamic equation for θ :

d
θ (t) = 1 − (1 + γ )θ (t) + γ zG 1 (θ (t)). (13.39)
dt
This compact result captures the disease dynamic with high accuracy as demonstrated
in Karrer and Newman (2010) within the message-passing framework and later in
Miller et al. (2012) as a special case of the edge-based compartmental model. The
254 A. Koher et al.

authors in Miller et al. (2012) also investigated alternative random graph models with
time-varying topologies.
We close the section with a linear stability analysis of Eq. (13.39). Similar to the
derivation in Sect. 13.4, we start with a small initial perturbation: θ (t) = 1 − δ(t)
with δ(t)  1 and z = 1. We then expand the generating function G 1 (1 − δ(t)) to
the first order in δ(t):

G 1 (1 − δ(t)) = pn (1 − δ(t))n (13.40a)
n
= 1 + n q (1 − δ(t)) + O(δ(t)2 ). (13.40b)

Here, we used two properties of the generating function, namely G 1 (1) = n qn = 1


and G 1 (1) = n nqn = n q , where n q = ke denotes the mean excess degree. With
this, the linearization of Eq. (13.39) around the disease-free stable fixed point reads:

d
δ(t) = −(1 + γ − γ n q )δ(t). (13.41)
dt
From Eq. (13.41) we can easily see that a transition occurs from local to global out-
breaks if 1 + γ − γ n q < 0. Commonly, n q is expressed in terms of the first and
second moment of the degree distribution, i.e. n = n npn and n 2 = n n 2 pn ,
respectively. For that we take the definition n q = n nqn and substitute the relation
qn = (n + 1) pn+1 / n (see Newman et al. 2001 for details). With this, we recover
the well-known criticality condition from Newman (2002) and Miller (2007):

γ n
= 2 . (13.42)
γ +1 n − n

This result is related to the epidemic threshold in Eq. (13.31), where we studied
a single realization of a static graph and hence expressed the right hand side of
Eq. (13.42) through the spectral radius λ1 (B) of the non-backtracking matrix B.

13.6 Relation to the Message-Passing Framework

In the seminal work of Karrer and Newman (2010), the authors proposed a general
model for SIR spreading processes on sparse networks with non-Markovian infection
and recovery dynamics. The integro-differential formulation in Karrer and Newman
(2010) is a foundation of our CB model and therefore we will discuss in this section
the relation to their message-passing approach. For that we first propose a gener-
alization of the CB model to non-Markovian dynamics and then, taking the static
network limit, we will arrive at the previously proposed result.
13 Spreading of Infection on Temporal Networks: An Edge-Centered … 255

As a first step, we transform the dynamic equations in Eqs. (13.9), (13.12), (13.14),
and (13.16) that define the continuous-time CB model to an integro-differential equa-
tion. To this end, we notice first that Eq. (13.14) is of the form

d d
Ik→l (t) = −λk→l (t)Ik→l (t) − Sk→l (t). (13.43)
dt dt

For notational convenience, we use the short-hand notation λk→l (τ ) = μ + βak→l (τ )


t
and k→l (t, tk ) = exp[− tk λk→l (τ )dτ ]. The former denotes the probability that
node k recovers or infects the cavity node l within the time interval [τ, τ + dt) after
contracting the disease and the latter corresponds to the probability that no such
event took place between the time of infection and the observation time tk and t,
respectively. Here, we denote the absolute and relative time after infection with t and
τ , respectively. Together with the initial condition Ik→l (0) = 1 − z k the solution to
the differential equation is given by
t  
d
Ik→l (t) = (1 − z k ) k→l (t, 0) + − Sk→l (tk ) k→l (t, tk )dtk . (13.44)
0 dtk

In words, Eq. (13.44) states that node k has contracted the disease but not infected
its neighbor l by absolute time t if (i) node k was infected initially but has neither
recovered nor passed the infection or (ii) it was susceptible initially, contracted the
disease at time tk and has then neither recovered nor infected its neighbor up to the
observation time t.
Next, we integrate Eq. (13.9) and using the initial condition θk→l (0) = 1 we get
t
1 − θk→l (t) = βak→l (t  )Ik→l (t  )dt  (13.45a)
0
t
=(1 − z k ) dt  f k→l (t  |0) (13.45b)
0
t t  
d
+ dt  dtk − Sk→l (tk ) f k→l (t  |tk ).
0 0 dtk

In Eq. (13.45b) we used Ik→l (t) from Eq. (13.44) and we also introduced the transmis-
sion probability f k→l (t  |tk ) = βak→l (t  ) k→l (t  , tk ): Given that node k contracted
the infection at absolute time tk , f k→l (t  |tk ) gives the probability that the same node
passes the disease
 ∞ to its neighbor l at absolute time t  . In the context of static networks

the quantity tk f k→l (t |tk )dt is frequently referred to as transmissibility and plays a
crucial role in linking epidemic spreading to a percolation process (Karrer and New-
man 2010; Newman 2002). Note that the transmissibility can be smaller than one
as node k might recover before passing on the infection and for temporal networks,
unlike the static case, the value is edge- and time-dependent as we discussed already
at the end of Sect. 13.4.
256 A. Koher et al.

The message-passing framework in Karrer and Newman (2010) assumes a non-


Markovian infection and recovery process. Similarly, our result in Eq. (13.45b)
demonstrates how a general epidemic model on temporal networks can be formu-
lated by redefining f k→l (t  |tk ) as proposed in Karrer and Newman (2010) (see also
Eq. (13.32)).
In order to demonstrate the reduction to the message-passing formulation of Karrer
and Newman (2010), we reformulate Eq. (13.45b) for a static underlying topology.
With ak→l (t) ≡ 1 the transmission probability f k→l (t|tk ) → f (τ ) depends only on
the relative time τ = t − tk after infection and becomes an identical function for all
edges k → l. Using this simplification we obtain
t
1 − θk→l (t) =(1 − z k ) dτ f (τ )
0
t τ  
d
+ dτ dτi f (τ − τi ) − Sk→l (τi ) . (13.46)
0 0 dτi

Integrating the second term in Eq. (13.46) by parts, and using the fact that the double
integral can be reordered as
t τ t t
dτ dτi = dτi dτ, (13.47)
0 0 0 τi

we arrive at the message-passing formulation equivalent to that in Karrer and New-


man (2010):
t
θk→l (t) = 1 − dτ f (t)(1 − Sk→l (t − τ )). (13.48)
0

With this we have linked the continuous-time CB model with a previously introduced
message-passing framework for general non-Markovian epidemic models in the case
of a static underlying graph.

13.7 Summary and Discussion

We have presented a continuous-time description of a contact-based model. The


discussed theoretical framework allows us to study the spreading of epidemics and
extends the dynamic message-passing approach to networks with a time-varying
topology. At the center of the contact-based model is a shift in perspective from
node- to edge-centric quantities. This allows to accurately model, e.g., susceptible-
infected-recovered outbreaks on time-varying trees, that is, temporal networks with a
loop-free underlying topology. We have shown that on arbitrary graphs, the proposed
contact-based (edge-centric) model incorporates potential structural and temporal
heterogeneities of the underlying contact network and improves analytic estimations
13 Spreading of Infection on Temporal Networks: An Edge-Centered … 257

with respect to the individual-based (node-centric) approach at a low computational


and conceptual cost. Within this new framework, we have derived an analytical
expression for the epidemic threshold on temporal networks.
We have taken a decidedly theoretical and analytical approach to the proposed
framework. This will facilitate the application to both empirical data sets and generic
classes of networks. In a recent work, a related framework has been proposed by
Humphries et al. (2021). That temporal pair-based approach is similar, yet different
from the contact-based model of this chapter. In brief, it keeps the vertex-based
approach of the individual-based model and the dynamic equations are given in
terms of vertices, but—in addition—it includes additional equations for pairs of
nodes. That way, an edge-centric perspective is recovered. As discussed in a separate
chapter of this book, the model is in exact agreement with Markovian epidemic
processes on temporal networks, which contain no more than one non-backtracking
path between any two vertices, e.g., on tree-like networks. Note that Frasca et al.
report a similar framework to systematically close the equations at the level of node
pairs (Frasca and Sharkey 2016).

Acknowledgements AK and PH acknowledge the support by Deutsche Forschungsgmeinschaft


(DFG) in the framework of Collaborative Research Center 910 during its second funding period from
2015 to 2018. AK acknowledges further support by German Academic Exchange Service (DAAD)
via a short-term scholarship. JPG acknowledges the support by Science Foundation Ireland (grant
numbers 16/IA/4470 and 16/RC/3918). PH acknowledges additional support by DFG (project ID
434434223 - SFB 1461).

References

R.H. Anderson, R.M. May, Infectious Diseases of Humans: Dynamics and Control (Oxford Uni-
versity Press, Oxford and New York, 1992)
N.T.J. Bailey, The Mathematical Theory of Epidemics (Griffin, London, 1957)
N.T.J. Bailey, The Mathematical Theory of Infectious Diseases and Its Applications. Mathematics
in Medicine Series (Griffin, London, 1975)
D. Balcan, V. Colizza, B. Gonçalves, H. Hu, J.J. Ramasco, A. Vespignani, Proc. Natl. Acad. Sci.
106(51), 21484 (2009)
D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, C. Faloutsos, ACM Trans. Inf. Syst. Secur. 10(4),
1:1 (2008)
D.L. Chao, M.E. Halloran, V.J. Obenchain, I.M. Longini Jr., PLOS Comput. Biol. 6(1), 1 (2010)
K.T.D. Eames, M.J. Keeling, Proc. Natl. Acad. Sci. 99(20), 13330 (2002)
S. Eubank, H. Guclu, V.S.A. Kumar, M.V. Marathe, A. Srinivasan, Z. Toroczkai, N. Wang, Nature
429, 180 (2004)
N.M. Ferguson, D.A.T. Cummings, S. Cauchemez, C. Fraser, S. Riley, A. Meeyai, S. Iamsirithaworn,
D.S. Burke, Nature 437(7056), 209 (2005)
M. Frasca, K.J. Sharkey, J. Theor. Biol. 399, 13 (2016)
A. Ganesh, L. Massoulié, D. Towsley, in INFOCOM 2005. 24th Annual Joint Conference of the
IEEE Computer and Communications Societies. Proceedings IEEE, vol. 2 (IEEE, 2005), pp.
1455–1466
S. Gómez, A. Arenas, J. Borge-Holthoefer, S. Meloni, Y. Moreno, Europhys. Lett. 89(3), 38009
(2010)
258 A. Koher et al.

S. Gonçalves, G. Abramson, M.F.C. Gomes, Eur. Phys. J. B 81(3), 363 (2011)


M.E. Halloran, N.M. Ferguson, S. Eubank, I.M. Longini, D.A.T. Cummings, B. Lewis, S. Xu,
C. Fraser, A. Vullikanti, T.C. Germann, D. Wagener, R. Beckman, K. Kadau, C. Barrett, C.A.
Macken, D.S. Burke, P. Cooley, Proc. Natl. Acad. Sci. 105(12), 4639 (2008)
W.H. Hamer, Lancet 1, 733 (1906)
H.W. Hethcote, SIAM Rev. 42(4), 599 (2000)
R. Humphries, K. Mulchrone, J. Tratalos, S.J. More, P. Hövel, Appl. Netw. Sci. 6(1), 23 (2021)
B. Karrer, M.E.J. Newman, Phys. Rev. E 82, 016101 (2010)
B. Karrer, M.E.J. Newman, L. Zdeborová, Phys. Rev. Lett. 113, 208702 (2014)
M.J. Keeling, P. Rohani, Modeling Infectious Diseases in Humans and Animals (Princeton Univer-
sity Press, Princeton, 2008)
W.O. Kermack, A.G. McKendrick, Proc. R. Soc. A 115(772), 700 (1927)
I.Z. Kiss, C.G. Morris, F. Sélley, P.L. Simon, R.R. Wilkinson, J. Math. Biol. 70(3), 437 (2015a)
I.Z. Kiss, G. Röst, Z. Vizi, Phys. Rev. Lett. 115(7), 078701 (2015b)
A. Koher, J.P. Gleeson, P. Hövel, in Temporal Network Theory (Springer, Berlin, 2019a), pp. 235–
252
A. Koher, H.H.K. Lentz, J.P. Gleeson, P. Hövel, Phys. Rev. X 9(3), 031017 (2019b)
F. Krzakala, C. Moore, E. Mossel, J. Neeman, A. Sly, L. Zdeborová, P. Zhang, Proc. Natl. Acad.
Sci. 110(52), 20935 (2013)
J. Lindquist, J. Ma, P. van den Driessche, F.H. Willeboordse, J. Math. Biol. 62(2), 143 (2011)
A.Y. Lokhov, Dynamic cavity method and problems on graphs. Theses, Université Paris Sud - Paris
XI (2014)
A.Y. Lokhov, M. Mézard, H. Ohta, L. Zdeborová, Phys. Rev. E 90(1), 012801 (2014)
I.M. Longini, A. Nizam, S. Xu, K. Ungchusak, W. Hanshaoworakul, D.A.T. Cummings, M.E.
Halloran, Science 309(5737), 1083 (2005)
S. Merler, M. Ajelli, Pugliese. PLOS Comput. Biol. 7(9), 1 (2011)
J.C. Miller, Phys. Rev. E 76, 010101 (2007)
J.C. Miller, PLoS ONE 9(7), 1 (2014)
J.C. Miller, I.Z. Kiss, Math. Model. Nat. Phenom. 9(2), 4 (2014)
J.C. Miller, A.C. Slim, E.M. Volz, J. Royal Soc. Interface 9(70), 890 (2012)
J.C. Miller, E.M. Volz, J. Math. Biol. 67(4), 869 (2013)
M. Molloy, B. Reed, Random Struct. Algorithms 6(2–3), 161 (1995)
M.E.J. Newman, Phys. Rev. E 66(1), 016128 (2002)
M.E.J. Newman, S.H. Strogatz, D.J. Watts, Phys. Rev. E 64, 026118 (2001)
R. Pastor-Satorras, C. Castellano, P. Van Mieghem, A. Vespignani, Rev. Mod. Phys. 87, 925 (2015)
J. Pearl, in Proceedings of the Second AAAI Conference on Artificial Intelligence (AAAI Press,
1982), AAAI’82, pp. 133–136
L.E.C. Rocha, N. Masuda, Sci. Rep. 6, 31456 (2016)
R. Ross, The Prevention of Malaria (E.P. Dutton, New York, 1910)
N. Sherborne, J.C. Miller, K.B. Blyuss, I.Z. Kiss, J. Math. Biol. 76(3), 755 (2018)
P.L. Simon, M. Taylor, I.Z. Kiss, J. Math. Biol. 62(4), 479 (2011)
L. Speidel, K. Klemm, V.M. Eguiluz, N. Masuda, New J. Phys. 18(7), 073013 (2016)
E. Valdano, L. Ferreri, C. Poletto, V. Colizza, Phys. Rev. X 5, 021005 (2015)
E. Valdano, M.R. Fiorentin, C. Poletto, V. Colizza, Phys. Rev. Lett. 120(6), 068302 (2018)
N.G. van Kampen, Stochastic Processes in Physics and Chemistry (North-Holland, Amsterdam,
1981)
P. Van Mieghem, J. Omic, R. Kooij, IEEE/ACM Trans. Netw. 17(1), 1 (2009)
P. Van Mieghem, R. van de Bovenkamp, Phys. Rev. Lett. 110, 108701 (2013)
Y. Wang, D. Chakrabarti, C. Wang, C. Faloutsos, in 22nd international symposium on reliable
distributed systems, 2003. Proceedings. (2003), pp. 25–34
R.R. Wilkinson, F.G. Ball, K.J. Sharkey, J. Math. Biol. 75(6), 1563 (2017)
M. Youssef, C. Scoglio, J. Theor. Biol. 283(1), 136 (2011)
Chapter 14
The Effect of Concurrency on Epidemic
Threshold in Time-Varying Networks

Tomokatsu Onaga, James P. Gleeson, and Naoki Masuda

Abstract Various epidemic spreading processes are considered to take place on


time-varying networks. One key factor that alters epidemic spreading on time-varying
networks is concurrency, the number of neighbours that a node has at a given time
point. In this chapter, we present a theoretical study of the effects of concurrency
on the susceptible-infected-susceptible epidemic processes on a class of temporal
network models. By theoretical analysis that explicitly takes into account stochastic
dying-out effects, we show that network dynamics increase the epidemic threshold
(i.e., suppress epidemics), compared to that for the time-averaged network when the
nodes’ concurrency is low, but also decrease the epidemic threshold (i.e., enhance
epidemics) when the concurrency is high.

Keywords Temporal networks · Theoretical epidemiology · Network


epidemiology · Concurrency

T. Onaga
Interdisciplinary Graduate School of Engineering Sciences, Kyushu University,
Kasuga 816-8580, Japan
e-mail: onaga.tomokatsu.617@m.kyushu-u.ac.jp
The Frontier Research Institute for Interdisciplinary Sciences, Tohoku University,
Sendai 980-8578, Japan
J. P. Gleeson
MACSI, Department of Mathematics and Statistics, University of Limerick,
Limerick V94 T9PX, Ireland
e-mail: james.gleeson@ul.ie
N. Masuda (B)
Department of Engineering Mathematics, University of Bristol, Woodland Road,
Bristol BS8 1UB, United Kingdom
e-mail: naokimas@gmail.com
Department of Mathematics, State University of New York at Buffalo,
Buffalo, NY 14260-2900, USA

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 259
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_14
260 T. Onaga et al.

14.1 Introduction

How infectious diseases and information spread on contact networks may consid-
erably depend on temporal dynamics as well as static structural properties of the
underlying networks. While earlier research has focused on the understanding of
epidemic processes on static networks, an overarching aim of what one could call
“temporal network epidemiology” is to reveal qualitative and quantitative changes
that the time-dependent nature of the network introduces to epidemic processes on
networks (Bansal et al. 2010; Masuda and Holme 2013, 2017; Holme 2015).
Concurrency in epidemiology is a notion proposed in mid 1990s that concerns the
number of contacts that a node has simultaneously (Morris and Kretzschmar 1995,
1997; Kretzschmar and Morris 1996). In a static network, all edges exist concurrently
(i.e., at the same time) such that if a node has degree k, the k edges incident to the node
are simultaneously present over time. This is not necessarily the case for temporal
networks, in which edges appear and disappear. In a temporal network in which
each node never has more than one edge at any point of time, one can say that the
temporal network completely lacks concurrency. In general, although it depends on
how to measure the concurrency, temporal networks are considered to have lower
concurrency than the corresponding static networks. However, different temporal
networks can have different levels of concurrency even if they are reduced to the

(a)

(b)

(c)

time
Fig. 14.1 a A temporal network lacking concurrency. b A temporal network having a high con-
currency. c The aggregate network corresponding to the temporal networks shown in (a) and (b)
14 The Effect of Concurrency on Epidemic Threshold in Time-Varying Networks 261

same aggregate static network when the time information about the edges is ignored.
Consider two temporal networks shown in Fig. 14.1a and b, both of which have
N = 5 nodes and 10 discrete time points. The temporal network shown in Fig. 14.1a
lacks concurrency because all edges appear in isolation. Each node has at most m = 1
edge at any given time. In contrast, the temporal network shown in Fig. 14.1b has
high concurrency because some time points contain a star network and the hub in the
star has m = 4 edges that exist simultaneously. In fact, these two temporal networks
have the same aggregated network, which is defined as the time-averaged network
(Fig. 14.1c). In Fig. 14.1c, the edges are shown by thin lines because they should
have smaller weights than the edges shown in Fig. 14.1a and b to ensure that each
edge is used for the same “weight × time”. Specifically, assume that each edge in
Fig. 14.1a and b is of unit weight and each edge in Fig. 14.1c is of weight 1/5. In
Fig. 14.1a and b, an edge of unit weight appears between each node pair in two out
of the 10 discrete time points. In Fig. 14.1c, an edge of weight 1/5 appears between
each node pair in all the 10 discrete time points. Then, in each of the three networks
shown in Fig. 14.1, each edge in the complete graph is used for a total of two “weight
× time” units across the ten discrete time points.
In the present chapter, we pose the following question: how do different levels
of concurrency in temporal networks (e.g., Fig. 14.1a versus b) impact epidemic
spreading in temporal networks? We address this question by theoretically analysing
a susceptible-infected-susceptible (SIS) model on a temporal network model based
on activation of cliques (rather than stars as in Fig. 14.1).

14.2 Model

We consider the following continuous-time SIS model on a variant of the activity-


driven model of temporal networks (see Perra et al. 2012 for the original activity-
driven model), which we call the clique-based activity-driven network. For the mod-
elling and analysis of the epidemic threshold in the case of the original activity-driven
network model, see Onaga et al. (2017).
We denote the number of nodes by N . We assign each node i (1 ≤ i ≤ N ) an
activity potential ai , drawn from a probability density F(a) (0 < a ≤ 1). Activity

0 τ 2τ 3τ 4τ 5τ t

Fig. 14.2 Schematic of a clique-based activity-driven network with m = 2


262 T. Onaga et al.

potential ai is the probability with which node i is activated in each time window
of constant period τ and is fixed over time. If activated, node i creates a clique with
m uniformly randomly selected other nodes in the network (Fig. 14.2), modelling a
group conversation event (Tantipathananandh et al. 2007; Stehlé et al. 2010; Zhao
et al. 2011). If two cliques overlap and share at least two nodes, we only create a single
edge between each pair of the nodes shared by the different cliques. However, for large
N and relatively small ai , such events seldom occur. At the end of each time window
of length τ , all cliques are discarded. Then, in the next time window, each node
is again activated with probability ai , independently of the activity in the previous
time window, and creates a clique with m uniformly randomly selected nodes. We
repeat this procedure. The present network model is an example of a switching
network (Liberzon 2003; Masuda et al. 2013; Hasler et al. 2013; Speidel et al. 2016).
A large τ implies that the dynamics of network structure are slow compared to
epidemic dynamics. In the limit of τ → 0, the network changes infinitesimally fast,
enabling the dynamical process on networks to be approximated by that on the time-
averaged static network (Hasler et al. 2013).
In the SIS model, each node takes either the susceptible or infected state. At any
time, each susceptible node contracts infection according to the Poisson process with
rate β per infected neighbouring node. Each infected node recovers to transit to the
susceptible state at rate μ irrespectively of the neighbours’ states. Changing τ to
cτ (c > 0) is equivalent to changing β and μ to β/c and μ/c, respectively, while
leaving τ unchanged. Hence, we set μ = 1 without loss of generality.

14.3 Analysis

In this section, we calculate the epidemic threshold for the SIS model on the clique-
based activity-driven network as follows. First, we analyse SIS dynamics on a static
clique spanning a single time window of length τ by explicitly considering extinction
effects (Sect. 14.3.1). Second, we obtain a linear mapping that transforms the network
state at the beginning of the time window to that at the end of the time window,
which coincides with the beginning of the next time window (Sect. 14.3.2). Third,
we obtain the epidemic threshold as the root of an implicit function using a moment
closure method. For expository purposes, we confine ourselves to a simplified model
where the activity potential of all nodes is the same (Sect. 14.3.3). In Sect. 14.3.4, we
consider the general case in which the activity potential depends on the node.
For the sake of the theoretical analysis, we assume that cliques generated by
an activated node are disjoint from each other. Because  a clique created by node
i overlaps with another clique with probability ≈ m j=i a j (m + 1)/N ∝ m 2 a,

where a ≡ da F(a)a is the mean activity potential, we impose m 2 a 1 for
this assumption to be valid.
14 The Effect of Concurrency on Epidemic Threshold in Time-Varying Networks 263

(a)
ρ0clique ρ1clique ρ2clique ρ3clique

susceptible infected

(b)
1

0.8
ρ0clique
probability

0.6
ρ1clique
0.4
ρ2clique
0.2 ρ3clique

0
0 2 6 8 4 10
t
Fig. 14.3 Stochastic SIS dynamics on a clique of size 3. a Four possible states of the clique and the
transitions between them. b Time course of the probability of each state of the clique. The initial
clique clique
condition is set to ρ1 (0) = 1 and ρi (0) = 0 for i = 1. We set β = 2

14.3.1 SIS Dynamics on a Clique and Extinction Effects

In this section, we examine the SIS dynamics of a clique of size m + 1. We explicitly


calculate the effect of stochastic extinction in the following analysis. For demon-
stration purposes, we consider SIS dynamics on a clique of size 3 (Fig. 14.3a). We
denote the probability with which there are i infected nodes on a clique at time t by
clique clique
ρi (t). The master equation for ρi (t) is given by
⎛ clique ⎞ ⎛ ⎞ ⎛ clique ⎞
ρ0 0 1 0 0 ρ0
d ⎜ clique ⎟
⎜ρ1 ⎟ ⎜ 0 −2β − 1 2 ⎟ ⎜ clique ⎟
0 ⎟ ⎜ρ1 ⎟
⎜ ⎟=⎜ ⎜ ⎟. (14.1)
dt ⎝ρ2clique ⎠ ⎝0 2β −2β − 2 3 ⎠ ⎝ρ2clique ⎠
ρ
clique 0 0 2β −3 ρ
clique
3 3
264 T. Onaga et al.

clique clique clique clique


By solving Eq. (14.1), one can obtain {ρ0 (t), ρ1 (t), ρ2 (t), ρ3 (t)}
as a function of the state of the clique at the beginning of the time window, i.e.,
clique clique clique clique clique
{ρ0 (0), ρ1 (0), ρ2 (0), ρ3 (0)}. With β = 2, the values of ρi (t) for
a range of t are shown in Fig. 14.3b. Because the employed infection rate is relatively
clique
large, infection initially spreads over a clique, causing the increase in ρ2 and
clique clique
ρ3 . However, at larger t, ρ0 grows dramatically. Because the present dynamics
clique
are a Markov process with a unique, disease-free absorbing state, ρ0 approaches
1 for any infection rate. Therefore, on the clique-based activity-driven network, an
infection would die out, for any infection rate if the length of the time window, τ , is
large.
The most common approach to SIS dynamics in statistical physics and mathemati-
cal biology is perhaps the mean-field theory (Pastor-Satorras et al. 2015). However, in
temporal networks, only a small number of contacts may be simultaneously present
such that stochastic extinction effects are not negligible. In this case, we need to
use the master equation or other approaches that explicitly deal with the stochastic
dynamics including extinction effects (Keeling et al. 2008; Simon et al. 2011; Hindes
and Schwartz 2016; Kiss et al. 2017; Van Mieghem et al. 2009). We note that the so-
called individual-based mean-field approximation is also a method aiming to track
the evolution of the probability that each node is infected (Pastor-Satorras et al. 2015;
Kiss et al. 2017). That method assumes that the state of each node is independent
from each other. For this reason, the approximation may fail to capture the probability
of the extinction state. Note that the extinction effect becomes stronger for smaller
m and that the mean-field theory and the individual-based mean-field approximation
are accurate in the limit m → ∞, where the extinction effects can be safely ignored.
Within the present modelling approach, the master-equation approach as opposed to
the mean-field approaches is better when m is roughly less than ten in practice. The
analysis presented in this section suggests that, for large values of τ , infection tends
to vanish even if the infection rate is large. Therefore, the epidemic threshold βc for
the clique-based activity-driven networks is expected to be large when τ is large.

14.3.2 Linear Mapping of the Network State Across a Time


Window of Length τ

To calculate the epidemic threshold for the entire SIS model, we first formulate SIS
dynamics on a static clique with m + 1 nodes using a master equation. Let us denote
the state of the clique by {x, y, z} (x, y ∈ {S, I }, 0 ≤ z ≤ m − 1), where x and y are
the states of the activated node and another specific node, respectively, and z is the
number of infected nodes among the other m − 1 nodes. Although a general network
with m + 1 nodes has 2m+1 states, using this notation, we can describe SIS dynamics
on a clique by a continuous-time Markov process with 4m states (Simon et al. 2011).
We denote the 4m × 4m transition rate matrix of the Markov process by M. By
definition, the element M{x ,y ,z },{x,y,z} of M is equal to the rate of transition from
14 The Effect of Concurrency on Epidemic Threshold in Time-Varying Networks 265

state {x, y, z} to state {x , y , z }. The diagonal elements of M are given by

M{x,y,z},{x,y,z} = − M{x ,y ,z },{x,y,z} . (14.2)


{x ,y ,z }={x,y,z}

The transition rates owing to a recovery event are given by

M{S,y,z},{I,y,z} = 1 , (14.3)
M{x,S,z},{x,I,z} = 1 , (14.4)
M{x,y,z−1},{x,y,z} = z (z ≥ 1), (14.5)

because the recovery rate μ has been set to 1. The transition rates owing to an infection
event are given by

M{I,S,z},{S,S,z} = zβ, (14.6)


M{S,I,z},{S,S,z} = zβ, (14.7)
M{I,I,z},{S,I,z} = (z + 1)β, (14.8)
M{I,I,z},{I,S,z} = (z + 1)β, (14.9)
M{S,S,z+1},{S,S,z} = z(m − 1 − z)β (z ≤ m − 2), (14.10)
M{I,S,z+1},{I,S,z} = (z + 1)(m − 1 − z)β (z ≤ m − 2), (14.11)
M{S,I,z+1},{S,I,z} = (z + 1)(m − 1 − z)β (z ≤ m − 2), (14.12)
M{I,I,z+1},{I,I,z} = (z + 2)(m − 1 − z)β (z ≤ m − 2). (14.13)

The remaining elements of M are equal to 0.


Let p{x,y,z} (t) be the probability for a clique to be in state {x, y, z} at time t.
Because
ṗ(t) = M p(t), (14.14)

where p(t) is the 4m-dimensional column vector whose elements are p{x,y,z} (t), one
obtains
p(t) = exp(Mt) p(0). (14.15)

Using Eq. (14.15), we obtain a linear mapping for the state of the entire temporal
network, i.e., mapping from the network state before a time window to the network
state after the time window, as follows.
Let c1 be the probability that the activated node in an isolated clique is infected at
time t + τ , when the activated node is the only infected node at time t and a new time
window of length τ is started exactly at time t. Note that c1 is the probability with
which x = I at time τ when the initial state is {I, S, 0}. Therefore, using Eq. (14.15),
one obtains
c1 (β, τ, m) = exp(Mτ ){I,y,z},{I,S,0} . (14.16)
y,z
266 T. Onaga et al.

Let c2 be the probability that the activated node is infected at t + τ when another
single node but no other node is infected at t. One obtains

c2 (β, τ, m) = exp(Mτ ){I,y,z},{S,I,0} . (14.17)


y,z

Let ρ(a, t) be the probability that a node with activity potential a is infected at
time t. The fraction of infected nodes in the entire network at time t is given by

ρ(t) ≡ da F(a)ρ(a, t). (14.18)

Denoting by ρ1 the probability that an activated node with activity potential a is


infected after the duration τ of the clique, one obtains

ρ1 (a, t + τ ) = c1 ρ(a, t) + c2 mρ(t), (14.19)

where the first term on the right-hand side of Eq. (14.19) corresponds to the situation
in which the activated node with activity potential a is infected after duration τ , when
only that node is infected at time t. The second term corresponds to the situation in
which the activated node is infected after duration τ , when only another single node
in the clique is infected at time t. In deriving Eq. (14.19), we assumed that the value
of β is selected close to the epidemic threshold such that at most one node is infected
in the clique at time t (and hence ρ(a, t), ρ(t) 1).
Let ρ2 be the probability that a node with activity potential a in a clique triggered
by activation of a different node with activity potential a is infected after time τ .
One obtains

ρ2 (a, a , t + τ ) = c1 ρ(a, t) + c2 ρ(a , t) + c2 (m − 1)ρ(t), (14.20)

where the first term on the right-hand side of Eq. (14.20) corresponds to the situation
in which the node with activity potential a is infected after duration τ , when that
node is the unique infected node in the clique at time t. The second term corresponds
to the situation in which the node with activity potential a is infected after duration
τ , when the activated node with activity potential a is the unique infected node in
the clique at time t. The third term corresponds to the situation in which the node
with activity potential a is infected after duration τ , when a different node is infected
at time t.
Finally, the probability that an isolated node with activity potential a is infected
after time τ is given by e−τ ρ(a, t).
By combining these contributions, one obtains
14 The Effect of Concurrency on Epidemic Threshold in Time-Varying Networks 267

ρ(a, t + τ ) = aρ1 (a, t + τ ) + da F(a )ma ρ2 (a, a , t + τ )

+(1 − a − ma)e−τ ρ(a, t)


= e−τ + (a + ma)(c1 − e−τ ) ρ(a, t)
+ [ma + m(m − 1)a] c2 ρ(t) + mc2 aρ(a, t). (14.21)

14.3.3 Epidemic Threshold When all Nodes Have the Same


Activity Potential

Consider the special case in which all nodes have the same activity potential a. In
this case, Eq. (14.21) is reduced to the one-dimensional map given by

ρ(t + τ ) = T (β, τ, m)ρ(t), (14.22)

where

T (β, τ, m) = e−τ + (m + 1)a c1 (β, τ, m) − e−τ + m(m + 1)ac2 (β, τ, m)


(14.23)
and we have omitted the argument a from ρ(a, t). A positive prevalence ρ(t) (i.e.,
a positive fraction of infected nodes in the equilibrium state) occurs only if scalar
T exceeds 1. Therefore, the epidemic threshold βc is given by the solution of the
following implicit function:

f (βc , τ, m) ≡ T (βc , τ, m) − 1 = 0. (14.24)

Equation (14.24) in the limit τ → 0 is reduced to

1 1
βc = = , (14.25)
m(m + 1)a k̄

where k̄ = m(m + 1)a is the degree of a node averaged over realisations of the net-
work structure generated by clique-based activity-driven networks. When the net-
work blinks infinitesimally fast (i.e., τ → 0), the epidemic dynamics are equivalent
to those occurring on the time-averaged network (Hasler et al. 2013). The time-
averaged network of the clique-based activity-driven network is the complete graph
with edge weight k̄/(N − 1). Therefore Eq. (14.25) is consistent with the result for
the well-mixed population.
For m = 1, Eq. (14.24) for any τ value is reduced to
  κ τ  1 + 3β  
(βc −1)τ −κc τ
− eτ + 1 − 2a = 0,
c c
2ae 2 cosh + sinh (14.26)
2 κc 2
268 T. Onaga et al.

where κc = βc2 + 6βc + 1.
We calculated the epidemic threshold by numerically solving Eq. (14.24) for m =
10 and Eq. (14.26) for m = 1. The epidemic threshold for a range of τ is indicated
by the solid lines in Fig. 14.4a and b. Note that we have kept the mean number of
edges in the network the same between m = 1 and m = 10. Therefore, βc at τ → 0
is the same between the two cases and is given by Eq. (14.25). The prevalence values
obtained by direct numerical simulations of the stochastic SIS model are also shown
in the same figures in different colours. The figures suggest that Eqs. (14.24) and
(14.26) describe results obtained by numerical simulations fairly well.
For m = 1, the epidemic threshold increases with τ and diverges at τ ≈ 0.1
(Fig. 14.4a). The network dynamics (i.e., larger values of τ ) reduce the prevalence for
all values of β. In contrast, for m = 10, the epidemic threshold initially decreases,
then increases and finally diverges, as τ increases (Fig. 14.4b). Depending on the
level of concurrency (i.e., m = 1 versus m = 10), the network dynamics impact the
epidemic threshold in qualitatively different manners.
The phase diagram for the epidemic threshold when τ and m are varied is shown
in Fig. 14.5a. The colour indicates the βc values that we calculated by numerically
solving Eq. (14.24). Note that βc has the same value for all m at τ = 0. Depending
on the value of m, network dynamics (i.e., finite τ ) increase or decrease the epidemic
threshold as τ increases from zero.
Two boundaries partitioning the different phases are given as follows. First, we
derive the boundary between the “hindered” and “extinct” phases, which is shown
by the solid line in Fig. 14.5a. In Fig. 14.5a, the epidemic threshold diverges at τ =
τ∗ . In the limit βc → ∞, infection seeded by a single infected node immediately
infects the entire clique, leading to c1 → 1 and c2 → 1. By substituting c1 , c2 → 1
in Eq. (14.23), we obtain f (βc → ∞, τ∗ , m) = 0, where

1 − (1 + m)a
τ∗ = ln ≈ k̄. (14.27)
1 − (1 + m)2 a

In Eq. (14.27), we used the approximation ln(1 − x) ≈ −x for x 1. For τ > τ∗


(i.e., “extinct” phase), infection always dies out even if the infection rate is infinite.
This is because, in a finite network (i.e., a finite clique in the present case), infection
always dies out after a sufficiently long time because of stochasticity (Sect. 14.3.1).
Second, there may exist τc such that βc at τ < τc is smaller than the βc value at
τ = 0. The comparison between the behaviour of βc at m = 1 and m = 10 (Fig. 14.4a
and b) leads us to hypothesise that τc (> 0) exists only for m > m c for a positive value
of m c between 1 and 10. We should obtain

dβc
=0 (14.28)

at (τ, m) = (0, m c ). The derivative of Eq. (14.24) with respect to τ gives
14 The Effect of Concurrency on Epidemic Threshold in Time-Varying Networks 269

(a) m=1 <ρ> (b) m = 10 <ρ>


40 0.9 15 0.4
30 0.3
0.6 10
β 20 β 0.2
0.3 5
10 0.1
0 0 0 0
0 0.05 0.1 0.15 0 0.05 0.1 0.15
τ τ
(c) <ρ> (d) <ρ>
40 0.9 15 0.4
30 0.3
0.6 10
β 20 β 0.2
0.3 5
10 0.1
0 0 0 0
0 0.05 0.1 0.15 0 0.05 0.1 0.15
τ τ
Fig. 14.4 Epidemic threshold and the numerically simulated prevalence when m = 1 (a and c)
and m = 10 (b and d). In a and b, all nodes have the same activity potential value a. In c and d,
the activity potential ( ≤ ai ≤ 0.9, 1 ≤ i ≤ N ) is assumed to obey a power-law distribution with
exponent 3. The solid lines represent the analytical estimate of the epidemic threshold given by
Eq. (14.26) in (a), Eq. (14.24) in (b) and Eq. (14.35) in (c) and (d). We set N = 2000 and adjust
the values of a and  such that the mean degree is the same (i.e., k = 0.1) for the four cases.
We simulated the stochastic SIS dynamics using the quasistationary state method (De Oliveira and
Dickman 2005), as we did in our previous studies (Onaga et al. 2017; Speidel et al. 2016), and
calculated the prevalence as an average over 100 realisations after discarding the first 15,000 time
steps (t = 0.005)

∂f ∂ f dβc
+ = 0. (14.29)
∂τ ∂βc dτ
∂f
By combining Eqs. (14.28) and (14.29), one obtains ∂τ
= 0, leading to

m c = 2. (14.30)

When m ≤ m c , any finite value of τ increases the epidemic threshold and reduces
the risk of the prevalence. When m > m c , a small positive τ reduces the epidemic
threshold and increases the risk of the prevalence, whereas a larger τ increases the
epidemic threshold and reduces the prevalence.
270 T. Onaga et al.

(a) (b )
10 10

promoted promoted
m extinct m extinct
5 5

2 hindered 2 hindered
1 1
0 0.1 0.2 0 0.1 0.2
τ τ

βc
4 5 10 50 100
Fig. 14.5 Epidemic threshold βc for the clique-based activity-driven network model. In a, we set
the activity potential to a for all nodes. In b, the activity potential ( ≤ ai ≤ 0.9, 1 ≤ i ≤ N ) obeys
a power-law distribution with exponent 3. We set k = 0.1 for m = 1 and adjusted the values of a
(in (a)) or  (in (b)) such that the value of βc at τ = 0 is independent of m. In the “extinct” phase,
the epidemic threshold βc is effectively infinity such that infection eventually dies out for any finite
β. In the “hindered” phase, βc is finite and is larger than the value at τ = 0. In the “promoted”
phase, βc is smaller than the value at τ = 0. The solid and dashed lines represent τ∗ (Eq. (14.27))
and τc , respectively. The “extinct” regions are determined as the regions in which βc > 100

14.3.4 General Activity Distributions

To obtain the epidemic threshold for general distributions of the activity potential,
we use a moment closure method in this section. Note that a generating function
approach, which is more complicated than the present moment closure method, yields
the epidemic threshold without approximation (Onaga et al. 2017). By averaging
Eq. (14.21) over the nodes having various activity potentials distributed according to
F(a), we obtain

ρ(t + τ ) = e−τ + ma(c1 − e−τ ) + m 2 ac2 ρ(t)


+(c1 − e−τ + mc2 )aρ(a, t). (14.31)

By multiplying Eq. (14.21) by a and averaging over a, we obtain

aρ(a, t + τ ) = ma 2  + m(m − 1)a2 c2 ρ(t)


+ e−τ + ma(c1 − e−τ ) + mac2 aρ(a, t)
+(c1 − e−τ )a 2 ρ(a, t). (14.32)
14 The Effect of Concurrency on Epidemic Threshold in Time-Varying Networks 271

To close the system of equations given by Eqs. (14.31) and (14.32), we approximate
a 2 ρ(a, t) by aaρ(a, t). Then, we obtain
   
ρ(t + τ ) ρ(t)
=T , (14.33)
aρ(a, t + τ ) aρ(a, t)

where
 −τ 
e  + ma (c1 − e−τ ) + m2 a c2 c1 − e−τ + mc2
T= .
ma 2  + m(m − 1)a 2 c2 e−τ + (m + 1)a (c1 − e−τ ) + ma c2
(14.34)

A positive prevalence ρ(t) is expected if and only if the largest eigenvalue of


T (βc , τ, m) exceeds 1. This condition results in the implicit equation for the epidemic
threshold given by
    
f (βc , τ, m) ≡ m(m + 1)a 2 q 2 + m 2 + m + 1 a 2 − a 2  qr + a 2 − a 2  r 2
− (2m + 1)a q − (m + 1)a r + 1 = 0, (14.35)

where

c1 (β, τ, m) − e−τ
q(β, τ, m) = , (14.36)
1 − e−τ
mc2 (β, τ, m)
r (β, τ, m) = . (14.37)
1 − e−τ

By solving the Eq. (14.35), one can obtain the epidemic threshold βc for any com-
bination of τ and m. If the activity potentials of all nodes are the same such that
a 2  = a2 , Eq. (14.35) is reduced to Eq. (14.24).
In Fig. 14.4c and d, the epidemic threshold given as the numerical solution of
Eq. (14.35) (solid lines) is compared with the prevalence obtained by direct numerical
simulations of the stochastic dynamics of the model. Although we employed the
moment closure approximation, our theory describes the numerical results fairly
well. The dependence of βc on τ and m is similar to when the activity potential is
the same for all nodes (Fig. 14.4a and b).
The phase diagram of epidemic threshold is shown in Fig. 14.5b. The figure indi-
cates that the network dynamics (i.e., finite positive τ ) always suppresses epidemics
when m < 2, whereas small positive values of τ enhance epidemics when m > 2. If
τ is large enough, infection always vanishes.
272 T. Onaga et al.

(a) m=1 <ρ> (b) m = 10 <ρ>


40 0.9 15 0.4

30 0.3
0.6 10
β 20 β 0.2
0.3 5
10 0.1

0 0 0 0
0 0.05 0.1 0.15 0 0.05 0.1 0.15
τ τ
(c) <ρ> (d) <ρ>
40 0.9 15 0.4

30 0.3
0.6 10
β 20 β 0.2
0.3 5
10 0.1

0 0 0 0
0 0.05 0.1 0.15 0 0.05 0.1 0.15
τ τ
(e) <ρ> (f) <ρ>
40 0.9 15 0.4

30 0.3
0.6 10
β 20 β 0.2
0.3 5
10 0.1

0 0 0 0
0 0.05 0.1 0.15 0 0.05 0.1 0.15
τ τ
Fig. 14.6 Numerically simulated prevalence on clique-based activity-driven networks with attrac-
tiveness of nodes. a γ = −1, m = 1. b γ = −1, m = 10. c γ = 0, m = 1. d γ = 0, m = 10. e
γ = 1, m = 1. f γ = 1, m = 10. The activity potential ( ≤ ai ≤ 0.9, 1 ≤ i ≤ N ) obeys a power-
law distribution with exponent 3. The colour indicates the prevalence. We set N = 2000 and adjust
the value of  such that the mean degree k is the same in all the cases
14 The Effect of Concurrency on Epidemic Threshold in Time-Varying Networks 273

14.4 Clique-Based Activity-Driven Networks


with Attractiveness

In this section, we consider a generalisation of the clique-based activity-driven net-


works. In social networks, the chance for nodes to be selected by active nodes, i.e.,
attractiveness, may depend on the nodes. As a variant of the activity-driven network
model, we consider the model in which each node is assigned a positive attractive-
ness bi (1 ≤ i ≤ N ) (Alessandretti et al. 2017; Pozzana et al. 2017). When a node
is activated, it creates a clique with m nodes randomly selected with probability
bi /(bN ). In general, a node is assigned with activity ai and attractiveness bi that
are drawn from the joint probability density H (a, b). We focus on the special case in
which activity and attractiveness have a deterministic relationship given by b ∼ a γ ,
i.e. , H (a, b) = F(a)δ(b − a γ ), where δ(x) is the Dirac delta function (Alessandretti
et al. 2017; Pozzana et al. 2017). We also assume that ai is distributed according to
the same power law as that used in Fig. 14.4c and d.
The prevalence obtained by direct numerical simulations is shown in Fig. 14.6
for two values of m, three values of γ and a range of τ and β. When the activity
potential and attractiveness are negatively correlated (Fig. 14.6a and b), the results
are qualitatively the same as the cases without attractiveness (Fig. 14.6c and d, which
are equivalent to the colour maps shown in Fig. 14.4c and d). In contrast, a positive
correlation between the activity potential and attractiveness yields larger prevalence
(Fig. 14.6e and f). In this case, for a wider range of τ , the prevalence is positive for both
m = 1 and m = 10. In addition, for m = 10 (Fig. 14.6f), the prevalence is positive
even at relatively small values of β. These results imply that the epidemic threshold
is smaller when the activity potential and attractiveness are positively correlated than
otherwise. This is consistent with the previous result that positive correlation between
the activity potential and attractiveness facilitates epidemics (Pozzana et al. 2017).

14.5 Conclusions

We introduced a theoretical approach to stochastic SIS dynamics on a switching


temporal network model and its extension, which are variants of the activity-driven
network model. We found that the epidemic threshold and prevalence, and how they
compare with the case of static networks, mainly depend on the level of concurrency
and the distribution of attractiveness.

Acknowledgements T. O. acknowledges the support provided through JSPS KAKENHI Grant


Number JP19K14618 and JP19H01506. J. G. acknowledges the support provided through Science
Foundation Ireland (Grants No. 16/IA/4470 and No. 16/RC/3918). N. M. acknowledges the support
provided through JST, CREST, and JST, ERATO, Kawarabayashi Large Graph Project.
274 T. Onaga et al.

References

L. Alessandretti, K. Sun, A. Baronchelli, N. Perra, Phys. Rev. E 95, 052318 (2017)


S. Bansal, J. Read, B. Pourbohloul, L.A. Meyers, J. Biol. Dyn. 4, 478–489 (2010)
M.M. De Oliveira, R. Dickman, Phys. Rev. E 71, 016129 (2005)
M. Hasler, V. Belykh, I. Belykh, SIAM J. Appl. Dyn. Syst. 12, 1031–1084 (2013)
J. Hindes, I.B. Schwartz, Phys. Rev. Lett. 117, 028302 (2016)
P. Holme, Eur. Phys. J. B 88, 234 (2015)
M.J. Keeling, J.V. Ross, J. R. Soc. Interface 5, 171–181 (2008)
I.Z. Kiss, J.C. Miller, P.L. Simon, Mathematics of Epidemics on Networks: From Exact to Approx-
imate Models (Springer, Cham, 2017)
M. Kretzschmar, M. Morris, Math. Biosci. 133, 165–195 (1996)
D. Liberzon, Switching in Systems and Control. Systems and Control: Foundations and Applications
(Birkhäuser Boston, Boston, MA, 2003)
N. Masuda, P. Holme (eds.), Temporal Network Epidemiology (Springer, Singapore, 2017)
N. Masuda, P. Holme, F1000Prime Rep. 5, 6 (2013)
N. Masuda, K. Klemm, V.M. Eguíluz, Phys. Rev. Lett. 111, 188701 (2013)
M. Morris, M. Kretzschmar, Soc. Netw. 17, 299–318 (1995)
M. Morris, M. Kretzschmar, AIDS 11, 641–648 (1997)
T. Onaga, J.P. Gleeson, N. Masuda, Phys. Rev. Lett. 119, 108301 (2017)
R. Pastor-Satorras, C. Castellano, P. Van Mieghem, A. Vespignani, Rev. Mod. Phys. 87, 925–979
(2015)
N. Perra, B. Gonçalves, R. Pastor-Satorras, A. Vespignani, Sci. Rep. 2, 469 (2012)
I. Pozzana, K. Sun, N. Perra, Phys. Rev. E 96, 042310 (2017)
P.L. Simon, M. Taylor, I.Z. Kiss, J. Math. Biol. 62, 479–508 (2011)
L. Speidel, K. Klemm, V.M. Eguíluz, N. Masuda, New J. Phys. 18, 073013 (2016)
J. Stehlé, A. Barrat, G. Bianconi, Phys. Rev. E 81, 035101(R) (2010)
C. Tantipathananandh, T. Berger-Wolf, D. Kempe, in Proceedings of the Thirteenth ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining (ACM, New York, 2007),
pp. 717–726
P. Van Mieghem, J. Omic, R. Kooij, IEEE Trans. Netw. 17, 1–14 (2009)
K. Zhao, M. Karsai, G. Bianconi, PLoS One 6, e28116 (2011)
Chapter 15
Dynamics and Control of Stochastically
Switching Networks: Beyond Fast
Switching

Russell Jeter, Maurizio Porfiri, and Igor Belykh

Abstract Many complex systems throughout science and engineering display on-
off intermittent coupling among their units. Stochastically blinking networks as a
specific type of temporal networks are particularly relevant models for such spo-
radically interacting systems. In blinking networks, connections between oscillators
stochastically switch in time with a given switching period. Most of the current
understanding of dynamics of such switching temporal networks relies on the fast
switching hypothesis, where the network dynamics evolves at a much faster time scale
than the individual units. In this chapter, we go beyond fast switching and uncover
highly nontrivial phenomena by which a network can switch between asynchronous
regimes and synchronize against all odds. We review a series of our recent papers
and provide analytical insight into the existence of windows of opportunity, where
network synchronization may be induced through non-fast switching. Using stability
and ergodic theories, we demonstrate the emergence of windows of opportunity and
elucidate their nontrivial relationship with network dynamics under static coupling.
In particular, we derive analytical criteria for the stability of synchronization for two
coupled maps and the ability of a single map to control an arbitrary network of maps.
This work not only presents new phenomena in stochastically switching dynamical
networks, but also provides a rigorous basis for understanding the dynamic mecha-
nisms underlying the emergence of windows of opportunity and leveraging non-fast
switching in the design of temporal networks.

Keywords Blinking networks · Averaging · Master stability function · Stochastic


stability · Window of opportunity

R. Jeter
Department of Biomedical Informatics, Emory University, Atlanta, GA 30322, USA
e-mail: rjeter@emory.edu
M. Porfiri · I. Belykh (B)
Department of Mechanical and Aerospace Engineering, New York University Tandon
School of Engineering, Brooklyn, NY 11201, USA
e-mail: ibelykh@gsu.edu
M. Porfiri
e-mail: mporfiri@nyu.edu

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 275
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_15
276 R. Jeter et al.

15.1 Introduction

Collective behavior within networks has received a considerable amount of attention


in the literature, from animal grouping to robotic motion (Sumpter 2010; Ren and
Beard 2008). One type of collective behavior, synchronization, is particularly impor-
tant in how prevalent it is in real-world systems (Boccaletti et al. 2002; Pikovsky
et al. 2003; Arenas et al. 2008). Synchronization is one of the most basic instances
of collective behavior, and one of the easiest to diagnose: it occurs when all of the
nodes in a network act in unison. Typically, it manifests in ways similar to a school of
fish moving as one larger unit to confuse or escape from a predator (Camazine et al.
2003), or a collection of neurons firing together during an epileptic seizure (Netoff
and Schiff 2002).
Significant attention has been devoted to the interplay between node dynamics
and network topology which controls the stability of synchronization (Pecora and
Carroll 1998; Belykh et al. 2004b; Li and Chen 2006; Nishikawa and Motter 2010).
Most studies have looked at networks whose connections are static; networks with
a dynamically changing network topology, called temporal or evolving networks,
are only recently appearing into the scientific literature (Gorochowski et al. 2010,
2012; De Lellis et al. 2008, 2010a, b, 2013; Stojanovski et al. 1997; Ito and Kaneko
2001; Belykh et al. 2004a, 2013; Hasler and Belykh 2005; Skufca and Bollt 2004;
Lu and Chen 2005; Lu 2007; Mondal et al. 2008; Chen et al. 2009; Zanette and
Mikhailov 2004; Porfiri et al. 2006, 2008; Porfiri and Pigliacampo 2008; Porfiri and
Fiorilli 2009a; Porfiri 2012, 2011; Yu et al. 2012; Sorrentino and Ott 2008; So et al.
2008; Hasler et al. 2013a, b; Dorogovtsev and Mendes 2002; Abaid and Porfiri 2011;
Frasca et al. 2008; Masuda and Holme 2017; Masuda et al. 2013) (see the recent
books Holme and Saramäki 2012, 2013 for additional references).
A particular class of evolving dynamical networks is represented by on-off switch-
ing networks, called “blinking” networks (Belykh et al. 2004a; Porfiri et al. 2008)
where connections switch on and off randomly and the switching time is fast, with
respect to the characteristic time of the individual node dynamics. As summarized
in a recent review (Belykh et al. 2014), different aspects of synchronization, con-
sensus, and multistability in stochastically blinking networks of continuous-time
and discrete-time oscillators have been studied in the fast-switching limit where the
dynamics of a stochastically switching network is close to the dynamics of a static
network with averaged, time-independent connections. While a mathematically rig-
orous theory of synchronization in fast-switching blinking networks is available
(Belykh et al. 2004a; Hasler and Belykh 2005; Skufca and Bollt 2004; Porfiri et al.
2006, 2008; Porfiri and Pigliacampo 2008; Porfiri and Fiorilli 2009a; Porfiri 2012,
2011; Hasler et al. 2013a, b; Belykh et al. 2013; Jeter and Belykh 2015a), the analysis
of synchronization in non-fast switching networks of continuous-time oscillators has
proven to be challenging and often elusive.
Non-fast switching connections yield a plethora of unexpected dynamical phe-
nomena, including (i) the existence of a significant set of stochastic sequences and
optimal frequencies for which the trajectory of a multistable switching oscillator can
15 Dynamics and Control of Stochastically Switching … 277

converge to a “wrong” ghost attractor (Belykh et al. 2013) and (ii) bounded windows
of intermediate switching frequencies (“windows of opportunity”) in which syn-
chronization becomes stable in a switching network over bounded intervals of the
switching frequency, which may not include the fast switching limit (Jeter and Belykh
2015a). As a result, networks that do not synchronize in the fast switching limit may
synchronize for non-fast switching, and then lose synchronization as the frequency
is further reduced. Found numerically in networks of continuous-time Rössler and
Duffing oscillators (Jeter and Belykh 2015a) and Rosenzweig–MacArthur food chain
models (Jeter and Belykh 2015b), the emergence of windows of opportunity calls
for a rigorous explanation of unexpected synchronization from non-fast switching.
Blinking networks of discrete-time systems (maps) with non-fast switching offer
such a mathematical treatment (Golovneva et al. 2017; Porfiri and Belykh 2017;
Jeter et al. 2018a, b; Porfiri et al. 2019). More precisely, the switching period in
discrete-time networks can be quantified as a number of the individual map’s iterates
such that rescaling of time yields a new, multi-iterate map that is more convenient
to work with. This enables the formulation of a rigorous mathematical framework
for the analysis of the stochastic stability of synchronization as a function of the
switching period.
The purpose of this chapter is to give a detailed overview of this rigorous math-
ematical framework and reveal the central role of non-fast switching which may
provide opportunity for stochastic synchronization in a range of switching peri-
ods where fast switching fails to synchronize the maps. We start with a historical
perspective and a short review of the existing fast-switching theory for networks
of continuous-time oscillators and discuss a motivating example of coupled Rössler
oscillators with non-fast switching (Sect. 15.2). Then, we present the stochastic model
of coupled maps and introduce the mean square stability of the transverse dynam-
ics. To isolate the delicate mechanisms underpinning stochastic synchronization,
we consider two coupled maps with independent identically distributed stochastic
switching and study the stability of synchronization as a function of the switch-
ing period (Sect. 15.3). In Sect. 15.4, we extend our rigorous toolbox to assess the
mean-square stability of controlled synchronization in broadcaster-network systems.
We examine the feasibility of on-off broadcasting from a single reference node to
induce synchronization in a target network with connections from the reference node
that stochastically switch in time with an arbitrary switching period. Internal con-
nections within the target network are static and promote the network’s resilience
to externally induced synchronization. Through rigorous mathematical analysis, we
uncover a complex interplay between the network topology and the switching period
of stochastic broadcasting, fostering or hindering synchronization to the reference
node. With coupled chaotic tent maps as our test-bed, we prove the emergence of
“windows of opportunity” where only non-fast switching periods are favorable for
synchronization. The size of these windows of opportunity is shaped by the Lapla-
cian spectrum such that the switching period needs to be manipulated accordingly
to induce synchronization. Surprisingly, only the zero and the largest eigenvalues of
the Laplacian matrix control these windows of opportunities for tent maps within a
wide parameter region.
278 R. Jeter et al.

15.2 The Blinking Network Model: Continuous-Time


Systems

“Blinking” networks were originally introduced for continuous-time oscillators in


the context of network synchronization in Belykh et al. (2004a). A blinking network
consists of N oscillators interconnected pairwise via a stochastic communication
network:
dxi N
 
= Fi (xi ) + ε si j (t)P x j − xi , (15.1)
dt j=1

where xi (t) ∈ Rd is the state of oscillator i, Fi : Rd → Rd describes the oscillators’


individual dynamics, ε > 0 is the coupling strength. The d × d matrix P determines
which variables couple the oscillators, si j (t) are the elements of the time-varying
connectivity (Laplacian) matrix G(t). The existence of an edge from vertex i to vertex
j is determined randomly and independently of other edges with probability p ∈
[0, 1]. Expressed in words, every switch in the network is operated independently,
according to a similar probability law, and each switch opens and closes in different
time intervals independently. All possible edges si j = s ji are allowed to switch on
and off so that the communication network G(t) is constant during each time interval
[kτ, (k + 1)τ ) and represents an Erdős–Rényi graph of N vertices. Figure 15.1 gives
an example of a “blinking” graph.
The switching network (15.1) is a relevant model for stochastically changing
networks such as information processing cellular neural networks (Hasler and Belykh
2005) or epidemiological networks (Porfiri et al. 2006; Frasca et al. 2008; Jeter
and Belykh 2015b). For example, independent and identically distributed (i.i.d.)
stochastic switching of packet networks communicating through the Internet comes
from the fact that network links have to share the available communication time

Fig. 15.1 (Top). Two t = 0.05 t = 0.06


subsequent instances of the
switching network.
Probability of an edge
p = 0.5, the switching time
step τ = 0.01. (Bottom).
The corresponding averaged
network where the switching
connections of strength ε are
replaced with static all-to-all
connections of strength pε,
representing their mean
value
15 Dynamics and Control of Stochastically Switching … 279

slots with many other packets belonging to other communication processes and the
congestion of the links by the other packets can also occur independently.
As far as network synchronization is concerned, local computer clocks, that are
required to be synchronized throughout the network, are a representative example.
Clock synchronization is achieved by sending information about each computer’s
time as packets through the communication network (Belykh et al. 2004a). The local
clocks are typically implemented by an uncompensated quartz oscillator. As a result,
the clocks can be unstable/inaccurate and need to receive synchronizing signals,
that aim to reduce the timing errors. These signals must be sufficiently frequent to
guarantee sufficient precision of synchronization between the clocks. At the same
time, the communication network must not be overloaded by the administrative
signals. This is a compromise between the precision of synchronization and the traffic
load on the network. Remarkably, this blinking network administration can provide
precise functioning of a network composing of imprecise elements. It also indicates
the importance of optimal switching frequencies that ensure this compromise.

15.2.1 Historical Perspective: Fast Switching Theory

Over the years, various aspects of synchronization in fast switching networks of


continuous-time oscillators have been extensively investigated (Belykh et al. 2004a;
Skufca and Bollt 2004; Porfiri et al. 2006, 2008; Porfiri and Pigliacampo 2008;
Porfiri and Fiorilli 2009a; Porfiri 2012; Jeter and Belykh 2015a). In particularly, it
was rigorously proved in both continuous and discrete-time cases that if the switching
frequency is sufficiently high, with respect to the characteristic time of the individual
oscillators (fast switching limit), the stochastically blinking network can synchronize
even if the network is disconnected at every instant of time.
Beyond synchronization, a rigorous theory for the behavior of stochastic switching
networks of continuous-time oscillators in the fast switching limit was developed in
Hasler and Belykh (2005), Hasler et al. (2013a, b) and Belykh et al. (2013). In general,
it was proved in Hasler et al. (2013a, b) that for switching dynamical systems of this
type, if the stochastic variables switch sufficiently fast, the behavior of the stochastic
system will converge to the behavior of the averaged system in finite time, where the
dynamical law is given by the expectation of the stochastic variables. These studies
have also helped clarify a number of counterintuitive findings about the relationship
between the stochastic network and its time-averaged counterpart. While intuition
suggests that the switching network should follow the averaged system in the fast
switching limit, this is not always the case, especially when the averaged system
is multistable and its attractors are not invariant under the switching system. These
attractors act as ghost attractors for the switching system, whereby the trajectory
of the switching system can only reach a neighborhood of the ghost attractor, and
remains close most of the time with high probability when switching is fast. In a
multistable system, the trajectory may escape to another ghost attractor with low
probability (Hasler et al. 2013b). This theory uses the Lyapunov function method
280 R. Jeter et al.

along with large deviation bounds to derive explicit conditions that connect the
probability of converging towards an attractor of a multistable blinking network,
the fast switching frequency, and the initial conditions. As the switching frequency
decreases, it was shown that there is a range of “resonant” frequencies where the
trajectory of a multistable switching oscillator receives enough kicks in the wrong
direction to escape from the ghost attractor against all odds (Belykh et al. 2013).
Indeed, there are circumstances for which not converging to the averaged system is
favorable, and the present fast-switching theory is not able to make definitive claims
about the behavior of the stochastic system. This leads us to explore the effects of
non-fast-switching on the dynamics of the switching network.

15.2.2 Beyond Fast Switching: A Motivating Example

We begin with a numerical example of the stochastic Erdős–Rényi network (15.1)


composed of ten x-coupled Rössler oscillators:
⎧ 10


⎨ ẋi = −(yi + z i ) + ε si j (t)(x j − xi )
j=1
(15.2)

⎪ ẏ = xi + ayi
⎩ i
ż i = b + z i (xi − c).

Hereafter, the intrinsic parameters are chosen and fixed as follows: a = 0.2, b = 0.2,
c = 7. The averaged network is an all-to-all network with a fixed coupling strength
pε. Synchronization in a network of x-coupled Rössler systems is known Pecora and
Carroll (1998) to destabilize after a critical coupling strength ε∗ , which depends on
the eigenvalues of the connectivity matrix G. We choose the coupling strengths in the
stochastic network such that the coupling in the averaged network is defined by one
of the three values, marked in Fig. 15.2. In particular, for ε = 1, synchronization in
the averaged network is unstable. As a result, synchronization in the fast-switching
network is also unstable. Surprisingly, there is a window of intermediate switching
frequencies for which synchronization becomes stable (see Fig. 15.3). In fact, the
stochastic network switches between topologies whose large proportion does not
support synchronization or is simply disconnected.
To better isolate the above effect and gain insight into what happens when switch-
ing between a connected network in which the synchronous solution is unstable, and
a completely disconnected network in which the nodes’ trajectories behave indepen-
dently of one another, we consider a two-node Rössler network (15.2). Figure 15.4
demonstrates the emergence of synchronization windows for various intermediate
values of τ for which the fast-switching network does not support synchronization.
In essence, the system is switching between two unstable systems, and yet when
the switching period τ is in a favorable range within a window of opportunity, the
system stabilizes.
15 Dynamics and Control of Stochastically Switching … 281

Fig. 15.2 Transversal stability of synchronization in the averaged ten-node x-coupled Rössler
system, expressed via the largest transversal Lyapunov exponent. Synchronization is stable within
the interval ε− < ε < ε+ [not shown]. The values of ε used in Fig. 15.3 are marked with dot in red,
light blue, and navy

Fig. 15.3 Probability of synchronization in the ten-node stochastic Rössler network with differ-
ing coupling strengths, showing the effects of varying τ . These are coupling strengths for which
synchronization in the averaged system is stable (red), weakly unstable (light blue), and strongly
unstable (navy), respectively (cf. Fig. 15.2 for the values marked with appropriately colored dots).
The bell-shaped curve corresponds to an optimal range of non-fast switching 0.6 < τ < 2.2 (the
“window of opportunity”), where synchronization in the stochastic network becomes stable with
high probability, whereas synchronization in the corresponding averaged system is unstable (ε = 1).
Switching probability p = 0.5. Probability calculations are based on 1000 trials
282 R. Jeter et al.

Fig. 15.4 Probability of synchronization in the two-node Rössler network (15.2) as a function of
the switching probability p and switching period τ . Yellow (lighter) colors correspond to higher
probability of convergence (with light yellow at probability 1) and blue (darker) colors correspond
to lower probabilities (dark blue at probability 0). The coupling strength of the connection is fixed at
ε = 7. As p increases, pε, the effective coupling in the averaged/fast-switching network progresses
through the window of synchrony indicated in Fig. 15.2. For the two-node network this interval is
pε ∈ [0.08 2.2], yielding the stability range p ∈ [0.011 0.31] (the yellow interval on the y-axis)
for  = 7 and small τ . Probability calculations are based on 1000 trials

Observed numerically in the network of continuous-time oscillators, this phe-


nomenon calls for a more rigorous study to isolate the principal mechanisms under-
pinning unexpected synchronization from non-fast switching. The following sections
aim at establishing such an analytical insight in more analytically tractable networks
of discrete-time oscillators.

15.3 Revealing Windows of Opportunity in Two


Stochastically Coupled Maps

We focus on a discrete-time setting, where the coupling between the maps is held fixed
for a finite number of time steps (switching period) and then it stochastically switches,
independent of the time history. In this case, non-fast switching can be studied by re-
scaling the time variable and consequently modifying the individual dynamics of the
coupled maps. This enables the formulation of a rigorous mathematical framework
for the analysis of the stochastic stability of synchronization as a function of the
switching period. We restrict our analysis to two coupled maps with the two-fold
15 Dynamics and Control of Stochastically Switching … 283

aim of: (i) providing a clear demonstration for the origin of this phenomenon, which
may be hidden by topological factors in large networks and (ii) establishing a toolbox
of closed-form results for the emergence of windows of opportunity.

15.3.1 Network Model

We study the stochastic synchronization of two maps characterized by the state vari-
ables xi ∈ R, i ∈ {1, 2}. We assume that the individual dynamics of each node evolves
according to xi (k + 1) = F(xi (k)), where k ∈ Z+ is the time step and F : R → R
is a smooth nonlinear scalar function. The maps are linearly coupled through the
stochastic gains ε1 (k), ε2 (k) ∈ R, such that

x1 (k + 1) F(x1 (k)) + ε1 (k)(x2 (k) − x1 (k))


= . (15.3)
x2 (k + 1) F(x2 (k)) + ε2 (k)(x1 (k) − x2 (k))

Each of the sequences of coupling gains, ε1 (0), ε1 (1), ε1 (2), . . . and ε2 (0),
ε2 (1), ε2 (2), . . ., is assumed to be switching stochastically with the same period
m ∈ Z+ \ {0}. Every m time steps, the coupling gains simultaneously switch,
such that ε1 (mk) = ε1 (mk + 1) = · · · = ε1 (mk + m − 1) = ε̃1 (k) and ε2 (mk) =
ε2 (mk + 1) = · · · = ε2 (mk + m − 1) = ε̃2 (k) for every time step k, where ε̃1 (0),
ε̃1 (1), . . . and ε̃2 (0), ε̃2 (1), . . . are two sequences of independent and identically
distributed random variables.
The evolution of the coupled maps in Eq. (15.3) is determined by the random
variables ε̃1 and ε̃2 , from which the coupling gains are drawn. In general, these random
variables may be related to each other and may not share the same distribution.
For example, in the case of uni-directional stochastic coupling, one of the random
variables is zero; on the other hand, for bi-directional interactions, the two random
variables coincide.
The majority of the work on stochastic synchronization of coupled discrete maps
is largely limited to the case m = 1, for which the coupling gains switch at every
time step (Porfiri 2011). In this case, the random variables εi (0), εi (1), εi (2), . . ., for
i ∈ {1, 2}, are mutually independent. For each value of k, x1 (k + 1) and x2 (k + 1)
are functions only of the previous values x1 (k) and x2 (k), and Eq. (15.3) reduces to
a first order Markov chain with explicit dependence on time through the individual
dynamics. In the case of m > 1, the random variables εi (0), εi (1), εi (2), . . ., for
i ∈ {1, 2}, are no longer independent, which poses further technical challenges for
the analysis of the system, while opening the door for rich behavior to emerge from
the stochastically driven coupling.
The oscillators synchronize at time step k if their states are identical, that is,
x1 (k) = x2 (k). From Eq. (15.3), once the oscillators are synchronized at some time
step, they will stay synchronized for each subsequent time step. The common
synchronized trajectory s(k) is a solution of the individual dynamics, whereby
s(k + 1) = F(s(k)). The linear stability of synchronization can be studied through
284 R. Jeter et al.

the following variational equation, obtained by linearizing Eq. (15.3) in the neigh-
borhood of the synchronization manifold:

ξ(k + 1) = F  (s(k)) − d(k) ξ(k), (15.4)

where prime indicates differentiation, d(k) = ε1 (k) + ε2 (k) is the net coupling, and
ξ(k) = x1 (k) − x2 (k) is the synchronization error at time step k. Equation (15.4)
defines the linear transverse dynamics of the coupled oscillators, measured with
respect to the difference between their states ξ(k). This quantity is zero when the
two oscillators are synchronized. Equation (15.4) relies on the assumption that the
mapping governing the individual dynamics, F, is differentiable everywhere. This
assumption can be relaxed, however, to functions that are differentiable almost every-
where (Pikovsky and Politi 2016).
Only the sum of the two coupling gains ε1 (k) and ε2 (k) affects the transverse
dynamics, thereby only the statistics of the random variable d(k) modulate the linear
stability of the synchronization manifold. To simplify the treatment of the variational
problem in Eq. (15.4), we can rescale the time variable with respect to the switching
period as follows:


m−1
ξ̃ (k + 1) = (F  (s(mk + i)) − d̃(k))ξ̃ (k), (15.5)
i=0

where ξ̃ (k) = ξ(mk) and d̃(k) = ε̃1 (k) + ε̃2 (k). Equation (15.5) casts the variational
dynamics in the form of a first order time-dependent Markov chain, generated by a
linear time-varying stochastic finite difference equation (Fang 1994; Kushner 1971).
It is important to emphasize that the synchronization manifold x1 (k) = x2 (k) is
an invariant set of the stochastic Eq. (15.3). Therefore, the dynamics of the synchro-
nization manifold is governed by an attractor of the mapping function F(s(k)).

15.3.2 Mean Square Stability of Synchronization

In determining the stability of the synchronous state, various criteria can be consid-
ered, such as almost sure, in probability, and mean square (Fang 1994; Kushner 1971;
Porfiri et al. 2019). The concept of mean square stability is particularly attractive,
due to its practicality of implementation and its inclusiveness with respect to other
criteria. Mean square stability of the synchronous state is ascertained through the
analysis of the temporal evolution of the second moment of the error E[ξ̃ 2 ], where
E[·] indicates expectation with respect to the σ -algebra generated by the switching.
By taking the square of each side of Eq. (15.5) and computing the expectation, we
obtain
15 Dynamics and Control of Stochastically Switching … 285
m−1 
    

E ξ̃ (k + 1) = E
2
(F (s(mk + i)) − d̃(k)) E ξ̃ 2 (k) .
2
(15.6)
i=0

This recursion is a linear, time-varying, deterministic finite difference equation


whose initial condition is ξ̃ 2 (0), which is treated as a given value and not as a
random variable. We say that Eq. (15.5) is mean square asymptotically stable if
Eq. (15.6) is asymptotically stable, that is, if the Lyapunov exponent λ of Eq. (15.6)
is negative. This implies that any small difference between the states of the oscillators
will converge to zero in the mean square sense as time increases.
The Lyapunov exponent is a function of the switching period m and can be com-
puted from Eq. (15.6) as follows (Pikovsky and Politi 2016):
m−1 
1  
k−1
λ(m) = lim ln E (F  (s(m j + i)) − d̃( j))2 . (15.7)
k→∞ k
j=0 i=0

In general, the stability of the synchronization manifold depends on the underlying


synchronous solution, whereby λ(m) in Eq. (15.7) explicitly depends on s(k). In what
follows, we focus on the case where s(k) is a chaotic trajectory. We comment that our
approach is based on the linearized dynamics in Eq. (15.4), which describes small
perturbations from the synchronous state. Thus, our analysis is only applicable to the
study of local stability of the synchronization manifold, and initial conditions cannot
be arbitrarily selected in the basin of attraction.

15.3.3 Preliminary Claims

We assume that d̃(k) takes values on a finite sample space D = {d1 , d2 , . . . , dn } of


cardinality n. For l = 1, . . . , n, the probability that the net coupling is equal to dl is
chosen to be equal to pl . For example, in the case of simple on-off connections, the
individual coupling gains take values 0 and ε with corresponding probabilities p and
1 − p. Therefore, the net coupling gain d̃(k) takes values d1 = 0, d2 = ε, and d3 =
2ε with corresponding probabilities p1 = p 2 , p2 = 2 p(1 − p) and p3 = (1 − p)2 .
From the individual values of the net coupling and their probabilities, we can
evaluate the Lyapunov exponent in Eq. (15.7) as
 n 
1  m−1 
k−1

λ(m) = lim ln pl (F (s(m j + i)) − dl ) .
2
(15.8)
k→∞ k
j=0 l=1 i=0

One of the central objectives of this study is to understand the relationship between
the synchronizability of the coupled maps when statically coupled through the net
coupling gains in D and their stochastic synchronizability when the net coupling
286 R. Jeter et al.

randomly switches at a period m. Toward this aim, we adjust Eq. (15.8) to the case
of statically coupled maps with a net coupling d 

1
k−1

λ (d ) = lim
st
ln (F  (s( j)) − d  )2 . (15.9)
k→∞ k
j=0

For convenience, we write λlst = λst (dl ) for l = 1, . . . , n. Depending on the value of
dl , the statically coupled systems may synchronize or not, that is, the corresponding
error dynamics may be asymptotically stable or unstable.
If all of the Lyapunov exponents of the statically coupled systems are finite, then
we can establish the following relationship between the Lyapunov exponent of the
stochastic error dynamics (15.8) and {λrst }rn=1 :
 m−1  
1
k−1 n
pl i=0 (F (s(m j + i)) − dl )2
λ(m) = mλrst + lim ln l=1
m−1  . (15.10)
i=0 (F (s(m j + i)) − dr )
k→∞ k 2
j=0

Equation (15.10) is derived fromm−1 Eq. (15.8) by: (i) dividing and multiplying the
argument of the logarithm by i=0 (F  (s(m j + i)) − dr )2 ; (ii) using the product
rule of logarithms; and (iii) applying Eq. (15.9) upon rescaling of the time variable
by the period m.
By multiplying both sides of Eq. (15.10) by pr and summing over r , we obtain
the following compact relationship between the Lyapunov exponent of the stochastic
dynamics and the individual Lyapunov exponents for statically coupled maps:


n
1
k−1 n
pl ζl ( j)
λ(m) = m pl λlst + lim ln l=1
n pl . (15.11)
l=1 ζl ( j)
k→∞ k
l=1 j=0

Here, we have introduced:


m−1
ζl ( j) = (F  (s(m j + i)) − dl )2 , (15.12)
i=0

which we assume to be different than zero to ensure that the Lyapunov exponent
stays finite.
The first summand on the right-hand side of Eq. (15.11) is linearly proportional
n
to the switching period m and the “effective” Lyapunov exponent λ̄ = l=1 pl λlst ,
which corresponds to the average of the Lyapunov exponents associated with the
statically coupled maps, weighted by the probability of the corresponding switch-
ing. The second summand is a residual quantity, which is always nonnegative and
encapsulates the complex dependence of the transverse dynamics on the switching
period beyond the linear dependence associated with the first summand.
15 Dynamics and Control of Stochastically Switching … 287

15.3.4 Necessary Condition for Mean Square


Synchronization

Proposition 15.3.1 The synchronization of the stochastic system (15.3) is mean


square stable only if the effective Lyapunov exponent λ̄ is negative.

Proof A lower bound for the Lyapunov exponent λ(m) can be obtained by applying
the weighted arithmetic-geometric mean inequality (Bullen and Mitrinovic 2013)


n
p

n
ζl l ≤ pl ζl . (15.13)
l=1 l=1

From inequality (15.13), it follows that the argument of the logarithm in Eq. (15.11)
is larger than or equal to 1. As a result, we obtain

λ(m) ≥ m λ̄. (15.14)

This inequality establishes that for λ(m) to be negative, λ̄ must also be negative. 

Remark 15.1 From the previous claim, we posit if none of the Lyapunov expo-
nents {λrst }rn=1 are negative, synchronization is not feasible for any selection of m and
{ pr }rn=1 . Thus, stochastic synchronization cannot be achieved without at least one
coupling configuration to support synchronization. This is in contrast with observa-
tions from continuous-time systems which indicate the possibility of stable synchro-
nization even if none of the coupling configurations support synchronization (Jeter
and Belykh 2015a, b).

Remark 15.2 The weighted arithmetic and geometric mean, introduced in (15.13),
are equal if and only if ζ1 = ζ2 = · · · = ζn . Thus, inequality (15.14) reduces to an
equality if and only if


m−1 
m−1
(F  (s(m j + i)) − d1 )2 = (F  (s(m j + i)) − d2 )2 = (15.15)
i=0 i=0


m−1
··· = (F  (s(m j + i)) − dn )2 (15.16)
i=0

holds for any j ∈ Z+ . For the case of chaotic dynamics, where s(k) does not evolve
periodically in time, this condition cannot be satisfied and Eq. (15.15) is a strict
inequality.

For continuous-time systems (Belykh et al. 2004a; Hasler and Belykh 2005; Por-
firi et al. 2008, 2006; Porfiri and Fiorilli 2009a, b, 2010; Porfiri and Pigliacampo
2008; Porfiri and Stilwell 2007), it was shown that under fast switching conditions
288 R. Jeter et al.

the synchronizability of stochastically switching system can be assessed from the


synchronizability of the averaged system. Here, we re-examine this limit in the case
of coupled maps, whereby the averaged system is obtained by replacing the switch-
ing gain by its expected values. The synchronizability of the averaged system is
ascertained by studying the Lyapunov exponent obtained by replacing d  with E[d]
in Eq. (15.9), that is,

1
k−1
λaver = lim ln (F  (s( j)) − E[d])2 . (15.17)
k→∞ k
j=0

In what follows, we demonstrate through examples that the weighted average


Lyapunov exponent λ̄ can be positive or negative, independent of the value of λaver .
Therefore, the averaged system does not offer valuable insight on the stability of
the synchronization manifold of the stochastically coupled maps. For the sake of
illustration, we consider the case in which the individual dynamics corresponds to
the identity, such that

x1 (k + 1) x (k) + ε1 (k)(x2 (k) − x1 (k))


= 1 . (15.18)
x2 (k + 1) x2 (k) + ε2 (k)(x1 (k) − x2 (k))

In this case, the transverse dynamics in (15.4) takes the simple form

ξ(k + 1) = [1 − d(k)] ξ(k). (15.19)

Statically coupled identity maps should have a Lyapunov exponent given by (15.9)
with F  (s( j)) = 1, that is,

λst (d  ) = ln (1 − d  )2 . (15.20)

Suppose that the net switching gain is a random variable that takes values d1 = 1
and d2 = −1 with equal probabilities 0.5. Then, using Eq. (15.20) we compute

1  st 
λ̄ = λ (1) + λst (−1) = −∞, (15.21a)
2
λaver = λst (0) = 0 > λ̄. (15.21b)

Thus, the average coupling does not support synchronization, even though the effec-
tive Lyapunov exponent is negative.
Now, we assume d1 = 0 and d2 = 2 with the same probability 0.5, which yields

1  st 
λ̄ = λ (0) + λst (2) = 0, (15.22a)
2
λaver = λst (1) = −∞ < λ̄. (15.22b)
15 Dynamics and Control of Stochastically Switching … 289

This posits that the stochastically coupled maps cannot synchronize for any selection
of the period m, even though the average coupling affords synchronization in a single
time step.
If the difference between the possible values of the net coupling gain in D is
sufficiently small, the stability of the stochastic system can be related to the stability
of the error dynamics of the averaged system. In this case, if for all l = 1, . . . , n, we
can write F  (x) − dl as F  (x) − Δdl + E[d], where |Δdl | |F  (x) − E[d]| is the
deviation of the stochastic switching with respect to their expected value. Thus, we
obtain


n
1
k−1
λ̄ = pl lim ln (F  (s( j)) − E[d] + Δdl )2 ≈
k→∞ k
l=1 j=0

1    
k−1 n k−1
2 pl Δdl
lim ln (F  (s( j)) − E[d])2 + lim 
= λaver ,
k→∞ k
j=0 l=1
k→∞
j=0
F (s( j) − E[d]
(15.23)

where we have expanded the logarithm in series in the neighborhood of F  (s( j)) −
n
E[d] and we have used the fact that l=1 pl Δdl = 0 by construction.

15.3.5 Chaotic Dynamics

Direct computation of the Lyapunov exponent as a limit of a time series from


Eq. (15.8) or (15.11) may be challenging or even not feasible; for example, if F  (x) is
undefined on a finite set of points x. Following the approach of Hasler and Maistrenko
(1997), we replace the summation with integration using Birkhoff’s ergodic theo-
rem (Bunimovich et al. 2000).
Toward this aim, we introduce ρ(x) as the probability density function of the
map F(x), defined on a set B and continuously differentiable on B except for a
finite number of points. The probability density function of each map can be found
analytically or numerically (Billings and Bollt 2001; Bollt 2013). Using Birkhoff’s
ergodic theorem, Eqs. (15.8), (15.9), and (15.11) can be written as

λlst = ln (F  (t) − dl )2 ρ(t)dt, (15.24a)
B
 
n
λ(m) = ln pl Yl (t, m)ρ(t)dt, (15.24b)
B l=1

n  n
pl Yl (t, m)
λ(m) = m pl λlst + ln l=1
n pl ρ(t)dt. (15.24c)
l=1 B l=1 Yl (t, m)
290 R. Jeter et al.

Here, we have introduced the function of time and switching period


m−1
Yl (t, m) = (F  (F i (t)) − dl )2 , (15.25)
i=0

where F i (t) = [F ◦ F ◦ · · · ◦ F] (t) is the composite function of order i.


If the analytical expression of the probability density function is known, the Lya-
punov exponents can be found explicitly as further detailed in what follows when
we study coupled tent maps. Numerical analysis can also benefit from the above
formulation, which obviates with computational challenges related to uncertainties
in rounding variables in Eqs. (15.8), (15.9), and (15.11) for large values of k. This
may be especially evident for large curvatures of the individual map, which could
result in sudden changes in the synchronization dynamics.

Remark 15.3 Equation set (15.24) can be used to explore the synchronizability of an
N -periodic trajectory s(N k + i) = si , where i = 0, 1, . . . , N − 1, k ∈ Z+ , and N ∈
Z+ /{0}, by using the appropriate probability density function (Bollt 2013) ρ(s) =
N −1
i=0 δ(s − si ), where δ(·) denotes the Dirac delta distribution. Specifically, from
1
N
(15.24a) and (15.24b), we establish

N −1
1  
n
λ(m) = ln pl Yl (si , m). (15.26)
N i=0 l=1

15.3.6 A Representative Example: Coupled Tent Maps

To illustrate our approach, we use the paradigm of two linearly coupled one-
dimensional tent maps. Statically coupled tent maps are known to have two symmetric
ranges of positive and negative coupling for which synchronization is locally stable
(Hasler and Maistrenko 1997) (see Fig. 15.5). In our setting, we let the coupling
stochastically switch between values within and outside these stability regions to
explore the emergence of windows of opportunity. We will demonstrate that while
fast switching, occurring at each time step may not synchronize the maps, there can
be a range of lower frequencies that yields stable synchronization. We argue that
this is possible for coupled maps where the probability of switching between stable
and unstable configurations is uneven, inducing a non-trivial balance between the
dynamics of the coupled maps and the switching periods.
The chaotic tent map, described by the equation

ax(k), x(k) < 1/2
x(k + 1) = F(x(k)) = (15.27)
a(1 − x(k)), x(k) ≥ 1/2
15 Dynamics and Control of Stochastically Switching … 291

Fig. 15.5 Transversal Lyapunov exponent, λst , for stability of synchronization in the static network
of tent maps (15.3), calculated through (15.28) as a function of coupling ε

with parameter a = 2 is known to have a constant invariant density function ρ(t) = 1


(Hasler and Maistrenko 1997).

15.3.6.1 Statically Coupled Maps

The stability of synchronization in a static network (15.3) of tent maps (15.27) is


controlled by the sign of the transversal Lyapunov exponent (Hasler and Maistrenko
1997)
λst = ln |2 − ε| + ln |2 + ε|. (15.28)
 √ √ 
Figure 15.5 indicates two disjoint regions given by ε ∈ − 5, − 2 and ε ∈
√ √ 
2, 5 in which λst < 0 and synchronization is stable.

15.3.6.2 Stochastically Coupled Maps

To elucidate synchronizability of stochastically coupled tent maps, we assume that


the net coupling gain d takes values d1 and d2 with corresponding probabilities p1
and p2 = 1 − p1 . The numerical computation of the Lyapunov exponent in (15.8)
is performed for different values of d2 from −4 to 4 with a step of 0.01 and m from
292 R. Jeter et al.

1 to 25 with a step of 1. The probability p1 is held fixed to 0.5 and the net coupling
gain d1 to −1.90.
This wide parameter selection allows for exploring the connection between the
stability of synchronization for static coupling and the resulting stochastic synchro-
nization. We consider different cases, where stochastic switching is implemented
on coupling gains which could individually support or hamper synchronization for
statically coupled maps. Specifically, we contemplate the case in which: none (case
I), one (case II), or both (case III) of the coupling gains yield synchronization.
A closed-form expression for the Lyapunov exponent of coupled tent maps can
be derived from Eq. (15.24b) using the probability density function ρ(t) = 1, see
Golovneva et al. (2017) for a precise derivation,

m  
 n 
1  m 
λ(m) = m ln pl (2 − dl ) 2(m−i)
(2 + dl ) .
2i
(15.29)
2 i=0 i l=1


We comment that for large m the binomial coefficient grows as 2m / m according
to Stirling’s formula, which ensures that the summation is well behaved in the slow
switching limit (Olver et al. 2010).
Figure 15.6 (Top) provides the Lyapunov exponent of two stochastically tent maps,
analytically computed from Eq. (15.29). The effective Lyapunov exponent is directly
computed from Eq. (15.28), which for the select parameters, p1 = p2 = 0.5 and
d1 = −1.90, yield the following intervals for
      
1 1 1 1
d2 : − 4+ ,− 4 − ∪ 4− , 4+ .
0.39 0.39 0.39 0.39

Importantly, analytical results for large periods in Fig. 15.6 (Bottom) confirm that
slow switching in case III favors stochastic synchronization. Figure 15.6 (Bottom)
also confirms the existence of a thin green zone surrounding the blue bands, where
synchronization is stable even though one of the coupling gains does not support
synchronization (case II). For example, in the case of fast switching, m = 1, these
regions are (−2.33, −2.24) and (−1.73, −1.64) from the closed-form expressions
in Eqs. (15.28) and (15.29).
The analytical solution in Eq. (15.29) allows for shedding further light on the
possibility of synchronizing coupled maps in case II. Specifically, in Fig. 15.7 we
consider switching between coupling gains d1 = −1.9999 and d2 = 1.7000, which
are associated with λst1 = −7.82 (strongly stable synchronization) and λst2 = 0.10
(weakly unstable synchronization). We systematically vary the probability of switch-
ing p1 from 0.6 to 1 with a step 0.001, so that when the coupled maps spend most
of the time with the coupling gain that would support synchronization. In this case,
the effective Lyapunov exponent is always negative, and synchronization may be
attained everywhere in the parameter space.
15 Dynamics and Control of Stochastically Switching … 293

Fig. 15.6 Analytical demonstration of synchronization through non-fast switching. (Top) Lyapunov
exponent of two stochastically tent maps, where the net coupling is switching with equal probability
between d1 = −1.90 and d2 at a period m, analytically computed from Eq. (15.29). The color bar
illustrates the range of Lyapunov exponents attained for each value of γ . The dashed line identifies
the values of d2 and m for which the Lyapunov exponent is zero; the regions within such contours
correspond to negative values of the Lyapunov exponent and thus stochastic synchronization. The
solid lines refer to the values of d2 and m for which the effective Lyapunov exponent is zero. The
vertical bands identified by such solid lines correspond to regions where stochastic synchronization
is feasible, as predicted by Proposition 3.1. (Bottom) Interplay between synchronization in stochas-
tically and statically coupled tent maps. The partition into cases I, II, and III is based on the sign of
the Lyapunov exponent in Eq. (15.28), corresponding to the net couplings d1 and d2 . The regions
are colored as follows: orange (case II without stochastic synchronization); yellow (case III without
stochastic synchronization); green (case II with stochastic synchronization); and blue (case III with
stochastic synchronization)
294 R. Jeter et al.

Fig. 15.7 Analytical demonstration of emergence of windows of opportunity. (Top) Lyapunov


exponent of two stochastically tent maps as a function of the switching probability p1 and the
period m, analytically computed from Eq. (15.29) with d1 = −1.9999 and d2 = 1.7000. The color
bar illustrates the range of Lyapunov exponents attained for each value of γ . The dashed line identify
the values of d2 and m for which the Lyapunov exponent is zero; the regions within such contours
correspond to negative values of the Lyapunov exponent and thus stochastic synchronization. (Bot-
tom) Interplay between synchronization in stochastically and statically coupled tent maps. For the
select values of the net couplings, λst
1 = −7.82 and λ2 = 0.10, which correspond to case II. The
st

regions are colored as follows: orange (case II without stochastic synchronization) and green (case
II with stochastic synchronization)
15 Dynamics and Control of Stochastically Switching … 295

Surprisingly, under fast switching conditions, synchronization is not attained if


p1  1 as shown in Fig. 15.7. Although the maps spend most of the time in a configu-
ration that would strongly support synchronization, the sporadic ( p2 ≈ 0) occurrence
of a coupling gain which would lead to weak instability hampers stochastic synchro-
nization under fast switching. Increasing the switching period, synchronization may
be attained for p1 > 0.995 (see the “Pinocchio nose” in Fig. 15.7 (Bottom)). For
0.753 < p1 < 0.795, we observe a single window of opportunity, whereby synchro-
nization is achieved in a compact region around m = 10. For 0.795  p1  0.824,
a second window of opportunity emerges for smaller values of m around 5. The two
windows ultimately merge for p1 ≈ 0.83 in a larger window that grows in size as p1
approaches 1.
In summary, we have studied the stochastic stability of the transverse dynamics
using the notion of mean square stability, establishing a mathematically-tractable
form for the Lyapunov exponent of the error dynamics. We have demonstrated the
computation of the stochastic Lyapunov exponent from the knowledge of the prob-
ability density function. A necessary condition for stochastic synchronization has
been established, aggregating the Lyapunov exponents associated with each static
coupling configuration into an effective Lyapunov exponent for the stochastic dynam-
ics. For tent maps, we have established a closed-form expression for the stochastic
Lyapunov exponent, which helps dissecting the contribution of the coupling gains,
switching probabilities, and switching period on stochastic synchronization.
We have demonstrated that non-fast switching may promote synchronization of
maps whose coupling alternates between one configuration where synchronization
is unstable and another where synchronization is stable (case II). These windows
of opportunity for the selection of the switching period may be disconnected and
located away from the fast switching limit, where the coupling is allowed to change
at each time step.
In contrast to one’s expectations, fast switching may not even be successful in
synchronizing maps that are coupled by switching between two configurations that
would support synchronization (case III). However, a sufficiently slow switching
that allows the maps to spend more time in one of the two stable synchronization
states will induce stochastic synchronization. The emergence of a lower limit for
the switching period to ensure stochastic synchronization is highly non-trivial, while
the stabilization of synchronization by slow switching in the dwell time limit should
be expected as the maps will spend the time necessary to synchronize in one of the
stable configurations, before being re-wired to the other stable configuration.
296 R. Jeter et al.

15.4 Network Synchronization Through Stochastic


Broadcasting

Building on our results from the previous section on the stochastic synchronization
of two intermittently coupled maps, in this section, we go further and address an
important problem of how non-fast switching can be used to control synchronization
in a target network through stochastic broadcasting from a single external node.
This problem of controlling synchronous behavior of a network towards a desired
common trajectory (Motter 2015) arises in many technological and biological sys-
tems where agents are required to coordinate their motion to follow a leader and
maintain a desired formation (Ren and Beard 2008). In our setting, each node of
the target network, implemented as a discrete-time map, is coupled to the external
node with connections that stochastically switch in time with an arbitrary switching
period. The network is harder to synchronize than its isolated nodes, as its struc-
ture contributes to resilience to controlled synchronization probed by the externally
broadcasting node.
In the following, we will rigorously study the mean square stability of the syn-
chronous solution in terms of the error dynamics and provide an explicit dependence
of the stability of controlled synchronization on the network structure and the prop-
erties of the underlying broadcasting signal, defined by the strength of broadcasting
connections and their switching period and probability. Via an analytical treatment
of the Lyapunov exponents of the error dynamics and the use of tools from ergodic
theory, we derive a set of stability conditions that provide an explicit criterion on
how the switching period should be manipulated to overcome network resilience to
synchronization as a function of the Laplacian spectrum of the network (Godsil and
Royle 2013).
Through the lens of chaotic tent maps, we discover that the network topology
shapes the windows of opportunity of favorable non-fast switching in a highly non-
linear fashion. In contrast to mutual synchronization with a network whose stability is
determined by the second smallest and largest eigenvalue of the Laplacian matrix via
the master stability function, (Pecora and Carroll 1998) controlled synchronization
by the external node is defined by all its eigenvalues, including the zero eigenvalue.
In the case of chaotic tent maps, the zero and the largest eigenvalue appear to effec-
tively control the size of these windows of opportunity. This leads to the appearance
of a persistent window of favorable switching periods where all network topologies
sharing the largest eigenvalue become more prone to controlled synchronization.
We study the synchronization of a network of N discrete-time oscillators given
by the state variables yi ∈ R for i = 1, 2, . . . , N 1 that are driven by an external ref-
erence node given by x ∈ R via a signal that is stochastically broadcasted to all of
the nodes in the network. The topology of the network is undirected and unweighted.
It is described by the graph G = (V , E ), where V is the set of vertices and E is
the set of edges. The broadcaster-network system is depicted in Fig. 15.8. The evo-

1 These results generalize for yi ∈ Rn .


15 Dynamics and Control of Stochastically Switching … 297

Fig. 15.8 The reference


node (blue) stochastically
broadcasts a signal to each of
the nodes in a static network
of N oscillators (pink)

lution of the oscillators in the network and the reference node are given by the same
mapping function F : R → R, such that x(k + 1) = F(x(k)). The switching of the
broadcasted signal is an independent and identically distributed (i.i.d) stochastic
process that re-switches every m time steps. That is, the coupling strength of the ref-
erence node ε(mk) = ε(mk + 1) = · · · = ε(m(k + 1) − 1) is drawn randomly from
a set of n coupling strengths {ε1 , . . . , εn } with probabilities p1 , . . . , pn , respectively
n
( l=1 pl = 1).
The evolution of the discrete-time broadcaster-network system can be written
compactly as

x(k + 1) = F(x(k)),
(15.30)
y(k + 1) = F (y(k)) − μLy(k) − ε(k)I N (y(k) − x(k)1 N ) ,

where F is the natural vector-valued extension of F, μ is the coupling strength within


the network, 1 N is the vector of ones of length N , I N is the N × N identity matrix, and
N
L is the Laplacian matrix of G i.e., L i j = −1 for i j ∈ E , L ii = − Li j , i =
j=1, j=i
1, 2, . . . , N . Without loss of generality, we order and label the Laplacian spectrum
of L: γ1 = 0 ≤ γ2 ≤ · · · ≤ γ N .
We study the stability of the stochastic synchronization of the network about
the reference node’s trajectory, or y1 (k) = y2 (k) = · · · = y N (k) = x(k). Towards
this goal, it is beneficial to re-format the problem and study the evolution of the
error dynamics ξ(k) = x(k)1 N − y(k). When all of the nodes yi (k) have converged
to the reference trajectory, ξ(k) = x(k)1 N − y(k) = 0 N . To study the stability of
synchronization, we linearize the system about the reference trajectory

ξ(k + 1) = [D F(x(k))I N − μL − ε(k)I N ] ξ(k), (15.31)

where D F(x(k)) is the Jacobian of F evaluated along the reference trajectory x(k).
As is typical of linearization, we assume that the perturbations ξi (k) in the varia-
tional Eq. (15.31) are small and in directions transversal to the reference trajectory.
298 R. Jeter et al.

Convergence to the reference trajectory along these transversal directions ensures the
local stability of the synchronous solution. Despite the stochastic and time-dependent
nature of the broadcasting signal ε(k), it only appears on the diagonal elements under-
lying the evolution of the error vector ξ(k). Because μL is the only matrix in (15.31)
that is not diagonal, we can diagonalize (15.31) with respect to the eigenspaces of
the Laplacian matrix.
We obtain the stochastic master stability equation

ζ (k + 1) = D F(x(k)) − μγ − ε(k)) ζ (k), (15.32)

where γ ∈ {γ1 , . . . , γ N } and ζ ∈ R is a generic perturbation along the eigendirection


of L. Notice that γ1 = 0 corresponds to the evolution of the error dynamics in the
absence of a network. Lastly, in order to simplify the analysis of the evolution of
the variational equations, we re-scale the time variable with respect to the switching
period

m−1
ζ̃ (k + 1) = D F(x(mk + i)) − μγ − ε̃(k) ζ̃ (k), (15.33)
i=0

where ζ̃ (k) = ζ (mk) and ε̃(k) = ε(mk). This scalar equation provides the explicit
dependence of the synchronization error on the network topology (via μγ ) and
the strength of the broadcasted signal (via ε). With this in mind, we continue by
discussing the stability of the synchronization to the reference trajectory.

Definition 15.4.1 The synchronous solution yi (k) = x(k) for i = 1, 2, . . . , N in


the stochastic system (15.30) is locally mean square asymptotically stable if
lim E[ζ̃ 2 (k)] = 0 for any ζ̃ (0) and γ ∈ {γ1 , . . . , γ N } in (15.33), where E[·] denotes
k→∞
expectation with respect to the σ -algebra generated by the stochastic process under-
lying the switching.
Mean square stability of the stochastic system in (15.33), and by extension syn-
chronization in the original system (15.30), corresponds to studying the second
moment of ζ̃ (k). We take the expectation of the square of the error in (15.33)
 m−1  2
n 
E[ζ̃ 2 (k + 1)] = pl D F(x(mk + i)) − μγ − εl E[ζ̃ 2 (k)]. (15.34)
l=1 i=0

Reducing the stochastically switching system (15.30) to a deterministic system


(15.34) allows for the use of standard tools from stability theory, such as Lyapunov
exponents (Ott 2002). The Lyapunov exponent for (15.34) is computed as
 
E ζ̃ 2 (k) j  
λ= lim 1 ln ζ̃ 2 (0)
= lim 1
ln E[ζ̃ 2 (k + 1)] . (15.35)
k→∞ k j→∞ j k=1

There are numerous pitfalls that can undermine the numerical computation of
Lyapunov exponent from a time series, such as E[ζ̃ 2 ] falling below numerical pre-
15 Dynamics and Control of Stochastically Switching … 299

cision in a few time steps and incorrectly predicting stochastic synchronization for
trajectories that would eventually diverge. With the proper assumptions, one can use
Birkoff’s ergodic theorem (Ott 2002) to avoid these confounds and form the main
analytical result of this section.
Proposition 15.4.1 The synchronous solution x(k) of the stochastic system (15.30)
is locally mean square asymptotically stable if
 m−1  2
 n 
λ= ln pl D F(t) − μγ − εl ρ(t)dt (15.36)
B l=1 i=0

is negative for ∀γ ∈ {γ1 , . . . , γ N }. Here, B is the region for which the invariant
density ρ(t) of F is defined.
Proof Assuming F is ergodic with invariant density ρ(t), one can avoid computing
the Lyapunov exponent from a time series using Birkoff’s ergodic theorem to replace
the averaging over time with averaging over the state. This amounts to replacing the
summation with integration in (15.35). Then, by virtue of (15.35) and the definition
of a Lyapunov exponent, stability of the stochastic system reduces to monitoring the
sign of this Lyapunov exponent. 
Remark 15.4 We reduce studying the stability of synchronization in (15.30) to mon-
itoring the sign of the Lyapunov exponents in (15.36), with a different exponent for
each eigenvalue γ . If each of these Lyapunov exponents is negative, the dynamics
of the network in the original system (15.30) converges to the dynamics of the refer-
ence trajectory. Furthermore, this allows the stability of stochastic synchronization
to be studied explicitly in the network and broadcasting parameters μ, {γ1 , . . . , γ N },
{ε1 , . . . , εn }, { p1 , . . . , pn }, and m.
Remark 15.5 There are two notable consequences of the Laplacian spectrum on
the stability conditions given by the sign of (15.36): (i) μγ = 0 is always an eigen-
value, such that it is necessary that the nodes in the network pairwise synchronize
to the reference node in the absence of a network topology, and (ii) if the network
is disconnected, fewer stability conditions need to be satisfied, whereby there will
be repeated zero eigenvalues. In light of these consequences, a network is inherently
resilient to broadcasting synchronization, in that it necessitates satisfying more sta-
bility conditions, and synchronization in the absence of a network is always one of
the stability conditions.

15.4.1 Tent Maps Revisited

To explore some of the theoretical implications of the general stability criterion


(15.36), we consider the broadcaster-network system (15.30) composed of chaotic
tent maps. In this context, the general criterion (15.36) can be written in a compact
form that depends only on the network and broadcasting parameters.
300 R. Jeter et al.

Proposition 15.4.2 A stochastic system (15.30) of chaotic tent maps is locally mean
square asymptotically stable if
m  
 n 
1  m 
λ= m ln pl Y (i, m, μγ , εl ) (15.37)
2 i=0 i l=1

is less than zero,


m  where Y (i, m, μγ , εl ) is given by (2 + μγ + εl )2i (2 − μγ −
εl ) 2(m−i)
and i = (m−i)!i! .
m!

Remark 15.6 The closed-form analytical expression (15.37) for the Lyapunov
exponents indicates the explicit dependence of the stability of controlled synchroniza-
tion on the network coupling strength μ, the eigenvalues of the Laplacian matrix for
the network, the switching period m, the stochastically switching coupling strengths
{ε1 , . . . , εn }, and their respective probabilities { p1 , . . . , pn }. For controlled synchro-
nization to be mean square stable, the Lyapunov exponent for each eigenvalue in the
Laplacian spectrum must be negative.

To illustrate the power of our explicit criterion (15.37) for controlled synchro-
nization and clearly demonstrate of the emergence of windows of opportunity, we
limit our attention to stochastic broadcasting between two coupling strengths ε1 (with
probability p) and ε2 (with probability 1 − p).
To choose the coupling strengths ε1 and ε2 , we consider two statically coupled
tent maps (15.27)
x(k + 1) = f (x(k)),
(15.38)
y(k + 1) = f (y(k)) + ε(x(k) − y(k)).

This network (15.38) describes a pairwise, directed interaction between the dynamics
of the broadcasting map x(k) and a single, isolated map y(k) from the network where
the switching broadcasting coupling is replaced with a static connection of strength
ε. The stability of synchronization in the static network (15.38) is controlled by the
sign of the transversal Lyapunov exponent (Hasler and Maistrenko  √ 1997) given in
√ 
(15.28). Figure 15.5 indicates two disjoint regions given by ε ∈ − 5, − 2 and
√ √ 
ε∈ 2, 5 in which λst < 0 and synchronization is stable.

15.4.2 Stochastic Broadcasting: Fast Switching (m = 1)

When switching occurs at every time step, the condition described in Proposi-
tion 15.4.2 can be simplified to the following corollary, which we state without addi-
tional proof.
15 Dynamics and Control of Stochastically Switching … 301

Corollary 15.4.1 The Lyapunov exponent for the mean square stability of the syn-
chronous solution in the fast-switching system represented by (15.30) of chaotic tent
maps is

λ = ln (2 − μγ )2 + 2(μγ − 2)E[ε(k)] + E[ε2 (k)]


· (−2 − μγ )2 + 2(μγ + 2)E[ε(k)] + E[ε2 (k)] ,
(15.39)

where E[ε(k)] = p1 ε1 + p2 ε2 and E[ε2 (k)] = p1 ε12 + p2 ε22 .

15.4.2.1 Master Stability Function

The Lyapunov exponent (15.39) demonstrates the explicit dependence of the stability
of stochastic synchronization on the node-to-node coupling strength, the eigenvalues
of the Laplacian matrix, and the stochastically switching coupling strengths along
with their respective probabilities.
Figure 15.9 illustrates the dependence of λ on ε1 and μγ . The dashed curve in
Fig. 15.9 indicates the boundary between positive and negative Lyapunov exponents,
identifying the onset of mean square stability of the error dynamics. In order for the
network to synchronize to the reference node, the point (ε1 , μγ ) must fall within
the dashed curve for every eigenvalue in the spectrum of the Laplacian matrix. In
agreement with our predictions, we find that as μγ increases the range of values of
ε1 which affords stable synchronization becomes smaller and smaller. This suggests
that the resilience of the network to synchronize improves with μγ .

Remark 15.7 While the nonlinear dependence of the stability boundary on ε1 and
μγ is modulated by the nonlinearity in the individual dynamics, it should not be
deemed as a prerogative of nonlinear systems. As shown in Remark 4, the stochastic
stability of synchronization in the simplest case of a linear system is also nonlinearly
related to the spectrum of the Laplacian matrix and to the expectation and variance of
the broadcasting signal – even for classical consensus with α = 1 (Cao et al. 2013).

Remark 15.8 In this example of a chaotic tent map, the stability boundary is a sin-
gle curve, defining a connected stability region. To ensure stable synchronization
of a generic network, it is thus sufficient to monitor the largest eigenvalue of the
Laplacian matrix, γ N , such that (ε1 , μγ ) will fall within the stability region. This is
in contrast with the master stability function for uncontrolled, spontaneous synchro-
nization (Pecora and Carroll 1998), which would typically require the consideration
of the second smallest eigenvalue, often referred to as the algebraic connectivity
(Godsil and Royle 2013). However, similar to master stability functions for uncon-
trolled, spontaneous synchronization (Stefański et al. 2007), we would expect that
for different maps, one may find several disjoint regions in the (ε1 , μγ )-plane where
stable stochastic synchronization can be attained.
302 R. Jeter et al.

Fig. 15.9 Master stability function for stochastic synchronization of chaotic tent maps, for ε2 =
2.2, m = 1, and p1 = p2 = 0.5. For synchronization to be stable, each eigenvalue of the Laplacian
matrix must correspond to a negative Lyapunov exponent (indicated by the yellow color, isolated by
the black dashed curve). For example, the black vertical line shows the range of admissible values
of μγ that would guarantee stability at ε1 = 2

15.4.2.2 Role of Network Topology

The master stability function in Fig. 15.9 shows that both μ and γ contribute to
the resilience of the network to synchronization induced by stochastic broadcasting.
For a given value of the node-to-node coupling strength μ, different networks will
exhibit different residences based on their topology. Based on the lower bound by
Grone and Merris (1994) and the upper bound by Anderson and Morley (1985), for
a graph with at least one edge, we can write max{di , i = 1, . . . , N } + 1 ≤ γ N ≤
max{di + d j , i j ∈ E }, where di is the degree of node i = 1, . . . , N . While these
bounds are not tight, they suggest that the degree distribution has a key role on γ N . For
a given number of edges, one may expect that networks with highly heterogeneous
degree distribution, such as scale-free networks (Boccaletti et al. 2006), could lead
to stronger resilience to broadcasting as compared to regular or random networks,
with more homogenous degree distributions (Boccaletti et al. 2006).
In Fig. 15.10, we illustrate this proposition by numerically computing the largest
eigenvalue of the graph Laplacian for three different network types:

(i) A 2K -regular network, in which each node is connected to 2K nearest neigh-


bors, such that the degree is equal to 2K . As K increases, the network
approaches a complete graph.
15 Dynamics and Control of Stochastically Switching … 303

Fig. 15.10 Largest eigenvalue γ N of the Laplacian matrix as a function of the number of edges for
three different types of networks of 100 nodes: a 2K -regular network (navy curve), scale-free (light
blue curve), and random Erdős–Rényi (red curve) networks. Scale-free and random networks are
run 10000 times to compute means and standard deviations, reported herein – note that error bars
are only vertical for scale-free networks since the number of edges is fully determined by q, while
for random networks also horizontal error bars can be seen due to the process of network assembly

(ii) A scale-free network (Barabási and Albert 1999) which is grown from a small
network of q nodes. At each iteration of the graph generation algorithm, a node
is added with q edges to nodes already in the network. The probability that an
edge will be connected to a specific node is given by the ratio of its degree to
the total number of edges in the network. Nodes are added until there are N
nodes in the network. When q is small, there are a few hub nodes that have a
large degree and many secondary nodes with small degree, whereas when q
is large, the scale-free network is highly connected and similar to a complete
graph.
(iii) A random Erdős–Rényi network which takes as input the probability, p, of an
edge between any two nodes. When p is small, the network is almost surely
disconnected, and when p approaches 1, it is a complete graph.

We fix N to 100 and vary K , q, and p in (i), (ii), and (iii), respectively, to explore
the role of the number of edges.
As expected from the bounds in Grone and Merris (1994) and Anderson and
Morley (1985), for a given number of edges, the scale-free network tends to exhibit
larger values of γ N . This is particularly noticeable for networks of intermediate size,
whereby growing the number of edges will cause the three network types to collapse
on a complete graph of N nodes. As the largest eigenvalue of the Laplacian matrix
304 R. Jeter et al.

fully controls the resilience of the network to broadcasting-induced synchronization


(in the case of linear and chaotic tent maps), we may argue that, given a fixed number
of edges, the network can be configured such that it is either more conducive (regular
graph) or resistant (scale-free graph) to synchronization. The increased resilience
of scale-free networks should be attributed to the process of broadcasting-induced
synchronization, which globally acts on all nodes simultaneously, without targeting
critical nodes (low or high degree) like in pinning control (Cao et al. 2013; Tang et al.
2014).

15.4.3 Stochastic Broadcasting: Beyond Fast Switching


(m > 1)

Returning to the stochastically switching broadcaster-network system, but without


the limitation of m = 1, we use the master stability function of Fig. 15.5 to choose
ε1 = −1.999 from a stability region and ε2 = −1.7 from an instability region such
that the connection from the broadcasting node to the network switches between the
two values where one value supports controlled synchronization whereas the other
destabilizes it. In this way, the broadcaster sends two conflicting messages to the
network to follow and not to follow its trajectory.
We pay particular attention to the case where the switching probability of the
stabilizing coupling, ε1 is higher ( p > 0.5.) One’s intuition would suggest that fast-
switching between the stable and unstable states of controlled synchronization with
probability ( p > 0.5), that makes the system spend more time in the stable state,
would favor the stability of synchronization. However, the master stability function
of Fig. 15.11 calculated through the analytical expression for the Lyapunov exponent
(15.37) shows that this is not the case. Our results reveal the presence of a stability
zone (black area) which, in terms of the switching periods m, yields a window of
opportunity when non-fast switching favors controlled synchronization, whereas fast
or slow switching does not. The fact that slower switching at m > 25 at the switching
probability p = 0.9 (see the transition from point A to B) desynchronizes the system
is somewhat unexpected, as the system is likely to stay most of the time in the stable
state, defined by ε1 .
The exact cause of this effect remains to be studied; however, we hypothesize
that this instability originates from a large disparity between the time scale of weak
convergence in the vicinity of the synchronization state during the (long) time lapse
when the stabilizing coupling ε1 is on and the time scale of strong divergence from the
synchronization solution far away from it when the destabilizing coupling ε2 finally
switches on. As a result, this unbalance between the convergence and divergence
makes synchronization unstable.
The window of opportunity displayed in Fig. 15.11 appears as a result of inter-
sections between the boundaries (dashed curves) of the stability zones where each
boundary is calculated from the criterion (15.37) when the Lyapunov exponent is
15 Dynamics and Control of Stochastically Switching … 305

Fig. 15.11 Analytical calculation of the master stability function (15.37) for controlled synchro-
nization of tent maps as a function of the switching probability p and switching period m for
ε1 = −1.9999, ε2 = 1.7, and μ = 0.01. (Top). The black region indicates the stability of con-
trolled synchronization, and the dashed lines represent the boundaries for the stability region for
various eigenvalues γ of a network’s Laplacian matrix. Notice that the size of the stability region is
primarily controlled by only two curves, corresponding to γ = 0 (red dashed) and γ = 10 (black
dashed) such that the addition of curves for eigenvalues γ ∈ (0, 10) only affects the small cusp
part of the stability region (see the zoomed-in area). (Bottom). Zoom-in of the region marked by
the white rectangle in (Top). Points A, B, and C indicate pairs ( p, m) for which synchronization
is unstable, stable, and unstable, respectively for different values of the switching period m. Note
the window of favorable frequencies m which includes point B in the vertical direction from A to
C. Remarkably, the size of the stability region remains persistent to changes of the intra-network
coupling μ (not shown), suggesting the existence of soft, lower and upper thresholds for favorable
switching frequencies between m = 20 and 30
306 R. Jeter et al.

zero for each eigenvalue of the Laplacian matrix. The red curve for γ1 = 0 shows the
stability region in the absence of a network, and is therefore a necessary condition
for controlled synchronization in the presence of the network. In the general case
of N distinct eigenvalues, there will be N curves. Each curve adds a constraint and,
therefore, one would expect each eigenvalue γ1 , . . . , γ N to play a role in reducing
the size of the stability zone and shaping the window of opportunity as a function of
network topology.
In contrast to these expectations, Fig. 15.11 provides a convincing argument that
the stability zone is essentially defined by two curves, corresponding to the zero
eigenvalue, γ1 (red dashed curve) and the largest eigenvalue, γ N (black dashed curve).
All the other curves offer a very minor contribution to shaping the stability region.
As a consequence, windows of opportunity should be relatively robust to topological
changes, preserving the maximum largest eigenvalue of the Laplacian spectrum. For
example, the set of four distinct eigenvalues (0, 1, 3, 10) in Fig. 15.11 corresponds
to a star network of 10 nodes with an additional edge connecting two outer nodes.
In this case, the removal of the additional link reduces the spectrum to three distinct
eigenvalues (0, 1, 10) and eliminates the curve for γ = 3 which, however, does not
essentially change the stability region. This observation suggests that the addition of
an edge to a controlled network, which would be expected to help a network better
shield from the external influence of the broadcasting node, might not necessarily
improve network resilience to synchronization.
Similarly, the removal of an edge from an all-to-all network with two distinct
eigenvalues (0, N ) changes the spectrum to (0, N − 1, N ), which according to
Fig. 15.11 does not significantly alter the stability region either. For general topolo-
gies, one may look at the degree distribution to gather insight on the largest eigenvalue
(Anderson and Morley 1985; Grone and Merris 1994), thereby drawing conclusions
on the switching periods that guarantee the success of the broadcaster to synchronize
the network. Put simply, “you can run but you cannot hide:” the broadcaster will
identify suitable switching rates to overcome the resilience of the network.

15.5 Conclusions

While the study of stochastically switching networks has gained significant momen-
tum, most analytical results have been obtained under the assumption that the charac-
teristic time scales of the intrinsic oscillators and evolving connections are drastically
different, enabling the use of averaging and perturbation methods. In regard to on-
off stochastically switching systems, these assumptions typically yield two extremes,
fast or slow (dwell-time Hespanha and Morse 1999) switching, for which rigorous
theory has been developed (Hasler et al. 2013a, b; Belykh et al. 2004a, 2013; Porfiri
et al. 2006, 2008; Porfiri and Pigliacampo 2008; Porfiri and Fiorilli 2009a). However,
our understanding of dynamical networks with non-fast switching connections is elu-
sive, and the problem of an analytical treatment of the dynamics and synchronization
in non-fast switching network remains practically untouched.
15 Dynamics and Control of Stochastically Switching … 307

In this chapter, we sought to close this gap by presenting an analytical approach to


characterize the stability of synchronization in stochastically switching networks of
discrete-time oscillators as a function of network topology and switching period. We
first focused on the simplest stochastic network composed of two maps and studied
the stability of synchronization by analyzing the linear stability of an augmented
system, associated with the linear mean square transverse dynamics. We performed
a detailed analysis of the Lyapunov exponent of the transverse dynamics, based on
the knowledge of the probability density function for the synchronized trajectory.
We established a necessary condition for stochastic synchronization in terms of the
synchronizability of the coupled maps with a static coupling. The necessary condi-
tion can be used to demonstrate that switching between configurations which do not
individually support synchronization will not stabilize stochastic synchronization
for any switching frequencies. This is in contrast with networks of continuous-time
oscillators where windows of opportunity for stable synchronization may appear as
a result of switching between unstable states (Jeter and Belykh 2015a, b). Through
closed-form and numerical findings, we have demonstrated the emergence of win-
dows of opportunity and elucidated their nontrivial relationship with the stability of
synchronization under static coupling.
While the details of the mechanisms for the appearance of windows of opportunity
in stochastically switching networks are yet to be clarified, it is tenable to hypothesize
that this effect is related to the dynamic stabilization of an unstable state. From a
mechanics perspective, this can be loosely explained by an analogy to the dynamics of
Kapitza’s pendulum. Kapitza’s pendulum is a rigid pendulum in which the pivot point
vibrates in a vertical direction, up and down (Kapitza 1951). Stochastic vibrations
of the suspension are known to stabilize Kapitza’s pendulum in an upright vertical
position, which corresponds to an otherwise unstable equilibrium in the absence
of suspension vibrations. By this analogy, stochastic switching between stable and
unstable configurations can be proposed to perform a similar stabilizing role.
Extending our analysis of synchronization of two maps, we then established a rig-
orous toolbox for assessing the mean-square stability of controlled synchronization
in a static network of coupled maps induced by stochastic broadcasting from a single,
external node. We studied the conditions under which a reference broadcasting node
can synchronize a target network by stochastically transmitting sporadic, possibly
conflicting signals. We demonstrated that manipulating the rate at which the con-
nections between the broadcasting node and the network stochastically switch can
overcome network resilience to synchronization. Through a rigorous mathematical
treatment, we discovered a nontrivial interplay between the network properties that
control this resilience and the switching rate of stochastic broadcasting that should
be adapted to induce synchronization. Unexpectedly, non-fast switching rates con-
trolling the so-called windows of opportunity guarantee stable synchrony, whereas
fast or slow switching leads to desynchronization, even though the networked system
spends more time in a state favorable to synchronization.
In contrast to classical master stability functions for uncontrolled synchroniza-
tion, where both the algebraic connectivity and the largest eigenvalue of the Lapla-
cian matrix determine the onset of synchronization, we report that the algebraic
308 R. Jeter et al.

connectivity has no role on broadcasting-induced synchronization of chaotic tent


maps. Specifically, the resilience of the network to broadcasting synchronization
increases with the value of the largest eigenvalue of the Laplacian matrix. Heteroge-
nous topologies with hubs of large degree should be preferred over homogenous
topologies, when designing networks that should be resilient to influence from a
broadcasting oscillator. On the contrary, homogenous topologies, such as regular or
random topologies, should be preferred when seeking networks that could be easily
tamed through an external broadcasting oscillator. Interestingly, these predictions
would be hampered by a simplified analysis based on averaging, which could lead
to false claims regarding the stability of synchronous solutions.
Our approach is directly applicable to high-dimensional maps whose invariant
density measure can calculated explicitly. These systems include two-dimensional
diffeomorphisms on tori such as Anosov maps (Hasselblatt and Katok 2002), for
which the invariant density measure can be calculated analytically, and volume-
preserving two-dimensional standard maps whose invariant density function can
be assessed through computer-assisted calculations (Levnajić and Mezić 2010).
Although our work provides an unprecedented understanding of network synchro-
nization beyond the fast switching limit, we have hardly scratched the surface of
temporal dynamical networks theory. This work immediately raises the following
questions: (i). What if the i.i.d process underlying the switching was relaxed to be
a more general Markov process? (ii). What if the underlying topology of the broad-
casting was more complex? Both of these questions are of interest, but provide their
own technical challenges and require further study. We anticipate that combining
our recent work on synchronization of two maps under Markovian switching with
memory (Porfiri and Belykh 2017) with the approach presented in this chapter should
make progress toward unraveling a complex interplay between switching memory
and network topology for controlled synchronization.

Acknowledgements This work was supported by the U.S. Army Research Office under Grant No.
W911NF-15-1-0267 (to I.B., R.J., and M.P.), the National Science Foundation (USA) under Grants
No. DMS-1009744 and DMS-1616345 (to I.B and R.J.), and CMMI 1561134, CMMI 1505832,
and CMMI 1433670 (to M.P.).

References

N. Abaid, M. Porfiri, Consensus over numerosity-constrained random networks. IEEE Trans.


Autom. Control 56(3), 649–654 (2011)
W.N. Anderson Jr., T.D. Morley, Eigenvalues of the laplacian of a graph. Linear Multilinear Algebra
18(2), 141–145 (1985)
A. Arenas, A. Díaz-Guilera, J. Kurths, Y. Moreno, C. Zhou, Synchronization in complex networks.
Phys. Rep. 469(3), 93–153 (2008)
A.L. Barabási, R. Albert, Emergence of scaling in random networks. Science 286(5439), 509–512
(1999)
I. Belykh, V. Belykh, R. Jeter, M. Hasler, Multistable randomly switching oscillators: the odds of
meeting a ghost. Eur. Phys. J. Spec. Top. 222(10), 2497–2507 (2013)
15 Dynamics and Control of Stochastically Switching … 309

I. Belykh, M. Di Bernardo, J. Kurths, M. Porfiri, Evolving dynamical networks. Physica D 267(1),


1–6 (2014)
I.V. Belykh, V.N. Belykh, M. Hasler, Blinking model and synchronization in small-world networks
with a time-varying coupling. Physica D 195(1), 188–206 (2004a)
V.N. Belykh, I.V. Belykh, M. Hasler, Connection graph stability method for synchronized coupled
chaotic systems. Physica D 195(1), 159–187 (2004b)
L. Billings, E. Bollt, Probability density functions of some skew tent maps. Chaos Solitons &
Fractals 12(2), 365–376 (2001)
S. Boccaletti, J. Kurths, G. Osipov, D. Valladares, C. Zhou, The synchronization of chaotic systems.
Phys. Rep. 366(1), 1–101 (2002)
S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, D.U. Hwang, Complex networks: structure and
dynamics. Phys. Rep. 424(4), 175–308 (2006)
E.M. Bollt, N. Santitissadeekorn, Applied and Computational Measurable Dynamics (SIAM,
Philadelphia, 2013)
P.S. Bullen, D.S. Mitrinovic, M. Vasic, Means and their Inequalities, vol. 31 (Springer Science &
Business Media, Berlin, 2013)
L. Bunimovich, S. Dani, R. Dobrushin, M. Jakobson, I. Kornfeld, N. Maslova, Y.B. Pesin, J. Smil-
lie, Y.M. Sukhov, A. Vershik, Dynamical Systems, Ergodic Theory and Applications, vol. 100
(Springer Science & Business Media, Berlin, 2000)
S. Camazine, J.L. Deneubourg, N.R. Franks, J. Sneyd, E. Bonabeau, G. Theraulaz, Self-organization
in Biological Systems, vol. 7 (Princeton University Press, Princeton, 2003)
Y. Cao, W. Yu, W. Ren, G. Chen, An overview of recent progress in the study of distributed multi-
agent coordination. IEEE Trans. Ind. Inf. 9(1), 427–438 (2013)
M. Chen, Y. Shang, C. Zhou, Y. Wu, J. Kurths, Enhanced synchronizability in scale-free networks.
Chaos: Interdiscip. J. Nonlinear Sci. 19(1), 013,105 (2009)
P. De Lellis, M. di Bernardo, F. Garofalo, Synchronization of complex networks through local
adaptive coupling. Chaos: Interdiscip. J. Nonlinear Sci. 18(3), 037,110 (2008)
P. De Lellis, M. Di Bernardo, F. Garofalo, Adaptive pinning control of networks of circuits and
systems in lur’e form. IEEE Trans. Circuits Syst. I Regul. Pap. 60(11), 3033–3042 (2013)
P. De Lellis, M. Di Bernardo, F. Garofalo, M. Porfiri, Evolution of complex networks via edge
snapping. IEEE Trans. Circuits Syst. I Regul. Pap. 57(8), 2132–2143 (2010a)
P. De Lellis, M. Di Bernardo, T.E. Gorochowski, G. Russo, Synchronization and control of com-
plex networks via contraction, adaptation and evolution. IEEE Circuits Syst. Mag. 10(3), 64–82
(2010b)
S.N. Dorogovtsev, J.F. Mendes, Evolution of networks. Adv. Phys. 51(4), 1079–1187 (2002)
Y. Fang, Stability analysis of linear control systems with uncertain parameters. Ph.D. thesis, Case
Western Reserve University (1994)
M. Frasca, A. Buscarino, A. Rizzo, L. Fortuna, S. Boccaletti, Synchronization of moving chaotic
agents. Phys. Rev. Lett. 100(4), 044,102 (2008)
C. Godsil, G.F. Royle, Algebraic Graph Theory, vol. 207 (Springer Science & Business Media,
Berlin, 2013)
O. Golovneva, R. Jeter, I. Belykh, M. Porfiri, Windows of opportunity for synchronization in stochas-
tically coupled maps. Physica D 340, 1–13 (2017)
T.E. Gorochowski, M. di Bernardo, C.S. Grierson, Evolving enhanced topologies for the synchro-
nization of dynamical complex networks. Phys. Rev. E 81(5), 056,212 (2010)
T.E. Gorochowski, M.D. Bernardo, C.S. Grierson, Evolving dynamical networks: a formalism for
describing complex systems. Complexity 17(3), 18–25 (2012)
R. Grone, R. Merris, The laplacian spectrum of a graph ii. SIAM J. Discret. Math. 7(2), 221–229
(1994)
M. Hasler, I. Belykh, Blinking long-range connections increase the functionality of locally con-
nected networks. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 88(10), 2647–2655
(2005)
310 R. Jeter et al.

M. Hasler, V. Belykh, I. Belykh, Dynamics of stochastically blinking systems. Part i: finite time
properties. SIAM J. Appl. Dyn. Syst. 12(2), 1007–1030 (2013a)
M. Hasler, V. Belykh, I. Belykh, Dynamics of stochastically blinking systems. Part ii: asymptotic
properties. SIAM J. Appl. Dyn. Syst. 12(2), 1031–1084 (2013b)
M. Hasler, Y.L. Maistrenko, An introduction to the synchronization of chaotic systems: coupled
skew tent maps. IEEE Trans. Circuits Syst. I: Fund. Theory Appl. 44(10), 856–866 (1997)
B. Hasselblatt, A. Katok, Handbook of Dynamical Systems (Elsevier, Amsterdam, 2002)
J.P. Hespanha, A.S. Morse, Stability of switched systems with average dwell-time, in Proceedings
of the 38th IEEE Conference on Decision and Control, vol. 3 (IEEE, 1999), pp. 2655–2660
P. Holme, J. Saramäki, Temporal networks. Phys. Rep. 519(3), 97–125 (2012)
P. Holme, J. Saramäki, Temporal Networks (Springer, Berlin, 2013)
J. Ito, K. Kaneko, Spontaneous structure formation in a network of chaotic units with variable
connection strengths. Phys. Rev. Lett. 88(2), 028,701 (2001)
R. Jeter, I. Belykh, Synchronization in on-off stochastic networks: windows of opportunity. IEEE
Trans. Circuits Syst. I Regul. Pap. 62(5), 1260–1269 (2015a)
R. Jeter, I. Belykh, Synchrony in metapopulations with sporadic dispersal. Int. J. Bifur. Chaos
25(07), 1540,002 (2015b)
R. Jeter, M. Porfiri, I. Belykh, Network synchronization through stochastic broadcasting. IEEE
Control Syst. Lett. 2(1), 103–108 (2018a). https://doi.org/10.1109/LCSYS.2017.2756077
R. Jeter, M. Porfiri, I. Belykh, Overcoming network resilience to synchronization through non-fast
stochastic broadcasting. Chaos Interdiscip. J. Nonlinear Sci. 28(7), 071,104 (2018b)
P.L. Kapitza, Dynamic stability of a pendulum when its point of suspension vibrates. Soviet Phys.
JETP 21, 588–592 (1951)
H.J. Kushner, Introduction to Stochastic Control (Holt, Rinehart and Winston, New York, 1971)
Z. Levnajić, I. Mezić, Ergodic theory and visualization. i. Mesochronic plots for visualization of
ergodic partition and invariant sets. Chaos: Interdiscip. J. Nonlinear Sci. 20(3), 033,114 (2010)
Z. Li, G. Chen, Global synchronization and asymptotic stability of complex dynamical networks.
IEEE Trans. Circuits Syst. II Express Briefs 53(1), 28–33 (2006)
J. Lu, G. Chen, A time-varying complex dynamical network model and its controlled synchroniza-
tion criteria. IEEE Trans. Autom. Control 50(6), 841–846 (2005)
W. Lu, Adaptive dynamical networks via neighborhood information: Synchronization and pinning
control. Chaos: Interdiscip. J. Nonlinear Sci. 17(2), 023,122 (2007)
N. Masuda, P. Holme, Temporal Network Epidemiology (Springer, Berlin, 2017)
N. Masuda, K. Klemm, V.M. Eguíluz, Temporal networks: slowing down diffusion by long lasting
interactions. Phys. Rev. Lett. 111(18), 188,701 (2013)
A. Mondal, S. Sinha, J. Kurths, Rapidly switched random links enhance spatiotemporal regularity.
Phys. Rev. E 78(6), 066,209 (2008)
A.E. Motter, Networkcontrology. Chaos: Interdiscip. J. Nonlinear Sci. 25(9), 097,621 (2015)
T.I. Netoff, S.J. Schiff, Decreased neuronal synchronization during experimental seizures. J. Neu-
rosci. 22(16), 7297–7307 (2002)
T. Nishikawa, A.E. Motter, Network synchronization landscape reveals compensatory structures,
quantization, and the positive effect of negative interactions. Proc. Natl. Acad. Sci. 107(23),
10342–10347 (2010)
F. Olver, D. Lozier, R. Boisvert, C. Clark, NIST Handbook of Mathematical Functions (Cambridge
University Press, Cambridge, 2010)
E. Ott, Chaos in Dynamical Systems (Cambridge University Press, Cambridge, 2002)
L.M. Pecora, T.L. Carroll, Master stability functions for synchronized coupled systems. Phys. Rev.
Lett. 80(10), 2109 (1998)
A. Pikovsky, A. Politi, Lyapunov Exponents: A Tool to Explore Complex Dynamics (Cambridge
University Press, Cambridge, 2016)
A. Pikovsky, M. Rosenblum, J. Kurths, Synchronization: A Universal Concept in Nonlinear Sci-
ences, vol. 12 (Cambridge University Press, Cambridge, 2003)
15 Dynamics and Control of Stochastically Switching … 311

M. Porfiri, A master stability function for stochastically coupled chaotic maps. Europhys. Lett.
96(4), 40,014 (2011)
M. Porfiri, Stochastic synchronization in blinking networks of chaotic maps. Phys. Rev. E 85(5),
056,114 (2012)
M. Porfiri, I. Belykh, Memory matters in synchronization of stochastically coupled maps. SIAM J.
Appl. Dyn. Syst. 16(3), 1372–1396 (2017)
M. Porfiri, F. Fiorilli, Global pulse synchronization of chaotic oscillators through fast-switching:
theory and experiments. Chaos Solitons & Fractals 41(1), 245–262 (2009a)
M. Porfiri, F. Fiorilli, Node-to-node pinning control of complex networks. Chaos: Interdiscip. J.
Nonlinear Sci. 19(1), 013,122 (2009b)
M. Porfiri, F. Fiorilli, Experiments on node-to-node pinning control of chua’s circuits. Physica D
239(8), 454–464 (2010)
M. Porfiri, R. Jeter, I. Belykh, Windows of opportunity for the stability of jump linear systems:
almost sure versus moment convergence. Automatica 100, 323–329 (2019)
M. Porfiri, R. Pigliacampo, Master-slave global stochastic synchronization of chaotic oscillators.
SIAM J. Appl. Dyn. Syst. 7(3), 825–842 (2008)
M. Porfiri, D.J. Stilwell, Consensus seeking over random weighted directed graphs. IEEE Trans.
Autom. Control 52(9), 1767–1773 (2007)
M. Porfiri, D.J. Stilwell, E.M. Bollt, Synchronization in random weighted directed networks. IEEE
Trans. Circuits Syst. I Regul. Pap. 55(10), 3170–3177 (2008)
M. Porfiri, D.J. Stilwell, E.M. Bollt, J.D. Skufca, Random talk: Random walk and synchronizability
in a moving neighborhood network. Physica D 224(1), 102–113 (2006)
W. Ren, R.W. Beard, Distributed Consensus in Multi-vehicle Cooperative Control (Springer, Berlin,
2008)
J.D. Skufca, E.M. Bollt, Communication and synchronization in disconnected networks with
dynamic topology: moving neighborhood networks. Math. Biosci. Eng. (MBE) 1(2), 347–359
(2004)
P. So, B.C. Cotton, E. Barreto, Synchronization in interacting populations of heterogeneous oscil-
lators with time-varying coupling. Chaos Interdiscip. J. Nonlinear Sci. 18(3), 037,114 (2008)
F. Sorrentino, E. Ott, Adaptive synchronization of dynamics on evolving complex networks. Phys.
Rev. Lett. 100(11), 114,101 (2008)
A. Stefański, P. Perlikowski, T. Kapitaniak, Ragged synchronizability of coupled oscillators. Phys.
Rev. E 75(1), 016,210 (2007)
T. Stojanovski, L. Kocarev, U. Parlitz, R. Harris, Sporadic driving of dynamical systems. Phys. Rev.
E 55(4), 4035 (1997)
D.J. Sumpter, Collective Animal Behavior (Princeton University Press, Princeton, NJ, 2010)
Y. Tang, F. Qian, H. Gao, J. Kurths, Synchronization in complex networks and its application-a
survey of recent advances and challenges. Annu. Rev. Control. 38(2), 184–198 (2014)
W. Yu, P. DeLellis, G. Chen, M. Di Bernardo, J. Kurths, Distributed adaptive control of synchro-
nization in complex networks. IEEE Trans. Autom. Control 57(8), 2153–2158 (2012)
D.H. Zanette, A.S. Mikhailov, Dynamical systems with time-dependent coupling: clustering and
critical behaviour. Physica D 194(3), 203–218 (2004)
Chapter 16
The Effects of Local and Global Link
Creation Mechanisms on Contagion
Processes Unfolding on Time-Varying
Networks

Kaiyuan Sun, Enrico Ubaldi, Jie Zhang, Márton Karsai, and Nicola Perra

Abstract Social closeness and popularity are key ingredients that shape the emer-
gence and evolution of social connections over time. Social closeness captures local
reinforcement mechanisms which are behind the formation of strong ties and com-
munities. Popularity, on the other hand, describes global link formation dynamics
which drive, among other things, hubs, weak ties and bridges between groups. In
this chapter, we characterize how these mechanisms affect spreading processes tak-
ing place on time-varying networks. We study contagion phenomena unfolding on
a family of artificial temporal networks. In particular, we revise four different varia-
tions of activity-driven networks that capture (i) heterogeneity of activation patterns
(ii) popularity (iii) the emergence of strong and weak ties iv) community structure. By
means of analytical and numerical analyses we uncover a rich and process depen-
dent phenomenology where the interplay between spreading phenomena and link
formation mechanisms might either speed up or slow down the spreading.

Keywords Activity driven networks · Epidemic modeling · Dynamical processes


on time-varying networks · Time-varying networks models · Popularity · Social
closeness

K. Sun
MOBS Lab, Network Science Institute, Northeastern University, Boston, USA
E. Ubaldi
Sony Computer Science Laboratories, Paris, France
J. Zhang · N. Perra (B)
Networks and Urban Systems Centre, University of Greenwich, London, UK
e-mail: n.perra@gre.ac.uk
J. Zhang
e-mail: jie.zhang@gre.ac.uk
M. Karsai
University of Lyon, Inria, CNRS, ENS de Lyon, UCB Lyon 1, LIP UMR 5668, 69007 Lyon,
France
e-mail: marton.karsai@ens-lyon.fr

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 313
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_16
314 K. Sun et al.

16.1 Introduction

Think about the last conference you attended. In particular, focus on the social inter-
actions you had throughout the week. Chances are that you spent a disproportionate
fraction of time chatting with old and current collaborators as well as with long time
colleagues you typically meet in such settings. Chances are that you also networked
with new people. Most of these interactions were probably short and spontaneous
exchanges maybe while waiting in line for coffee or after your presentation. Others
might have been with editors, to whom you were trying to push your new idea for a
book, or with one of the keynote speakers after her inspiring talk. Bear with us and
think about last time you visited Twitter. Chances are that you read and interacted
with the posts of popular users, which you don’t know personally, but follow avidly.
Finally, chances are that you also interacted with your personal friends you follow.
These two scenarios highlight how both face-to-face and digital interactions are
temporal acts driven by intricate social mechanisms. Among them we can identify
two categories. The first, refers to frequent reciprocated connections with individuals
in your close social circle(s). These are interactions you reinforce and activate over
time. The second refers to (mostly) one-sided interactions with popular individuals.
These are connections you initiate with people who, thanks to their status and fame,
are able to attract a large share of the attention from many others. The first category
encompasses local mechanisms that do not depend on the behavior of people out-
side your social circle. The second instead, global mechanisms that depend on the
behavior of, potentially, a large fraction of individuals.
Let’s go back to your last conference. Many of the interactions you probably had
were with people that, conference after conference, dinner after dinner, paper after
paper, entered in your close social circle. These are individuals that you know very
well and that are likely part of the same community. Some of the other interactions
instead were probably with popular and influential people such as the keynote speak-
ers or editors with whom many other participants wanted to speak to. The first type
of interactions were driven by social closeness, the second instead by popularity.
A large literature, mostly built on a time-aggregated (static) data, substantiates
this picture. In particular, it is well known that social ties (both offline and online)
can be categorized as strong or weak (Granovetter 1973, 1995; Onnela et al. 2007;
Saramäki et al. 2014; Bakshy et al. 2012; Levin and Cross 2004; Friedkin 1980;
Brown and Reingen 1987). The first describes a small subset of ties which are acti-
vated frequently. The second instead describes sporadic (such as the person you met
while waiting for coffee) interactions. A classic signature of this tendency is found
in the distributions of link’s weights which are heterogeneous. There is more. In
fact, as alluded above, people with whom we share strong ties are likely to be also
connected in tight communities (Onnela et al. 2007; Fortunato 2010; Karsai et al.
2014, 2011). Thus, strong ties are clustered around groups of people characterized
by large links’ overlap (Onnela et al. 2007; Weng et al. 2018). Some of the weak ties
instead, bridge such groups (Onnela et al. 2007; Burt 2009). Another well known
property of real networks is the heterogeneity in the distribution of number of ties
16 The Effects of Local and Global Link Creation Mechanisms … 315

(the degree) (Newman 2010; Barabási et al. 2016). In fact, networks are typically
sparse. Many nodes are poorly connected. Few of them instead are able to attract
a disproportionate amount of connections. It is important to stress how strong and
weak ties, communities, and hubs emerge and evolve over time (Holme 2015; Holme
and Saramäki 2012). Which model(s) can be used to reproduce such features? How
do they affect contagion phenomena taking place on their fabric? These are the main
questions we will tackle in this chapter. In particular, we aim to revise and discuss
a set of models that have been proposed to capture the evolution of social ties as
function of time. In particular, we will consider both local and global approaches
able to reproduce the formation of strong ties, communities, as well as the presence
of popular individuals. From this stand point, we will then study how they affect
contagion (epidemic) processes unfolding on such networks.

16.2 The Activity-Driven Framework

To explore the effects of local and global link formation mechanisms on epidemic
spreading processes, we will consider several variations of the activity-driven frame-
work (Perra et al. 2012). These are models of time-varying networks, based on
the same fundamental scheme. In particular, when one is tasked to describe the
evolution of the connections between N nodes, needs to specify (at least) which
nodes are involved in interactions at each time step. In the activity-driven framework
this prescription is divided in two steps: (i) activation, (ii) partners selection. The
first describes, which nodes are active and willing to connect. The second instead
describes how such active nodes select the partners to whom interact. The modeling
of the activation process, which will be the same for all the different variations of the
framework we discuss here, is based on the intuition that not all nodes are equally
willing to create or be part of social interactions. This has been confirmed with a
range of observations in real datasets capturing very different type of interactions
ranging from scientific collaborations to R&D alliances between firms (Perra et al.
2012; Ribeiro et al. 2013; Karsai et al. 2014; Tomasello et al. 2014; Ubaldi et al.
2016; Alessandretti et al. 2017; Ubaldi et al. 2017). In particular, it turns out that the
activity rate (measured in series of time windows of size Δt) is very heterogeneous.
Furthermore, the distribution of activity is largely independent on the choice of Δt.
In other words, if we measure the propensity of nodes to be involved in a social
interaction by splitting the data in time windows of size Δt or Δt  we will get very
extremely similar distributions. The partner selection process instead describes the
mechanism behind the formation of links. Here, we will revise and consider three
different models that capture popularity and social closeness mentioned above. In
addition, we will consider a basic version of the model in which link creation is ran-
dom. This will serve as null model (baseline). In all cases, we will first discuss the
details of the link formation mechanism and then their effect on epidemic processes
unfolding on the network at comparable time scale respect to the evolution on the
graph’s structure.
316 K. Sun et al.

The general setting of activity driven models is the following. N nodes are
described by at least one variable: their activity a. This quantity regulates their
propensity to be active and willing to initiate social interactions at each time step.
Activities are extracted and assigned to nodes from a distribution F(a), which in the
following we consider as a power-law. Thus F(a) = Ca −γ with  ≤ a ≤ 1 to avoid
divergences for small values of activities. At each time step t the network G t starts
completely disconnected. Each node i is active with a probability ai Δt. Active nodes
create m connections with others. As mentioned above we will consider four link
creation mechanisms. At time t + Δt each link is deleted and the process re-start and
a network G t+Δt is generated. It is important to stress how all the links are deleted at
the end of each iteration and thus links do not persist across time steps unless they
are re-formed.

16.2.1 Model 1: Baseline

The simplest link formation mechanism is random (Perra et al. 2012) (see Fig. 16.1a).
In this very unrealistic scenario partners are selected homogeneously across the entire
system. While, very active nodes are likely to initiate connections in adjacent time
steps, the probability that the same link is activated more than one time, the weight in
a time integrated network, follows a Poisson distribution which, as mentioned in the
introduction, is quite far from real observations. However, it is possible to show that
integrating links over T time steps, the degree distribution follows the functional
form of the activity (Starnini and Pastor-Satorras 2014). Thus, the heterogeneous
propensity to initiate social interactions results in a heterogeneous degree distribution.
It is important to stress however, that each G t network is mostly constituted by a
set of disconnected stars formed around active nodes. Hubs emerge in time due the
active engagement of such nodes.

16.2.2 Model 2: Global Links Formation Process Driven


by Popularity

The second link formation we consider aims to capture a global, popularity based,
mechanism (Alessandretti et al. 2017) (see Fig. 16.1b). The basic intuition here is that
not all nodes are equally attractive. Keynotes in conferences and celebrities in Twit-
ter attract a disproportionate fraction of the connections in the system. To account
for this, we assume that nodes, besides the activity, are characterized by another
feature: the attractiveness b. Observations in different online social platforms sug-
gest that indeed the propensity of people to attract connections is heterogeneously
distributed (Alessandretti et al. 2017). All these aspects can be modeled within the
activity-driven framework as follows. Nodes are assigned with two features: activity
16 The Effects of Local and Global Link Creation Mechanisms … 317

(a) T=1 T=2 T=3 Integrated

(b)

(c)

(d)

Fig. 16.1 Schematic representation of the four different variations of the activity-driven framework.
The first three columns describe three time-steps of the evolution of network. The final column
describes the union of links created in the three time steps. At each time, for all the networks,
we assume that the same nodes are active (nodes in red), but the link creation process is instead
different. For simplicity we assume m = 1, the width of the link in the final column is proportional
to how many time each link was activated. The first row a shows the case of random link creation
(model 1). The second b describe the global link creation mechanism based on popularity (model
2). In this representation, one of nodes (the node with degree 4 in the integrated network) is the
most attractive node. The third row c describes the local link creation mechanism based on the
social memory (model 3). The final row d describes the local link creation mechanism driven by
the presence of communities which are depicted by shaded areas (model 4)

a and attractiveness b. These two quantities are extracted from a joint distribution
H (a, b). Interestingly, observations on online social platforms indicate that active
people are also more attractive, thus the two features are in general positively corre-
lated (Alessandretti et al. 2017). In these settings the dynamics of the networks are
very similar as those described above. At each time step t, the network G t starts dis-
connected. Each node is active with probability aΔt and connects to m other nodes.
Each partner j is selected following a simple preferential attachment, thus with
probability b j /bN . At time t + Δt all links are deleted and the process starts from
scratch. It is possible to show that also in this case the degree distribution, obtained
integrating links over time, is heterogeneous. However, the presence of attractive-
ness introduces some levels of heterogeneity in the weight distribution (Alessandretti
et al. 2017).
318 K. Sun et al.

16.2.3 Model 3: Local Links Formation Process Driven


by Social Memory

The third link formation mechanism is local and based on the idea of social closeness
(see Fig. 16.1c). The intuition is that, due to the need for close social connection,
cognitive and temporal constraints, large part of our interactions take place within
a small social circle (Saramäki et al. 2014; Holt-Lunstad et al. 2010; Dunbar 2009;
Miritello et al. Apr 2011; Stiller and Dunbar 2007; Powell et al. 2012; Gonçalves
et al. 2011). These are strong ties that we remember (hence social memory) and acti-
vate frequently. We do also have sporadic, weak, interactions with people outside the
circle such as the conference participants we met waiting for coffee in the hypothet-
ical scenario described in the introduction. Observations across collaboration and
communication networks corroborate this picture (Onnela et al. 2007; Burt 2009;
Karsai et al. 2014; Ubaldi et al. 2016; Alessandretti et al. 2017; Ubaldi et al. 2017).
Indeed, the probability that the next social act, for nodes that have already contacted
k distinct individuals in the past, will result in the establishment of a new, k + 1-th,
 −η
tie follows this function p(k) = 1 + kc (Karsai et al. 2014; Ubaldi et al. 2016).
This implies that the larger the size of social circle the smaller the probability of
increasing it. Consequently, social acts are frequently repeated within small circles
of nodes. Remarkably, the behavior of large number of individuals can be modeled
using one single value of c (that captures the off-set after which the memory effects
become effective) and a single (or multiple within a small region) value of η (that
captures the memory strength) (Karsai et al. 2014; Ubaldi et al. 2016). We can mod-
ify the activity-driven framework to account for the function p(k) that regulates the
tendency towards new/old connections (social memory). The evolution of these net-
works, driven by such local link formation mechanism, is modeled as follows. At
each time t the network G t starts completely disconnected. Each node i activates
with probability ai Δt and creates m connections. Each of these is created towards a
randomly selected node never contacted before with probability p(ki ) (where ki is
the number of nodes already in the social circle of i) and with probability 1 − p(ki )
towards a node already connected before. It is possible to show how both emergent
degree and weight distributions are heterogeneous and function of η (Karsai et al.
2014; Ubaldi et al. 2016).

16.2.4 Model 4: Local Links Formation Process Driven


by Communities

The fourth, and final, mechanism is also based on a local link formation process.
As mentioned in the introduction, not only our social ties are organized around
strong and weak ties but people in our social circles are likely to be friends. In
networks terms, social circles are communities formed by groups of tightly connected
people (Fortunato 2010). It is important to notice how the local mechanism described
16 The Effects of Local and Global Link Creation Mechanisms … 319

above (model 3) does not account for this important aspect. Indeed, triangles (i.e.
a friend of my friend is my friend), which are crucial aspects of real communities,
might emerge but are not likely by construction. There has been multiple proposals
on how to model emergent clusters using the activity-driven framework (Laurent
et al. 2015; Nadini et al. 2018). Here, we will consider the approach developed
in Ref. Nadini et al. (2018) (see Fig. 16.1d). Observations in real networks show
how the size of communities is heterogeneous (Fortunato 2010). Thus, we can set a
distribution P(s) ∼ s −α with smin ≤ s ≤ smax to describe the size, s, of communities
in the system. Each node is then associated to a community. In these settings, we can
easily modify the activity-driven framework to account for communities. As before, at
each time step t the network G t starts completely disconnected. Each node i is active
with probability ai Δt and creates m connections. With probability q each of these is
directed towards nodes in the same community (at random). With probability 1 − q
instead links are created randomly outside the community. The parameter q regulates
the modularity of the time-aggregated network. Clearly, if q = 0 the community
structure does not play any role on the dynamics of the network. Instead, if q = 1
the system is formed by disconnected communities. Values in between connect these
two limits. It is possible to show how, for moderately high values of q, the degree
and weight distribution are heterogeneous (Nadini et al. 2018).

16.3 Epidemic Spreading on Activity-Driven Networks:


Analytical Approach

After this preamble, we are in the position to investigate how global and local link
formation mechanisms affect epidemic spreading. To this end, we will consider a
prototypical contagion process: the Susceptible-Infected-Susceptible model (Keeling
and Rohani 2008; Barrat et al. 2008; Pastor-Satorras et al. 2015). Here, nodes are
divided in two compartments according to their disease status. Susceptible, S for
short, are healthy nodes that might be infected. Infected, I for short, are infectious
nodes. The natural history of the disease is described by two transitions. The first
is the infection process which is linked to a contact between S and I nodes. In
particular, a susceptible node in contact with an infected one gets infected with
λ
probability λ: S + I − → 2I . The second is instead the recovery process. Infected
μ
nodes spontaneously recover and become susceptible again with rate μ: I − → S. A
key quantity, that can be used to study the spreading of a disease with a given λ
and μ in a given network, is the epidemic threshold. The disease will be able to
spread, and reach an endemic state, only above a critical value (which is determined
by the features of the network where the process unfolds). Below such critical value
the disease will die out and affect only a small fraction of the population. Before
going forward we need another piece. In particular, how do we go about estimating,
numerically, the epidemic threshold? The classic approach is to study I∞ as function
of λ/μ (Barrat et al. 2008; Wang 2016). As already mentioned, above the threshold
320 K. Sun et al.

Table 16.1 Summary of the four different activity driven models and their key features
Model Links creation mechanism Key variable(s)
1 Random Activity a
2 Global: driven by popularity Activity a and attractiveness b
3 Local: driven by social memory Activity a, social ties reinforcement
parameters c and η
4 Local: driven by communities Activity a, details of communities sizes
distribution, probability q to select
partners in the same community

the process reaches an endemic state. This is a dynamical equilibrium in which the
total number of infected nodes is constant. Thus above threshold I∞ > 0, while below
threshold I∞ = 0. Due to the stochastic nature of the process, the estimation of the
threshold by looking at the behavior of I∞ is quite hard. We will adopt an alternative
and recent approach, which looks at the life time, L, of the process (Boguña et al.
2013). This quantity is defined as the time the disease needs either to die out or to
reach a finite fraction, Y , of the population. Indeed, well below the threshold the
disease will quickly die out. Well above the threshold the disease will be able to
reach the fraction Y quite quickly. For values between these two regimes the life
time increases and reaches a peak for values in proximity of the real threshold. In
the language of phase transitions, the life time acts as the susceptibility χ (Boguña
et al. 2013).
We have now all the ingredients to study a contagion process unfolding on different
versions of activity-driven networks which capture local and global link formation
dynamics. Before jumping to the details we summarize the key features of the various
models in Table 16.1.

16.3.1 SIS Epidemic Processes Unfolding on Model 1:


Baseline

As first step, let us consider the basic model where links are created randomly. This is
the baseline which will highlight the effects of heterogeneous activity patterns (since
the link formation is featureless). We can assume that nodes in the same activity
class are statistically equivalent. Furthermore, we can differentiate them according
to their disease status. Thus we will refer to Sa , and Ia as the number of susceptible
and infected nodes in activity class a. As we are considering a fixed population
Na = Sa + Ia at all times. In order to derive the conditions for the spreading, we can
study the evolution of the infected population. In particular, we can write the number
of infected nodes at time t + Δt as:
16 The Effects of Local and Global Link Creation Mechanisms … 321
 
Ia  (t) Ia  (t)
Ia (t + Δt) = Ia (t) − μIa (t)Δt + mλSa (t)aΔt da  + mλSa (t) da  a  Δt .
N N
(16.1)
In particular, this is given by the number of infected nodes at time t (first term on the
r.h.s.), minus the nodes that recover (second term on the r.h.s.), plus susceptible nodes
that are active, get in contact with infected nodes in other classes and get infected as
result (third term on the r.h.s.), plus susceptible nodes that get contacted and infected
by active infectious nodes in other categories of activity (fourth term on the r.h.s.).
It is important to stress that as each m link is created randomly, the probability of
selecting a node in a particular class is simply m/N . By dividing for Δt in the limit
of Δt → 0 we can write:
N a − Ia N a − Ia
dt Ia = −μIa + mλ a I + mλ Θ, (16.2)
N N
where for simplify the notation  we removed the explicit dependence
 of time, wrote
Sa = Na − Ia , defined Θ = a Ia da and considered that I = Ia da is the total
number of infected nodes. Since we are interested at the early stages of the spreading
we can move forward linearizing the expression by assuming that Na ∼ Sa and by
keeping just the first order terms in Ia . Thus we get

Na Na
dt Ia = −μIa + mλ a I + mλ Θ. (16.3)
N N
By summing over all classes of activity we get

dt I = −μI + mλaI + mλΘ, (16.4)


 
since a = a Na /N da = a F(a)da. The expression is now function of two vari-
ables of I and Θ. In order to understand their behavior we need to get an expression
for Θ. To this end, we multiply Eq. 16.3 by a and sum over all activity classes:

dt Θ = −μΘ + mλa 2 I + mλaΘ. (16.5)

At this stage, we obtained a system of two differential equations, one in I one in


Θ. The disease will be able to spread only if the largest eigenvalue of the Jacobian
of the system is larger than zero. In fact, this will imply that the region where we
developed the system is unstable. The Jacobian matrix can be written as
 
−μ + λma λm
Jm = ,
λma 2  −μ + λma

with eigenvalues 
(1,2) = maλ − μ ± λm a 2 . (16.6)
322 K. Sun et al.

Thus, the epidemic threshold can be simply written as (Perra et al. 2012)

λ 1 1
>  (16.7)
μ m a + a 2 

We can define β as the per capital rate at which people get infected. This is equal to
β = λk. The average degree at each time step is equal to k = 2ma. Thus we
can write
β 2a
> ξr ≡  , (16.8)
μ a + a 2 

where we defined ξr as the epidemic threshold for the random link creation pro-
cess. It is interesting to notice how the threshold is function of the first and second
moments of the activity distribution and that it has been derived using also other
methods (Starnini and Pastor-Satorras 2014; Rizzo et al. 2014; Zino et al. 2016).
As the process is unfolding as the network changes structure the threshold is not
function of the integrated degree distribution but only of the quantities describing
the activation of each node at each time step.

16.3.2 SIS Epidemic Processes in Model 2: The Effects


of Popularity

Let us now shift gears and analyze the first not random link creation mechanism. In
particular, following the previous order, let us consider the global mechanism based
on popularity. To this end, nodes are characterized by two features extracted from a
general joint distribution H (a, b). As before a describes the activity, b instead the
attractiveness. In these settings, it is necessary to divide nodes according to these
two features. Thus, the number of susceptible, and infected nodes of activity a and
attractiveness b at time t is indicated as Sa,b and Ia,b respectively. The evolution of
the number of infected nodes can be written as:
 
λm
dt Ia,b = −μIa,b + Sa,b a da  db b Ia  ,b + b da  db a  Ia  ,b . (16.9)
N b

The first term on the r.h.s. accounts for the recovery process. The second describes
susceptible nodes that are active and select (with probability b/bN ) infected nodes
in other classes getting infected. The third term finally describes susceptible nodes
selected by active and infected nodes in other classes and that become infectious as
result. It is interesting to notice the
 symmetry of the last two
 terms. To move forward,
let us define two functions θ = a Ia,b dadb and φ = bIa,b dadb. The previous
expression becomes:
16 The Effects of Local and Global Link Creation Mechanisms … 323

λm
dt Ia,b = −μIa,b + Sa,b aφ + bθ . (16.10)
N b

As before, we can assume that at the early stages of the spreading Na,b ∼ Sa,b and
neglect the terms at the second order in Ia,b thus we are left with

λm
dt Ia,b = −μIa,b + Na,b [aφ + bθ ]. (16.11)
N b

From the last expression, we can obtain a system of three equations necessary to study
the behavior of the number of infected nodes in the early stages. In particular, we can
obtain an expression for i) I by summing all activity and attractiveness classes, ii) θ
by multiplying both sides for a and summing all classes, and iii) φ by multiplying
both sides for b and summing all classes. Doing so, we obtain the following system
of differential equations:

λm
dt I = −μI + [aφ + bθ ], (16.12)
b
λm 2
dt θ = −μθ + [a φ + abθ ], (16.13)
b
λm
dt φ = −μφ + [abφ + b2 θ ]. (16.14)
b

The eigenvalues of the Jacobian matrix read:

λm 
1 = −μ, (2,3) = ab ± a 2 b2  − μ. (16.15)
b

As before, the disease is able to spread if the largest eigenvalue is larger than zero.
This condition implies (Pozzana et al. 2017):

β 2ab
> ξatt ≡  . (16.16)
μ ab + a 2 b2 

It is important to notice how the threshold has been computed without any assump-
tion on the form of the distribution H (a, b), thus it is valid for any (integrable)
form. A comparison between ξr and ξatt reveals how the general structure of the
threshold is similar. In particular, the second moments are under the square root in
the denominator. However, note how the details of the correlation between the two
features appear explicit in the term ab. To gather a deeper understanding on the
differences between the two thresholds, let us first consider the uncorrelated case
thus H (a, b) = F(a)G(b). In this case, we can re-write the threshold as:

2
ξatt = . (16.17)
a 2 b2 
1+ a2 b2
324 K. Sun et al.

As the dependence of the threshold on the two moments is symmetric, the case
with constant attractiveness and generic F(a) (baseline) can be mapped to the one
with constant activity and attractiveness distribution F(b). Clearly the symmetry
would be broken in case of directed networks, since the activity would regulate out-
links while the attractiveness in-links. Furthermore, as b2  ≥ b2 always holds, the
threshold can only be lower than or equal to the one found in first model. This means
that the introduction of any amount of heterogeneity in the attractiveness helps the
epidemic spreading pushing the threshold to smaller values. As mentioned in the
introduction, observations in real networks suggest that activity and attractiveness
are correlated. The relation between the two, in two online communication networks,
can be modeled as a ∼ bγc with γc close to one. What happens to the threshold in
this case? To answer this question, we can study the case of deterministic correlation
between the two variables imposing:

H (a, b) = F(a)δ(b − q(a)), (16.18)

where δ(x) is the Dirac delta and q(a) is the function that determines the attractiveness
of a node given its activity: bi = q(ai ), ∀i. Using the relation G(b) = F(a)|da/db|,
we can obtain an expression for G(b):
 −1 
 dq (b) 
G(b) = F(q −1
(b))  . (16.19)
db 

To account for the observations mentioned above, we can set q(a) = a γc , γc > 0.
Since the activity is distributed according to a power-law (F(a) ∝ a −γa ), the attrac-
1−γa
tiveness will be distributed as G(b) ∝ b−1+ γc . In these settings, a generic moment
of the joint distribution can be expressed as:

a n bm  = a n+γc m , (16.20)

thus the epidemic threshold becomes:

2aa γc 
ξatt =  .
a 1+γc  + a 2 a 2γc 

Generally speaking (this is controlled by the value of γc ) the threshold is not only
smaller than ξr but also smaller than the uncorrelated case. In fact, the disease is able
to spread faster when popular people, that are able to attract the connection from
many others, are also very active in contacting other nodes.
16 The Effects of Local and Global Link Creation Mechanisms … 325

16.3.3 SIS Epidemic Processes in Model 3: The Effects


of Social Memory

The third model, based on the local reinforcement of previously activated ties (social
memory), does not allow (to the best of our knowledge) a derivation of a closed
expression for the threshold as we did in the previous two cases. Exact numerical
methods, based on the spectral properties of matrices obtained from G t , can be used to
derive it, but these do not allow to gather an explicit expression (Valdano et al. 2015;
Prakash et al. 2010). However, a recent paper by Tizzani et al. (2018) provides an
analytical treatment with some approximations. While we refer the interested reader
to the original paper for details, here we provide a summary of their derivation as it
nicely complements the techniques we discussed above. First of all, they adopted an
individual based approach, in which rather than considering classes of activity each
node is considered explicitly. In fact, the memory effects for each node make the
interactions with a given social circle more likely. Thus nodes in the same activity
class cannot be considered statistically equivalent as their behavior depends on their
memory of past interactions. In the individual based approach, the focus goes from
the study of the evolution of the number of infected nodes in a given activity class
to the study of the probability ρi (t) that the node i is infected at time t. This can be
written as (Tizzani et al. 2018):

 Ai j (t)
dt ρi (t) = −μρi (t) + λ [1 − ρi (t)] ⎣ ai [1 − p(ki )] ρ j (t)+
j
ki
 ρ j (t)    Ai j (t)
+ ai p(ki ) + a j 1 − p(k j ) ρ j (t)
ji
N − k i − 1 j
kj

 ρ j (t) ⎦
+ a j p(k j ) , (16.21)
ji
N − kj − 1

where Ai j (t) is the adjacency matrix of the integrated graph up to time t, and  selects
only the nodes j not yet connected to i. By construction this is N − k j (t) − 1. The
first term on the r.h.s. describes the recovery rate of the node. All the other terms
describe the infection processes, which depend on the infection probability λ and the
probability that the node is susceptible (1 − ρi (t)). The first two terms in the large
brackets account for the fact that the node i is active and connects with a node j that
has already contacted before (first term) or that has never seen (second term). The
last two terms are the same but in this case they account for the fact that the other
nodes are active and connect to i (Tizzani et al. 2018). It is important to notice that
this expression underlies an approximation: the state of every node is independent
of the state of the neighbors. Clearly, this neglects the correlation between nodes.
The challenges induced by the memory are clear thinking that the adjacency matrix
and the social circle of each node are function of time. Thus, the unfolding of the
326 K. Sun et al.

disease is clearly function of its starting point in time. As Tizzani et al. noted, if we
consider the limit in which 1 ki (t) N , thus if the disease starts spreading when
the degree of each node is far from one and from N , hence at large times (but not too
large), the expression can be reduced as (see Ref. Tizzani et al. (2018) for details):

  
ai aj
dt ρ(t) = −μρ(t) + λ [1 − ρ(t)] Ai j (t) + ρ j (t). (16.22)
j
ki kj

In order to find a solution to this expression, Tizzani et al. transitioned from the time
integrated connectivity patterns (Ai j ) to an annealed form (Pi j (t)) which describes
the probability that i and j have been connected in the past. Interestingly, they show
that 1
t 1+η  
Pi j (t) = (1 + η) g(ai ) + g(a j ) , (16.23)
N

where they defined g(x) = x/(C x)η . The strength of the memory η regulates the
expression as well as the activity of the two nodes. From here Tizzani et al. can move
from the probability that a node i is infected at time t to the probability that a node of
activity a at time t is infected (thanks to the annealed approximation). In particular,
they obtain (Tizzani et al. 2018):
 
ag(a)
dt ρ(a, t) = −μρ(a, t) + λ [1 − ρ(a, t)] da  F(a  )ρ(a  , t)
g(a) + g
 
a a
+ da F(a )ρ(a , t)g(a ) + g(a) da  F(a  )
   
ρ(a  , t)
g(a) + g g(a  ) + g
 
  a  g(a  ) 
+ da F(a ) ρ(a , t) , (16.24)
g(a  ) + g

where interestingly we find similar terms from the two previous cases. This expres-
sion is rather complex. However, the conditions for the spreading can be found,
conceptually as before, by linearizing it at early times and studying the Jacobian of
the system of four differential equations obtained for it (see Ref. Tizzani et al. (2018)
for the derivation). While the condition can be obtained fairly easily numerically,
the nature of the terms, and the size of the matrix does not allow for a simple closed
expression. Nevertheless, in case a disease start spreading in a mature social network
in which nodes have build social circles, this analytical treatment works extremely
well (Tizzani et al. 2018). However, the general case, does not have a general closed
solution, yet.
16 The Effects of Local and Global Link Creation Mechanisms … 327

16.3.4 SIS Epidemic Processes in Model 4: The Effects


of Communities

Finally, let us turn our attention to the last model where the link creation dynamic
is influenced by the membership to specific communities. In particular, active nodes
select (at random) a connection with nodes in their community with probability q,
and outside their community with probability 1 − q. Although we will not be able to
solve them, it is instructive to write the dynamical equations describing the contagion
process in these settings. Similarly to what we did before, let us define Sa,s and Ia,s
as the number of susceptible and infected individuals, respectively, in the class of
activity a and community of size s at time t. We can then write (Nadini et al. 2018):
 
Is I
dt Ia,s = −μIa,s + λaSa,s q + (1 − q)
s N
  
  S a,s Sa,s
+ λ da a q Ia  ,s + (1 − q)Ia  ,s , (16.25)
s N

where Is and I are the number of infected in communities of size s and in the whole
network, respectively. As usual, the first term in the r.h.s accounts for the recovery of
infected individuals. The second and third terms describe susceptible nodes that are
active and select infected nodes in their community or outside. The fourth and fifth
terms are similar but consider that active nodes and infected nodes select susceptible
nodes of class a in community of size s. For simplicity, we consider that N − s ∼ N
and, at least initially, I − Is ∼ I . Summing over all the activities and community
sizes, and considering only the first order terms in a, Ia,s and their products, we
obtain

dt I = −μI + λaI + λθ + λq (as − a)Is , (16.26)
s
dt θ = −μθ + λa 2 I + λaθ +
 
+ λq (a 2 s − a 2 )Is + (as − a)θs , (16.27)
s

 
where we xdefined as before θ = a Ia da, and θs = a Ia,s da. The term a s =
x

da Na,s a /s describes the moments of the activity distribution in any community


of size s. That is the average activity in a community of size s. As before, the
second, auxiliary, equation is obtained from the first by multiplying both sides by
a and summing over all s and a. The epidemic threshold, at least in principle, can
be derived evaluating the largest eigenvalue of the Jacobian matrix of the system
of differential equations in I and θ . Unfortunately, a closed expression, to the best
of our knowledge, has not been derived yet. Nevertheless, we can point out some
interesting observations. First of all, the terms in q weight a comparison between the
moments of the activity distribution in the whole network with the corresponding
328 K. Sun et al.

values computed inside each community. In case the fluctuations of these terms are
negligible, due for example to very large community sizes or to narrow distribution
of activity, the equations become equivalent to the case q = 0 (which is equivalent,
for small community sizes, to a random link creation mechanism). Similarly, in case
q → 0, the network has no modular structure, and the threshold becomes equal to
the first simple model. In the opposite limit q → 1 the large majority of connections
take place inside each community. Thus the coupling between clusters becomes very
weak. Especially when the average size of communities is small, the probability of
selecting the same node as partner increases significantly.
Before moving to a more direct comparison between the thresholds in the four
different models, let us spend few words about another important, and prototypical,
contagion process: the SIR model (Keeling and Rohani 2008; Barrat et al. 2008;
Pastor-Satorras et al. 2015). While the infection mechanism is equivalent to the SIS,
the recovery is radically different. In fact, there is another compartment, R, describ-
ing infected nodes that recover. These cannot be infected again as they acquire a
permanent immunity. It is easy to show that, in case at early stages the population
is fully susceptible and thus R ∼ 0, the threshold for the SIR model unfolding on
activity-driven networks with random or global link creation dynamics is equivalent
to that of a SIS process. However, the symmetry breaks in case of local link cre-
ation mechanisms (for the last model in the limit of high q and small community
sizes) (Nadini et al. 2018; Sun et al. 2015). In fact, the presence of memory in the
connectivity patterns induced by the reinforcement of previously activated ties or
by high modularity have opposite effects in SIR and SIS models. The repetition of
a small number of connections hinders the spreading power of SIR processes. In
fact, as soon as a node recovers, links towards it cannot result in further infections.
However, such repetition (as will see more in details in the next section) favors SIS
dynamics since it allows the disease to survive in small patches (infected nodes will
eventually be susceptible again).

16.4 Epidemic Spreading on Activity-Driven Networks:


Numerical Simulations

In Fig. 16.2 we show the normalized lifetime, L n , for the different variations of the
activity-driven framework discussed above. The normalization is done dividing each
curve by its maximum, i.e. L n = L/ max L. In the plot we considered two versions
of the model with heterogeneous attractiveness (model 2). In the first, attractiveness
and activity, for each node, are extracted independently (uncorrelated scenario). In
the second instead, attractiveness and activity are equal (correlated scenario). In the
case of social memory (model 3), we set c = 1 and η = 1, thus previously activated
ties are repeated with probability 1 − p(k) = 1+kk
. We also considered two different
versions of the model with communities (model 4). In the first, we set q = 0.45, thus
only 45% of the links are created within each community. In the second instead, we
16 The Effects of Local and Global Link Creation Mechanisms … 329

Fig. 16.2 We show the normalized lifetime of SIS processes unfolding on different variations of
the activity driven framework as function of β/μ. In all cases, we set N = 105 , m = 1, F(a) a −2.1 ,
 = 10−3 , μ = 10−2 , Y = 0.25, start each simulations with 1% of randomly selected seeds, and
each point is the average of 102 independent simulations. In the models with attractiveness, we set
the distribution of popularity as G(b) b−2.1 and considered two scenarios: (1) H (a, b) = F(a)G(b)
(uncorrelated case), (2) H (a, b) = F(a)δ(b − a) (correlated case). In the model with social mem-
ory, we set c = 1 and η = 1, thus a node that contacted already k nodes will connect to a new
tie with probability p(k) = k+1 1
and will reinforce a previously activated link with probability
1 − p(k) = k+1 . Finally, in the models with community structure, we extracted communities sizes
k

for a power-law distribution P(s) s −2.1 with 10 ≤ s ≤ N and considered two values of q

set q = 0.9, a much higher value. In both scenarios, community √ sizes are extracted
from a power-law distribution P(s) s −2.1 with 10 ≤ s ≤ N . In order to compare
the different scenarios, we fixed (across the board) all the other parameters (see
legend of Fig. 16.2 for details). In these settings, several observations are in order.
First, all global and local links creation mechanisms result in smaller values of the
epidemic threshold respect to the baseline (random links’ creation). Second, global,
popularity-driven, mechanisms based on heterogeneous distributions of attractivity
(model 2) push the threshold to much smaller values respect to local mechanisms
based on social closeness principles (models 3 and 4). Interestingly, correlations
between activity and attractiveness help the spreading of the disease even further
respect to uncorrelated scenarios. Third, the effects of communities depend on the
value of q. In particular, high values of this quantity help the spreading more than
smaller values of it. In fact, the repetition of connections within each cluster, helps
the disease to survive for smaller values of the spreading rate. Finally, social memory,
thus the repetition of previously activated ties, helps the spreading respect to medium-
low values of modularity (q = 0.45), but not as much as larger values of it. In order
to gather a deeper understanding on the dynamics, in Fig. 16.3 we show, as function
of β/μ, the fraction of infected population evaluated at the time step equal to the life
time, I L . This quantity provides complimentary information respect to the previous
plot by showing the prevalence of the disease at the moment the conditions that define
the lifetime are met (i.e. disease either dies out or reaches a cumulative fraction Y of
the population). Several observations are in order. First, the presence of heterogeneity
in nodes’ attractiveness (model 2) not only results in the lower value of the threshold
(as shown in Fig. 16.2), but affects a larger fraction of the population respect to all
330 K. Sun et al.

Fig. 16.3 We show I L for the different version of activity-driven models. This quantity describes
the fraction of infected nodes at time t = L. The parameters for all different cases are the same
used in Fig. 16.2

the other links creation mechanisms. The effects of correlations between activity and
attractiveness are not as visible as for the threshold. Second, despite in case of strong
modularity (q = 0.9) the threshold is smaller than in case of social memory, the latter
has a larger impact on the population. Indeed, while the presence of tightly connected
communities allows for the survival of the disease for smaller values of the spreading
rate, high values of modularity confine the impact of the virus in small patches. It is
important to notice that this effect is dependent on the average community size and
the distribution of sizes. Intuitively, in case nodes are arranged in few and very large
communities such effects would be reduced as the repetition of ties would be much
less likely. Finally, social memory (model 3), thus the reinforcement of previously
activated ties, has a large impact on the system respect to random and other local
mechanisms for values of β/μ closer to the epidemic threshold. As β/μ increases
we enter in a region of the phase space where the repetition of the same connections
hampers the spreading power of the disease respect to randomly activated ties or to
networks with low modularity (model 4). In this regime, the spreading rate is high
and having connections with a large number nodes, rather than repeating the links
with few of them, helps the unfolding of the disease.

16.5 Conclusions

In this chapter we have investigated the effects of different link creation mecha-
nisms on contagion processes unfolding on time-varying networks. In particular, we
focused on two main classes: global and local mechanisms. We modeled the first
considering that the propensity of nodes to attract social interactions is heteroge-
neous. We modeled the second, considering two different approach: (i) ties activated
in the past are more likely to be re-activated than new ones (social memory), (ii)
social ties are typically organized in tight communities poorly connected between
them. The first mechanism is inspired by popularity (attractiveness) the second by
16 The Effects of Local and Global Link Creation Mechanisms … 331

social closeness mechanisms. Furthermore, as null model, we considered the sim-


ple, and unrealistic, case in which links are created randomly (baseline). We first
provided details about how to analytically tackle the spreading of SIS processes in
these models. We then presented a more direct numerical comparison between them.
Interestingly, we found that global link creation mechanisms, driven by heteroge-
neous distributions of attractivity, drastically reduce the epidemic threshold respect
to the case of homogenous distribution of such quantity (baseline) and to the case of
local mechanisms driven by social closeness. Thus, the presence of globally popular
nodes, able to attract a large share of the interactions, facilitates the spreading respect
to the local correlated dynamics induced by social closeness mechanisms. In fact,
as soon such popular hubs get infected they affect a large fraction of the population
that connects to them, even for small values of the spreading rate. The effect of com-
munities is function of the modularity. High values of modularity push the threshold
to smaller values respect to low values of it and to the social memory mechanism
based on the repetition of previously activated ties. However, social memory might
have a large impact on system in terms of disease’s prevalence in the population. In
fact, for values of the spreading rates close to the threshold we observe an interesting
phenomenology where the fraction of infected nodes is larger respect to the case of
communities as well as to the baseline. While the presence of communities allows
the disease to survive for smaller values of the spreading rate, it confines the disease
in smaller patches respect to case of social memory.
Arguably, all the mechanisms considered here are not mutually exclusive. In
fact, both offline and online social networks are driven by their interplay. Here, we
shown that even taken singularly they introduced no-trivial dynamics on contagion
processes. More research should be conducted to study their interplay and trade-off
in the future.

References

L. Alessandretti, K. Sun, A. Baronchelli, N. Perra, Random walks on activity-driven networks with


attractiveness. Phys. Rev. E 95(5), 052318 (2017)
E. Bakshy, I. Rosenn, C. Marlow, L. Adamic, The role of social networks in information diffusion, in
Proceedings of the ACM International World Wide Web Conference (WWW) (2012), pp. 519–528
A.-L. Barabási et al., Network Science (Cambridge University Press, 2016)
A. Barrat, M. Barthélemy, A. Vespignani, Dynamical Processes on Complex Networks (Cambridge,
2008)
M. Boguña, C. Castellano, R. Pastor-Satorras, Nature of the epidemic threshold for the susceptible-
infected-susceptible dynamics in networks. Phys. Rev. Lett. 111, 068701 (2013)
J.J. Brown, P.H. Reingen, Social ties and word-of-mouth referral behavior. J. Consum. Res. 14(3),
350–362 (1987)
R.S. Burt, Structural Holes: The Social Structure of Competition (Harvard University Press, 2009)
R.I.M. Dunbar, The social brain hypothesis and its implications for social evolution. Ann. Human
Biol. 36(5), 562–572 01 (2009)
S. Fortunato, Community detection in graphs. Phys. Rep. 486, 75–174 (2010)
332 K. Sun et al.

N. Friedkin, A test of structural features of granovetter’s strength of weak ties theory. Soc. Netw.
2(4), 411–422 (1980)
B. Gonçalves, N. Perra, A. Vespignani, Modeling users’ activity on twitter networks: validation of
dunbar’s number. PloS one 6(8), e22656 (2011)
M. Granovetter, The strength of weak ties. Am. J. Sociol. 78, 1360–1380 (1973)
M. Granovetter, Getting a Job: A Study of Contacts and Careers (University of Chicago Press,
Chicago, 1995)
P. Holme, Modern temporal network theory: a colloquium. Eur. Phys. J. B 88(9), 1–30 (2015)
P. Holme, J. Saramäki, Temporal networks. Phys. Rep. 519(3), 97–125 (2012)
J. Holt-Lunstad, T.B. Smith, J. Bradley Layton, Social relationships and mortality risk: a meta-
analytic review. PLoS Med 7(7), e1000316 07 (2010)
M. Karsai, G. Iñiguez, K. Kaski, J. Kertész, Complex contagion process in spreading of online
innovation. J. R. Soc. Interface 11(101) (2014)
M. Karsai, M. Kivelä, R.K. Pan, K. Kaski, J. Kertész, A.-L. Barabási, J. Saramäki, Small but slow
world: how network topology and burstiness slow down spreading. Phys. Rev. E 83(2), 025102
(2011)
M. Karsai, N. Perra, A. Vespignani, Time varying networks and the weakness of strong ties. Sci.
Rep. 4, 4001 (2014)
M.J. Keeling, P. Rohani, Modeling Infectious Disease in Humans and Animals (Princeton University
Press, 2008)
G. Laurent, J. Saramäki, M. Karsai, From calls to communities: a model for time-varying social
networks. Eur. Phys. J. B 88(11), 1–10 (2015)
D.Z. Levin, R. Cross, The strength of weak ties you can trust: the mediating role of trust in effective
knowledge transfer. Manag. Sci. 50(11), 1477–1490 (2004)
G. Miritello, E. Moro, R. Lara, Dynamical strength of social ties in information spreading. Phys.
Rev. E 83, 045102 (2011)
M. Nadini, K. Sun, E. Ubaldi, M. Starnini, A. Rizzo, N. Perra, Epidemic spreading in modular
time-varying networks. Sci. Rep. 8(1), 2352 (2018)
M.E.J. Newman, Networks, An Introduction (Oxford Univesity Press, 2010)
J.-P. Onnela, J. Saramaki, J. Hyvonen, G. Szabo, D. Lazer, K. Kaski, J. Kertesz, A.-L. Barabasi,
Structure and tie strengths in mobile communication networks. Proc. Natl. Acad. Sci. U.S.A. 104,
7332 (2007)
R. Pastor-Satorras, C. Castellano, P. Van Mieghem, A. Vespignani, Epidemic processes in complex
networks. Rev. Mod. Phys. 87(3), 925 (2015)
N. Perra, B. Gonçalves, R. Pastor-Satorras, A. Vespignani, Activity driven modeling of time-varying
networks. Sci. Rep. 2, 469 (2012)
J. Powell, P.A. Lewis, N. Roberts, M. García-Fiñana, R.I.M. Dunbar, Orbital prefrontal cortex
volume predicts social network size: an imaging study of individual differences in humans, in
Proceedings of the Royal Society of London B: Biological Sciences (2012)
I. Pozzana, K. Sun, N. Perra, Epidemic spreading on activity-driven networks with attractiveness.
Phys. Rev. E 96(4), 042310 (2017)
B.A. Prakash, H. Tong, M. Valler, C. Faloutsos, Virus propagation on time-varying networks:
theory and immunization algorithms, Machine Learning and Knowledge Discovery in Databases.
Lecture Notes in Computer Science 6323, 99–114 (2010)
B. Ribeiro, N. Perra, A. Baronchelli, Quantifying the effect of temporal resolution on time-varying
networks. Sci. Rep. 3, 3006 (2013)
A. Rizzo, M. Frasca, M. Porfiri, Effect of individual behavior on epidemic spreading in activity
driven networks. Phys. Rev. E 90, 042801 (2014)
J. Saramäki, E.A. Leicht, E. López, S.G.B. Roberts, F. Reed-Tsochas, R.I.M. Dunbar, Persistence
of social signatures in human communication. Proc. Natl. Acad. Sci. 111(3), 942–947 (2014)
M. Starnini, R. Pastor-Satorras, Temporal percolation in activity driven networks. Phys. Rev. E 89,
032807 (2014)
16 The Effects of Local and Global Link Creation Mechanisms … 333

J. Stiller, R.I.M. Dunbar, Perspective-taking and memory capacity predict social network size. Soc.
Netw. 29(1), 93–104 1 (2007)
K. Sun, A. Baronchelli, N. Perra, Contrasting effects of strong ties on sir and sis processes in
temporal networks. Eur. Phys. J. B 88(12), 1–8 (2015)
M. Tizzani, S. Lenti, E. Ubaldi, A. Vezzani, C. Castellano, R. Burioni, Epidemic spreading and
aging in temporal networks with memory. Phys. Rev. E 98(6), 062315 (2018)
M.V. Tomasello, N. Perra, C.J. Tessone, M. Karsai, F. Schweitzer, The role of endogenous and
exogenous mechanisms in the formation of R&D networks. Sci. Rep. 4, 5679 (2014)
E. Ubaldi, N. Perra, M. Karsai, A. Vezzani, R. Burioni, A. Vespignani, Asymptotic theory of time-
varying social networks with heterogeneous activity and tie allocation. Sci. Rep. 6, 35724 (2016)
E. Ubaldi, A. Vezzani, M. Karsai, N. Perra, R. Burioni, Burstiness and tie activation strategies in
time-varying social networks. Sci. Rep. 7, 46225 (2017)
E. Valdano, L. Ferreri, C. Poletto, V. Colizza, Analytical computation of the epidemic threshold on
temporal networks. Phys. Rev. X 5(2), 021005 (2015)
Z. Wang, C.T. Bauch, S. Bhattacharyya, A. d’Onofrio, P. Manfredi, M. Perc, N. Perra, M. Salathé,
D. Zhao, Statistical physics of vaccination. Phys. Rep. 664, 1–113 (2016)
L. Weng, M. Karsai, N. Perra, F. Menczer, A. Flammini, Attention on weak ties in social and
communication networks, in Complex Spreading Phenomena in Social Systems (Springer, 2018),
pp. 213–228
L. Zino, A. Rizzo, M. Porfiri, Continuous-time discrete-distribution theory for activity-driven net-
works. Phys. Rev. Lett. 117(22), 228302 (2016)
Chapter 17
Supracentrality Analysis of Temporal
Networks with Directed Interlayer
Coupling

Dane Taylor, Mason A. Porter, and Peter J. Mucha

Abstract We describe centralities in temporal networks using a supracentrality


framework to study centrality trajectories, which characterize how the importances
of nodes change with time. We study supracentrality generalizations of eigenvector-
based centralities, a family of centrality measures for time-independent networks
that includes PageRank, hub and authority scores, and eigenvector centrality. We
start with a sequence of adjacency matrices, each of which represents a time layer
of a network at a different point or interval of time. Coupling centrality matrices
across time layers with weighted interlayer edges yields a supracentrality matrix
C(ω), where ω controls the extent to which centrality trajectories change with time.
We can flexibly tune the weight and topology of the interlayer coupling to cater to
different scientific applications. The entries of the dominant eigenvector of C(ω)
represent joint centralities, which simultaneously quantify the importances of every
node in every time layer. Inspired by probability theory, we also compute marginal
and conditional centralities. We illustrate how to adjust the coupling between time
layers to tune the extent to which nodes’ centrality trajectories are influenced by the
oldest and newest time layers. We support our findings by analysis in the limits of
small and large ω.

Keywords Temporal networks · Centrality · PageRank · Multilayer networks ·


Multiplex networks

D. Taylor (B)
University of Wyoming, Laramie, WY, USA
e-mail: dane.taylor@uwyo.edu
M. A. Porter
University of California, Los Angeles, CA, USA
e-mail: mason@math.ucla.edu
Santa Fe Institute, Santa Fe, NM, USA
P. J. Mucha
Dartmouth College, Hanover, NH, USA
e-mail: peter.j.mucha@dartmouth.edu

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 335
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_17
336 D. Taylor et al.

17.1 Introduction

Quantifying the importances of nodes through the calculation of ‘centrality’ mea-


sures is a central topic in the study of networks (Newman 2018). It is important
in numerous and diverse applications, including identification of influential people
(Bonacich 1972; Faust 1997; Borgatti et al. 1998; Kempe et al. 2003), ranking web
pages in searches (Brin and Page 1998; Page et al. 1999; Kleinberg 1999), ranking
teams and individual athletes in sports (Callaghan et al. 2007; Saavedra et al. 2010;
Chartier et al. 2011), identification of critical infrastructures that are susceptible to
congestion or failure (Holme 2003; Guimerà et al. 2005), quantifying impactful judi-
cial documents (Leicht et al. 2007; Fowler et al. 2007; Fowler and Jeon 2008) and
scientific publications (Bergstrom et al. 2008), revealing drug targets in biological
systems (Jeong et al. 2001), and much more.
Because most networks change with time (Holme and Saramäki 2012, 2013;
Holme 2015), there is much interest in extending centralities to temporal networks
(Liao et al. 2017). Past efforts have generalized quantities such as betweenness cen-
trality (Tang et al. 2010; Kim et al. 2012; Alsayed and Higham 2015; Williams and
Musolesi 2016; Fenu and Higham 2017), closeness centrality (Tang et al. 2010; Pan
and Saramäki 2011; Kim et al. 2012; Williams and Musolesi 2016), Bonacich and
Katz centrality (Lerman et al. 2010; Grindrod and Higham 2014), win/lose cen-
trality (Motegi and Masuda 2012), communicability (Grindrod et al. 2011; Estrada
2013; Grindrod and Higham 2013; Chen et al. 2016; Arrigo and Higham 2017; Fenu
and Higham 2017), dynamic sensitivity (Huang and Yu 2017), coverage centrality
(Takaguchi et al. 2016), PageRank (Walker et al. 2007; Rossi and Gleich 2012; Mar-
iani et al. 2015, 2016; You et al. 2017), and eigenvector centrality (Praprotnik and
Batagelj 2015; Huang et al. 2017; Flores and Romance 2018). A common feature
of these extensions is that they illustrate the importance of using methods that are
designed explicitly for temporal networks, as opposed to various alternatives. These
alternatives include aggregating a temporal network into a single ‘time-independent’
network, independently analyzing a temporal network at different instances of time,
and binning a temporal network into time windows and analyzing those windows
independently. With aggregation, it is not even possible to study centrality trajectories
(i.e., how centralities change with time).
Because one can derive many centralities by studying walks on a network, some
of the above temporal generalizations of centrality involve the analysis of so-called
‘time-respecting paths’ (Kossinets et al. 2008; Kostakos 2009). There are multiple
ways to define a time-respecting path, including the possibility of allowing multiple
edge traversals per time step for a discrete-time temporal network. There are also
multiple ways to quantify the length of a time-respecting path (Williams and Musolesi
2016), because such a path can describe the number of edges that are traversed by
a path, latency between the initial and terminal times of a path, or a combination of
these ideas. It is necessary to make choices even to define a notion of a ‘shortest path’
(from which one can formulate several types of centrality). Consequently, some of
17 Supracentrality Analysis of Temporal Networks … 337

the diversity in the various temporal generalizations of centrality measures arises


from the diversity in defining and measuring the length of a time-respecting path.
In the present work, we examine a notion of supracentrality (Taylor et al. 2017,
2021), which one can calculate by representing a temporal network as a sequence of
network layers and coupling those layers to form a multilayer network (specifically,
a multiplex network (Kivelä et al. 2014; Porter 2018)). See Fig. 17.1 for illustrative
examples. The sequence of network layers, which constitute time layers, can repre-
sent a discrete-time temporal network at different time instances or a continuous-time
network in which one bins (i.e., aggregates (Taylor et al. 2017)) the network’s edges
to form a sequence of time windows with interactions in each window. This approach
is motivated by the use of a multiplex-network representation to detect communities
in temporal networks through maximization of multilayer modularity (Mucha and
Porter 2010; Bassett et al. 2013; Weir et al. 2017; Pamfil et al. 2019). There is also
widespread interest in generalizing centrality measures to multilayer networks more
generally (Magnani and Rossi 2011; Ng et al. 2011; De Domenico et al. 2013; Halu
et al. 2013; Magnani et al. 2013; Solá et al. 2013; Battiston et al. 2014; Solé-Ribalta
et al. 2014; Chakraborty and Narayanam 2016; Solé-Ribalta et al. 2016; Tavassoli
and Zweig 2016; DeFord 2017; DeFord and Pauls 2017; Ding and Li 2018; Rahmede
et al. 2017; Spatocco et al. 2018; Tudisco et al. 2018).
Our supracentrality framework generalizes a family of time-independent network
centralities that are called eigenvector-based centralities, which are defined by the
property of calculating centralities as the entries of an eigenvector (the so-called
‘dominant’ eigenvector) that corresponds to the largest-magnitude eigenvalue (the
‘dominant’ eigenvalue1 ) of a centrality matrix C(A), which one defines by some
function of a network’s adjacency matrix A. Different choices for the centrality
matrix recover some of the most popular centrality measures, including eigenvector
centrality (by using C(A) = A) (Bonacich 1972), hub and authority scores (by using
C(A) = AAT for hubs and AT A for authorities) (Kleinberg 1999), and PageRank
(Page et al. 1999) (see Sect. 17.2.2). Given a discrete-time temporal network in the
form of a sequence of adjacency matrices A(t) ∈ R N ×N for t ∈ {1, . . . , T }, where
Ai(t)j denotes a directed edge from entity i to entity j in time layer t, examining
supracentralities involves two steps:
1. Construct a supracentrality matrix C(ω), which couples centrality matrices
C(A(t) ) of the individual time layers t = 1, t = 2, t = 3, …
2. Compute and interpret the dominant eigenvector of C(ω).
For a temporal network with N nodes and T time layers, C(ω) is a square matrix
of size N T × N T . We require the set of nodes to be the same for all time layers.
However, it is easy to accommodate the appearance and disappearance of nodes
by including extra instances of the entities in layers in which they otherwise do
not appear (but without including any associated intralayer edges). The parameter ω

1Technically, we study the eigenvector that is associated with the largest positive eigenvalue of
an irreducible, nonnegative matrix. We assume that this eigenvalue has a larger magnitude than all
other eigenvalues.
338 D. Taylor et al.

(a) (c)
1 1 1 1 1 1

2 2 2 2 2 2

3 3 3 3 3 3 undirected interlayer coupling

4 4 4 4 4 4
1’ 2’ 3’ 4’ 5’ 6’

(b) 1 1 1 1 1 1
(d)
2 2 2 2 2 2

3 3 3 3 3 3 directed interlayer coupling


layer teleportation ( )
4 4 4 4 4 4
1’ 2’ 3’ 4’ 5’ 6’

Fig. 17.1 Multiplex-network representations of a discrete-time temporal network. Given a tem-


poral network with N = 4 nodes and T = 6 times, we represent the network at each time by a
‘time layer’ with adjacency matrix A(t) ∈ R N ×N for t ∈ {1, . . . , T }. a, b We represent the net-
work as a multiplex network by coupling the layers with ‘interlayer edges’ (gray edges) that we
encode in an interlayer-adjacency matrix à ∈ RT ×T . Panel a illustrates interlayer coupling in the
form of an undirected chain, and panel b depicts directed coupling between layers. In panels c
and d, we show visualizations of the networks that are associated with the matrix à for panels a
and b, respectively. In panel d, there are directed interlayer edges between consecutive time layers,
so these interlayer edges respect the direction of time. Additionally, we construct connections of
weight γ > 0 between corresponding nodes from all pairs of layers to ensure that à corresponds
to a strongly connected network, which in turn ensures that the centralities are positive and unique.
By analogy to ‘node teleportation’ in PageRank (Gleich 2015), we refer to this coupling as ‘layer
teleportation’

scales the weights of the interlayer couplings to control the strength of the connection
between time layers. It thus provides a ‘tuning knob’ to control how rapidly centrality
trajectories can change with time.
An important aspect of the first step is that one chooses a topology to couple layers
to each other. To do this, we define an interlayer-adjacency matrix à ∈ RT ×T , where
the entry Ãtt  encodes the coupling from time layer t to time layer t  . In Fig. 17.1,
we illustrate two possible choices for coupling the time layers. In the upper row,
à ∈ RT ×T encodes an undirected chain, which couples the time layers with adjacent-
in-time coupling but neglects the directionality of time. In the lower row, by contrast,
we couple the time layers with a directed chain that reflects the directionality of time.
In addition to the directed, time-respecting edges, Fig. 17.1d also illustrates that we
include weighted, undirected edges between corresponding nodes for all pairs of
layers. This implements ‘layer teleportation’, which is akin to the well-known ‘node
teleportation’ of the PageRank algorithm (Gleich 2015). Just like node teleportation,
layer teleportation ensures that supracentralities are well-behaved (specifically, that
they are positive and unique).
The second step to examine supracentralities involves studying the dominant
right eigenvector of the supracentrality matrix C(ω), which characterizes the joint
centrality of each node-layer pair (i, t)—that is, the centrality of node i in time
layer t—and thus reflects the importances of both node i and layer t. From the joint
17 Supracentrality Analysis of Temporal Networks … 339

centralities, one can calculate marginal centralities for only the nodes or only the
time layers. One can also calculate conditional centralities that measure a node’s
centrality at time t relative only to the other nodes’ centralities at time t. These
concepts, which are inspired by ideas from probability theory, allow one to develop
a rich characterization for how node centralities change with time.
In this chapter, we describe the supracentrality framework that we developed in
Taylor et al. (2017, 2021) and extend these papers with further numerical explorations
of how interlayer-coupling topology affects supracentralities. We apply this approach
to a data set, which we studied in Taylor et al. (2017) and is available at Taylor (2019),
that encodes the graduation and hiring of Ph.D. recipients between mathematical-
sciences doctoral programs in the United States. We focus our attention on five top
universities and examine how they are affected by the value of ω and the choice of Ã.
Specifically, we compare the two strategies for interlayer coupling in Fig. 17.1 and
explore the effect of reversing the directions of all directed edges. Our experiments
reveal how to use ω and à to tune the extent to which centrality trajectories of nodes
are influenced by the oldest time layers, the newest time layers, and the direction of
time.

17.2 Background Information

We now give some background information on multiplex networks and eigenvector-


based centralities. Our supracentrality framework involves representing a tempo-
ral network as a multiplex network (see Sect. 17.2.1). In Sect. 17.2.2, we review
eigenvector-based centrality measures.

17.2.1 Analysis of Temporal Networks


with Multiplex-Network Representations

We study discrete-time temporal networks, for which we provide a formal definition.


Definition 17.1 (Discrete-Time Temporal Network) A discrete-time temporal net-
work consists of a set V = {1, . . . , N } of nodes and sets E (t) of weighted edges that
we index (using t) in a sequence of network layers. We denote such a network either
by G (V , {E (t) }) or by the sequence {A(t) } of adjacency matrices, where Ai(t)j = wit j
if (i, j, wit j ) ∈ E (t) and Ai(t)j = 0 otherwise.
As we illustrated in Fig. 17.1, we represent a discrete-time temporal network as
a multiplex network with weighted and possibly directed coupling between the time
layers. We restrict our attention to the following type of multiplex network.
Definition 17.2 (Uniformly and Diagonally Coupled (i.e., Layer-Coupled) Multi-
plex Network) Let G (V , {E (t) }, E˜ ) be a T -layer multilayer network with node set
340 D. Taylor et al.

V = {1, . . . , N } and interactions between node-layer pairs that are encoded by the
sets {E (t) } of weighted edges, where (i, j, wit j ) ∈ E (t) if and only if there is an edge
(i, j) with weight wit j in layer t. The set E˜ = {(s, t, w̃st )} encodes the topology
and weights for coupling separate instantiations of the same node between a pair
(s, t) ∈ {1, . . . , T } × {1, . . . , T } of layers. Equivalently, one can encode a multiplex
network as a set {A(t) } of adjacency matrices, such that Ai(t)j = wit j if (i, j, wit j ) ∈ E (t)
and Ai(t)j = 0 otherwise, along with an interlayer-adjacency matrix à with entries
Ãst = w̃st if (s, t, w̃st ) ∈ E˜ and Ã(t) st = 0 otherwise.

The coupling in Definition 17.2 is ‘diagonal’ because the only interlayer edges are
ones that couple a node in one layer to that same node in another layer. It is ‘uniform’
because the coupling between two layers is identical for all nodes in those two layers.
A multilayer network with both conditions is called ‘layer-coupled’ (Kivelä et al.
2014).
As we illustrated in Fig. 17.1, we focus our attention on two choices for coupling
time layers:
(A) Ã encodes an undirected chain:

1 , |t  − t| = 1
Ãtt  = (17.1)
0 , otherwise ;

(B) Ã encodes a directed chain with layer teleportation:



1 + γ , t − t = 1
à tt  = (17.2)
γ , otherwise ,

where γ > 0 is the layer-teleportation probability. In Sect. 17.4, we compare


the effects on centrality trajectories of these two choices of Ã.

17.2.2 Eigenvector-Based Centrality for Time-Independent


Networks

Arguably the most notable—and certainly the most profitable—type of centrality is


PageRank, which provided the mathematical foundation for the birth of the web-
search algorithm of the technology giant Google (Brin and Page 1998; Page et al.
1999; Gleich 2015). PageRank quantifies the importances of the nodes of a network
(e.g., a directed network that encodes hyperlinks between web pages) by computing
the dominant eigenvector of the ‘PageRank matrix’ (i.e., ‘Google matrix’ (Langville
and Meyer 2006))

C(PR) = σ AT D−1 + (1 − σ )N −1 11T , (17.3)


17 Supracentrality Analysis of Temporal Networks … 341

where N is the number of nodes, 1 = [1, . . . , 1]T is a length-N vector of ones, and
A is an adjacency matrix in which each entry Ai j encodes a directed (and possibly
weighted) edge from node i to node j. The matrix D = diag[d1out , . . . , d Nout ] is a
diagonal matrix that encodes the node out-degrees di = j Ai j .
out

The PageRank matrix’s dominant right eigenvector is a natural choice for ranking
nodes, as it encodes a random walk’s stationary distribution (which estimates the
fraction of web surfers on each web page in the context of a web-search engine2 ).
The term AT D−1 is a transition matrix that operates on column vectors that encode
the densities of random walkers (Masuda et al. 2017). The term N −1 11T is a telepor-
tation matrix; it is a transition matrix of an unbiased random walk on a network with
uniform all-to-all coupling between nodes. The teleportation parameter σ ∈ (0, 1)
implements a linear superposition of the two transition matrices and yields an irre-
ducible matrix, even when the transition matrix AT D−1 is reducible. Because we
introduced the concept of layer teleportation in Sect. 17.2.1, we henceforth refer to
the traditional teleportation in PageRank as ‘node teleportation’.
It is common to define the PageRank matrix as the transpose of Eq. (17.3); in
that case, one computes the dominant left eigenvector instead of the dominant right
eigenvector. However, we use the right-eigenvector convention to be consistent with a
broader class of centrality measures called ‘eigenvector-based centralities’, in which
one encodes node importances in the elements of the dominant eigenvector of some
centrality matrix. In addition to PageRank, prominent examples of eigenvector-based
centralities include (vanilla) eigenvector centrality (Bonacich 1972) and hub and
authority (i.e., HITS) centralities (Kleinberg 1999). We now provide formal defini-
tions.

Definition 17.3 (Eigenvector-Based Centrality) Let C = C(A) be a centrality


matrix, which we obtain from some function C : R N ×N → R N ×N of the adjacency
matrix A, of a network G (V , E ). Consider the dominant right eigenvector u, which
satisfies
Cu = λmax u , (17.4)

where λmax ∈ R+ is the largest eigenvalue of C. (This eigenvalue is guaranteed to be


positive.) The ith entry u i specifies the eigenvector-based centrality of node i ∈ V
that is associated with the function C.

Definition 17.4 (PageRank (Page et al. 1999; Gleich 2015)) When C is given by
Eq. (17.3), we say that Eq. (17.4) yields PageRank centralities {u i(PR) }.

Remark 17.1 It is also common to compute PageRank centralities from a left eigen-
vector (Gleich 2015). In the present chapter, we use a right-eigenvector formulation
to be consistent with the other eigenvector-based centralities. One can recover the
left-eigenvector formulation by taking the transpose of Eq. (17.4).

2 PageRank has had intellectual impact well beyond web searches (Gleich 2015).
342 D. Taylor et al.

17.3 Supracentrality Framework

We now describe the supracentrality framework that we presented in Taylor et al.


(2021). The present formulation generalizes our formulation of supracentrality from
Taylor et al. (2017) that required interlayer coupling to take the form of an undirected
chain. (See the top row of Fig. 17.1.) To aid our presentation, we summarize our
mathematical notation in Table 17.1.

17.3.1 Supracentrality Matrices

We first describe a supracentrality matrix from Taylor et al. (2021).

Definition 17.5 (Supracentrality Matrix) Let {C(t) } be a set of T centrality matri-


ces for a discrete-time temporal network with a common set V = {1, . . . , N } of
nodes, and assume that Ci(t)
j ≥ 0. Let Ã, with entries Ãi j ≥ 0, be a T × T interlayer-
adjacency matrix that encodes the interlayer couplings. We define a family of
supracentrality matrices C(ω), which are parameterized by the interlayer-coupling
strength ω ≥ 0, of the form

⎡ ⎤ ⎡ ⎤
C(1) 0 0 ... Ã11 I Ã12 I Ã13 I ...
⎢ 0 C(2) 0 ...⎥ ⎢ ...⎥
⎢ ⎥ ⎢ Ã21 I Ã22 I Ã23 I ⎥
⎢ Ã31 I Ã32 I Ã33 I ...⎥
C(ω) = Ĉ + ω = ⎢
⎢ 0 0 C(3) .. ⎥
.⎥ + ω ⎢ ⎥,
⎣ ⎦ ⎢ . .. .. .. ⎥
.. .. . . .. ⎣ .. . . .⎦
. . . .
(17.5)

where Ĉ = diag[C(1) , . . . , C(T ) ] and  = à ⊗ I is the Kronecker product of à and I.

Table 17.1 Summary of our Typeface Class Dimension


mathematical notation for
objects with different M Matrix NT × NT
dimensions M Matrix N×N
M Matrix T ×T
v Vector NT × 1
v Vector N ×1
v Vector T ×1
Mi j Scalar 1
vi Scalar 1
17 Supracentrality Analysis of Temporal Networks … 343

For layer t, the matrix C(t) can be any matrix whose dominant eigenvector is of
interest. In our discussion, we focus on PageRank centrality matrices (see Defini-
tion 17.4), but one can alternatively choose eigenvector centrality (Bonacich 1972),
hub and authority centralities (Kleinberg 1999), or something else.
The N T × N T supracentrality matrix C(ω) encodes the effects of two distinct
types of connections: the layer-specific centrality entries {Ci(t)
j } in the diagonal blocks
relate centralities between nodes in layer t; and entries in the off-diagonal blocks
encode coupling between layers. The matrix  = à ⊗ I implements uniform and
diagonal coupling. The matrix I encodes diagonal coupling; any two layers t and
t  are uniformly coupled because all interlayer edges between them have the same
weight ω Ãtt  .

17.3.2 Joint, Marginal, and Conditional Centralities

As we indicated earlier, we study the dominant right-eigenvalue equation for supra-


centrality matrices. That is, we solve the eigenvalue equation

C(ω)v(ω) = λmax (ω)v(ω) , (17.6)

and we interpret entries in the dominant right eigenvector v(ω) as scores that measure
the importances of node-layer pairs {(i, t)}. Because the vector v(ω) has a block
form—its first N entries encode the joint centralities for layer t = 1, its next N
entries encode the joint centralities for layer t = 2, and so on—it is useful to reshape
v(ω) into a matrix.

Definition 17.6 (Joint Centrality of a Node-Layer Pair (Taylor et al. 2017)) Let
C(ω) be a supracentrality matrix given by Definition 17.5, and let v(ω) be its dom-
inant right eigenvector. We encode the joint centrality of node i in layer t via the
N × T matrix W(ω) with entries

Wit (ω) = v N (t−1)+i (ω) . (17.7)

We refer to Wit (ω) as a ‘joint centrality’ because it reflects the importance of both
node i and layer t.

Definition 17.7 (Marginal Centralities of Nodes and Layers (Taylor et al. 2017))
Let W(ω) encode the joint centralities given by Definition 17.6. We define the
marginal layer centrality (MLC) and marginal node centrality (MNC), respectively,
by

xt (ω) = Wit (ω) ,


i

x̂i (ω) = Wit (ω) . (17.8)


t
344 D. Taylor et al.

Definition 17.8 (Conditional Centralities of Nodes and Layers (Taylor et al. 2017))
Let {Wit (ω)} be the joint centralities given by Definition 17.6; and let {xt (ω)} and
{x̂i (ω)}, respectively, be the marginal layer and node centralities given by Defini-
tion 17.7. We define the conditional centralities of nodes and layers by

Z it (ω) = Wit (ω)/xt (ω) ,


Ẑ it (ω) = Wit (ω)/x̂i (ω) , (17.9)

where Z it (ω) gives the centrality of node i conditioned on layer t and Ẑ it (ω) gives
the centrality of layer t conditioned on node i. The quantity Z it (ω) indicates the
importance of node i relative just to the other nodes in layer t.

We ensure that the supracentralities are well-defined (i.e., unique, positive, and
finite) with the following theorem.

Theorem 17.1 (Uniqueness and Positivity of Supracentralities (Taylor et al. 2021))


Let C(ω) be a supracentrality matrix given by Eq. (17.5). Additionally, suppose that
à is an adjacency matrix for a strongly connected graph and that t C(t) is an
irreducible, nonnegative matrix. It then follows that C(ω) is irreducible, nonnega-
tive, and has a simple largest positive eigenvalue λmax (ω), with corresponding left
eigenvector u(ω) and right eigenvector v(ω) that are each unique and positive. The
centralities {Wit (ω)}, {xi (ω)}, {x̂t (ω)}, {Z it (ω)}, and { Ẑ it (ω)} are then positive and
finite. If we also assume that C(ω) is aperiodic, it follows that λmax (ω) is a unique
dominant eigenvalue.

In Fig. 17.2, we show the joint and marginal centralities for the network in
Fig. 17.1a. We have normalized the vector v(ω) using the 1-norm.

layer index

1 2 3 4 5 6 MNC

1 0.0305 0.0461 0.0493 0.0460 0.0360 0.0195 0.2272


node index

2 0.0198 0.0368 0.0480 0.0501 0.0471 0.0308 0.2326

3 0.0249 0.0491 0.0592 0.0520 0.0402 0.0212 0.2466

4 0.0238 0.0465 0.0660 0.0744 0.0552 0.0275 0.2935

MLC 0.0990 0.1784 0.2225 0.2225 0.1784 0.0990

Fig. 17.2 Joint centralities {Wit (ω)} of Definition 17.6 (white cells), with corresponding marginal
layer centralities (MLCs) {xt (ω)} and marginal node centralities (MNCs) {x̂i (ω)} from Defini-
tion 17.7 (gray cells), for the network in panel a of Fig. 17.1 with ω = 1. The centrality matrices of
the layers are PageRank centrality matrices (see Eq. (17.3)) with a node-teleportation parameter of
σ = 0.85
17 Supracentrality Analysis of Temporal Networks … 345

17.4 Application to a Ph.D. Exchange Network

We apply our supracentrality framework to study centrality trajectories for a temporal


network that encodes the graduation and hiring of mathematicians between N = 231
mathematical-sciences doctoral programs in the United States during the years 1946–
2010 (Taylor et al. 2017). Each edge Ai(t)j of the temporal network encodes the number
of Ph.D. recipients who graduated from university j in year t and subsequently
supervised a Ph.D. student at university i. The edge directions, where Ai(t)j is an edge
from university i to university j, point in the opposite direction to the flow of people
who earn their Ph.D. degrees. We define edge directions in this way to indicate that
university i effectively selects the output of university j when hiring someone who
received their Ph.D. from university j (Burris 2004; Myers et al. 2011; Clauset et al.
2015). With this convention for the direction of edges, {C(t) } encodes the PageRank
matrices of the layers; the highest-ranking universities are the ones that are good
sources for the flow of Ph.D. recipients. The network, which we constructed using
data from the Mathematics Genealogy Project (2009), is available at Taylor (2019).
We focus our discussion on five universities: Harvard, Massachusetts Institute of
Technology (MIT), Princeton, Stanford, and University of California, Berkeley (UC
Berkeley). They have the largest PageRank centralities (using a node-teleportation
 (t) of σ = 0.85) for a temporally aggregated network with adjacency matrix
parameter
t A . In all of our experiments, we assume that the layers’ centrality matrices are
given by PageRank matrices (17.3). As in our other explorations (Taylor et al. 2017,
2021), we vary the interlayer-coupling strength ω to adjust how rapidly centralities
change with time. In the present work, our primary focus is investigating the effects
on supracentralities of undirected and directed interlayer coupling. See Eqs. (17.1)
and (17.2) for the definitions of these interlayer-coupling schemes, and see Fig. 17.1
for visualizations of these two types of interlayer coupling.
We first consider undirected interlayer coupling, so we define à by Eq. (17.1).
In Fig. 17.3, we plot the joint and conditional centralities for the five universities.
The columns show results for interlayer-coupling strengths ω ∈ {1, 10, 102 , 103 }.
In the bottom row, we see that progressively larger values of ω yield progressively
smoother conditional-centrality trajectories. In the top row, we observe that as one
increases ω, the joint centrality appears to limit to one arc of a sinusoidal curve. We
prove this result in Sect. 17.5. The most striking results appear in the bottom row of
the third column. Based on conditional node centrality, we see that MIT becomes the
top-ranked university in the 1950s and then remains so in our data set. Stanford and
UC Berkeley gradually develop larger conditional centralities over the 64 years in
the data set, whereas the conditional centralities of Princeton and Harvard decrease
gradually over this period. The conditional centralities of these five universities are in
the top-10 values among all universities in all years of the data set. This is consistent
with our results in Taylor et al. (2017, 2021).
We now examine directed interlayer coupling, and we let à encode a directed
chain with layer teleportation. See Eq. (17.2) for the specific formula and the bot-
tom row of Fig. 17.1 for an associated visualization. In each panel of Fig. 17.4, we
346 D. Taylor et al.

Harvard MIT Princeton Stanford UC Berkeley

Fig. 17.3 Trajectories of node centralities using undirected interlayer coupling for mathematical-
sciences Ph.D. programs at five top universities. The top and bottom rows illustrate joint centralities
and conditional node centralities, respectively, that we compute with PageRank centrality matrices
with a node-teleportation parameter of σ = 0.85 and undirected interlayer-adjacency matrix Ã
given by Eq. (17.1) with ω ∈ {1, 10, 100, 1000}. The dotted black curve in the rightmost top panel
is the result of an asymptotic approximation that we present in Sect. 17.5

plot the joint centralities and conditional node centralities. The columns give results
for interlayer-coupling strengths ω ∈ {0.1, 1, 10, 100}, and the three panels indicate
different layer-teleportation probabilities: (a) γ = 0.0001; (b) γ = 0.001; and (c)
γ = 0.01. The dotted black curves in the rightmost column indicate large-ω asymp-
totic approximations that we will present in Sect. 17.5.
To understand the main effect of directed interlayer coupling, we first compare the
joint centralities in Fig. 17.4 to those in Fig. 17.3. To help our discussion, we focus
on the rightmost column of the two figures. We observe that the joint-centrality
trajectories tend to decay with time for directed interlayer coupling, whereas they
have peaks and attain their largest values near t = 1978 for undirected interlayer
coupling. Therefore, directed interlayer coupling tends to “boost” the joint centralities
of earlier time layers in comparison to undirected coupling. Comparing panels a–c
of Fig. 17.4 (and again focusing on the rightmost column), we observe that the decay
is fastest for γ = 0.0001 (panel a) and slowest for γ = 0.01 (panel c).
The conditional centralities are also affected by directed interlayer coupling. Con-
sider ω = 10 in Fig. 17.3, and observe that the conditional centrality of Princeton
decreases monotonically with time. By contrast, observe in Fig. 17.4a,b for ω = 10
that the conditional centrality of Princeton now decreases between t = 1946 and
about t = 1988, but then it increases.
For our last experiment, we examine how reversing the direction of the interlayer
edges changes the results of our supracentrality calculations. We repeat the previous
experiment with directed interlayer edges, except that we set à to be the transpose of
the matrix that we defined by Eq. (17.2). One motivation is that for some applications,
the most recent time layers are more important than the earliest time layers. One can
incorporate this idea into our supracentrality framework by reversing the direction of
the interlayer edges. In Fig. 17.5, we plot the same quantities as in Fig. 17.4, except
17 Supracentrality Analysis of Temporal Networks … 347

(a)
γ = 0.0001

(b)
γ = 0.001

(c)
γ = 0.01

Harvard MIT Princeton Stanford UC Berkeley

Fig. 17.4 Trajectories of node centralities using directed interlayer coupling for mathematical-
sciences Ph.D. programs at five top universities. This figure is similar to Fig. 17.3, except that the
interlayer-adjacency matrix à is now given by Eq. (17.2), which corresponds to a directed chain with
layer teleportation with probability γ . Panels a, b, and c show results for γ = 0.0001, γ = 0.001,
and γ = 0.01, respectively. The dotted black curves in the rightmost top subpanels of panels a–c
are results of an asymptotic approximation that we present in Sect. 17.5. For sufficiently large ω
and sufficiently small γ , the joint centralities decrease with time
348 D. Taylor et al.

that now we take the directed interlayer edges to have the opposite direction (so
we have reversed the arrow of time). Observe that the joint centralities now tend to
increase with time; by contrast, in Fig. 17.4, they tend to decrease with time. These
trends are most evident in the rightmost columns. We also observe differences in
the conditional centralities. For example, focusing on ω = 10 in the third column
of Fig. 17.5, we see that Princeton never has the largest conditional centrality. By
contrast, for ω = 10 in Figs. 17.3 and 17.4a,b, Princeton has the largest conditional
centrality for the earliest time steps (specifically, for t ∈ {1946, . . . , 1954}).
Understanding how the weights, topologies, and directions of interlayer coupling
affect supracentralities is essential to successfully deploying supracentrality analysis
to reveal meaningful insights. The above experiments highlight that one can tune the
weights and topology of interlayer coupling to emphasize either earlier or later time
layers. Specifically, one can adjust the parameters ω and γ , as well as the direction
of the interlayer edges, to cater a study to particular data sets and particular research
questions. In our investigation in this section, we considered both the case in which Ã
is given by Eq. (17.2) and the case in which it is given by the transpose of the matrix
that we determine from Eq. (17.2). It is worth considering how these different choices
of interlayer edge directions are represented in the supracentrality matrix C(ω) and
the consequences of these choices. Specifically, each layer’s PageRank matrix C(t) is
defined in Eq. (17.3) using the transpose of the layer’s adjacency matrix A(t) , yet when
coupling the centrality matrices, we do not take the transpose of à when defining
C(ω) in Eq. (17.5). Accordingly, one may worry that the matrix C(ω) effectively
acts in the forward direction for the intralayer edges but in the opposite direction
for the interlayer edges. However, this does not lead to any inherent contradiction,
as the meanings of the directions of these two types of edges are fundamentally
different. The direction of the intralayer edges dictates the flow of random walkers,
whereas the direction of the interlayer edges couples the centralities of the different
layers. In other applications, it may be necessary to encode the directions of the
interlayer and intralayer edges in the same way, but there is no reason why one
cannot encode the directions of interlayer and intralayer edges in different ways in a
supracentrality formalism. As we have demonstrated by considering both à and its
transpose—and thus by treating the effect of the interlayer edges in opposite ways in
these two calculations—both uses are meaningful. They also probe different aspects
of temporal data.

17.5 Asymptotic Behavior for Small and Large


Interlayer-Coupling Strength ω

In this section, we summarize the asymptotic results from Taylor et al. (2021) that
reveal the behavior of supracentralities in the limit of small and large ω. In our present
discussion, we focus on dominant right eigenvectors.
17 Supracentrality Analysis of Temporal Networks … 349

(a)
γ = 0.0001

(b)
γ = 0.001

(c)
γ = 0.01

Harvard MIT Princeton Stanford UC Berkeley

Fig. 17.5 Trajectories of node centralities using reversed directed interlayer coupling for
mathematical-sciences Ph.D. programs at five top universities. This figure is identical to Fig. 17.4,
except that à is now given by the transpose of the matrix from Eq. (17.2), such that the directed
chain points backwards in time. For sufficiently large ω and sufficiently small γ , the joint centralities
now increase with time
350 D. Taylor et al.

To motivate our asymptotic analysis, consider the top-right subpanels in each


panel of Figs. 17.3, 17.4, and 17.5. In each of these subpanels, we plot (in dotted
black curves) the results of an asymptotic analysis of the dominant right eigenvector
ṽ(1) of à for the joint centrality of MIT in the limit of large ω. We observe excel-
lent agreement with our numerical calculations. Therefore, for sufficiently large ω,
one can understand the effects of both undirected and directed interlayer couplings
(as encoded in an interlayer-adjacency matrix Ã) by examining the dominant right
eigenvector of Ã. For large values of ω, this eigenvector captures the limit of the joint
centralities as a function with a peak for undirected coupled (see Fig. 17.3), decay in
time for directed coupling (see Fig. 17.4), and growth in time for directed coupling
when reversing the arrow of time (see Fig. 17.5).

17.5.1 Layer Decoupling in the Limit of Small ω

We begin with some notation. Let μ̃1 be the dominant eigenvalue (which we assume
to be simple) of Ã, and let ũ(1) and ṽ (1) denote its corresponding left and right
eigenvectors. Given a set {C(t) } of centrality matrices, we let μ(t) 1 be the dominant
eigenvalue (which we also assume to be simple) of C(t) ; the vectors u(1,t) and v(1,t)
are the corresponding left and right eigenvectors. Let {μ(t) 1 } denote the set of spectral
radii, where λmax (0) = maxt μ(t) 1 is the maximum eigenvalue over all layers. (Recall
that λmax (ω) is the dominant eigenvalue of the supracentrality matrix C(ω).) Let
P = {t : μ(t) 1 = λmax (0)} denote the set of layers whose centrality matrices have an
eigenvalue that achieves the maximum. When the layers’ centrality matrices {C(t) }
are PageRank matrices given by Eq. (17.3), it follows that μ(t) 1 = 1 for all t (i.e.,
P = {1, . . . , T }), the corresponding left eigenvector is u(1,t) = [1, . . . , 1]T /N , and
v(1,t) is the PageRank vector for layer t. Furthermore, for each t, we define the length-
N T “block” vector v(1,t) = e(t) ⊗ v(1,t) , which consists of zeros in all blocks except
for block t, which equals v(1,t) . The vector e(t) is a length-T unit vector that consists
of zeros in all entries except for entry t, which is 1.
We now present a theorem from Taylor et al. (2021), although we restrict our
attention to the part of it that pertains to the right dominant eigenvector.

Theorem 17.2 (Weak-Coupling Limit of Dominant Right Eigenvectors (Taylor


et al. 2021)) Let v(ω) be the dominant right eigenvector of a supracentrality matrix
that is normalized using the 1-norm and satisfies the assumptions of Theorem 17.1.
Additionally, let P = {t : μ(t)
1 = λmax (0)} denote the set of indices associated with
the eigenvalues of C(t) that equal the largest eigenvalue λmax (0) of C(0). We assume
that each layer’s dominant eigenvalue μ(t)1 is simple. It then follows that the ω → 0
+

limit of v(ω) satisfies

v(ω) → αt v(1,t) , (17.10)


t∈P
17 Supracentrality Analysis of Temporal Networks … 351

where the vector α = [α1 , . . . , αT ]T has nonnegative entries and is the unique solu-
tion of the dominant eigenvalue equation

Xα = λ1 α . (17.11)

The eigenvalue λ1 needs to be determined, and the entries of X are



u(1,t) , v(1,t )
X tt  = Ãt,t  χ (t)χ (t  ) , (17.12)
u(1,t) , v(1,t)

where χ (t) = t  ∈P δtt  is an indicator function: χ (t) = 1 if t ∈ P and χ (t) = 0
otherwise. The vector α must also be normalized to ensure that the right-hand
side of Eq. (17.10) is normalized (by setting α p = 1 for normalization using a
p-norm).

17.5.2 Layer Aggregation in the Limit of Large ω

To study the ω → ∞ limit, it is convenient to divide Eq. (17.6) by ω and define


ε = 1/ω to obtain
C̃(ε) = εC(ε−1 ) = εĈ + Â , (17.13)

which has right eigenvectors ṽ(ε) that are identical to those of C(ω) (specifically,
ṽ(ε) = v(ε−1 )). Its eigenvalues {λ̃i } are scaled by ε, so λ̃i (ε) = ελi (ε−1 ).
Before presenting results from Taylor et al. (2021), we define a few additional
concepts. Let ṽ(1, j) = ẽ( j) ⊗ ṽ (1) denote a block vector that consists of zeros in all
blocks except for block j, which equals the dominant right eigenvector ṽ (1) of Ã.
The vector ẽ( j) is a length-N unit vector that consists of zeros in all entries except
for entry j, which is 1. We also define the stride permutation matrix

1 , l = k/N + T [(k − 1) mod N ]
Pkl = (17.14)
0 , otherwise ,

where the ceiling function θ denotes the smallest integer that is at least θ and
‘mod’ denotes the modulus function (i.e., a mod b = a − b a/b − 1 ).

Theorem 17.3 (Strong-Coupling Limit of Dominant Eigenvectors (Taylor et al.


2021)) Let Ã, μ̃1 , ũ(1) , and ṽ (1) be defined as above, with the same assumptions
as in Theorem 17.1. It then follows that the dominant eigenvalue λ̃max (ε) and the
associated eigenvector v(ε) of C(ε) converge as ε → 0+ to the following expres-
sions:
352 D. Taylor et al.

λ̃max (ε) → μ̃1 ,


ṽ(ε) → α̃ j Pṽ(1, j) , (17.15)
j

where the constants {α̃i } solve the dominant eigenvalue equation

X̃α̃ = μ̃1 α̃ , (17.16)

with

ũ (1) (1)
t ṽt
X̃ i j = Ci(t) . (17.17)
ũ(1) , ṽ (1)
j
t

We normalize the vector α̃ to ensure that the right-hand side of Eq. (17.15) is nor-
malized.

Equation (17.17) indicates that the strong-coupling limit effectively aggregates


the centrality matrices {C(t) } across time via a weighted average, with weights that
depend on the dominant left and right eigenvectors of Ã. When à encodes an undi-
rected chain (see Eq. (17.1) and the top row of Fig. 17.1), it follows that (Taylor et al.
2017)
sin2 Tπt+1
X̃ = C(t)  . (17.18)
T 2 πt
t t=1 sin (T +1)

The dotted black curve in the top-right subpanel of Fig. 17.3 shows a scaled version
of ṽ (1) , which is defined by the normalized sinusoidal weightings in Eq. (17.18). The
dotted black curves in the top-right subpanels of each panel of Figs. 17.4 and 17.5 also
show ṽ (1) (with à given by Eq. (17.2) and by the transpose of the matrix that we obtain
from Eq. (17.2), respectively), which we scale to normalize the joint centralities.

17.6 Discussion

We presented a supracentrality framework to study how the importances of the nodes


of a temporal network change with time. Our approach involves representing a tem-
poral sequence of networks as time layers of a multiplex network and using the
strength and topology of coupling between time layers to tune centrality trajectories.
A key feature of our approach is that it simultaneously yields the centralities of all
nodes at all times by computing the dominant right eigenvector of a supracentrality
matrix.
Inspired by ideas from probability theory, we examined three types of eigenvector-
based supracentralities:
17 Supracentrality Analysis of Temporal Networks … 353

(i) the joint centrality of a node-layer pair (i, t); this captures the combined impor-
tance of node i and time layer t;
(ii) the marginal centrality of a node i or time t; these capture separate importances
of a node or a time layer; and
(iii) the conditional centrality of a node i at time t; this captures the importance of
a node relative only to other nodes at that particular time.
Because our approach involves analyzing the dominant eigenvector of a central-
ity matrix, it generalizes eigenvector-based centralities, such as PageRank, hub and
authority centralities, and (vanilla) eigenvector centrality. Naturally, it is desirable
to extend supracentralities to analyze networks that are both temporal and multiplex
(Kivelä et al. 2014). Another important generalization of centrality analysis is the
study of continuous-time temporal networks and streaming network data (Grindrod
and Higham 2014; Ahmad et al. 2021), and it will be insightful to extend supracen-
tralities to such situations.

Acknowledgements We thank Petter Holme and Jari Saramäki for the invitation to write this
chapter. We thank Daryl DeFord, Tina Eliassi-Rad, Des Higham, Christine Klymko, Marianne
McKenzie, Scott Pauls, and Michael Schaub for fruitful conversations. DT was supported by the
Simons Foundation under Award #578333. PJM was supported by the James S. McDonnell Foun-
dation 21st Century Science Initiative—Complex Systems Scholar Award #220020315.

References

W. Ahmad, M.A. Porter, M. Beguerisse-Díaz, IEEE Trans. Netw. Sci. Eng. 8(2), 1759 (2021)
A. Alsayed, D.J. Higham, Chaos, Solitons Fractals 72, 35 (2015)
F. Arrigo, D.J. Higham, Appl. Netw. Sci. 2(1), 17 (2017)
D.S. Bassett, M.A. Porter, N.F. Wymbs, S.T. Grafton, J.M. Carlson, P.J. Mucha, Chaos 23(1), 013142
(2013)
F. Battiston, V. Nicosia, V. Latora, Phys. Rev. E 89(3), 032804 (2014)
C.T. Bergstrom, J.D. West, M.A. Wiseman, J. Neurosci. 28(45), 11433 (2008)
P. Bonacich, J. Math. Sociol. 2(1), 113 (1972)
S.P. Borgatti, C. Jones, M.G. Everett, Connections 21(2), 27 (1998)
S. Brin, L. Page, in Proceedings of the Seventh International World Wide Web Conference (1998),
pp. 107–117
V. Burris, Am. Sociol. Rev. 69(2), 239 (2004)
T. Callaghan, M.A. Porter, P.J. Mucha, Am. Math. Mon. 114(9), 761 (2007)
T. Chakraborty, R. Narayanam, in 2016 IEEE 32nd International Conference on Data Engineering
(ICDE) (IEEE, Piscataway, 2016), pp. 397–408
T.P. Chartier, E. Kreutzer, A.N. Langville, K.E. Pedings, SIAM J. Sci. Comput. 33(3), 1077 (2011)
I. Chen, M. Benzi, H.H. Chang, V.S. Hertzberg, J. Complex Netw. 5(2), 274 (2016)
A. Clauset, S. Arbesman, D.B. Larremore, Sci. Adv. 1(1), e1400005 (2015)
M. De Domenico, A. Solé-Ribalta, E. Cozzo, M. Kivelä, Y. Moreno, M.A. Porter, S. Gómez, A.
Arenas, Phys. Rev. X 3(4), 041022 (2013)
D.R. DeFord, in International Workshop on Complex Networks and their Applications (Springer,
Berlin, 2017), pp.1111–1123
D.R. DeFord, S.D. Pauls, J. Complex Netw. 6(3), 353 (2017)
C. Ding, K. Li, Neurocomputing 312, 263 (2018)
354 D. Taylor et al.

E. Estrada, Phys. Rev. E 88(4), 042811 (2013)


K. Faust, Soc. Netw. 19(2), 157 (1997)
C. Fenu, D.J. Higham, SIAM J. Matrix Anal. Appl. 38(2), 343 (2017)
J. Flores, M. Romance, J. Comput. Appl. Math. 330, 1041 (2018)
J.H. Fowler, S. Jeon, Soc. Netw. 30(1), 16 (2008)
J.H. Fowler, T.R. Johnson, J.F. Spriggs II, S. Jeon, P.J. Wahlbeck, Policy Anal. 15(3), 324 (2007)
D.F. Gleich, SIAM Rev. 57(3), 321 (2015)
P. Grindrod, D.J. Higham, SIAM Rev. 55(1), 118 (2013)
P. Grindrod, D.J. Higham, Proc. R. Soc. A 470(2165), 20130835 (2014)
P. Grindrod, M.C. Parsons, D.J. Higham, E. Estrada, Phys. Rev. E 83(4), 046120 (2011)
R. Guimerà, S. Mossa, A. Turtschi, L.A.N. Amaral, Proc. Natl. Acad. Sci. U. S. A. 102(22), 7794
(2005)
A. Halu, R.J. Mondragón, P. Panzarasa, G. Bianconi, PLoS ONE 8(10), e78293 (2013)
P. Holme, Adv. Complex Syst. 6(2), 163 (2003)
P. Holme, Eur. Phys. J. B 88(9), 234 (2015)
P. Holme, J. Saramäki, Phys. Rep. 519(3), 97 (2012)
P. Holme, J. Saramäki (eds.), Temporal Networks (Springer, Berlin, 2013)
D.W. Huang, Z.G. Yu, Sci. Rep. 7, 41454 (2017)
Q. Huang, C. Zhao, X. Zhang, X. Wang, D. Yi, Europhys. Lett. 118(3), 36001 (2017)
H. Jeong, S.P. Mason, A.L. Barabási, Z.N. Oltvai, Nature 411(6833), 41 (2001)
D. Kempe, J. Kleinberg, É. Tardos, in Proceedings of the Ninth ACM SIGKDD International Con-
ference on Knowledge Discovery and Data Mining (ACM, New York, 2003), pp. 137–146
H. Kim, J. Tang, R. Anderson, C. Mascolo, Comput. Netw. 56(3), 983 (2012)
M. Kivelä, A. Arenas, M. Barthelemy, J.P. Gleeson, Y. Moreno, M.A. Porter, J. Complex Netw.
2(3), 203 (2014)
J. Kleinberg, J. ACM 46(5), 604 (1999)
G. Kossinets, J. Kleinberg, D. Watts, in Proceedings of the 14th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (ACM, New York, 2008), pp. 435–443
V. Kostakos, Phys. A 388(6), 1007 (2009)
A.N. Langville, C.D. Meyer, Google’s PageRank and Beyond: The Science of Search Engine Rank-
ings (Princeton University Press, Princeton, 2006)
E.A. Leicht, G. Clarkson, K. Shedden, M.E.J. Newman, Eur. Phys. J. B 59(1), 75 (2007)
K. Lerman, R. Ghosh, J.H. Kang, in Proceedings of the Eighth Workshop on Mining and Learning
with Graphs (ACM, New York, 2010), pp. 70–77
H. Liao, M.S. Mariani, M. Medo, Y.C. Zhang, M.Y. Zhou, Phys. Rep. 689, 1 (2017)
M. Magnani, B. Micenkova, L. Rossi, arXiv preprint arXiv:1303.4986 (2013)
M. Magnani, L. Rossi, in 2011 International Conference on Advances in Social Networks Analysis
and Mining (ASONAM) (IEEE, Piscataway, 2011), pp. 5–12
M.S. Mariani, M. Medo, Y.C. Zhang, Sci. Rep. 5, 16181 (2015)
M.S. Mariani, M. Medo, Y.C. Zhang, J. Informetr. 10(4), 1207 (2016)
N. Masuda, M.A. Porter, R. Lambiotte, Phys. Rep. 716–717, 1 (2017)
S. Motegi, N. Masuda, Sci. Rep. 2, 904 (2012)
P.J. Mucha, M.A. Porter, Chaos 20(4), 041108 (2010)
S.A. Myers, P.J. Mucha, M.A. Porter, Chaos 21(4), 041104 (2011)
M.E.J. Newman, Networks, 2nd edn. (Oxford University Press, Oxford, 2018)
M.K.P. Ng, X. Li, Y. Ye, in Proceedings of the 17th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining (ACM, New York, 2011), pp. 1217–1225
L. Page, S. Brin, R. Motwani, T. Winograd, The PageRank citation ranking: Bringing order to the
Web. Technical Report 1999-66. Stanford InfoLab (1999)
A.R. Pamfil, S.D. Howison, R. Lambiotte, M.A. Porter, SIAM J. Math. Data Sci. 1(4), 667 (2019)
R. Pan, J. Saramäki, Phys. Rev. E 84(1), 016105 (2011)
M.A. Porter, Not. Am. Math. Soc. 65(11), 1419 (2018)
S. Praprotnik, V. Batagelj, Ars Mat. Contemp. 11(1), 11 (2015)
17 Supracentrality Analysis of Temporal Networks … 355

C. Rahmede, J. Iacovacci, A. Arenas, G. Bianconi, J. Complex Netw. 6(5), 733 (2017)


R.A. Rossi, D.F. Gleich, Algorithms and Models for the Web Graph (Springer, Berlin, 2012),
pp.126–137
S. Saavedra, S. Powers, T. McCotter, M.A. Porter, P.J. Mucha, Phys. A 389(5), 1131 (2010)
L. Solá, M. Romance, R. Criado, J. Flores, A.G. del Amo, S. Boccaletti, Chaos 23(3), 033131
(2013)
A. Solé-Ribalta, M. De Domenico, S. Gómez, A. Arenas, in Proceedings of the 2014 ACM Confer-
ence on Web Science (ACM, New York, 2014), pp. 149–155
A. Solé-Ribalta, M. De Domenico, S. Gómez, A. Arenas, Phys. D 323, 73 (2016)
C. Spatocco, G. Stilo, C. Domeniconi, arXiv preprint arXiv:1801.08026 (2018)
T. Takaguchi, Y. Yano, Y. Yoshida, Eur. Phys. J. B 89(2), 1 (2016)
M.Y. Tang, M. Musolesi, C. Mascolo, V. Latora, V. Nicosia, in Proceedings of the 3rd Workshop
on Social Network Systems—SNS ’10 (2010), pp. 1–6
S. Tavassoli, K.A. Zweig, in 2016 Third European Network Intelligence Conference (ENIC) (IEEE,
Piscataway, 2016), pp. 25–32
D. Taylor, Data release: Ph.D. exchange in the Mathematical Genealogy Project. https://sites.google.
com/site/danetaylorresearch/data (2019)
D. Taylor, R.S. Caceres, P.J. Mucha, Phys. Rev. X 7(3), 031056 (2017)
D. Taylor, S.A. Myers, A. Clauset, M.A. Porter, P.J. Mucha, Multiscale Model. Simul. 15(1), 537
(2017)
D. Taylor, M.A. Porter, P.J. Mucha, Multiscale Model. Simul. 19(1), 113 (2021)
The Mathematics Genealogy Project. http://www.genealogy.ams.org; data provided 19 October
2009
F. Tudisco, F. Arrigo, A. Gautier, J. SIAM. Appl, Math. 78(2), 853 (2018)
D. Walker, H. Xie, K.K. Yan, S. Maslov, J. Stat. Mech. 2007(6), P06010 (2007)
W.H. Weir, S. Emmons, R. Gibson, D. Taylor, P.J. Mucha, Algorithms 10(3), 93 (2017)
M.J. Williams, M. Musolesi, R. Soc. Open Sci. 3(6) (2016)
K. You, R. Tempo, L. Qiu, IEEE Trans. Autom. Control 62(5), 2080 (2017)
Chapter 18
Approximation Methods for Influence
Maximization in Temporal Networks

Tsuyoshi Murata and Hokuto Koga

Abstract The process of rumor spreading among people can be represented as


information diffusion in a social network. The scale of the rumor spread can change
greatly depending on the starting nodes. If we can select nodes that trigger large
scale diffusion events, the nodes are expected to be important for viral marketing.
Given a network and the number of starting nodes, the problem of selecting nodes
for maximizing information diffusion is called as influence maximization problem.
We propose three new approximation methods (Dynamic Degree Discount, Dynamic
CI, and Dynamic RIS) for influence maximization problem in temporal networks.
These methods are the extensions of previous methods for static networks to tem-
poral networks. Although the performance of MC greedy was better than the three
methods, it was computationally expensive and intractable for large-scale networks.
The computational time of our proposed methods was more than 10 times faster than
MC greedy. When compared with Osawa, the performances of the three methods
were better for most of the cases.

Keywords Temporal networks · Influence maximization · Network centrality

18.1 Introduction

Diffusion of rumors (or information) can be represented as information propagation


in a social network where its nodes are people and its edges are contacts among the
people. The scale of information propagation depends on where and when to start the
propagation. In order to propagate information as much as possible, starting nodes
should be carefully selected. Selecting starting nodes for large-scale information
propagation is important as one of the methods for viral marketing.
From given network, selecting such starting nodes for large-scale information
propagation was formalized as “influence maximization problem” by Kempe et al.

T. Murata (B) · H. Koga


Department of Computer Science, School of Computing, Tokyo Institute of Technology, W8-59
2-12-1 Ookayama, Meguro, Tokyo 152-8552, Japan
e-mail: murata@c.titech.ac.jp

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 357
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_18
358 T. Murata and H. Koga

(2003). The original formalization is for static networks. However, nodes and edges
can be newly added or deleted in many real social networks. Therefore, influence
maximization problem in temporal networks should be considered. Habiba et al.
defined the problem for temporal networks (Habiba and Berger-Wolf 2007). Since
the problem was proved to be NP-hard, computing the best solution in realistic time
is computationally intractable. Therefore, many approximation schemes based on
Monte-Carlo simulation and other heuristic methods have been proposed. Methods
based on Monte-Carlo simulation are more accurate but computationally expensive.
On the other hand, other heuristic methods are fast but they are less accurate.
In order to find better solutions for the information maximization problem, we
propose three new methods for temporal networks as the extension of the meth-
ods for static networks. Dynamic Degree Discount is a heuristic method based on
node degree. Dynamic CI is a method based on a node’s degree and the degrees of
reachable nodes from the node within specific time. Dynamic RIS uses many similar
networks generated by random edge removal. We compare the proposed methods
with previous methods. Although the performance of MC greedy was better than
the three methods, it was computationally expensive and intractable for large-scale
networks. The computational time of our proposed methods was more than 10 times
faster than MC greedy. When compared with Osawa, the performances of the three
methods were better for most of the cases.
We discuss extended methods for influence maximization in temporal networks
(Murata and Koga 2017, 2018). This chapter includes detailed explanation of back-
ground knowledge, discussions of the effect of different values of parameters in the
proposed methods, and detailed analysis of the advantages and disadvantages of the
proposed methods.
The structure of this chapter is as follows. Section 18.2 shows related work.
Section 18.3 presents proposed methods (Dynamic Degree Discount, Dynamic CI
and Dynamic RIS), Sect. 18.4 explains our experiments, and Sect. 18.5 shows the
experimental results. Section 18.6 shows discussions about the experimental results,
and Sect. 18.7 concludes the chapter.

18.2 Related Work

18.2.1 Model of Information Propagation

We use the SI model as the model of information propagation on networks. In the SI


model, each node in networks is either in state S (susceptible) or in state I (infected).
Nodes in state S do not know the information and those in state I know the information.
At the beginning of information propagation (at time t = 1), a set of nodes in state I is
fixed as the seed nodes. For all edges (t, u, v) at time t = 1, 2, . . . , T , the following
operations are performed. If node u is in state I and node v in state S, information is
propagated from u to v with probability λ, which means the state of v is changed from
18 Approximation Methods for Influence Maximization in Temporal Networks 359

S to I at time t + 1. Probability λ is the parameter of susceptibility, and it controls the


percentage of information propagation. At time t = T + 1, information propagation
is terminated.
Based on the above notations, we can formulate influence maximization problem
as follows. We define σ (S) as the expected number of nodes of state I at time T + 1
when information propagation started at time 1 from seed nodes S of state I based
on SI model. (Please keep in mind that S in σ (S) is a set of seed nodes, and S in SI
model is susceptible state.) Influence maximization problem in a temporal network
is to search for a set of seed nodes S of size k that maximizes σ (S) when a temporal
network G, duration of the network T , susceptibility of SI model λ and the size of
seed nodes k are given.

18.2.2 Problems Related to Influence Maximization


in Temporal Networks

There are some problems related to influence maximization in temporal networks.


Instead of giving item (or information) to seed nodes for free, revenue maximization
(Babaei et al. 2013) is the problems of finding seed customers (nodes) and offering
discounts to them in order to increase total revenue. Although the problem is impor-
tant in the field of marketing, it is more complicated than influence maximization
problem since seed nodes are not treated as equal, and the amount of discount for
each node may not be equal. The number of possible parameters increases greatly
especially in the case of temporal networks.
Opinion formation (Jalili 2013, 2012; Afshar and Asadpour 2010) is another prob-
lem related to influence maximization problem. Each agent (node) has an opinion
which might be a continuous or a discrete quantity. The underlying network repre-
sents the society where the agents have interactions. Each agent has an opinion in
the society that is influenced by the society. Analyzing the increase and decrease of
each opinion is important for modeling the dynamics of opinion formation and for
opinion polarization (Garimella et al. 2018).
It is often pointed out that the properties of temporal networks are quite different
from those in static networks. Braha and Bar-Yam (2006; 2009) pointed out the over-
lap of the centrality in temporal networks and that in the aggregated (static) network
is very small. Hill and Braha (2010) propose dynamic preferential attachment mech-
anism that reproduce dynamic centrality phenomena. Holme presents good surveys
of temporal networks (Holme 2015; Holme and Saramäki 2012).
360 T. Murata and H. Koga

18.2.3 Influence Maximization Methods for Static Networks

Jalili presents a survey on spreading dynamics of rumor and disease based on cen-
trality (Jalili and Perc 2017). There are roughly three approaches for influence max-
imization problem in static networks. The first is Monte-Carlo simulation methods,
the second is heuristic-based methods and the third is the methods to generate a large
number of networks with random edge removal and select seed nodes based on the
generated networks.
Monte-Carlo simulation method is proposed by Kempe (2003). In Kepme’s
method, σ (S) is estimated by repeating Monte-Carlo simulations. When S is given
as a set of seed nodes, simulations of information propagation are repeated R times
and the average number of infected nodes is defined as σ (S). Next, the node v which
maximizes the difference σ (S ∪ {v}) − σ (S) is added to seed nodes greedily based
on the estimated σ (S). This operation is repeated until |S| = k.
Since σ (·) is a monotonic and submodular function, when we denote strict solution
of seed nodes as S ∗ , the seed nodes obtained by the above greedy algorithm Sgr eedy
are proved to satisfy σ (Sgr eedy ) ≥ (1 − 1/e)σ (S ∗ ) (Kempe et al. 2003). Because of
this property, qualities of the solutions by Kempe’s method are good. However, more
and more repetition of Monte-Carlo simulation is needed in order to estimate σ (S)
accurately. Since the computational cost for finding seed nodes with this method is
high, it is not possible to find seed nodes in realistic time for large scale networks.
Heuristic methods are proposed in order to search for seed nodes at high speed.
Chen (2010) proposes PMIA to find seed nodes focusing on the paths with high
information propagation ratio. Jiang (2011) proposed SAEDV which searches for
seed nodes by annealing method to obtain σ (·) from adjacent nodes in seed nodes.
Chen (2009) proposed Degree Discount based on node degree where the nodes
adjacent to already selected node are given penalty. This is because when node v
is selected as one of seed nodes and u is its neighbor, it is highly likely that v
propagates information to u, so selecting nodes other than u as seed nodes is better
for information diffusion.
Algorithm of Degree Discount is shown as follows. ti in the algorithm shows the
penalty of node i. ddi is the degree of node i after giving penalty. ddi is smaller when
the value of ti is bigger.
Morone et al. (2015) proposed a method for finding seed nodes considering the
degrees of distant nodes. The method calculates the following CIl (v) for each node
and selects seed nodes based on the values:

CIl (v) = (kv − 1) (ku − 1).
u∈∂ Ball(v,l)

∂ Ball(v, l) in the above formula represents nodes where the distance from node
v is l. The example of CIl (v) is explained in Fig. 18.1. ∂ Ball(v, 2) when l = 2 are
two red nodes with distance 2 from node v and the degrees of both nodes are 8.
Therefore, CI2 (v) = (2 − 1) × {(8 − 1) + (8 − 1)} = 14.
18 Approximation Methods for Influence Maximization in Temporal Networks 361

Algorithm 18.1 Degree discount


Inputs
Static network G
The size of seed nodes k
Susceptibility λ
Outputs
Seed nodes S
Algorithm
(1) Initialize seed nodes as S = φ, and initialize ddi = ki , ti = 0 for each node i. ki is the degree
of node i in network G.
(2) Add node v to seed nodes S such that v = argmaxi {ddi |i ∈ V \S}. V is the nodes in network
G.
(3) Update ddu and tu for all nodes u adjacent to v.

tu = tu + 1
ddu = ku − 2tu − (ku − tu )tu λ

(4) Repeat (2) and (3) until |S| = k.

Fig. 18.1 Example for


explaining CIl (v). When
l = 2, C Il (v) = 14

The degree of node v itself is low in the network in Fig. 18.1, but the node
v is effective for information propagation because it is connected with some high
degree nodes with distance two. This method thus selects seed nodes causing wider
propagation compared with the cases when seed nodes are selected based on the
degree of node v only.
These heuristic methods compute seed nodes faster than the methods based on
Monte-Carlo simulation. However, it is experimentally confirmed that the scale of
propagation of the methods depends on network structures and parameters.
Ohsaka et al. (2014) proposed a method to generate many networks with random
edge removal in order to solve this problem. Ohsaka’s method is based on “coin flip”
mentioned in Kempe’s paper (2003). The distribution of nodes to which information
is propagated from seed nodes S in static network G is set as DG (S). And distribu-
tion of nodes where information is propagated from seed nodes S on network where
edges are removed at constant ratio from the network G is set as DG (S). “Coin flip”
states as DG (S) equals to DG (S) in this situation. σ (·) can be estimated by generating
many networks with edges removed at constant ratio, not by repeating Monte-Carlo
simulation. Ohsaka’s method estimates σ (·) by acquiring Strongly Connected Com-
ponent (SCC) in each network generated by R R numbers of networks with edges
362 T. Murata and H. Koga

removed at constant ratio. SCC is a subgraph where each node in the subgraph can
be reachable to and from any other nodes.
Borgs (2014) and Tang (2014) also propose methods similar to Ohsaka’s method.
The difference from Ohsaka’s method is σ (·), which is not estimated directly from
generated networks. Reachable nodes from randomly selected node v are computed,
and then seed nodes are selected based on the nodes. More specifically, the algorithm
is as follows.

Algorithm 18.2 Algorithm by Borgs and Tang


Input
Static network G
The size of seed nodes k
Susceptibility λ
Generated number of networks θ
Outputs
Seed nodes S
Algorithm
(1) Initialize S = φ, U = φ. U is a set of all R R.
(2) Select node v at random.
(3) Remove edges with probability 1 − λ from network G and set as G p .
(4) Acquire nodes R R reachable to v by G p . Add R R to U .
(5) Repeat θ times from (2) to (4).
(6) Add node u with the highest frequency in U to S.
(7) Delete all R R containing u from U .
(8) Repeat (6) and (7) until |S| = k.

There are other approaches for influence maximization problem in different prob-
lem setting. Chen (2012) proposed a method to solve the problem with time limit.
Feng (2014) solves the influence maximization problem in a situation where fresh-
ness of the information degrades as it spreads. Mihara (2015) proposed a method to
influence maximization problem where the whole network structure is unknown.

18.2.4 Degrees in Temporal Networks

Notations of edges and paths in temporal networks are the same as the ones in Osawa
and Murata (2015). (t, u, v) represents an edge from node u to v at time t. A path
from node v1 to vk of length k − 1 is represented as (t1 , v1 , v2 ), (t2 , v2 , v3 ), . . . , (tk−1 ,
vk−1 , vk ), where t1 < t2 < . . . < tk−1 and ∀i, j (i = j), vi = v j . Duration of time
from the start to the end of a path tk−1 − t1 is the length of time of the path, and the
smallest one is the minimum length of time.
Habita et al. (2010) define degrees in temporal network using symmetric difference
of past connections and future connections. However, diffusion in temporal networks
18 Approximation Methods for Influence Maximization in Temporal Networks 363

Fig. 18.2 Example of low degree nodes in a temporal network. Nodes adjacent to node A do not
change over time ({B, C} → {B, C} → {B, C})

Fig. 18.3 Example of high degree nodes in a temporal network. Nodes adjacent to node A change
at each time ({B, C} → {D, E} → {B, C})

is from past to future only, and it is not bidirectional. We therefore define degree
DT (v) of node v in temporal network as follows:
 |N (v, t − 1)\N (v, t)|
DT (v) = |N (v, t)|,
1<t≤T
|N (v, t − 1) ∪ N (v, t)|

where N (v, t) is a collection of nodes adjacent to node v at time t. Figures 18.2 and
18.3 illustrate the examples of degrees on temporal networks.
In Fig. 18.2, adjacent nodes of node A do not change during the period. The
difference of adjacent nodes N (A, 1)\N (A, 2) and N (A, 2)\N (A, 3) are empty.
Therefore, the degree of node A in Fig. 18.2 is calculated as follows.

|N (A, 1)\N (A, 2)| |N (A, 2)\N (A, 3)|


D3 (A) = |N (A, 2)| + |N (A, 3)|
|N (A, 1) ∪ N (A, 2)| |N (A, 2) ∪ N (A, 3)|
0 0
= ∗2+ ∗2=0
2 2
On the other hand, in Fig. 18.3, nodes adjacent to node A change over time. So the
degree of node A is bigger than that in Fig. 18.2.

|N (A, 1)\N (A, 2)| |N (A, 2)\N (A, 3)|


D3 (A) = |N (A, 2)| + |N (A, 3)|
|N (A, 1) ∪ N (A, 2)| |N (A, 2) ∪ N (A, 3)|
2 2
= ∗2+ ∗2=2
4 4
364 T. Murata and H. Koga

In Figs. 18.2 and 18.3, the number of adjacent nodes of node A is the same every
time, so the average degree of node A is the same in Figs. 18.2 and 18.3. On the other
hand, if we employ DT (v) as the definition of node degree, D3 (A) = 0 in Fig. 18.2
and D3 (A) = 2 in Fig. 18.3. DT (v) captures the number of newly adjacent nodes, and
this is important for influence maximization problem. We therefore employ DT (v)
as the definition of node degree in temporal networks.

18.2.5 Influence Maximization Methods for Temporal


Networks

There are two approaches for influence maximization problem in temporal networks:
methods based on Monte-Carlo simulation and heuristic-based methods. The former
method is proposed by Habiba (2007). The method estimates the scale of propagation
σ (·) by repeating Monte-Carlo simulation just the same as in static networks. Since
σ (·) is monotonic and deteriorated modular also in temporal networks, this method
achieves large-scale propagation. However, the computational cost of this method is
high as in static networks. Osawa (2015) proposed a heuristic method for calculating
σ (·) at high speed. His algorithm for computing σ (S) for seed nodes S is shown as
follows.
After σ (S) is computed, seed nodes are obtained by greedy algorithm as in the
method by Monte-Carlo simulation. Osawa’s method finds seed nodes in realistic
computational time. However, the quality of its solution depends on given networks
because σ (·) is calculated approximately, and it is worse compared with the solutions
by Monte-Carlo simulation.

18.3 Proposed Methods

We propose new methods for influence maximization problem in temporal networks


in this section. We propose three new methods (Dynamic Degree Discount, Dynamic
CI and Dynamic RIS) which are the extensions of static network methods to temporal
network methods. We use the following notations: G: temporal network, T : duration
of the temporal network, k: the size of seed nodes, λ: susceptibility, θ : the number
of generated networks, and S: seed nodes.

18.3.1 Dynamic Degree Discount

Dynamic Degree Discount is the extension of Degree Discount by Chen et al. (2009)
to temporal networks. In Dynamic Degree Discount, definition of degrees and adja-
18 Approximation Methods for Influence Maximization in Temporal Networks 365

Algorithm 18.3 Osawa’s algorithm


Input
Temporal network G
Duration of temporal network T
Seed nodes S
Susceptibility λ
Output
The number of state I nodes σ (S)
Algorithm
(1) Initialize p̂v (1) as follows where the probability of node v being in state I is p̂v (t) in time
t (1 ≤ t ≤ T + 1).

1 v∈ S
p̂v (1) =
0 v∈
/S

(2) Update p̂v (t) in time t as follows.

p̂v (t + 1) = 1 − (1 − p̂v (t))Rv (t)



Rv (t) = (1 − p̂u (t)λ)
u∈N (v,t)

where Rv (t) is the probability of which information is not propagated to node v from any
adjacent nodes at time t. N (v, t) is the set of nodes adjacent to node v at time t.
(3) σ (S) is computed by adding probability p̂v (T + 1) of each node which is in state I at time
T + 1.

σ (S) = p̂v (T + 1)
v∈V

cent nodes in the algorithm of Degree Discount are modified for temporal networks.
Algorithm 4 shows the algorithm of Dynamic Degree Discount. Underlines show
the parts modified from original Degree Discount.

18.3.2 Dynamic CI

Dynamic CI is an extension of Morone’s method (Morone and Makse 2015) for


temporal networks. Morone’s method focuses on the degree of node v and the degrees
of nodes with distance l from v. Dynamic CI defines an index D_CIl (v) in which
degree and distance are extended to temporal networks.

D_CIl (v) = DT (v) DT (u)
u∈D Ball(v,l)

The differences between CIl (v) and D_CIl (v) are: (1) the definition of degree is
changed to that for temporal networks and (2) ∂ Ball(v, l) in CIl (v) is changed to
366 T. Murata and H. Koga

Algorithm 18.4 Dynamic degree discount


Input
Temporal network G
Duration of temporal network T
The size of seed nodes k
Susceptibility λ
Output
Seed nodes S
Algorithm
(1) Initialize seed nodes as S = φ. Also initialize the values of each node i as ddi = DT (i) and
ti = 0.
(2) Add node v where v = argmaxi {ddi |i ∈ V \S} to S. V is the set of nodes in the network.
(3) For all nodes u where u ∈ N T (v), update ddu and tu as follows.

tu = tu + 1
ddu = DT (u) − 2tu − (DT (u) − tu )tu λ

N T (v) represents a set of all nodes adjacent to v during the whole period of the temporal
network.


T
N T (v) = N (v, t)
t=1

(4) Repeat (2) and (3) until |S| = k.

D Ball(v, l). D Ball(v, l) represents nodes where their shortest duration of time from
node v is l. l is a parameter which takes the value within the range 1 ≤ l ≤ T . In the
algorithm of Dynamic CI, D_CIl (v) is computed for each node and top k nodes are
selected as seed nodes.

18.3.3 Dynamic RIS

Dynamic RIS is an extension of Borgs’s method (Borgs et al. 2014) and Tang’s
method (Tang et al. 2014) for temporal networks.
The difference between Borgs’s and Tang’s algorithms and Dynamic RIS are
where R R in their algorithm is set as R R(v, d) in our algorithm. R R(v, d) is a set of
all nodes that are reachable to v within the shortest duration of time d in all durations
of temporal networks, which is defined as follows:


T
R R(v, d) = R Rt (v, d).
t=1
18 Approximation Methods for Influence Maximization in Temporal Networks 367

Algorithm 18.5 Dynamic RIS


Inputs
Temporal network G
Duration of the temporal network T
Size of seed nodes k
Susceptibility λ
The number of generated networks θ
Output
Seed nodes S
Algorithm
(1) Initialize as S = φ, U = φ, where U is the set containing all R R(v, d).
(2) Select node v at random.
(3) Remove edges from temporal network G with probability 1 − λ and set as G p .
(4) Acquire nodes R R(v, d) reachable to v on G p . Add R R(v, d) to U .
(5) Repeat (2) to (4) θ times.
(6) Add the most frequent node u in U to S.
(7) Remove all R R(v, d) with u from U .
(8) Repeat (6) and (7) until |S| = k.

R Rt (v, d) is a set of nodes which are reachable to “node v at time t” within the short-
est period of d. The computational complexities of these methods are as follows.

Dynamic Degree Discount:


According to the paper of Chen (2009), the computational complexity of Degree
Discount is O(k · log n + m), where k is the number of seed nodes, n is the number
of nodes, and m is the number of edges, respectively. Dynamic Degree Discount is an
extension of Degree Discount. Static degree is replaced with dynamic one (DT (i))
and Static neighbors is replaced with dynamic one (N T (v)). Computational complex-
ity for dynamic degree and dynamic neighbors are Tn·m , where T is the total duration
of time of given temporal network. Therefore, the total computational complexity of
Dynamic Degree Discount is O(k · log n + m + Tn·m ).

Dynamic CI:
According to the paper of Morone (2015), the computational complexity of CI is
O(n · logn), where n is the number of nodes. Dynamic CI is an extension of CI.
Static degree is replaced with dynamic one (DT (i)), and its computational complex-
ity is Tn·m , where T is the total duration of time of given temporal network. Therefore,
the total computational complexity of Dynamic CI is O(n · log n + Tn·m ).

Dynamic RIS:
According to the paper of Tang (2014), the computational complexity of RIS is
O(k · l 2 (m + n)log 2 n/ε3 ) which returns (1 − 1e − ε)-approximate solution with at
least 1 − n −l probability, where l and ε are the constants. Computational complex-
ity of Dynamic RIS heavily depends on the parameters θ and d, which are the
number of generated networks and the duration of time for computing R R(v, d),
368 T. Murata and H. Koga

respectively. Therefore, the total computational complexity of Dynamic RIS is


O(θ · d · k · l 2 (m + n)log 2 n/ε3 ).

18.4 Experiments

We perform experiments for comparing the proposed methods with previous ones
in order to confirm their effectiveness. Temporal networks used for the experiments
are shown in Table 18.1. These networks are the same as the ones used in previous
research. Average
 degree in Table 18.1 shows the average of all nodes in the network,
which is |V1 | v∈V DT (v). Hospital (Vanhems et al. 2013) is a network about contacts
of patients and medical staffs at hospital with time. Primary School (Stehlé et al. 2011;
Gemmetto et al. 2014) is a network about contacts of students and teachers at school.
High School 2013 (Mastrandrea et al. 2015) is a network of contacts of students.
The unit of the duration in these three datasets is 20 s. Each dataset is available at
SocioPatterns (http://www.sociopatterns.org).
Methods used in the experiments are previous two methods (Monte-Carlo simu-
lation (MC Greedy) and Osawa) for temporal network explained in Sect. 18.2.5 and
our proposed methods (Dynamic Degree Discount, Dynamic CI and Dynamic RIS)
in Sect. 18.3. Given a network as input, each method computes seed nodes S. The
simulation of influence maximization based on SI model is repeated R times with
the obtained seed nodes and set the average of the number of nodes in state I as σ (S).
The values of σ (S) are compared in order to evaluate the methods.
Experiments are performed for the following purposes:
(1) Comparison of σ (S) when the size of seed nodes k changes
(2) Comparison of computational time when the size of seed nodes k changes
Parameters in the experiments are set as follows. The number of repetition of the
simulations for information propagation is set as R = 50. The number of repetition
of Monte-Carlo simulation in MC Greedy is set as 1000. These two parameters
are common in all experiments. The size of seed nodes k is set from 0% to 20%.
Susceptibility λ is set as λ = 0.01. It is difficult to perform experiments for all the
values as parameter l in Dynamic CI which takes the value of 1 ≤ l ≤ T . We use the
values l = 1, 5, 10, 20 in the experiments. As the parameters θ and d in Dynamic

Table 18.1 Dataset for the experiments


Nodes Edges Duration Ave. deg.
Hospital 75 32,424 9,453 69.3
Primary school 242 125,773 3,100 142.7
High school 2013 327 188,508 7,375 63.0
18 Approximation Methods for Influence Maximization in Temporal Networks 369

RIS, θ is set as θ = 1000. As for d, values d = 0, 5, 10, 20 are used since it is difficult
to perform experiments for all the value as in l of Dynamic CI.
CELF (Leskovec et al. 2007) is used to speedup the experiments when greedy
algorithms are used in MC Greedy and Osawa. CELF is an algorithm used when the
greedy algorithm is applied to the problem with inferior modularity, and the solution
is the same as in normal greedy algorithm. According to the experiments by Leskovec
(2007), computational time is 700 times faster than normal greedy algorithm when
CELF is used.

18.5 Experimental Results

18.5.1 Comparison of σ (S) When the Size of Seed Nodes


k Changes

The results of information propagation for each size of seed nodes k with fixed
susceptible λ = 0.01 of SI model are shown in Fig. 18.4. The x-axis of the Figure
shows the percentage of seed nodes, and the y-axis shows the number of infected
nodes. Values of the x-axis is |Vk | ∗ 100, the percentage of seed nodes to all nodes in
the network. Values of the y-axis is σ|V(S)| ∗ 100, the percentage of σ (S) to all nodes in
the network. The best values of l in Dynamic CI and d in Dynamic RIS are used in
our experiments. As shown in Fig. 18.4, MC Greedy achieves the highest diffusion in
all dataset. Diffusion of the proposed methods, Dynamic Degree Discount, Dynamic
CI and Dynamic RIS are inferior to MC Greedy, but they are still better than Osawa.
The scale of diffusion of Dynamic RIS in High School 2013 achieves 1.5 times as
in Osawa.
There is not much difference in the scale of diffusion among each of three proposed
methods. Dynamic RIS achieves the highest in High School 2013 for example, but the
difference among proposed methods is small compared with the difference between
proposed methods and previous methods (MC Greedy and Osawa).

18.5.2 Comparison of σ (S) When Susceptibility λ Changes

Figure 18.5 shows diffusion when the size of seed nodes is fixed as 20% of all nodes
in the networks and susceptibility is changed as λ = 0.001, 0.01, 0.05. The x-axis
shows the value of λ, and the y-axis shows the percentage of diffusion. Parameters
l and d are the same as the ones used in the previous experiments. As shown in
Fig. 18.5, MC Greedy achieves the highest diffusion regardless of the value of λ. The
difference among three proposed methods are small.
As the result of comparison with proposed methods and Osawa, our proposed
methods achieve higher scale of diffusion than Osawa in Hospital and High School
370 T. Murata and H. Koga

Hospital Primary School

High School 2013

Fig. 18.4 Comparison of σ (S) when the size of seed nodes k changes. Except (computationally
expensive) MC Greedy, three proposed methods are better than Osawa

2013 when λ = 0.05. Osawa achieves higher diffusion than Dynamic RIS only in
Primary School. When λ = 0.001, the difference between proposed methods and
Osawa is very small compared with the cases of other λ values.

18.5.3 Comparison of Computational Time When the Size


of Seed Nodes k Changes

Figure 18.6 shows the computational time when λ is set as λ = 0.01 and the sizes
of seed nodes are changed. A PC of Intel Core i7(3.4GHz) CPU and 8GB memory
is used for the experiments. X-axis shows the percentage of seed nodes, and y-axis
shows the computational time (log-scale).
Figure 18.6 shows that for all datasets, methods other than MC Greedy can com-
pute seed nodes in realistic time. MC Greedy needs several hours to compute seed
nodes. This shows that MC Greedy is intractable in realistic time for large scale
networks.
18 Approximation Methods for Influence Maximization in Temporal Networks 371

Hospital Primary School

High School 2013

Fig. 18.5 Comparison of σ (S) when susceptibility λ changes. Except (computationally expensive)
MC Greedy, three proposed methods are better than Osawa for most of the cases

Regarding the comparison among three proposed algorithm, computational time


of Dynamic Degree Discount and Dynamic CI are almost the same in all dataset.
Dynamic RIS is about the same computational time as the other two proposed meth-
ods in Hospital, and is faster in Primary School and High School 2013. Regarding
the comparison with proposed methods and Osawa, Dynamic RIS is approximately
7.8 times faster than Osawa except very small network (Hospital).

18.5.4 Parameters of Dynamic CI and Dynamic RIS

Diffusion of proposed methods with different parameters are shown in this section.
We change parameters l of Dynamic CI, and θ and d in Dynamic RIS.
372 T. Murata and H. Koga

Hospital Primary School

High School 2013

Fig. 18.6 Comparison of computational time when the size of seed nodes k changes. Methods
other than MC Greedy can compute in realistic time

18.5.4.1 Diffusion and Computational Time of Different l in Dynamic


CI

Diffusion and computational time when l in Dynamic CI changes to 1, 5, 10, 20 are


shown in Fig. 18.7. Left line graphs show the size of diffusion when l is changed in
each network. Right bar graphs show computational time. Left line graphs show that
diffusion depends on the value of l. Therefore, it is important to find appropriate l
in Dynamic CI. Since there is no simple correlation between the scale of diffusion
and the value of l (such as diffusion becomes larger as l becomes large), diffusion
for various values of l should be investigated and compared. Right bar graphs show
that there are no big differences of execution time when the value of l changes.
18 Approximation Methods for Influence Maximization in Temporal Networks 373

Diffusion of Hospital Computational time of Hospital

Diffusion of Primary School Computational time of Primary School

Diffusion of High School 2013 Computational time of High School2013

Fig. 18.7 Diffusion and computational time for different l in Dynamic CI. Left: there is no simple
correlation between the scale of diffusion and the value of l. Right: there are no big differences of
execution time when the value of l changes
374 T. Murata and H. Koga

18.5.4.2 Diffusion and Computational Time of Different θ in Dynamic


RIS

In Dynamic RIS, θ is a parameter for the number of generated graphs in R R(v, d).
Diffusion and computational time when parameter θ is changed to 500, 1000, 1500,
2000 are shown in Fig. 18.8. Left line graphs show that diffusion does not change
much when θ changes. However, the scale of diffusion is slightly small when θ = 500
in Hospital and High School 2013. This means that bigger θ is desirable from the
viewpoint of diffusion. On the contrary, right bar graphs show that higher value of θ
results in the increase of computational time. From the viewpoint of computational
time, smaller θ is better. Regarding the value of θ , there is a trade-off between the
scale of diffusion and the computational time. It is important to find smaller θ for
shorter computational time, but too small θ results in small-scale diffusion.

18.5.4.3 Diffusion and Computational Time of Different d in Dynamic


RIS

In Dynamic RIS, d is a parameter for the number of time steps for looking back. Figure
18.9 shows diffusion and executing time when parameter d changes to 0, 5, 10, 20.
Left line graphs show that there is almost no difference in diffusion when d changes,
while right bar graphs show that computational time increases as the value of d
becomes bigger. The scale of diffusion does not change even if the value of d becomes
bigger in our experiments.

18.6 Discussion

18.6.1 Analysis Focused on Diffusion of Each Node

In the experiments when susceptibility changes in Sect. 18.5.2, the difference between
the proposed methods and Osawa was small when λ = 0.001 compared with the
experiments with other values of λ. When λ = 0.05, Osawa outperforms proposed
methods only in Primary School. This section discusses these two points.
Figure 18.10 shows the distribution of diffusion σ ({v}) of each node v when
Monte-Carlo simulation is used. X-axis shows the percentage of diffusion from node
v to the whole network (σ ({v})), and Y-axis shows the frequency of the nodes with
each of the percentage in X-axis. When λ = 0.001, almost all nodes are less than 5%
of diffusion in all networks. This means that there is no big difference of the diffusion
from different seed nodes. This is the reason why the difference between proposed
methods and Osawa is small in the experiment in Sect. 18.5.2. On the contrary, there
are many nodes with more than 60% of diffusion in Primary School when λ = 0.05
compared with other two networks. In this case, large scale diffusion is easy to be
18 Approximation Methods for Influence Maximization in Temporal Networks 375

Diffusion of Hospital Computational time of Hospital

Diffusion of Primary School Computational time of Primary School

Diffusion of High School 2013 Computational time of High School 2013

Fig. 18.8 Diffusion and computational time of different θ in Dynamic RIS. Left: diffusion does
not change much when θ changes. Right: higher value of θ results in the increase of computational
time
376 T. Murata and H. Koga

Diffusion of Hospital Computational time of Hospital

Diffusion of Primary School Computational time of Primary School

Diffusion of High School 2013 Computational time of High School 2013

Fig. 18.9 Diffusion and computational time of different d in Dynamic RIS. Left: there is almost
no difference in diffusion when d changes. Right: computational time increases as the value of d
becomes bigger
18 Approximation Methods for Influence Maximization in Temporal Networks 377

Hospital Primary School

High School 2013

Fig. 18.10 Distribution of diffusion σ ({v}) of each node v. In the case of Primary School when λ =
0.05, there are many nodes of high diffusion. Therefore, large scale diffusion is easy to be achieved
even if the most appropriate seed nodes are not selected. This is the reason Osawa outperforms
proposed method in Primary School

achieved even if the most appropriate seed nodes are not selected. This is the reason
why Osawa outperforms proposed method in Primary School in Sect. 18.5.2.

18.6.2 Advantages and Disadvantages of Each of Proposed


Methods

Advantages and disadvantages of each of proposed methods are discussed in this


section. An advantage of Dynamic Degree Discount is that it contains no parameter,
so there is no need to adjust parameter. Its disadvantage is that it is only for SI
model, so the method cannot be used for other models. This is because Dynamic
Degree Discount is an extension of Chen’s Degree Discount which is for SI model.
There are other information propagation models such as LT model and Triggering
models proposed by Kempe et al. Dynamic Degree Discount cannot be applied to
such models.
378 T. Murata and H. Koga

An advantage of Dynamic CI is that it can be applied to many information prop-


agation models in contrast to Dynamic Degree Discount because Dynamic CI uses
only degree information when it calculates seed nodes. Its disadvantage is that the
ability of diffusion depends on the value of parameter l as mentioned in Sect. 18.5.4.1.
It is necessary to search for appropriate values of l for Dynamic CI. The parameter
l takes the value within the range 1 < l < T , so the search takes time in general.
An advantage of Dynamic RIS is that its computational time is short. As shown
in the experimental results, its computational time is shorter than other methods in
all networks except Hospital. As the method can be applied to large networks due
to its short computational time, this is a big advantage. Disadvantage of Dynamic
RIS is that it needs to adjust parameters θ and d. As mentioned in the previous
section, computational time becomes bigger as the parameter θ becomes bigger, and
the scale of diffusion becomes smaller for too small θ . Therefore, it is necessary to
set appropriate value for θ . However, parameter sensitivity of θ and d is not so much
compared with the sensitivity of l in Dynamic CI.

18.7 Conclusion

We propose three new methods for influence maximization problem in temporal


networks which are the extensions of the methods for static networks. As the result of
experiments for comparing with previous methods, MC Greedy and Osawa, our three
proposed methods are better than previous methods in the following sense. Although
the performance of MC greedy is better than these three methods, it is computationally
expensive and intractable for large scale networks. The computational time of our
proposed methods are more than 10 times faster than MC greedy, so they can be
computed in realistic time even for large scale temporal networks. As the comparison
with Osawa, the performances of these three methods are almost the same as Osawa,
but they are approximately 7.8 times faster than Osawa. Based on these facts, the
proposed methods are suitable for influence maximization in temporal networks.
The comparison of Dynamic Degree Discount, Dynamic CI and Dynamic RIS is
as follows. The choice of the methods should be done based on the following pros
and cons.
Dynamic Degree Discount
• It requires no parameter.
• It is applicable to SI model only.
Dynamic CI
• It is applicable to other information propagation models.
• The performance heavily depend on parameter l.
18 Approximation Methods for Influence Maximization in Temporal Networks 379

Dynamic RIS
• It is relatively fast among these three methods.
• It requires two parameters to be adjusted (θ and d).
Finding the strategies of choosing suitable method for given temporal network
is practically important. It is a challenging open question and is left for our future
work. The problem of adjusting the parameters for Dynamic CI and and Dynamic
RIS is also left for our future work.

Acknowledgements This work was supported by JSPS Grant-in-Aid for Scientific Research(B)
(Grant Number 17H01785).

References

M. Afshar, M. Asadpour, Opinion formation by informed agents. J. Artif. Soc. Soc. Simul. 13(4),
1–5 (2010)
M. Babaei, B. Mirzasoleiman, M. Jalili, M.A. Safari, Revenue maximization in social networks
through discounting. Soc. Netw. Anal. Min. 3(4), 1249–1262 (2013)
C. Borgs, M. Brautbar, J. Chayes, B. Lucier, Maximizing social influence in nearly optimal time, in
Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms (2014),
pp. 946–957
D. Braha, Y. Bar-Yam, Time-dependent complex networks: Dynamic centrality, dynamic motifs,
and cycles of social interactions, in Adaptive Networks: Theory, Models and Applications (2009),
pp. 39–50
D. Braha, Y. Bar-Yam, From centrality to temporary fame: dynamic centrality in complex networks.
Complexity 12(2), 59–63 (2006)
W. Chen, W. Lu, N. Zhang, Time-critical influence maximization in social networks with time-
delayed diffusion process, in Proceedings of the Twenty-Sixth AAAI Conference on Artificial
Intelligence (2012), pp. 592–598
W. Chen, C. Wang, Y. Wang, Scalable influence maximization for prevalent viral marketing in
large-scale social networks, in Proceedings of the 16th ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining - KDD ’10 (2010), pp. 1029–1038
W. Chen, Y. Wang, S. Yang, Efficient influence maximization in social networks, in Proceedings
of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
- KDD ’09 (2009), pp. 199–207
S. Feng, X. Chen, G. Cong, Z. Yifeng, C.Y. Meng, X. Yanping, Influence maximization with novelty
decay in social networks, in Proceedings of the Twenty-Eighth AAAI Conference on Artificial
Intelligence (2014), pp. 37–43
K. Garimella, G.D.F. Morales, M. Mathioudakis, A. Gionis, Polarization on social media. Web
Conf. 2018 Tutor. 1(1), 1–191 (2018)
V. Gemmetto, A. Barrat, C. Cattuto, Mitigation of infectious disease at school: targeted class closure
vs school closure. BMC Infect. Dis. 14(1), 1 (2014)
T.B.W. Habiba, T.Y. Berger-Wolf, Maximizing the extent of spread in a dynamic network. Technical
report, DIMACS Technical Report 2007-20, 10 pages (2007)
Y.Y. Habiba, T.Y. Berger-Wolf, J. Saia, Finding spread blockers in dynamic networks. Advances in
Social Network Mining and Analysis 5498, 55–76 (2010)
S.A. Hill, D. Braha, Dynamic model of time-dependent complex networks. Phys. Rev. E 82(046105),
1–7 (2010)
P. Holme, Modern temporal network theory: a colloquium. Eur. Phys. J. B 88(234), 1–30 (2015)
380 T. Murata and H. Koga

P. Holme, J. Saramäki, Temporal networks. Phys. Rep. 519(3), 97–125 (2012)


M. Jalili, Effects of leaders and social power on opinion formation in complex networks. Simulation
89(5), 578–588 (2012)
M. Jalili, Social power and opinion formation in complex networks. Phys. A 392(4), 959–966 (2013)
M. Jalili, M. Perc, Information cascades in complex networks. J. Complex Netw. 5(5), 665–693
(2017)
Q. Jiang, G. Song, G. Cong, Y. Wang, W. Si, K. Xie, Simulated annealing based influence maxi-
mization in social networks, in Proceedings of the Twenty-Fifth AAAI Conference on Artificial
Intelligence (2011), pp. 127–132
D. Kempe, J. Kleinberg, É. Tardos, Maximizing the spread of influence through a social network,
in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining - KDD ’03 (2003), pp. 137–146
J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, N. Glance, Cost-effective outbreak
detection in networks, in Proceedings of the 13th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining - KDD ’07 (2007), pp. 420–429
R. Mastrandrea, J. Fournet, A. Barrat, Contact patterns in a high school: a comparison between
data collected using wearable sensors, contact diaries and friendship surveys. PloS One 10(9),
e0136,497 (2015)
S. Mihara, S. Tsugawa, H. Ohsaki, Influence maximization problem for unknown social networks,
in Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks
Analysis and Mining 2015 - ASONAM ’15 (2015), pp. 1539–1546
F. Morone, H.A. Makse, Influence maximization in complex networks through optimal percolation.
Nature 524(7563), 65–68 (2015)
T. Murata, H. Koga, Methods for influence maximization in dynamic networks, in Proceedings of the
6th International Conference on Complex Networks and Their Applications (Complex Networks
2017). Studies in Computational Intelligence (Springer, 2017), pp. 955–966
T. Murata, H. Koga, Extended methods for influence maximization in dynamic networks. Comput.
Soc. Netw. 5(8), 1–21 (2018)
N. Ohsaka, T. Akiba, Y. Yoshida, K.I. Kawarabayashi, Fast and accurate influence maximization on
large networks with pruned monte-carlo simulations, in Proceedings of the Twenty-Eighth AAAI
Conference on Artificial Intelligence (2014), pp. 138–144
S. Osawa, T. Murata, Selecting seed nodes for influence maximization in dynamic networks, in Pro-
ceedings of the 6th Workshop on Complex Networks (CompleNet 2015). Studies in Computational
Intelligence (Springer, 2015), pp. 91–98
J. Stehlé, N. Voirin, A. Barrat, C. Cattuto, L. Isella, J.R. Pinton, M. Quaggiotto, W. den Broeck,
C. Régis, B. Lina, Others, High-resolution measurements of face-to-face contact patterns in a
primary school. PloS One 6(8), e23,176 (2011)
Y. Tang, X. Xiao, Y. Shi, Influence maximization: near-optimal time complexity meets practical
efficiency, in Proceedings of the 2014 ACM SIGMOD International Conference on Management
of Data (2014), pp. 75–86
P. Vanhems, A. Barrat, C. Cattuto, J.F. Pinton, N. Khanafer, C. Régis, B.A. Kim, B. Comte, N. Voirin,
Estimating potential infection transmission routes in hospital wards using wearable proximity
sensors. PloS One 8(9), e73,970 (2013)
Chapter 19
Temporal Link Prediction Methods
Based on Behavioral Synchrony

Yueran Duan, Qing Guan, Petter Holme, Yacheng Yang, and Wei Guan

Abstract Link prediction—to identify potential missing or spurious links in


temporal network data—has typically been based on local structures, ignoring long-
term temporal effects. In this chapter, we propose link-prediction methods based
on agents’ behavioral synchrony. Since synchronous behavior signals similarity and
similar agents are known to have a tendency to connect in the future, behavioral
synchrony could function as a precursor of contacts and, thus, as a basis for link pre-
diction. We use four data sets of different sizes to test the algorithm’s accuracy. We
compare the results with traditional link prediction models involving both static and
temporal networks. Among our findings, we note that the proposed algorithm is supe-
rior to conventional methods, with the average accuracy improved by approximately
2–5%. We identify different evolution patterns of four network topologies—a prox-
imity network, a communication network, transportation data, and a collaboration
network. We found that: (1) timescale similarity contributes more to the evolution of
the human contact network and the human communication network; (2) such con-
tribution is not observed through a transportation network whose evolution pattern
is more dependent on network structure than on the behavior of regional agents; (3)
both timescale similarity and local structural similarity contribute to the collaboration
network.

Y. Duan
School of Economics and Management, China University of Geosciences, Beijing, Beijing, China
Department of Computer Science, Aalto University, Espoo, Finland
Q. Guan (B)
School of Information Engineering, China University of Geosciences, Beijing, Beijing, China
e-mail: guanqing35@126.com
P. Holme
Department of Computer Science, Aalto University, Espoo, Finland
e-mail: petter.holme@aalto.fi
Y. Yang
School of Economics and Management, Tsinghua University, Beijing, China
e-mail: yang-yc21@mails.tsinghua.edu.cn
W. Guan
School of Economics and Management, China University of Geosciences, Beijing, Beijing, China

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 381
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_19
382 Y. Duan et al.

Keywords Link prediction · Temporal networks · Behavioral synchrony · Time


decay · Network evolution

19.1 Introduction

Homophily is an important mechanism for network link-formation in systems as


diverse as friendships, collaborations, investments, information diffusion, citations,
etc. (Garrod and Jones 2021; McPherson et al. 2001). Furthermore, some studies
argue that synchronous behavior between agents indicates a similarity (Dong et al.
2015; Wiltermuth and Heath 2009). For example, the times for stock-market agents
(investors, fund managers, companies, etc.) to respond to financial distress, policy
change, etc., could divide them into groups of different abilities—agents able to
respond similarly show homophily throughout a data set (Chakrabarti et al. 2021).
This phenomenon could also happen in trade (Meshcheryakova 2020), collabora-
tion (Li et al. 2020), information diffusion, communication (Riad et al. 2019), and
other scenarios.
For a decade, link prediction has been an active area of research (Lü and Zhou
2011). One common principle is to exploit neighborhood-based similarity. There
are two commonly used methods: First, first-order neighborhood methods, such
as common neighbor index (Liben-Nowell and Kleinberg 2003) and preferential
attachment index (Barabási and Albert 1999). Second, second-order neighborhood
methods, such as the Adamic-Adar index (Adamic and Adar 2003) and resource
allocation index (Zhou et al. 2009).
Several studies have incorporated temporal information into link prediction mod-
els to further test the time effects on network evolution (Ahmed and Xing 2009).
Recent progress in temporal link prediction involves four methods: time decay func-
tion (Liao et al. 2017; Bütün et al. 2018), matrix factorization-based prediction
model (Ma et al. 2017; Wu et al. 2018), deep learning (Xiang et al. 2020), and proba-
bility model (Ouzienko et al. 2010). The time decay function is popular in empirical
research due to its relatively light computational complexity, but two issues remain.
First, the behavior of an agent has the same effect on the link prediction, irrespective
of time. Second, the time decay function cannot capture the temporal changes in the
similarity between agents.
This research provides what we call a neighborhood-based similarity time vector
(NSTV) link prediction method, which quantitatively captures the behavioral syn-
chrony of agents and utilizes such information to design an effective and universal
link prediction method for temporal networks. We depicted the effects of time on
network evolution by providing a timescale similarity index that measures the similar
contacts at points in time between two agents. Then, we add the timescale similarity
index to the link prediction model to predict the potential linkage. We selected four
data sets of different sizes to test the accuracy of our algorithm. We also compared the
results with static network link prediction models and competing typical temporal
link prediction models. We also distinguished topologies of networks with different
19 Temporal Link Prediction Methods Based on Behavioral Synchrony 383

evolution patterns by exploring the combination of two similarity measures. This


research contributes to the effects of agents’ behavior synchrony on network evolu-
tion by adding the time information into the network and improving the prediction
accuracy of the NSTV model using the temporal similarity index.
The rest of this chapter is arranged as follows: Sect. 19.2 introduces the prob-
lem statement, and its evaluation metrics; Sect. 19.3 reviews some related works;
Sect. 19.4 presents the NSTV link prediction model; Sect. 19.5 presents the data sets;
Sect. 19.6 designs the experiments and analyzes the results; Sect. 19.7 provides the
conclusions.

19.2 Problem Statement and Evaluation Metrics

19.2.1 Temporal Link Prediction

Consider an undirected temporal network G(V, E, T ), where V = {v1 , v2 , . . . , vn },


E = {e1 , e2 , . . . , en }, and T = {τ1 , τ2 , . . . , τn } denote the sets of nodes, links, and
time stamps, respectively. Additionally we use the notations, n = |V |, m = |E|,
k = |T |. We do not consider self-connections and multiple links.
Aτh is an adjacency matrix of G τh :
⎡ ⎤
a11τh · · · a1nτh
⎢ ⎥
Aτh = ⎣ ... . . . ... ⎦ (19.1)
an1τh · · · annτh

where ai jτh = 1 if there is a link between vi and v j at time τh , if not, ai j = 0 .


Correspondingly, each node has a time vector according to the time of node
interaction. The temporal activity of node vi can be represented by:

Ti = (di1 , di2 , . . . , dik ) (19.2)

where, di j indicates the degree of node vi at time τ j .


According to this definition, we attempt to forecast the potential link at time τk+1 .
Definitions are listed in Table 19.1 for reference.

19.2.2 Evaluation Metrics

We choose AUC and ranking score as the performance evaluation indexes of the
algorithm. In general, we delete 10% edges randomly from the networks and the
remaining 90% edges make up the training set E T .
384 Y. Duan et al.

Table 19.1 Variables used to deal with each data unit


Variables Definition Variables Definition
V The set of nodes n The number of nodes
E The set of edges m The number of edges
T The set of time stamps k The number of time stamps
ET The set of training edges Ep The set of probe edges
EN The set of nonexistent edges
U The universal link set of all sx y The link prediction score of
nodes (U = N ×(N 2
−1)
for the link between node vx
undirected network) and node v y
(x) The neighbor set of node vx kx The degree of node vx

AUC is defined as the probability that the randomly selected missing link is given
a higher score than a randomly chosen nonexistent link (Lü and Zhou 2011). After
independent comparison for n times, AUC value can be defined as:

n  + 0.5n 
AU C = (19.3)
n

where n  indicates the number of times that the score of the link in E P is greater than
that of the edge in E N . n  indicates the number of times that the score of the link in
E P is equal to that of the link in E N . According to Eq. 19.3, a higher value of AUC
indicates the degree to which the potential linking mechanism of the algorithm is
more accurate than the random selection. AU C = 0.5 indicates the score is random.
Ranking score measures the ranking of links in the probe set of unknown links.
Let S = U − E T , S is the set of unknown links, including both in the probe set and
non-existent links. The ranking score is defined as (Zhou et al. 2007):

i∈E PRankSi 1 
RankS = = P
ri
|S|
(19.4)
|E |
P |E P | i∈E

where ri represents the rank of the edge i’s score in the set of unknown edges.
According to this equation, a smaller value RankS represents a higher edge ranking
in the probe set, which means that the probability of being predicted successfully is
higher so the algorithm’s accuracy is higher.
19 Temporal Link Prediction Methods Based on Behavioral Synchrony 385

Table 19.2 Indicators used to describe neighborhood-based similarity (NS)


Neighborhood-based Function Description
similarity
Common sxCyN = |(x) ∩ (y)| CN takes the number of common
Neighbor Index (CN) neighbors of two nodes as the
similarity score (Liben-Nowell and
Kleinberg 2003). The calculation
of this indicator is simple, and the
performance is competitive
|(x)∩(y)|
Jaccard Coefficient (JC) sxJyC = |(x)∪(y)| JC uses the ratio of the common
neighbor of two nodes to all the
neighbors of the two nodes as the
similarity score (Jaccard 1901)
Preferential Attachment sxPyA = k x k y PA reflects the principle of
Index (PA) preferential attachment (Barabási
and Albert 1999)

Adamic-Adar sxAyA = 1
z∈|(x)∩(y)| log(k z ) AA index considers that vertices
Index (AA) with fewer shared relations have
higher linking probability (Adamic
and Adar 2003)

Resource sxAyA = 1
z∈|(x)∩(y)| k z RA is similar to AA in its way of
Allocation (RA) empowerment (Zhou et al. 2009)

19.3 Related Work

19.3.1 Link Prediction in Static Network

The purpose of link prediction is to detect missing links and identify spurious inter-
actions from existing parts of the network, and we can try to reconstruct the observed
networks by using these models (Guimerà and Sales-Pardo 2009). A number of pre-
diction models have been developed, but the simplest framework for link prediction
is based on the similarity between nodes for design and prediction (Lü and Zhou
2011). The higher the similarity between two nodes is, the higher the probability
that two nodes may have a link (Newman 2001). For example, nodes with more
common neighbors tend to link probably (Liben-Nowell and Kleinberg 2003). In the
field of similarity-based link prediction, there are three main methods: local index,
quasi-local index, and global index (Lü and Zhou 2011).
Through the verification of a large number of empirical networks, scholars found
that the link prediction method based on local information similarity has a better
performance, which is also simple and easy to operate (Lü and Zhou 2011). Table 19.2
lists five widely used link prediction models based on local structural similarity.
The five methods will be used as a comparative test to verify the superiority of the
algorithm proposed in this chapter.
386 Y. Duan et al.

19.3.2 Link Prediction in Temporal Networks

Link prediction in temporal networks is a method of evaluating network evolving


mechanisms by obtaining network data that evolves over time, and then finding future
links. Due to the time-varying characteristics of complex systems, more scholars tried
incorporating time information into the link prediction model. At present, there are
several temporal link prediction methods: deep learning, matrix factorization-based
prediction model, time decay function, etc.

19.3.2.1 Temporal Link Prediction with Deep Learning

Graph embedding can retain the properties of temporal networks while mapping the
networks into low-dimensional vector spaces (Xiang et al. 2020). Network embed-
ding can solve the problem that graph data is difficult to input into machine learning
algorithms efficiently.
With the development of graph neural networks, deep learning frameworks such
as graph convolutional network (GCN), long short-term memory network (LSTM),
and generative adversarial networks (GAN) are gradually applied to the dynamic
prediction of complex networks (Lei et al. 2019). In particular, Lei et al. (2019)
used GCN to explore the local topological features of each time segment, used
LSTM model to represent the evolution characteristics of dynamic networks, and
finally used GAN framework to enhance the model to generate the next weighted
network segment. Based on this model, many improved algorithms have attracted
more scholars’ attention, such as adding time decay coefficient (Meng et al. 2019),
increasing attention mechanism (Yang et al. 2019), and so on.
Graph embedding is more practical than adjacency matrices because they pack
node attributes into a vector with smaller dimensions. Vector operations are simpler
and faster than graphically comparable operations. However, the prediction model
based on deep learning is complicated in design, its potential mechanism is unclear,
and it is not universal in different types of networks. Although these models perform
well, they are not widely used in empirical studies.

19.3.2.2 Matrix Factorization-Based Temporal Link Prediction

Link prediction can be regarded as an adjacency matrix filling problem, which


can be solved by combining explicit and implicit features of nodes and links with
matrix decomposition method by bilinear regression model (Menon and Elkan
2011). Because matrix/tensor is the most direct method to characterize networks,
matrix/tensor decomposition-based link prediction model is widely used in temporal
link prediction. Ma et al. (2017) proved the equivalence between the nonnegative
matrix decomposition (NMF) of a communicability matrix and the eigendecomposi-
19 Temporal Link Prediction Methods Based on Behavioral Synchrony 387

tion (ED) of the Katz matrix, providing a theoretical explanation for the application
of NMF to the temporal link prediction problem.
Dunlavy et al. (2011) expressed the link prediction problem as a periodic temporal
link prediction, compressed the data into a single matrix through weight assignment,
and obtained the time characteristics by tensor decomposition. Ma et al. (2018) used
graph communicability to decompose each network to obtain features and then col-
lapsed the feature matrix to predict temporal links. However, these studies all decom-
posed network or collapsed features at each time layer and ignore the relationship
among slices. To overcome this problem, scholars proposed a graph regularization
non-negative matrix decomposition algorithm for temporal link prediction without
destroying dynamic networks (Ma et al. 2018; Zhang et al. 2019). Based on the
graph regularization non-negative matrix decomposition algorithm, some scholars
fused the community topology as a piece of prior information into a new temporal
link prediction model, effectively improving the prediction accuracy (Zhang et al.
2021).
Compared with the prediction model based on deep learning methods, the matrix
decomposition-based prediction model has much fewer parameters to learn. How-
ever, it is difficult to be used in large-scale networks and empirical studies due to the
high cost of matrix/tensor calculation.

19.3.2.3 Link Prediction with a Time Decay Function

During the growth of the network, agents exhibit heterogeneous fitness values that
decay over time (Medo et al. 2011). Therefore, many scholars have incorporated the
time decay function into the link prediction model based on structural similarity to
obtain a better prediction effect (Liao et al. 2017; Bütün et al. 2018). Meanwhile,
some scholars added the time decay function as the weight to the node pair for graph
embedding and finally added it into the framework of deep learning (Meng et al.
2019).
The time decay effect was proposed because scholars focused on the mechanism
that some effects weaken over time during the evolution of networks. The time decay
function-based prediction model is simple and performs well in some networks.
However, the time decay function appears as the weight of some attributes of networks
and does not distinguish the network structure of different time layers. The time-
based link prediction method proposed in this chapter also adds the time factor to
the original prediction model, so the prediction model with the time decay factor is
mainly selected for comparison experiments.

19.3.2.4 Other Temporal Link Prediction Models

In addition to the above models, some temporal link prediction models, such as time
series-based prediction models, probabilistic prediction models, and higher-order
structure prediction models, have also been well developed.
388 Y. Duan et al.

The time series-based temporal link prediction model generally builds time series
based on various node similarity measures and scores the possibility of node pair
linking by time series prediction model (da Silva Soares and Prudêncio 2012; Huang
and Lin 2009; Yang et al. 2015). One case is that Güneş et al. (2016) combined some
Autoregressive (AR) and Moving Average (MA) processes as time series prediction
models. Probabilistic models quantify change and uncertainty by using maximum
likelihood methods or probability distributions. The probability distribution describes
the range of possible values and gives the most likely values, which is helpful in
considering all possible outcomes in the process of temporal link prediction. Such
methods focus on constructing probabilistic models, such as the Exponential Random
Graph Model (ERGM) (Ouzienko et al. 2010). To better simulate real interaction
scenarios, some scholars have developed temporal link prediction methods based on
motif features or higher-order features to help us understand and predict the evolution
mechanism of different systems in the real world (Yao et al. 2021; Li et al. 2018; He
et al. 2023).

19.4 From Time Decay Function to Time Vector

19.4.1 Neighborhood-based Similarities with a Temporal


Logarithmic Decay Function (NSTD) Link Prediction
Model

The NSTD link prediction model uses a time decay function as the weight of the
local structure similarity score. The model is simple and common in design and can
be embedded into some complex models. Therefore, the NSTD prediction model is
popular in both empirical research and algorithm improvement.
This chapter chooses the NSTD link prediction model with a temporal logarith-
mic decay function as a comparison experiment of link prediction algorithms. The
temporal function can be defined as (Bütün et al. 2018):

Ft (vx , vz ) = log[(t (vx , vz ) − tstart ) + c] (19.5)

where c is a constant. Then, the time function is added as a weight to the similarity
indicators based on common neighbors. The formulas are shown in Table 19.3:

19.4.2 Neighborhood-based Similarities and Temporal Vector


(NSTV) Link Prediction Model

This chapter proposes a neighborhood-based similarity with a temporal vector


(NSTV) link prediction algorithm. The temporal vector of each agent is generated
19 Temporal Link Prediction Methods Based on Behavioral Synchrony 389

Table 19.3 Functions used to describe NSTD link prediction model


Algorithm Temporal extensions

CN sxTyC N = vz ∈(x)∩(y) f t (vx , vz ) + f t (v y , vz )
 f t (vx ,vz )+ f t (v y ,vz )
JC sxTyJ C = vz ∈(x)∩(y) 
va ∈(x) f t (vx , va )+ vb ∈(y) f t (v y , vb )
 
PA sxTyP A = va ∈(x) f t (vx , va ) × vb ∈(y) f t (v y , vb )
 t (vx ,vz )+ f t (v y ,vz )
f
AA sxTyA A = vz ∈(x)∩(y)
log(1+ v ∈(z) f t (vz , vc ))
c
 f (v ,v )+ f (v ,v )
RA sxTyR A = vz ∈(x)∩(y)  t x z t y z
vc ∈(z) f t (vz , vc ))

Fig. 19.1 A sample of the neighborhood-based similarity with temporal vector link prediction
algorithm. a Calculate the common neighbors of unknown links: sC N (1, 4) = 2; sC N (2, 3) = 2;
sC N (2, 5) = 1; sC N (3, 5) = 1; sC N (4, 5) = 0. The most potentially possible links are 1–4 and
2-3 (sC N (1, 4) = sC N (2, 3) = 2). b Calculate the temporal vectors of each node: 1 : [1, 1, 1],
2 : [2, 0, 0], 3 : [0, 2, 0], 4 : [1, 1, 0], 5 : [0, 0, 1]. The timescale similarity between nodes is:
cos(1, 4) = 0.667; cos(2, 3) = 0. Because cos(1, 4) > cos(2, 3), then the most potentially possible
link is 1–4

according to whether the agent interacts at each point of time, and the similarity
measure of a temporal vector is added to the original link prediction model.
Specifically speaking, in the link prediction process based on common neighbors,
the common neighbors of node pairs are positive integers. Many node pairs have the
same number of common neighbors whose priority is indistinguishable. In this case,
the degree of similarity between agents can be judged by analyzing their behavior
synchrony at different times. As shown in Fig. 19.1, the network generation has n
time periods, so each agent has an n-dimensional temporal vector. The temporal
vector of each agent is obtained by marking d when interaction occurs d times at
time 1, and 0 when interaction does not occur at the next time. Then, we choose the
cosine of the temporal vectors of the two agents as the measurement of the similarity
390 Y. Duan et al.

between the agents and add it to the traditional link prediction model based on the
neighbor similarity.
Timescale similarity of behaviors between agents can be defined as:
k
i=1 (txi × t yi )
tx y = cos(Tx , Ty ) = k k (19.6)
i=1 (txi ) × i=1 (t yi )
2 2

The link prediction score of this method can be defined as:

tx y sx y
Sim xNyST V = α × + (1 − α) × , α ∈ [0, 1] (19.7)
tmax smax

where tmax represents the maximum of the timescale similarity of all node pairs, and
sx y represents the link prediction score based on a certain local structure. Accordingly,
smax represents the maximum score of all node pairs under the link prediction model.

19.4.3 Neighborhood-based Similarities with a Temporal


Logarithmic Decay Function and Temporal Vector
(NSTDV) Link Prediction Model

The link prediction method based on the time decay function assumes that the
homophily effect of the agent will decrease with time. NSTD link prediction model
takes the time decay function as the weight of local structure similarity between
agents. The agents’ timescale similarity proposed in this chapter can also be used
as an index of the similarity between agents. Then, does timescale similarity have
a decay effect? Therefore, based on the NSTD link prediction model, we also take
the time decay function as the weight of the timescale similarity index to explore the
decay effect of agent behavior synchrony.
The temporal activity of node vi after adding the time decay function can be
expressed as:

Ti D = (log(1 + c)di1 , log(2 + c)di2 , . . . , log(k + c)dik ) (19.8)

Timescale similarity with the time decay function of behaviors between agents
can be defined as:
txDy = cos(TxD , TyD ) (19.9)

The link prediction score of this method can be defined as:

txDy sxDy
Sim xNySTDV = α × D
+ (1 − α) × D
, α ∈ [0, 1] (19.10)
tmax smax
19 Temporal Link Prediction Methods Based on Behavioral Synchrony 391

D
where tmax represents the maximum of the timescale similarity with the time decay
function of all node pairs.

19.4.4 Neighborhood-based Similarities Temporal Vector for


Heterogeneous Time Layer (NSTHV) Link Prediction
Model

The NSTDV model distinguishes the importance of time layers in the form of time
decay. The AA index considers that the contribution of less-connected common
neighbors is greater than that of more-connected common neighbors (Adamic and
Adar 2003). Inspired by the AA index, we believe that the less-connected time layer
contributes more than the more-connected time layer. Therefore, we proposed the
NSTHV prediction model, and the importance of the time layer can be expressed as:

1
hi = (19.11)
log(m i + c)

where h i represents the importance of time layer i, m i represents the number of time
layer i’s edges, and c is a constant. The temporal activity of node vi after adding the
importance of the time layer can be expressed as:

Ti H = (h 1 di1 , h 2 di2 , . . . , h k dik ) (19.12)

Timescale similarity for heterogeneous time layers can be defined as:

txHy = cos(TxH , TyH ) (19.13)

The link prediction score of this method can be defined as:

txHy sxHy
Sim xNySTH V = α × H
+ (1 − α) × H
, α ∈ [0, 1] (19.14)
tmax smax

H
where tmax represents the maximum of the timescale similarity with the importance
of the time layer of all node pairs.

19.5 Data

In this chapter, four different temporal networks are selected to verify the algorithm.
They are (1) Hypertext; (2) UK airline; (3) Hollywood; (4) EU email. These four
networks have been briefly introduced in the text above.
392 Y. Duan et al.

Table 19.4 Summary statistics of the four data sets


Dataset AD AWD ND Density ACC APL
Hypertext 38.867 368.46 3 0.347 0.54 1.656
UK airline 14.764 109.673 4 0.273 0.699 1.833
Hollywood 8.4 21.345 4 0.156 0.331 2.161
EU email 30.357 638.142 7 0.031 0.439 2.622

We calculated the average degree (AD), average weighted degree (AWD), network
diameter (ND), density, average clustering coefficient (ACC), and average path length
(APL) of the four networks in static state, as shown in Table 19.4. The four network
visualization pictures are shown in Fig. 19.2. Among them, Hypertext and Eu Email
have obvious community structures, while UK Airline shows typical geographical
characteristics.
To understand the distribution characteristics of the agent behavior at different
points in time, we also draw the distribution characteristics of node vectors (shown
in Fig. 19.3). The agent behavioral characteristics of Hypertext and EU email both
show periodicity, which is relevant to the law of human activity (for example, people
tend to socialize during the day rather than at night). UK airline, except that agent
behavior, is more frequent at time 1 and then more even, whereas the Hollywood
actor behavior is concentrated in the middle, which also correlates with the time of
an actor’s career.

19.6 Experiments

19.6.1 Experimental Setup

We proposed a new link prediction model with time information named NSTV link
prediction model. Therefore, the accuracy of this algorithm should be compared with
the NS link prediction model without time information and the NSTD link prediction
model with a time decay function. In addition, we proposed the NSTDV model
and NSTHV model to distinguish the importance of the time layer from different
perspectives and compare them with NSTV model. In the contrast experiments, α
was selected to 0.5.
To understand the contribution allocation of timescale similarity and local struc-
ture similarity to the link prediction model in different types of networks, we con-
ducted experiment 3 to observe the potential linking mechanism of different types
of networks by adjusting different α values.
19 Temporal Link Prediction Methods Based on Behavioral Synchrony 393

Fig. 19.2 Visualization of the four empirical networks. Colors represent different communities,
obtained by a network clustering algorithm (for the purpose of this paper, they only serve as deco-
ration)

19.6.2 Experimental Datasets

Experiments are conducted on four different datasets: (1) Hypertext 2009 (Isella et al.
2011), (2) UK airline Morer et al. (2020), (3) Hollywood (Taylor et al. 2017), and
(4) EU email (Paranjape et al. 2017). To verify the universality of the algorithm, we
selected four different timing networks. They differ in network size, timestamp size,
time span, and domain. The details are shown in Table 19.5. Hypertext describes
394 Y. Duan et al.

Fig. 19.3 Time vector distribution of four data sets: Due to a large number of timestamps in
a Hypertext and d EU email, the abscissa of the time vector distribution of these two data sets
is the timestamp serial number, and the ordinate is the number of interactions. The distribution
characteristics of the time vector of the UK Airline b and Hollywood c data sets are the timestamp
serial number on the abscissa and the node serial number on the ordinate (the serial numbers are
ranked according to the degree value of the node)

a network of face-to-face contacts of the attendees of the ACM Hypertext 2009


conference. UK airline describes a temporal network of domestic flights operated
in the United Kingdom between 1990 and 2003. Hollywood describes a network of
all collaborations among 55 actors who were prominent during the Golden Age of
Hollywood (1930–1959). EU email is a temporal network that was generated using
email data from a large European research institution.

19.6.3 Experimental Results

The experimental results consist of three parts, which are (1) the comparative exper-
iment between the NSTV link prediction model with time information and the NS
link prediction model in static networks, (2) the comparative experiment between
the NSTV link prediction model with time vector, the NSTD link prediction model
with time decay function, and NSTDV link prediction model with time decay func-
tion, and (3) the proportion of the distribution of time and local structure in the link
prediction model.
19 Temporal Link Prediction Methods Based on Behavioral Synchrony 395

Table 19.5 Four data sets’ attributes and what they represent
Dataset No. nodes No. links No. contacts Time span Network
domain
Hypertext 113 2196 5246 2.5 days Human
proximity
network
UK airline 47 406 14 1990–2003 Transportation
network
Hollywood 55 1043 10 1909–2009 Collaboration
network
EU email 986 24929 206313 803 days Human com-
munication
network

19.6.3.1 Experiment 1: Comparing the NS and NSTV Models

To explore whether the introduction of time information can supplement the existing
link prediction model, we compared the NSTV prediction model with the NS predic-
tion model and NSTD prediction model to observe whether the algorithm accuracy
is improved. The experimental results are shown in Fig. 19.4.
A low RankS value and high AUC value indicate the high accuracy of the algo-
rithm. It can be seen from the figure that: firstly, compared with NS model and
NSTD model, the NSTV algorithm proposed in this chapter shows a better pre-
diction effect in all four data sets. Therefore, we have evidence to believe that the
temporal features of the network play an important role in identifying the mecha-
nism of network generation. Secondly, NSTV algorithm maintains the accuracy of
different local structure algorithms at a relatively stable and high level. For example,
in the dataset of EU email, the algorithm RankS score obtained by using PA index of
NS model and NSTD model is very low, which is far lower than the average level of
other algorithms. However, NSTV link prediction model brings the accuracy back to
the average level. Thirdly, the improvement of NSTV algorithm accuracy is slightly
different in different types of data sets. In the human contact network, the accuracy
improvement is obvious, while that in the transportation network is the least. This is
because the contact behavior of participants in the contact network changes greatly
with time, so time has a great effect on the generation and evolution of such networks.
The evolution of the transportation network depends more on the distance between
two places, economic forces, national relations, and other practical factors. Once a
stable structure is formed, it is difficult to change over time. This indicates that time
has different effects on different types of networks, and its realistic factors should
be considered more carefully in empirical studies. We will discuss the difference in
more detail later. Finally, the accuracy of NSTD model and NS model is similar, and
it has slight advantages in some situations. This shows that the time decay effect of
the local structure effect of the four types of networks is not obvious.
396 Y. Duan et al.

Fig. 19.4 Comparison of the NS, NSTV, and NSTD algorithms by RankS and AUC

19.6.3.2 Experiment 2: Comparing the NSTV, NSTDV, and NSTHV


Models

The NSTV model does not consider the importance of different time layers to
the topology in computing the timescale similarity. We proposed the NSTDV and
NSTHV models that take into account the heterogeneity effect of time layers. We
compared the accuracy of NSTV model, NSTDV model, and NSTHV model, and
the experimental results are shown in Fig. 19.5.
Compared with the NSTV model, the two models with heterogeneous time effect,
NSTDV and NSTHV showed no significant improvement in most cases. However, in
19 Temporal Link Prediction Methods Based on Behavioral Synchrony 397

Fig. 19.5 Comparison of AUC and RankS in NSTV, NSTDV, and NSTHV algorithms

the collaboration network, the accuracy of NSTDV and NSTHV models is improved,
especially the prediction model based on CN, PA, and AA indicators. This is because
the collaboration network has a large time span and the evolution of the network has
a certain dependence on time similarity. From another perspective, the accuracy of
NSTHV model is slightly better than that of NSTDV model in most situations, but
the gap between the two is not obvious. Therefore, we can ignore the time layer and
still achieve a good prediction effect.
398 Y. Duan et al.

19.6.3.3 Experiment 3: The Contribution of Timescale Similarity and


Local Structure Similarity

In the previous comparative experiments, α was taken as 0.5. In other words, we


evenly distributed the contribution of the node’s timescale similarity and local struc-
ture similarity to the link prediction model. However, we have found in previous
experiments that NSTV algorithm improves to different degrees in different types of
networks. Therefore, the contributions of time and local structure in the prediction
model are different for different types of networks due to different potential linking
mechanisms. So, big time or big local structure in the four types of networks?
Based on these problems, we observe the contribution distribution of time and
local structure in the four types of networks by changing the value of α. We set the
initial value of α to 0 (contribution is devoted all by local structure), the step size
is 0.1, and the maximum value is 1 (contribution is devoted all by time). Because
AUC has certain random selectivity in the calculation process, the random error will
affect the judgment of the evolution trend of the network. Therefore, we selected the
RankS index to observe the accuracy of the prediction model. Although the randomly
selected probe sets in the prediction model also produce random errors, it is reliable
that we choose the same probe sets when adjusting α values. The experimental results
are shown in Fig. 19.6.
To summarize our experimental results:

Fig. 19.6 The contribution of timescale similarity and local structure similarity in four data sets
19 Temporal Link Prediction Methods Based on Behavioral Synchrony 399

• In the Hollywood collaborative network, when the weight of timescale similarity is


about 0.4 and the weight of local structure similarity is about 0.6, the algorithm has
the highest accuracy. It shows that in the cooperation network, having a common
partner plays a more important role in an agent’s choice of cooperation agents.
The similarity of the choice of cooperation time also has a greater impact.
• In the UK airline aviation network, the accuracy is highest when the contribution
of timescale similarity is only 0.2–0.3. This shows that the aviation network is
not susceptible to the influence of time and is more dependent on its structural
characteristics. This is consistent with the phenomenon that the algorithm accuracy
of the temporal link prediction model in the aviation network was not significantly
improved in Experiment 1.
• The situation of Hypertext and EU email is similar. The accuracy trend of the
prediction algorithm is almost the same. When the timescale similarity weight
is 0.8–0.9, the accuracy is the highest. Moreover, the timescale similarity weight
is 1 in the EU email dataset. This indicates that both the timescale similarity of
face-to-face communication between people in an unfamiliar environment and the
timescale similarity of emails delivered by colleagues between departments affect
the choice of the next communication partner. Even in email networks, the effects of
local network structures, such as common neighbors and preferred connections,
can be ignored. The result also proves that the human proximity network and
the human communication network have similar evolutionary characteristics, and
empirical research can consider learning from each other’s evolutionary models.
We found out more about the NSTV model. The RankS value of the model drops
faster when α is moved from 0 to 0.1 than after. This once again proves that the
addition of timescale similarity can better predict potential links. Among the four
networks, the best method of NSTV algorithm combined with local structure sim-
ilarity is different. This implies that the dynamic mechanism of different types of
networks is not completely consistent. In empirical research, we need to design a
more suitable prediction model by exploring the evolution mechanism of different
networks.

19.7 Conclusion

We proposed the NSTV link prediction model by computing nodes’ behavior syn-
chrony as a similarity measure by exploring the temporal characteristics of the net-
work and adding this index to the local structure similarity-based prediction models.
Experimental results showed that the timescale similarity index based on nodes’
time activity vector could be used as the potential linking mechanism of network
generation to predict future links. In the four data sets, the NSTD model does not
perform well, which proves that not all networks have a decay effect of local structure
similarity. NSTDV and NSTHV are not significantly better than NSTV model: the
prediction is not much improved after the heterogeneous time layer effect is compen-
400 Y. Duan et al.

sated for. Therefore, we can ignore the heterogeneous effect of time layers to some
extent, whether it is the contribution of time decay or reciprocal link number.
The innovation of this chapter mainly shows as follows: First, we quantified the
agents’ behavior synchrony index by mining the network time information. Secondly,
we proposed an effective and universal temporal link prediction model. This NSTV
model not only reflects the full timescale feature of the network but also mines the
temporal characteristics of the network. Furthermore, we distinguished the potential
linking mechanism of different types of networks through the ratio of different simi-
larity indexes, which is very beneficial to further understanding networks’ evolution
characteristics. Finally, most previous empirical studies on temporal link prediction
have mainly focused on the networks related to human behavior. This chapter is the
first to include the evolutionary time characteristics of infrastructure networks in the
link prediction model. The experimental results showed that the time characteristic
has some weak effect on this kind of network.
Based on the research and methods in this chapter, we believe that time, as a rep-
resentation of agents’ behavior synchrony, can represent a brand-new characteristic
of network evolution, and the evolution mechanism and future trend of the entire
network can be obtained through the changes of agents’ behavior choices.
The NSTV model shows that the contribution of behaviors is not simply decaying
in the full timescale, and some behaviors in the past are still significant. Moreover,
our experiment proved that the accuracy of the two models with heterogeneous time
effects did not improve significantly. A universal temporal link prediction method,
to some extent, can ignore time-layer heterogeneity. However, the prediction results
based on different network characteristics and time heterogeneity are more repre-
sentative of the real world. In empirical studies, how to identify critical time points,
assign weights to agents’ behavioral synchrony, and the importance of heterogeneous
time will be the focus of our future attention.

Acknowledgements Y.D., Q.G. and W.G. are grateful for the funding from the National Natural
Science Foundation of China (Grant No. 42001236), Young Elite Scientist Sponsorship Program
by Bast (Grant No. BYESS2023413), and the National Natural Science Foundation of China (Grant
No. 71991481, 71991480). P.H. was supported by JSPS KAKENHI Grant Number JP 21H04595.

References

L.A. Adamic, E. Adar, Friends and neighbors on the web. Soc. Netw. 25(3), 211–230 (2003)
A. Ahmed, E.P. Xing, Recovering time-varying networks of dependencies in social and biological
studies. Proc. Natl. Acad. Sci. USA 106(29), 11878–11883 (2009)
A.L. Barabási, R. Albert, Emergence of scaling in random networks. Science 286(5439), 509–512
(1999)
E. Bütün, M. Kaya, R. Alhajj, Extension of neighbor-based link prediction methods for directed,
weighted and temporal social networks. Inf. Sci. 463, 152–165 (2018)
P. Chakrabarti, M.S. Jawed, M. Sarkhel, Covid-19 pandemic and global financial market interlink-
ages: a dynamic temporal network analysis. Appl. Econ. 53(25), 2930–2945 (2021)
19 Temporal Link Prediction Methods Based on Behavioral Synchrony 401

P.R. da Silva Soares, P.B.C. Prudêncio, Time series based link prediction, in The 2012 International
Joint Conference on Neural Networks (IJCNN) (IEEE, 2012), pp. 1–7
P. Dong, X. Dai, R.S. Wyer Jr., Actors conform, observers react: the effects of behavioral synchrony
on conformity. J. Pers. Soc. Psychol. 108(1), 60 (2015)
D.M. Dunlavy, T.G. Kolda, E. Acar, Temporal link prediction using matrix and tensor factorizations.
ACM Trans. Knowl. Discov. Data (TKDD) 5(2), 1–27 (2011)
M. Garrod, N.S. Jones, Influencing dynamics on social networks without knowledge of network
microstructure. J. R. Soc. Interface 18(181), 20210,435 (2021)
R. Guimerà, M. Sales-Pardo, Missing and spurious interactions and the reconstruction of complex
networks. Proc. Natl. Acad. Sci. USA 106(52), 22073–22078 (2009)
İ Güneş, Ş Gündüz-Öğüdücü, Z. Çataltepe, Link prediction using time series of neighborhood-based
node similarity scores. Data. Min. Knowl. Discov. 30, 147–180 (2016)
X. He, A. Ghasemian, E. Lee, A. Clauset, P. Mucha, Sequential stacking link prediction algorithms
for temporal networks (2023). Preprint available at Research Square
Z. Huang, D.K. Lin, The time-series link prediction problem with applications in communication
surveillance. INFORMS J. Comput. 21(2), 286–303 (2009)
L. Isella, J. Stehlé, A. Barrat, C. Cattuto, J.F. Pinton, W. Van den Broeck, What’s in a crowd?
Analysis of face-to-face behavioral networks. J. Theor. Biol. 271(1), 166–180 (2011)
P. Jaccard, Étude comparative de la distribution florale dans une portion des alpes et des jura. Bull.
de la Soc. Vaud. des Sci. Nat. 37, 547–579 (1901)
K. Lei, M. Qin, B. Bai, G. Zhang, M. Yang, Gcn-gan: a non-linear temporal link prediction model
for weighted dynamic networks, in IEEE INFOCOM 2019-IEEE Conference on Computer Com-
munications (IEEE, 2019), pp. 388–396
Y. Li, Y. Wen, P. Nie, X. Yuan, Temporal link prediction using cluster and temporal information
based motif feature, in 2018 International Joint Conference on Neural Networks (IJCNN) (2018),
pp. 1–8 10.1109/IJCNN.2018.8489644
A. Li, L. Zhou, Q. Su, S.P. Cornelius, Y.Y. Liu, L. Wang, S.A. Levin, Evolution of cooperation on
temporal networks. Nat. Commun. 11(1), 2259 (2020)
H. Liao, M.S. Mariani, M. Medo, Y.C. Zhang, M.Y. Zhou, Ranking in evolving complex networks.
Phys. Rep. 689, 1–54 (2017)
D. Liben-Nowell, J. Kleinberg, The link prediction problem for social networks, in Proceedings of
the Twelfth International Conference on Information and Knowledge Management (2003), pp.
556–559
L. Lü, T. Zhou, Link prediction in complex networks: a survey. Phys. A: Stat. Mech. 390(6), 1150–
1170 (2011)
X. Ma, P. Sun, G. Qin, Nonnegative matrix factorization algorithms for link prediction in temporal
networks using graph communicability. Pattern Recognit. 71, 361–374 (2017)
X. Ma, P. Sun, Y. Wang, Graph regularized nonnegative matrix factorization for temporal link
prediction in dynamic networks. Phys. A: Stat. Mech. 496, 121–136 (2018)
M. McPherson, L. Smith-Lovin, J.M. Cook, Birds of a feather: homophily in social networks. Annu.
Rev. Sociol. 27(1), 415–444 (2001)
M. Medo, G. Cimini, S. Gualdi, Temporal effects in the growth of networks. Phys. Rev. Lett.
107(23), 238,701 (2011)
Y. Meng, P. Wang, J. Xiao, X. Zhou, Nelstm: a new model for temporal link prediction in social
networks, in 2019 IEEE 13th International Conference on Semantic Computing (ICSC) (IEEE,
2019), pp. 183–186
A.K. Menon, C. Elkan, Link prediction via matrix factorization, in Machine Learning and Knowl-
edge Discovery in Databases: European Conference, ECML PKDD 2011, Athens, Greece,
September 5-9, 2011, Proceedings, Part II 22 (Springer, 2011), pp. 437–452
N. Meshcheryakova, Similarity analysis in multilayer temporal food trade network, in Complex Net-
works XI: Proceedings of the 11th Conference on Complex Networks CompleNet 2020 (Springer,
2020), pp. 322–333
402 Y. Duan et al.

I. Morer, A. Cardillo, A. Díaz-Guilera, L. Prignano, S. Lozano, Comparing spatial networks: a


one-size-fits-all efficiency-driven approach. Phys. Rev. E 101, 042,301 (2020)
M.E. Newman, Clustering and preferential attachment in growing networks. Phys. Rev. E 64(2),
025,102 (2001)
V. Ouzienko, Y. Guo, Z. Obradovic, Prediction of attributes and links in temporal social networks,
in ECAI 2010 (IOS Press, 2010), pp. 1121–1122
A. Paranjape, A.R. Benson, J. Leskovec, Motifs in temporal networks, in Proceedings of the Tenth
ACM International Conference on Web Search and Data Mining (ACM, 2017), pp. 601–610
M.H. Riad, M. Sekamatte, F. Ocom, I. Makumbi, C.M. Scoglio, Risk assessment of ebola virus
disease spreading in uganda using a two-layer temporal network. Sci. Rep. 9(1), 16,060 (2019)
D. Taylor, S.A. Myers, A. Clauset, M.A. Porter, P.J. Mucha, Eigenvector-based centrality measures
for temporal networks. Multiscale Model. Simul. 15(1), 537–574 (2017)
S.S. Wiltermuth, C. Heath, Synchrony and cooperation. Psychol. Sci. 20(1), 1–5 (2009)
T. Wu, C.S. Chang, W. Liao, Tracking network evolution and their applications in structural network
analysis. IEEE Trans. Netw. Sci. Eng. 6(3), 562–575 (2018)
Y. Xiang, Y. Xiong, Y. Zhu, Ti-gcn: a dynamic network embedding method with time interval
information, in 2020 IEEE International Conference on Big Data (Big Data) (IEEE, 2020), pp.
838–847
L.M. Yang, W. Zhang, Y.F. Chen, Time-series prediction based on global fuzzy measure in social
networks. Front. Inf. Technol. Electron. Eng. 16(10), 805–816 (2015)
M. Yang, J. Liu, L. Chen, Z. Zhao, X. Chen, Y. Shen, An advanced deep generative framework for
temporal link prediction in dynamic networks. IEEE Trans. Cybern. 50(12), 4946–4957 (2019)
Q. Yao, B. Chen, T.S. Evans, K. Christensen, Higher-order temporal network effects through triplet
evolution. Sci. Rep. 11(1), 15,419 (2021)
T. Zhang, K. Zhang, X. Li, L. Lv, Q. Sun, Semi-supervised link prediction based on non-negative
matrix factorization for temporal networks. Chaos, Solitons Fractals 145, 110,769 (2021)
T. Zhang, K. Zhang, L. Lv, D. Bardou, Graph regularized non-negative matrix factorization for
temporal link prediction based on communicability. J. Phys. Soc. Japan 88(7), 074,002 (2019)
T. Zhou, J. Ren, M. Medo, Y.C. Zhang, Bipartite network projection and personal recommendation.
Phys. Rev. E 76(4), 046,115 (2007)
T. Zhou, L. Lü, Y.C. Zhang, Predicting missing links via local information. Eur. Phys. J. B 71,
623–630 (2009)
Chapter 20
A Systematic Derivation and Illustration
of Temporal Pair-Based Models

Rory Humphries, Kieran Mulchrone, and Philipp Hövel

Abstract A modelling framework is presented for the spread of epidemics on


temporal networks from which both the Individual-Based (IB) and Pair-Based (PB)
models can be recovered. A temporal PB model is systematically derived using this
framework. We show that the model offers an improvement over existing PB models
and moves away from edge-centric descriptions, such as the contact-based model,
while the description is concise and relatively simple. For the contagion process, we
consider a Susceptible-Infected-Recovered model, which is realized on a network
with time-varying edges. We demonstrate that the shift in perspective from IB to PB
quantities enables exact modelling of Markovian epidemic processes on temporal
networks, which contain no more than one non-backtracking path between any two
vertices. On arbitrary networks, the proposed temporal PB model provides a substan-
tial increase in accuracy at a low computational and conceptual cost when compared
to the IB model. We analytically find the condition necessary for an epidemic to occur
on a temporal PB model, otherwise known as the epidemic threshold. We identify
an epidemic by looking at the stability the basic disease-free equilibrium, that is,
the state of the system before any infected individuals are present. We explain the
accuracy of the PB model by considering the number of non-backtracking cycles in
the network structure.

Keywords Temporal networks · Analytical network theory

R. Humphries · K. Mulchrone
School of Mathematical Sciences, University College Cork, Western Road, T12 XF62 Cork,
Ireland
e-mail: rory@humphries.ie
K. Mulchrone
e-mail: k.mulchrone@ucc.ie
P. Hövel (B)
Theoretical Physics and Center for Biophysics, Universität des Saarlandes, Campus E2 6, 66123
Saarbrücken, Germany
e-mail: philipp.hoevel@uni-saarland.de

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 403
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_20
404 R. Humphries et al.

20.1 Overview

In recent years epidemiological modelling, along with many other fields, has seen
renewed activity thanks to the emergence of network science (Newman 2018;
Barabási and Pósfai 2016; Zhan et al. 2020; Masuda and Holme 2017). Approach-
ing these models from the view of complex coupled systems has shed new light on
spreading processes where early black-box Ordinary Differential Equation (ODE)
models, such as those developed by Kermack and McKendrick, are of limited applica-
bility (Keeling and Eames 2005; Humphries et al. 2021). These ODE models assume
homogeneous mixing of the entire population, which may be an appropriate approx-
imation for small communities. However, when attempting to model the spread of
disease at a national or international level, they fail to capture how heterogeneities in
both travel patterns and population distributions contribute to and affect the spread of
disease. Epidemiological models on complex networks aim to solve this problem by
moving away from averaged dynamics of populations and mean-field descriptions.
Instead, the focus is on interactions between individuals or meta-populations, where
the spreading process is driven by contacts in the network (Colizza et al. 2007; Wang
et al. 2003; Sharkey et al. 2015).
Recently, many improvements have been made to network models, e.g., gener-
alised multi-layer network structures or more specifically temporal networks that
allow for the network structure to change with time (Lentz et al. 2016; Masuda and
Holme 2017; Iannelli et al. 2017; Koher et al. 2016). Temporal networks are a natural
way of representing contacts and lead to an insightful interplay between the disease
dynamics and evolving network topology (Karrer and Newman 2010; Shrestha et al.
2015; Lentz et al. 2013). With the ever growing availability of mobility and con-
tact data, it has become easier to obtain accurate and high-resolution data to inform
network models. Results from network models can be extremely useful tools for
public-health bodies and other stakeholders (Génois and Barrat 2018; Gethmann
et al. 2019; Tratalos et al. 2020).
A widely used epidemiological concept in previous research is the Individual-
Based (IB) model (Newman 2018; Valdano et al. 2015; Sharkey 2011). It assumes
statistical independence in the state of each vertex. A major difficulty associated
with this approach is the occurrence of an echo chamber effect because there is no
memory of past interactions due to statistical independence.
To demonstrate the echo chamber phenomenon, we consider the graph shown in
Fig. 20.1, made up of 2 vertices which are connected for all time, i.e., a temporal
realisation of a static graph. Both vertices have an initial probability of being infected
of 0.2, i.e. PI1 (0) = PI2 (0) = 0.2, where we let PSi (t), PIi (t) and PRi (t) denote the
marginal probabilities of vertex i being in state S, I and R respectively at time
t. The system is evolved using the IB model (Valdano et al. 2015). To determine
the accuracy of the IB model, we also consider the average of many Monte-Carlo
realisations (MC model) of the stochastic SIR infection process as comparison to the
ground truth.
20 A Systematic Derivation and Illustration … 405

Fig. 20.1 A 2-vertex graph with both vertices connected by an edge for all time. The quantities
PI1 (0) = 0.2 and PI1 (0) = 0.2 refer to the initial conditions of the SIR model shown in Fig. 20.2

IB
1.5 MC
avg
avg. recovered, PR

1.0

0.5

0.0

0 10 20 30 40

Fig. 20.2 The average number of recovered individuals in the SIR model according to the IB model
(green curve) and the MC Model with 104 realisations (dashed blue curve). The parameters used
were PI1 (0) = PI2 (0) = 0.2, β = 0.4 and μ = 0.2

In Fig. 20.2, we show the results of running the IB model (green curve) as well
as the MC Model with 104 realisations (dashed blue curve) for 40 time steps. The
parameters used were β = 0.4 and μ = 0.2 for the infection and recovery rates,
respectively. The y-axis
 shows the average
 proportion of the number of recovered
which PRavg (t) = 21 PR1 (t) + PR2 (t) . From this very simple example, it is clear that
the IB model fails to describe the true process by vastly overestimating the probability
of contracting the disease at some point during the evolution of the model.
Efforts to eliminate the echo chamber effect have focused on introducing mem-
ory at the level of each vertex’s direct neighbours. These models are referred to
as Contact-Based (CB) (Koher et al. 2019) or Pair-Based (PB) models (Frasca and
Sharkey 2016) and have been shown to significantly reduce the echo chamber effect,
depending on the underlying network structure. The two models differ in their ini-
tial approach. The CB model takes an edge-based perspective, which extends the
message-passing approach (Karrer and Newman 2010; Kirkwood 1935), and all
dynamic equations are formulated in terms of edges. By contrast, the PB model
406 R. Humphries et al.

keeps the vertex-based approach of the IB model and dynamic equations are in terms
of vertices.
This chapter is largely based on our work in Humphries et al. (2021), in which
we extend the PB model to a temporal setting giving a Temporal Pair-Based (TPB)
Model. We show how it can be drastically reduced and simplified under a certain
dynamical assumption (Sharkey and Wilkinson 2015). We deal specifically with SIR
models. Once the TPB model is written in concise form, it is then possible to show
that the contact-based model is equivalent to a linearised version of the TPB model.
We then establish the conditions for an epidemic to occur according to the TPB
model, also known as the epidemic threshold. We investigate how the TPB model
performs on a number of synthetic and empirical networks and investigate what kind
of network topologies work best with the TPB model.

20.2 Reduced Master Equations

Let us start by considering a discrete time temporal network, G = (G 1 , . . . , G n t ), to


be a series of n t networks where G t = (V, E t ) denotes the network at time step t.
The networks all share the same vertex set, V , but differ in their temporal edge sets,
E t . Let n v = |V | be the number of vertices and n e,t = |E t | be the number of edges
at time t. The adjacency matrix for the network at time t is denoted by A [t] , and
Ai[t]j = 1 implies a directed edge from vertex i to vertex j at time t. If the network is
undirected then Ai[t]j = A[t]ji .
Let  be the set of compartments in an epidemiological compartment model, i.e.
 T
for the SIR model:  = {S, I, R}. Let x = x1 , x2 , . . . , xn v ∈ n v be the vector
whose i-th element refers to the state of the i-th vertex, i.e., xi = S means vertex
i belongs to compartment S, and similarly for other compartments. The evolution
of the disease is then described by the, discrete in time and space, master equation
(Gardiner and Gardiner 2009):
  
P(x, t) = W (x|y, t)P(y, t) − W (y|x, t)P(x, t) , (20.1)
y∈n v

and thus, we assume that the infection process is Markovian. P(x, t) is the probability
that the network is in the particular configuration of states given by x at time t.
W (x|y, t) is the transition rate of the network moving from the configuration of states
y to x at time t. This equation describes the entire process on the network. However,
in order to progress the system forward one step in time, the probabilities of all
combinations of state vectors must be found. This is not usually feasible for network
processes with potentially billions of vertices. For example, the total combination of
states for the SIR process is 3n v .
An alternate approach is to describe the evolution of disease using a system of
Reduced Master Equations (RMEs) (Sharkey 2011), that describes the evolution of
20 A Systematic Derivation and Illustration … 407

subsystems within the network, such as individual vertices, removing the need to
obtain every possible combination of states. It is important to note that the RMEs are
not true master equations because they are not necessarily linear. This is because the
transition rates of the subsystems are nonlinear combinations of the transitions rates of
the original system. However, we shall continue to use the term RME consistent with
Sharkey (2011). We use the following notation for the joint marginal probabilities:

Pxi1 ,xi2 ,...,xik (t), (20.2)

which gives the probability of realising the states xi1 , xi2 , . . . , xik for the vertices
i 1 , i 2 , . . . , i k at time t respectively. The condition marginal probabilities are:

Pxi1 ,xi2 ,...,xik |x j1 ,x j2 ,...,x jk (t), (20.3)

representing the probability of the vertices i 1 , i 2 , . . . , i k being in the states xi1 ,


xi2 , . . . , xik , respectively, at time t, given that the vertices j1 , j2 , . . . , jl are in the
states x j1 , x j2 , . . . , x jl respectively, at time t also. The marginal transition rates are
denoted:
Wxi1 ,xi2 ,...,xik |yi1 ,yi2 ,...,yik (t), (20.4)

which is the transition rate of vertices i 1 , i 2 , . . . , i k respectively moving to the states


xi1 , xi2 , . . . , xik at time t, given they are in the states yi1 , yi2 , . . . , yik .
When we wish to specify a particular realisation of xi , we denote it by Si , Ii or
Ri to imply xi = S, xi = I or xi = R respectively. Employing this new notation we
start with the RME, which describes the evolution of individual vertices:
 
Pxi (t) = Wxi |yi (t)Pyi (t) − W yi |xi (t)Pxi (t) , (20.5)
yi ∈

The next section deals with how to approximate the quantities given in Eq. (20.5).

20.3 Network Model

In the case of SIR dynamics, the evolution of each vertex in each compartment is
given by the following:

PSi (tn+1 ) = PSi (tn ) + PSi (tn ), (20.6a)


PIi (tn+1 ) = PIi (tn ) + PIi (tn ), (20.6b)
PRi (tn+1 ) = PRi (tn ) + PRi (tn ), (20.6c)

where PSi (tn ) is defined by the RME in the previous section as Eq. (20.5). Filling
in the transition rates we find:
408 R. Humphries et al.

PSi (tn+1 ) = PSi (tn ) − W Ii |Si (tn )PSi (tn ), (20.7a)


PIi (tn+1 ) = PIi (tn ) − W Ri |Ii (tn )PIi (tn ) + W Ii |Si (tn )PSi (tn ), (20.7b)
PRi (tn+1 ) = PRi (tn ) + W Ri |Ii (tn )PIi (tn ). (20.7c)

Note that transition rates such as W Ri |Si are not present, because—under the SIR
scheme—it is not possible to go from Ri to Si and thus W Ri |Si = 0. PRi can be
recovered using the conservation of the probabilities PSi + PIi + PRi = 1. In order
to compute the transition rates we define the following quantities: the probability of
infection on contact, β, and the rate of recovery, μ. The quantity A [tn ] , is the temporal
adjacency matrix of the network on which the process is occurring. Following Frasca
and Sharkey (2016), the transition rates of moving from S to I , and I to R are given
by:
 
W Ii |Si (tn ) = β Ai[tjn1] PI j1 |Si (tn ) − β 2 Ai[tjn1] Ai[tjn2] PI j1 ,I j2 |Si (tn )
j1 ∈V j1 < j2 ∈V

+ ··· − (−β) n v −1
Ai[tjn1] · · · Ai[tjnn]v −1 PI j1 ,...,I jnv −1 |Si (tn ), (20.8a)
j1 <···< jn v −1 ∈V

W Ri |Ii (tn ) = μ. (20.8b)

Equation (20.8a) can be thought of as analogous to the binomial distribution, where


the probability of at least 1 success out of n v − 1 trials is given as:
 
nv − 1 0
P(k > 0) = 1 − β (1 − β)n v −1
0
        (20.9)
nv − 1 nv − 1 2 nv − 1 3 nv − 1
= β− β + β − ··· + (−β)n v −1 ,
0 1 2 nv − 1

where β is the probability of success and k is the number of successes. The difference
between this and Eq. (20.8a) however, is that the probability of success is different
for each “infection attempt” because it depends on the current state of the system, as
well as the state of the adjacency matrix. Equation (20.8) describes the probabilistic
SIR process on a temporal network. Note that the system of equations is not closed
because a description for joint conditional probabilities PI j1 ,Si and all higher order
quantities are missing. There are a number of ways in which this problem can be
tackled, usually by making a number of numerical or dynamical approximations
(Koher et al. 2019; Valdano et al. 2015; Wang et al. 2003; Gómez et al. 2010).
In the next sections we attempt to improve on and unify many existing approaches
showing how they are derived from the system of RMEs given by Eqs. (20.7) and
(20.8).
20 A Systematic Derivation and Illustration … 409

20.3.1 Temporal Individual-Based Model

One of the most commonly used epidemiological models on networks is the IB model,
which has been extended to the temporal setting in Valdano et al. (2015). We refer
to this extension as the Temporal Individual-Based (TIB) model. The key idea is the
assumption of statistical independence of vertices or the mean field approximation,
i.e., the factorisation:

Pxi1 ,xi2 ,...,xik (tn ) = Pxi1 (tn )Pxi2 (tn ) · · · Pxik (tn ). (20.10)

By assuming independence of vertices, we solve the problem of Eq. (20.8a) not being
closed and it simplifies to:
 
W Ii |Si (tn ) = β Ai[tjn1] PI j1 (tn ) − β 2 Ai[tjn1] Ai[tjn2] PI j1 (tn )PI j2 (tn )
j1 ∈V j1 , j2 ∈V
 (20.11)
+ · · · − (−β)n v −1 Ai[tjn1] · · · Ai[tjnn]v −1 PI j1 (tn ) · · · PI jnv −1 (tn ).
j1 <···< jn v −1 ∈V

Under the assumption of independence, the conditional probability is simplified using


the definition of conditional probability, i.e.:

PI j1 ,Si PI j1 PSi
PI j1 |Si = = = PI j1 . (20.12)
PSi PSi

Upon factorising Eq. (20.11), it may be written more concisely as:

[tn ]
W Ii |Si (tn ) = 1 − 1 − βAik PIk (tn ) . (20.13)
k∈V

Substituting the transition rates W Ii |Si and W Ri |Ii under the assumption of statistical
independence, the full TIB model is then written as:

[tn ]
PSi (tn+1 ) = PSi (tn ) 1 − βAik PIk (tn ) , (20.14a)
k∈V

[tn ]
PIi (tn+1 ) = PIi (tn )(1 − μ) + PSi (tn ) 1 − 1 − βAik PIk (tn ) , (20.14b)
k∈V

which is the same as the IB model given in Valdano et al. (2015). The quantity
that is multiplied by PSi (tn ) is the probability of a vertex not becoming infected
under the IB model. This model closes Eq. (20.8a) at the level of vertices, thus
ignoring all correlations with other vertices at previous times. However, ignoring
all past correlations causes the model to suffer quite badly from an echo chamber
effect (Shrestha et al. 2015) (see Sect. 20.1). That is, vertices artificially amplify
410 R. Humphries et al.

each others probability of being infected, PIi , at each new time step, because the
marginal probability of each vertex is highly correlated with the rest of the network
and the factorisation of Eq. (20.8a) means each vertex forgets its past interactions. As
demonstrated in Shrestha et al. (2015), in the absence of a recovered compartment, a
static network of two linked vertices for non-zero initial conditions has probabilities
of being infected which converge according to limn→∞ PI0 (tn ) = limn→∞ PI1 (tn ) =
1 for the TIB model.
Further simplifications of the TIB model are often seen Newman (2018); Sharkey
(2011) by taking numerical approximations of the product in Eqs. (20.14). Given a
sequence of numbers (ri )i=1
N
such that |ri | < 1 for all i, one obtains:

N 
N 
N 
N 
N 
N 
N
(1 − ri ) = 1 − ri + ri r j − ri r j rk + . . .
i=1 i=1 i=1 j=1 i=1 j=1 k=1
(20.15)

N
≈1− ri .
i=1

Using this numerical approximation, Eqs. (20.14) become:

 [tn ]
PSi (tn+1 ) = PSi (tn ) 1 − βAik PIk (tn ) (20.16)
k∈V
 [tn ]
PIi (tn+1 ) = PIi (tn )(1 − μ) + PSi (tn ) βAik PIk (tn ). (20.17)
k∈V

The TIB model derived above is one of the simplest network models for the
spread of disease available. The form of the model relates to the level of statistical
independence assumed for the vertices in order to close Eq. (20.8a). Hence, in the
next section we see how an alternative assumption about the independence of vertices
leads to a more accurate model.

20.3.2 Temporal Pair-Based Model

Instead of assuming independence of vertices, it is possible to approximate the


marginal probabilities in terms of combinations of lower order marginals using some
form of moment closure (Frasca and Sharkey 2016; Sharkey and Wilkinson 2015).
Here, we make an equivalent assumption to that of message passing approaches (Kar-
rer and Newman 2010; Shrestha et al. 2015). We assume that the graph contains no
more than a single time-respecting Non-Backtracking (NBT) path from one vertex to
another. The path may be taken multiple ways through time, but the vertex sequence
must be the same. If the graph is undirected, then this implies the graph contains
no time-respecting NBT cycles. In other words, starting at some initial vertex i that
20 A Systematic Derivation and Illustration … 411

leaves via vertex j, there is no way to find a time-respecting path returning to this
vertex that does not return via j. This means that an undirected graph must be a tree-
graph in the static case, or in the case of a temporal graph, tree-like when viewed in
its static embedding of the supra-adjacency representation (Bianconi 2018).
These assumptions allow us to write all higher order moments in Eq. (20.8a) as a
combination of pairs PSi ,Ik . To show why this is possible, consider the three vertices
i, j, k connected by two edges through i. If conditional independence of these vertices
is assumed given we have the state of i, then one can make the following assumption:

Pxi ,x j Pxi ,xk Pxi ,x j Pxi ,xk


Pxi ,x j ,xk = Px j ,xk |xi Pxi = Pxi = . (20.18)
Pxi Pxi Pxi

This has the effect of assuming that there exists only a single time-respecting NBT
path from one vertex to another, as it implies that the flow of probability from j to
k must occur through vertex i and does not occur through any other intermediary
vertex. In the Sect. 20.5, we will investigate how this assumption holds up on graphs
whose topology does not satisfy the tree-graph condition. The result obtained in
Eq. (20.18) is often referred to as the Kirkwood closure (Kirkwood 1935). Under
the above assumption of conditional independence, the following simplification is
obtained for Eq. (20.8a):
 
[tn ] PSi ,Ik (tn )
W Ii |Si (tn+1 ) = 1 − 1 − βAik . (20.19)
k∈V
PSi (tn )

However, we run into the problem that we have no description for pairs of vertices.
Thus, we derive expressions for their evolution from the RMEs for pairs of vertices
which is given by:
  
Pxi ,x j (t) = Wxi ,x j |yi ,y j (t)Pyi ,y j (t) − W yi ,y j |xi ,x j (t)Pxi ,x j (t) , (20.20)
yi ,y j ∈

For PSi ,I j , we obtain:

PSi ,I j (tn+1 ) = PSi ,I j (tn ) + PSi ,I j (tn )


= PSi ,I j (tn ) + W Si ,I j |Si ,S j (tn )PSi ,S j (tn ) − W Ii ,I j |Si ,I j (tn )PSi ,I j (tn )
− W Si ,R j |Si ,I j (tn )PSi ,I j (tn ) − W Ii ,R j |Si ,I j (tn )PSi ,I j (tn ).
(20.21)
Note that the above equation requires a description for PSi ,S j also, which we find to
be the following:

PSi ,S j (tn+1 ) = PSi ,S j (tn ) + PSi ,S j (tn )


= PSi ,S j (tn ) − W Si ,I j |Si ,S j (tn )PSi ,S j (tn ) (20.22)
− W Ii ,S j |Si ,S j (tn )PSi ,S j (tn ) − W Ii ,I j |Si ,S j (tn )PSi ,S j (tn ).
412 R. Humphries et al.

Since only the probabilities PSi ,I j and PSi ,S j are needed in order to describe the
RMEs in Eq. (20.19), we only consider those two combinations of states. From
Frasca and Sharkey (2016), we obtain the exact transition rates for pairs of vertices
and find that we can factorise the pair-wise transition rates similar to Eq. (20.8a).
For example, the exact transition rate W Si ,I j |Si ,S j , which features in the second term
of Eq. (20.22), is given by

W Si ,I j |Si ,S j (tn ) =

 [t ]  [tn ] [tn ]
1−β Aikn1 PIk1 |Si ,S j (tn ) + β 2 Aik 1
Aik2 PIk1 ,Ik2 |Si ,S j (tn )
k1 ∈V k1 <k2 ∈V


− . . . +(−β)n v −2 [tn ]
Aik 1
[tn ]
. . . Aik n v −2
PIk1 ,...,Iknv −2 |Si ,S j (tn )⎦
k1 <···<kn v −2 ∈V

 
× β A[tjkn1] PIk1 |Si ,S j (tn ) − β 2 A[tjkn1] A[tjkn2] PIk1 ,Ik2 |Si ,S j (tn )
k1 ∈V k1 <k2 ∈V


+ . . . −(−β)n v −2 A[tjkn1] . . . A[tjknn]v −2 PIk1 ,...,Iknv −2 |Si ,S j (tn )⎦ . (20.23)
k1 <···<kn v −2 ∈V

In this equation, the term in the first pair of square brackets corresponds to the
probability that vertex i does not become infected and the term in the second pair of
square brackets corresponds to the probability that vertex j becomes infected. Upon
applying the moment closure technique, Eq. (20.23) may be written as,
⎡ ⎤
   
PS ,I (tn ) ⎢ PS j ,Ik (tn ) ⎥
W Si ,I j |Si S j (tn ) = 1 − βA[tkin ] i k ⎣1 − 1 − βA[tk jn ] ⎦.
k∈V
PSi (tn ) k∈V
PS j (tn )
k= j k=i
(20.24)
Similarly, the other two, pair-wise transitions rates W Ii ,S j |Si S j and W Ii ,I j |Si S j in
Eq. (20.23) are
⎡ ⎤
   
⎢ [tn ] PSi ,Ik (tn ) ⎥ [tn ] PS j ,Ik (tn )
W Ii ,S j |Si S j (tn ) = ⎣1 − 1 − βAki ⎦ 1 − βAk j
k∈V
PSi (tn ) k∈V
PS j (tn )
k= j k=i
(20.25)
and
20 A Systematic Derivation and Illustration … 413
⎡ ⎤
 
⎢ PS j ,Ik (tn ) ⎥
W Ii ,I j |Si S j (tn ) = ⎣1 − 1 − βA[tkin ] ⎦
k∈V
PS j (tn )
k= j
⎡ ⎤
 
⎢ PS j ,Ik (tn ) ⎥
× ⎣1 − 1 − βA[tk jn ] ⎦ . (20.26)
k∈V
PS j (tn )
k=i

For completeness, the same moment closure approach leads to the following the
transition rates W Ii ,I j |Si I j , W Si ,R j |Si I j , and W Ii ,R j |Si I j for the pair-based model needed
for PSi ,I j in Eq. (20.21) obtained
⎡ ⎤
 
⎢ PS ,I (tn ) ⎥
W Ii ,I j |Si I j (tn ) = (1 − μ) ⎣1 − (1 − βA[tjin ] ) 1 − βA[tkin ] i k ⎦,
k∈V
PSi (tn )
k= j
⎡ ⎤ (20.27)
 
⎢ PS ,I (tn ) ⎥
W Si ,R j |Si I j (tn ) = μ ⎣(1 − βA[tjin ] ) 1 − βA[tkin ] i k ⎦, (20.28)
k∈V
PSi (tn )
k= j

and
⎡ ⎤
 
⎢ PS ,I (tn ) ⎥
W Ii ,R j |Si I j (tn ) = μ ⎣1 − (1 − βA[tjin ] ) 1 − βA[tkin ] i k ⎦. (20.29)
k∈V
PSi (tn )
k= j

By introducing the following functions, the RMEs for pairs, as well as the indi-
vidual vertices can be written more concisely. The probability that vertex i does not
become infected at time step tn+1 , given that i is not infected at time step tn is denoted
by:  
[tn ] PSi ,Ik (tn )
i (tn ) = 1 − βAik . (20.30)
k∈V
PSi (tn )

Similarly, the probability that vertex i does not become infected at time step tn+1 ,
given that i is not infected at time step tn while excluding any interaction with j, is
given by:  
[tn ] PSi ,Ik (tn )
i j (tn ) = 1 − βAik . (20.31)
k∈V
PSi (tn )
k= j

Then, the evolution of the state of every vertex in the network is determined by the
following closed set of equations:
414 R. Humphries et al.

PSi (tn+1 ) = i (tn )PSi (tn ), (20.32a)


PIi (tn+1 ) = (1 − μ)PIi (tn ) + (1 − i (tn )) PSi (tn ), (20.32b)
PSi ,I j (tn+1 ) = (1 − μ) 1 − βAi[tjn ]
i j (tn )PSi ,I j (tn )
 
+ i j (tn ) 1 −  ji (tn ) PSi ,S j (tn ), (20.32c)
PSi ,S j (tn+1 ) = i j (tn ) ji (tn )PSi ,S j (tn ). (20.32d)

This approximation allows a large increase in accuracy compared to TIB model


while only adding two equations to the final model. All past dynamic correlations
are now tracked by the model and so the echo chamber effect is eliminated, but
only with direct neighbours, that is, vertices that share an edge. A major benefit
of this particular Temporal Pair-Based (TPB) model over other existing iterations
(Koher et al. 2019; Gómez et al. 2010) is that this model can be implemented as
an element-wise sparse matrix multiplication rather than having to iterate through
all edges for every time step making it extremely efficient computationally even on
large networks. It also benefits from a low conceptual cost by not deviating from a
vertex-based perspective. By contrast, contact-based models move to the perspective
of edges and thus define the model in terms of the line-graphs and non-backtracking
matrices (Koher et al. 2019).
Similar to Shrestha et al. (2015), we can compare TPB models to the TIB model
using the two vertex example. In that illustrative configuration, we consider two
vertices connected by an undirected static edge and give the two vertices some initial
non-zero probability PI1 (0) = PI2 (0) = z of being infected. We then run the TIB
and TPB models for some given parameters β and μ and compare it to the ground
truth, given by the MC model.
In Fig. 20.3, we show the two vertex example from Fig. 20.1 once again, however,
this time we also plot the TPB model. It is apparent that the TIB model fails to
capture the true SIR process on the network due to the echo chamber effect induced
by assuming statistical independence of vertices. It is clear, however, that the TPB
model accurately describes the underlying SIR process for this simple example as
each vertex is able to recover the dynamic correlations of past interactions with direct
neighbours.

20.3.2.1 Equivalence Between The Contact-Based and Pair-Based


Models

The contact-based model as defined in Koher et al. (2019), is an extension of the


message-passing approach applied to the spreading of epidemics. It moves from con-
sidering individual vertices to edges in the network, thus tracking pairs of vertices
similar to our pair-based model. The central component is θi j (tn ), which is the prob-
ability that node j has not passed infection to node i up to time step n. From θi j (tn ),
the quantity PSi (tn ) may be computed as:
20 A Systematic Derivation and Illustration … 415

0.8 TPB
TIB
avg

MC
avg. recovered, PR

0.6

0.4

0.2

0.0

0 10 20 30 40
ti me (days)

Fig. 20.3 Running 40 time steps of the TIB (green curve) and TPB (red curve) SIR model as well
as the average over 105 MC simulations (blue dashed curve) for the two-vertex example given in
Fig. 20.1. Parameters: β = 0.4, μ = 0.2, and initial conditions PI1 (0) = PI2 (0) = 0.2

PSi (tn+1 ) = PSi (t0 ) θi j (tn+1 ). (20.33)


j∈V

This equation is the basis for the contact-based model and allows us to easily com-
pare it with the PB model because it describes the same quantity as our Eq. (20.32a).
The authors also assume that the evolution of θi j (tn ) satisfies the following relation:

PSi ,I j (tn )
θi j (tn+1 ) = θi j (tn ) − βAi[tjn ] (20.34)
PSi (tn )
θi j (t0 ) = 1.

In the PB model, the evolution of the susceptible probability, given by Eq. (20.32a),
can similarly be rewritten in terms of its initial conditions:

PSi (tn+1 ) = i (tn )PSi (tn ) (20.35a)


n
= PSi (t0 ) i (tm ) (20.35b)
m=0
n 
PSi ,I j (tm )
= PSi (t0 ) 1 − βAi[tjm ] . (20.35c)
j∈V m=0
PSi (tm )
416 R. Humphries et al.

From equating (20.33) and (20.35) it is clear that if the models are exactly equiv-
alent then θi j is defined by

n  
PSi ,I j (tm )
θi j (tn+1 ) = 1 − βAi[tjm ] . (20.36)
m=0
PSi (tm )

However, this contradicts the assumption made by Eq. (20.34). Thus the PB and
contact-based models are only equivalent if the following linearisation is assumed:
n   n
PSi ,I j (tm ) PSi ,I j (tm )
1 − βAi[tjm ] ≈1− βAi[tjm ] , (20.37)
m=0
P (t
Si m ) m=0
PSi (tm )

which then implies Eq. (20.36) can be written as


n−1
PSi ,I j (tm ) PSi ,I j (tn )
θi j (tn+1 ) = 1 − βAi[tjm ] − βAi[tjn ] (20.38a)
m=0
PSi (tm ) PSi (tn )
PSi ,I j (tn )
= θi j (tn ) − βAi[tjn ] . (20.38b)
PSi (tn )

This shows that the contact-based model is a linearised version of the PB model.
We have shown that using our framework we are able to derive a number of
existing models such as the TIB and contact-based models, and we derive a new
concise form of the PB model in a temporal setting which does not rely on any
numerical linearisation. In the next section we compute the conditions necessary for
an epidemic to occur in our TPB model.

20.4 Epidemic Threshold

One of the most important metrics used in epidemiological modelling is the epidemic
threshold. It allows us to determine the critical values of the model parameters at
which a transition in qualitative behaviour occurs and an epidemic occurs. In order
to determine these parameters, we first need the fixed points of the system, as their
stability can aid us in the definition of an epidemic. The fixed points are given as:

PSi (tn ) = Si∗ , PIi (tn ) = 0, PRi (tn ) = Ri∗ ∀i, (20.39)

where Si∗ + Ri∗ = 1. From this definition it is clear that there exists a set of disease-
free equilibria (DFEs) to be considered. At the critical point, defined by some func-
tion of the model parameters, the DFE becomes unstable and on average an epidemic
occurs. Defining an epidemic in the SIR model is a bit more difficult compared to the
SIS model due to the fact that the flow of probability is in only one direction between
20 A Systematic Derivation and Illustration … 417

compartments S → I → R. Therefore, the class of DFE solutions are always asymp-


totically stable. Thus, we will look at classifying the initial stability of the SIR model
as we perturb it from the state:

PSi (tn ) = Si∗ , PIi (tn ) = 0, PRi (tn ) = 0 ∀i, (20.40)

which is defined to be the basic disease-free equilibrium (BDFE). If it is unstable then


the disease has a chance to take hold and will spread through the network causing
an epidemic before dying out. We now look at small perturbations from the BDFE,
if they vanish then the disease will die out and will not have a chance to propagate
through the network. We shall define an epidemic in the SIR model as instability of the
BDFE under such perturbations. First, we look to linearise the difference equation
for PIi (tn+1 ) near the BDFE, this translates to linearising the non-linear function
i (tn ). Under the assumption PIi (tn ) = i for every vertex i such that 0 < i 1,
we find:
PSi ,I j (tn )
0≤ ≤ j (20.41)
PSi (tn )

by the fact that for the joint probability PSi ,I j (tn ) ≤ min{PSi (tn ), PI j (tn )}. Thus for
PSi ,I j (tn )
i 1 we may assume PSi (tn )
≈  j . Upon substituting this into i (tn ) we find that:

[tn ]
i (n) ≈ 1 − βAik k
k∈V
 [tn ]
(20.42)
≈1 − βAik k ,
k∈V

We use this to linearise PI j (tn+1 ) from Eq. (20.32). While i 1 holds, so does
the approximation,
 [tn ]
PIi (tn+1 ) ≈ PIi (tn )(1 − μ) + βAik PIk (tn ). (20.43)
k∈V

This linearisation eliminates PSi (tn ) from the equation. Interestingly, this is exactly
the form of the SIS model in the TIB framework for which the epidemic threshold is
easily found (Valdano et al. 2015). Therefore, we find that the SIS and SIR models
share the same epidemic threshold condition. We introduce the matrix M[tn ] , called
the infection propagator, which is a linear map that describes the evolution of the
SIR model close to the BDFE:

Mi[tjn ] = βAi[tjn ] + δi j (1 − μ). (20.44)

Following Valdano et al. (2015), we find that the condition required for an epi-
demic to occur is given by
418 R. Humphries et al.

n t −1
 
ρ M = ρ M[tnt ] M[tk ] > 1, (20.45)
k=1

where M is the infection propagator written in terms of the supra adjacency matrix
(Bianconi 2018) and ρ is the spectral radius operator, i.e., it gives the largest eigen-
value by magnitude. The matrix M is given by,
⎛ ⎞
0 M[t1 ] 0 ··· ··· 0
⎜ 0 0 M[t2 ] ··· ··· 0 ⎟
⎜ ⎟
⎜ 0 0 0 ··· ··· 0 ⎟
⎜ ⎟
M=⎜

.. .. .. .. .. ⎟.
⎟ (20.46)
⎜ . . . . . ⎟
⎜ .. ⎟
⎝ 0 0 0 .M [tn t −1 ] ⎠
M[tnt ] 0 0 ··· ··· 0

For the values of β and μ for which the above Eq. (20.45) is satisfied, implies that
when a disease is introduced into the network the BDFE is unstable for a period of
time. This means that in the equivalent SIS model with the same parameters, the
proportion of infected vertices never settles on a DFE. We wish to show that the
equivalence in Eq. (20.45) is true. First, we partition λId − M into:
⎛ ⎞
λId −M[t1 ] 0 ··· ··· 0
⎜ 0 λId −M[t2 ] ··· ··· 0 ⎟
⎜ ⎟
⎜ 0 0 λId ··· ··· 0 ⎟
⎜ ⎟
⎜ .. .. .. .. .. ⎟, (20.47)
⎜ . . . . . ⎟
⎜ ⎟
⎜ .. ⎟
⎝ 0 0 0 . −M t ⎠
[tn −1 ]

−M[tnt ] 0 0 · · · · · · λId

then using the following formula for the determinant of 2 × 2 block matrices, which
is derived by applying the determinant to Schur’s complement:
 
AB  
det = det(A) × det D − CA−1 B , (20.48)
CD

where A, B, C and D are matrices of sizes p × p, p × q, q × p and q × q respec-


tively. Thus from the above partitioning:
 
1
det(λId − M ) = det(λId) det λId − M 1 + E1 , (20.49)
λ

where the two new matrices introduced are


20 A Systematic Derivation and Illustration … 419
⎛ ⎞
0 M[tk+1 ] 0 ··· ··· 0
⎜0 0 M[tk+2 ] ··· ··· 0 ⎟
⎜ ⎟
⎜0 0 0 ··· ··· 0 ⎟
⎜ ⎟
Mk = ⎜
⎜.
.. .. .. .. ..⎟
⎟ (20.50)
⎜ . . . .⎟
⎜ .. ⎟
⎝0 0 0 .M t ⎠
[tn −1 ]

0 0 0 ··· ··· 0

and ⎛ ⎞
0 0 ··· ··· 0
⎜ .. .. . . ⎟
⎜ . . . ⎟
Ek = ⎜


.. ⎟ . (20.51)
⎝ 0 0 . 0⎠

M[tnt ] ik M[ti ] 0 ··· ··· 0

Continuing partitioning of the matrices in this way, the determinant reduces to


 
1
det(λId − M ) = det(λId) det λId − M 1 + E1
λ
 
1
= det(λId) det λId − M 2 − 2 E2
2
λ
.. (20.52)
.
 
n t −2 (−1)n t
= det(λId) det λId − M n t −2 + n −1 En t −2
λt
 
n t −1 (−1) nt
= det(λId) det λId + n −1 En t −1 .
λt

Now, by computing the determinant for the identity matrix and substituting in the
expression for En t −1 , we find that the above expression is
 
(−1)n t
det(λId − M ) = λn v (n t −1) det λId + n −1 En t −1
λt
n t −1 (20.53)
= det λn t Id + (−1)n t M[tnt ] M[tk ] ,
k

thus proving that the largest eigenvalue, λmax , of the matrix M is equivalent to σmax
nt
,
[tn t ] n t −1 [tk ]
where σmax is the maximum eigenvalue of M k M .
We now aim to test the accuracy of the TPB model compared with the TIB model
and investigate the graph topologies for which it performs well. We also test our
expression for the epidemic threshold.
420 R. Humphries et al.

20.5 Results

In this section, we compare the accuracy of the TIB model and the TPB model
against the ground truth MC model, that is, direct stochastic simulations. In short,
we show that the TPB model can offer a massive increase in accuracy and also discuss
where it fails to accurately capture the true dynamics of the stochastic SIR process.
Furthermore, we validate the analytical epidemic threshold.

20.5.1 Synthetic Networks

Now we test our results on a number of randomly generated static graphs, including a
tree graph, in order to test our hypothesis that the TPB model is exact on tree graphs.
In total, we consider 4 different randomly generated graphs, all consisting of 100
vertices. They are Erdős-Rényi, Barabási-Albert, Watts-Strogatz and random tree
graphs respectively. Details on the parameters used in the generation of the graphs
follow.
The basic assumption in the TPB model is conditional independence between
vertices with a neighbour in common, given the common neighbours state. This is
equivalent to assuming the graph contains only a single time-respecting NBT path
between any two vertices. In the case of an undirected static graph, this condition is
equivalent to implying the graph is a tree graph.
To illustrate this reasoning, we consider a static tree graph. The graph is generated
by starting with a randomly generated Erdős-Rényi graph where the probability of
two vertices being connected by an edge is p = 0.05. We then randomly choose a
vertex and perform a breadth-first search (see Bondy and Murty 2008; Newman 2018
for more details on graph search algorithms). The resulting breadth-first search tree
produces a random tree graph. All vertices start from some initial non-zero probability
PIi (0) = 0.1 of being infected. We then run the TIB model and the TPB model for
the parameters β = 0.4 and μ = 0.2 and compare it to the ground truth, which is the
average of a number of MC realisations. Figure 20.4b plots the average probability
v
of a vertex belonging to state I according to each model, that is PIavg = n1v nk=1 PIk .
The TPB model is depicted as the solid red curve, the TIB model is shown as the solid
green curve and the average of the MC realisations is shown as the dashed blue curve.
The figure shows how the TIB model fails to capture the true SIR process on the graph
due to the previously discussed echo chamber effect induced by assuming statistical
independence of vertices. The TPB model accurately describes the underlying SIR
process for this simple example because each vertex is able to recover the dynamic
correlations of past interactions with direct neighbours. As we will see from the next
section, temporal graphs that are well approximated by tree graphs are also well
approximated by the TPB model.
We have established the case in which our TPB model is exact and we now
turn our attention to general graphs which permit more than 1 time-respecting path
20 A Systematic Derivation and Illustration … 421

a) b)

0.4
TPB

avg
avg. infected, PI
T IB
0.3
MC

0.2

0.1

0.0

0 20 40
t ime (da ys)

Fig. 20.4 a A random tree network made up of 100 vertices. The tree is generated by taking the
breadth-first search tree starting from a randomly selected vertex of an Erdős-Rényi graph with
connection probability p = 0.05. b Time series of the TIB (solid green) and TPB (solid red) SIR
model as well as 105 Monte-Carlo (MC) simulations (dashed blue) for the tree network shown in
panel a. The parameters used were β = 0.4, μ = 0.02, and PIk (0) = 0.1 for all vertices

between any two vertices and we expect our model deviates from the MC average.
However, the question remains, in what cases does our model provide a substantial
improvement over the IB model? We start with testing our model on an Erdős-Rényi
random graph made up of 100 vertices. We set the probability of connection between
vertices at p = 0.05. In Fig. 20.5b, we show the results of running the TIB, TPB,
and the average of 105 MC realisations of the SIR model on a realisation of the
Erdős-Rényi graph. We run each model for 500 time steps and set the parameters
to β = 0.05 and μ = 0.05. The average probability of being in the infected state
across the entire graph is plotted for each model, with the TPB model given by the
solid red curve, the TIB model given by the solid green curve, and the average of the
MC realisations given as the green dashed curve. Looking at the curves, we see both
models fail to line up with the MC average, however the IB model appears to perform
worse by attaining a larger peak of average probability of infection when compared
to the PB model. The TPB appears to perform worse in the later stages, however
this is explained by the IB depleting its pool of “susceptible probability” early on,
as seen by the higher peak, thus in the later stages there is a higher probability that
vertices have already recovered, and thus, less of a probability of them being currently
infected.
Next we turn to the Barabási-Albert model. We generate a random Barabási-Albert
directed graph consisting of 100 vertices. We start the graph with 2 unconnected
vertices, then at each step of constructing the graph, we add a new vertex with 2 out
edges, the probability of which vertices these edges connect to is outlined in Barabási
(1999). This process is repeated until 100 vertices are reached (see Fig. 20.5a for a
visualisation of the generated graph). In a fashion similar to the previous example,
we run the TPB, TIB models, as well as compute the average of 105 MC realisations
of the SIR model on our random graph. The results of running the models using
422 R. Humphries et al.

(a) (b )

TPB

avg
0.4

avg. infected, PI
T IB
MC

0.2

0.0
0 50 100 150
t ime (d ays)

Fig. 20.5 a A random Erdős-Rényi graph made up of 100 vertices. The probability of connection
was taken as p = 0.05. b Time series of the TIB (solid green) and TPB (solid red) SIR model as
well as 105 Monte-Carlo (MC) simulations (dashed blue) for the Erdős-Rényi graph shown in panel
a. The parameters used were β = 0.05, μ = 0.005, and PIk (0) = 0.01 for all vertices

(a) (b)
0.04
TPB
avg
avg. infected, PI

0.03 T IB
MC
0.02

0.01

0.00

0 50 100 150
t ime (d ays)

Fig. 20.6 a A random Barabási-Albert directed graph made up of 100 vertices. The graph con-
struction starts with unconnected vertices with a vertex with two out edges being added at each step
of the construction. b Time series of the TIB (solid green) and TPB (solid red) SIR model as well
as 105 Monte-Carlo (MC) simulations (dashed blue) for the Barabási-Albert graph shown in panel
a. The parameters used were β = 0.05, μ = 0.005, and PIk (0) = 0.01 for all vertices

β = 0.05 and μ = 0.05 is given in Fig. 20.6b, which shows the average probability
of being in the infected state across the entire graph with the TPB model given by
the solid red curve, the TIB model given by the solid green curve, and the average
of the MC realisations given by the dashed blue curve. In this case, it is clear that
the TPB produces far more accurate results by lying closer to the MC average when
compared to the TIB model which appears to produce quite different dynamics.
The last synthetic graph type we consider is the Watts-Strogatz graph. Like in
the previous examples, we construct a graph with 100 vertices. We initialise a graph
in a ring, where each vertex is connected to the neighbour to its left and right (thus
20 A Systematic Derivation and Illustration … 423

(a) (b )

TPB

avg
avg. infected, PI
0.4 T IB
MC

0.2

0.0

0 50 100 150
t ime (d ays)

Fig. 20.7 a A random Watts-Strogatz graph made up of 100 vertices. The graph starts as a ring
with each vertex connected to the vertex to its left and right. Then each edge has an incident vertex
reassigned with probability p = 0.05. b Time series of the TIB (solid green) and TPB (solid red)
SIR model as well as 105 Monte-Carlo (MC) simulations (dashed blue) for the Watts-Strogatz
graph shown in panel a. The parameters used were β = 0.15, μ = 0.005, and PIk (0) = 0.01 for all
vertices

each vertex has a degree of 2), then with probability p = 0.05, each of the edges has
one of its incident vertices randomly reassigned. The generated graph is depicted in
Fig. 20.7a. We then run the TPB and TIB SIR models on the graph and compare it to
the average of 105 MC realisations. The parameters used in this case were β = 0.15
and μ = 0.05. Figure 20.7b shows the average probability of being in the infected
state across the whole graph for the TPB model as the solid red curve, the TIB as
the solid green curve, and average of the MC realisations as the dashed blue curve.
For this graph, we achieve remarkable agreement between the TPB model and MC
average, likely due to the fact that the graph is very close to a tree structure. The
TIB model on the other hand overestimates the prevalence of the disease by a huge
amount.

20.5.2 Non-backtracking Matrix

We have seen that our TPB model appears to perform better on particular graph
structures with few possible time-respecting NBT paths between vertices. As a result,
we look at the number of possible time-respecting NBT paths in each of the synthetic
networks analysed and compare this to the performance of the TPB models in each
case. In order to quantify the number of time-respecting NBT paths possible in a
network, we introduce the non-backtracking accessibility matrix (NBT matrix), a
concept inspire by the accessibility matrix in Lentz et al. (2013). First, we define the
non-backtracking matrix of a temporal graph as:
424 R. Humphries et al.

[tk ,tk ] 1, if e2 = f 1 , e1 = f 2 and e ∈ E k1 , f ∈ E k2 ,
Be, f1 2 = (20.54)
0, otherwise,

where tk1 , tk2 are time steps such that tk1 < tk2 and e = (e1 , e2 ) and f = ( f 1 , f 2 )
are edges active at those times, respectively. Of course for static graphs, all edges
are active for all time. This description becomes more useful when we consider the
temporal empirical networks in the following section. Note that the NBT matrix
is defined over two time-steps and is indexed over the aggregated edge set E agg =
 nt
k=1 E k . This matrix tells us if it is possible to traverse an edge e at time tk1 and then
traverse edge f at time tk2 given that the edge f starts from where edge e finished
and f does not bring us back to where e started from. Using this matrix, we define
the NBT accessibility matrix as:

l
B [tl ] = ⊗(Id ⊕ B[tk−1 ,tk ] ), (20.55)
k=2

where the two operations ⊗ and ⊕ are Boolean multiplication and addition respec-
[tl ]
tively (Luce 1952), and so the matrix B is a binary matrix. Thus, the element Be, f
tells us if there is an NBT path that starts with traversing e and ends by traversing
f which is up to length l at up to time tl . Now, in order to determine how many
possible paths there are between any two vertices up to length l, we define two new
matrices, the in and out incidence matrices. Let the in-incidence matrix of a graph,
J in , be given by:

1, if edge e is an in edge of vertex i
in
Ji,e = (20.56)
0, otherwise,

out
and let the out-incidence matrix of a graph, Ji,e , be given by, temporal graph, where:

1, if edge e is an out edge of vertex i
out
Ji,e = (20.57)
0, otherwise.

Now, we transform B [tl ] from the edge space to the vertex space by using the two
incidence matrices defined above. This allows us to determine the number of unique
NBT paths (unique in the route taken and not the times edges are traversed) there
are between any two vertices up to length l at up to time tl from the expression:
T
C [tl ] = JoutB [tl ] Jin . (20.58)

The quantity Ci[tjl ] is a matrix indexed over the vertices, which gives the number of
unique NBT paths there are from vertex i to j. In a graph for which the TPB model is
20 A Systematic Derivation and Illustration … 425

1.0

ER
c(t)

0.5 BA
WS

0.0

0 20 40
t ime, t (day)

Fig. 20.8 The NBT reachability proportion, c(t) of the synthetic networks given in Figs. 20.5,
20.6, and 20.7. The red curves refer to the Erdős-Rényi (ER) graph, the green curves refer to the
Barabási-Albert (BA) graph, and the blue curves refer to Watts-Strogatz graph. For each graph, the,
c(t) was computed for 50 time steps

exact, each entry in this matrix would be at most one. We call the proportion of entries
in the matrix C [tl ] greater than one, the NBT reachability proportion and denote it by
c(tl ).
In Fig. 20.8 we plot the NBT reachability proportion from time 1 up to 50 for
each of the synthetic networks shown in Figs. 20.5, 20.6, and 20.7. In other words,
we plot the proportion of vertex pairs that have more than 1 NBT path connecting
them. The red curve refers to the Erdős-Rényi graph, the green curve refers to the
Barabási-Albert graph, and the blue curve refers to the Watts-Strogatz graph.
We see that the NBT reachability proportion in the Erdős-Rényi graph approaches
one almost immediately, far surpassing both the Barabási-Albert graph and the Watts-
Strogatz graph. The Barabási-Albert and Watts-Strogatz graphs attain far lower
values, with the Watts-Strogatz graph starting lower, but eventually surpassing the
Barabási-Albert graph. These results line up with the agreement between the MC
models and the TPB models for each of the considered synthetic graphs. The Erdős-
Rényi graph, with the highest NBT reachability proportion, performs almost as poorly
as the IB model, whereas the other 2 graphs, with very low proportions, perform far
better than the IB model.
Now, we validate our analytical findings for the epidemic threshold of the TPB
SIR process. For this purpose, we fix a value for μ and then for increasing values of
β, perform a number of MC simulations for long times in orderto get a distribution
of the final outbreak size, which is given by R∞ = limn→∞ n1v nk=1 v
PRk (n). Again,
we use the same synthetic graphs generated for Figs. 20.5, 20.6, and 20.7, upon
426 R. Humphries et al.

0.6 ER
BA
0.5 WS

0.4
R1

0.3

0.2

0.1

0.0
0.00 0.05 0.10
β

Fig. 20.9 The final outbreak size, R∞ , for the random graphs given in the Figs. 20.5, 20.6, and 20.7.
The red curves refer to the Erdős-Rényi (ER) graph, the green curves refer to the Barabási-Albert
(BA) graph, and the blue curves refer to Watts-Strogatz graph. R∞ was computed for different
values of β with a fixed value of μ = 0.05. For each graph, R∞ was taken as the average of the final
number of recovered individuals of 103 MC simulations for each value of β. The initial conditions
of each simulation was set so that each individual had a 0.01 chance of starting infected. For each
graph, the analytically computed critical β at which the epidemic threshold becomes greater than
one is plotted as a vertical curve of the same colour

which we run the models and MC simulations. In the long-term dynamics of the SIR
process, R∞ , will usually exceed the observation time of the network. Therefore,
periodicity of the networks is assumed in a similar way to Valdano et al. (2015)
when computing the final outbreak sizes.
In Fig. 20.9 we see the distribution of final outbreak proportions for each random
graph against their critical β as computed from the epidemic threshold of the TPB
model. For the parameters, we fix μ at the value 0.05 and then vary β for from 0
up to 0.1, running 103 MC simulations for each of the values of β. We then plot the
average final outbreak size, R∞ , of each of these 103 simulations as the red curve for
the Erdős-Rényi graph, the green curve for Barabási-Albert graph and the blue curve
for the Watts-Strogatz graph. We then compute the analytical epidemic threshold
for this value of μ from Eq. 20.45 for each of the considered graphs, and plot those
values as vertical lines. The epidemic threshold for the Erdős-Rényi graph is given
as the red vertical line, for the Barabási-Albert graph as the green line, and the Watts-
Strogatz graph as the blue line. Note that the epidemic threshold of the Erdős-Rényi
and Watts-Strogatz lie very close together.
For values of β that are greater than the computed epidemic threshold in the
Erdős-Rényi and Watts-Strogatz graphs, there is an obvious but gradual change in
dynamics as local outbreaks no longer die out, but now propagate throughout the
20 A Systematic Derivation and Illustration … 427

network leading to larger final outbreak sizes, thus showing agreement with the
analytical result for the epidemic threshold. However, this change in dynamics is not
as clear in the case of the Barabási-Albert graph, where the epidemic threshold is far
higher than in the other two cases. It is important to note as well that this difference
in the epidemic threshold is due to the difference in network structure as the value
for μ is held constant in each case. The Barabási-Albert R∞ values appear to level
off before the epidemic threshold and don’t increase after it. This is likely due to the
network structure itself being particularly robust against epidemics, in part because
of its directedness.
Overall, for the considered synthetic graphs, our model offers large improvements
over the individual-based model for all cases except the case of the Erdős-Rényi
graph, but still does not overestimate the peak as much as the IB model, thus still
offering a minor improvement. We have also confirmed that our model is exact on
tree graphs. So far we have not considered temporal graphs or graphs generated from
empirical data, and so, in the next section we test the performance of our model on
two real-world temporal networks.

20.5.3 Empirical Networks

In this section, we consider two empirical temporal networks that both vary in struc-
ture and temporal activity. For each of the empirical networks we wish to test our
findings that the TPB model offers an increase in accuracy over the TIB model. Sim-
ilarly to the previous section where we test our models on synthetic networks, we
run the TIB and TPB SIR models for all the empirical networks for given values of
β and μ and then compare them to the average of a sufficiently large number of MC
simulations. This allows us to quantify how well the different models approximate
the dynamics of the true SIR process. The considered networks are now discussed
and an overview of their properties are given in Table 20.1.

20.5.3.1 Irish Cattle Trade

The Cattle Trades network consists of all trades between herds within the Republic
of Ireland during the year 2017 with a temporal resolution of one day (Tratalos et al.
2020) (cf. Table 20.1). Due to the nature of the trade data, interactions are directional.

Table 20.1 List of empirical networks


Network list
Network Vertices Agg. edges Avg. edges Snapshots
Conference 405 9699 20.02 3509
Cattle trades 111513 1041054 347.17 365
428 R. Humphries et al.

a) b)

1500 10³ out-d egree


in-degree
10²
tr ade s

1000

herds
500 10¹

0 10⁰

0 100 200 300 10¹ ⋅⁰ 10¹ ⋅⁵ 10² ⋅⁰ 10² ⋅⁵ 10³ ⋅⁰


t ime (day s) degree

Fig. 20.10 a Time series of number of active edges per day in the network. b In- and out-degree
of the network aggregated over the entire year worth of data

Thus, this data set is modelled by a directed graph, where each vertex represents a
herd and each edge represents a trade weighted by the number of animals traded.
The aggregated degree distribution of the graph is shown in Fig. 20.10b indicates
a scale-free behaviour often seen in empirical networks. The graph appears to be
quite sparse as is evident from Fig. 20.10a, with an average of only 347 edges per
day while having an aggregated 1,041,054 edges over the entire year. The data also
displays a strong bi-modal seasonal trend with there being two distinct peaks while
there tend to be very few trades occurring on Sundays when the data points lie near
zero. Although we ignore external drivers of the disease, this model still offers insight
into how susceptible to epidemics the graph is because trade is the main vector of
non-local transmission. There are a number of infectious diseases that affect cattle,
such as Foot and Mouth Disease and Bovine Tuberculosis, the latter of which is still a
major problem in Ireland, thus effective models for the spread of infectious diseases
among herds are particularly useful tools. In the present study, we focus on the SIR
dynamics, but the TPB model framework can easily be extended to other models.
From Fig. 20.11 we can compare the performance of the TIB and TPB models on
the cattle trades network. The figure shows a year worth of the average probability
of being infected PIavg of both models and the average of 103 MC realisations for the
same choice of parameters. In both plots, the solid red curve refers to the TPB model,
the solid green curve refers to the TIB model, and the dashed blue curve refers to the
average of the MC realisations. The parameters used were β = 0.5 for the panel (a)
and β = 0.3 for panel (b), in both cases μ = 0.005 and the initial conditions were
set to PIi (0) = 0.01 for every vertex.
As in the case of the synthetic graphs, we now look at what proportion of vertex
pairs are connected by more than one NBT path, i.e., the NBT reachability proportion,
as well as the final outbreak sizes, R∞ , in comparison to the analytical epidemic
threshold. In Fig. 20.12a we see the evolution of the NBT reachability proportion
over the course of a full year for the cattle trade network. It is quite clear from the
20 A Systematic Derivation and Illustration … 429

a) b)

0.025 0.010
T PB
avg. infected, PI (t) avg

TIB 0.009
0.020 MC
0.008

0.015
0.007

0.006
0.010

0 100 200 300 0 100 200 300


time t, (days)

Fig. 20.11 TIB (solid green) and TPB (solid red) SIR models on the Irish cattle trade network
together with the average of 104 Monte-Carlo simulations (dashed blue). Panels a and b show the
time series for the average probability of being infected PIavg , for different sets of parameters. Both
panels assumed that μ = 0.005 and initial conditions of PIi (0) = 0.01 for every vertex i, panel a
uses β = 0.5 and panel b used β = 0.3

(a ) (b)
proport ion o f verte x pa irs

0.0010

0.0010 0.0008

0.0006
R1

0.0005 0.0004

0.0002
0.0000

0 100 200 300 0.0 0.1 0.2 0.3


t ime (day) β

Fig. 20.12 a The NBT reachability proportion of the cattle trade network over one year. b The final
outbreak size, R∞ , for different values of β with a fixed value of μ = 0.005. R∞ was computed
from the average of the final number of recovered individuals of 103 MC simulations for each value
of β. The initial conditions of each simulation were set so that each individual had a 0.0001 chance
of starting infected. The analytically computed critical β at which the epidemic threshold becomes
greater than one is plotted as the vertical line

figure that the proportion is particularly low over the time period, staying relatively
close to zero for nearly 100 days. This helps provide insight into why the TPB model
performs so well on this particular data set in comparison to the TIB model. The
low NBT reachability proportion means that there are few opportunities for echo
chambers to occur as in the example given in Sect. 20.3.2, and so, this allows us
to justify the use of the TPB model on this data set while the NBT reachability
proportion remains low.
430 R. Humphries et al.

According to the analytical epidemic threshold as given by Eq. (20.45), we find


the critical β for the cattle trade network to be 0.0049, given the value for μ as
0.005. This is plotted as the vertical line given in Fig. 20.12b. The other curve in
this figure is the final outbreak sizes R∞ for the same value of μ = 0.005, but for
varying values of β. We compute the final outbreak size, R∞ by running 103 MC
realisations for each of the values of β until there are no infected individuals left, the
initial condition of each run is set such that each individual has a random chance of
0.0001 of starting the simulation as infected. We then take the average of the final
number of recovered individuals. The epidemic threshold, in this case, is very small
and we can see that near the critical β, R∞ stays very close to initial proportion of
infected, but as β begins to increase further away from the critical value, R∞ begins
to increase rapidly.
As is evident from Fig. 20.11, the TPB model offers a significant improvement
over the TIB model because there is far better agreement with the MC average for
both sets of parameters, with the IB model deviating significantly. The reason for
such a considerable improvement can be explained by the fact that the TPB model
is exact on graphs where the NBT reachability proportion is zero. However, because
the cattle trade network is a production network, there exist very few scenarios where
there are many possible NBT paths between herds, making the graph structure highly
tree-like in its supra-adjacency embedding. This can be explained by the fact that the
existence of such cycles are inefficient and cost prohibitive in the trade process. As
a result, the graph is well approximated by a tree graph. Therefore, the SIR process
is well approximated by the TPB model for such a graph.

20.5.3.2 Conference Contacts

The second data set (cf. Table 20.1) is the Conference network described in Génois
and Barrat (2018). It includes the face-to-face interactions of 405 participants at the
SFHH conference held in Nice, France 2009. Each snapshot of the network represents
the aggregated contacts in windows of 20s. Since this data set describes face-to-face
interactions, each contact is bi-directional and so an undirected graph is a natural
choice to model these interactions.
In Fig. 20.13a we see that the edge activity, in this case, shows a number of
peaks occurring then quickly dying out with a large break of no activity for many
hours. These are explained by breaks between sessions at the conference during
which the participants converse and interact as well as an overnight break as the
conference lasted more than 24 h. Because of the time scale and observation period
of this particular temporal graph, it is not feasible to model the spread of disease
as infection and recovery are unlikely to occur within the observation period, which
is approximately 30 h. However, we can use our model to simulate the spread of
viral information or “gossip” using the same dynamics as the SIR model. Infection
is equivalent to receiving some information in such a way that it becomes interesting
enough for the individual to try and spread to those they contact in the future and
recovery is equivalent to growing tired of the information and no longer informing
20 A Systematic Derivation and Illustration … 431

(a) (b )

10¹ ⋅⁵
100

ind ividu als


10¹ ⋅⁰
cont act s

50
10⁰⋅⁵
out- degree
in-d egree
0 10⁰⋅⁰

0 10 20 30 10¹ ⋅⁰ 10¹ ⋅⁵ 10² ⋅⁰ 10² ⋅⁵ 10³ ⋅⁰


tim e (hr) degree

Fig. 20.13 a Time series of number of active edges per time-step in the network. b In- and out-
degree of the network aggregated over the entire data set

a) b)
0.3
TP B 0.15
avg. infected, PI (t)
avg

TI B
0.2 MC
0.10

0.1 0.05

0.0 0.00

0 1000 2000 0 1000 2000


time t, (days)

Fig. 20.14 TIB (solid green) and TPB (solid red) SIR models on the conference network together
with the average of 104 Monte-Carlo simulations (dashed blue). Panels a and b show the time series
for the average probability of being infected PIavg , for different sets of parameters. Both panels
assumed that μ = 0.005 and initial conditions of PIi (0) = 0.01 for every vertex i, panel a uses
β = 0.5 and panel b used β = 0.05

others they meet. As shown in Fig. 20.13b, there is a clear heavy tail with most
vertices having a relatively small aggregated degree, meaning most individuals had
relatively few interactions in comparison to the most popular individuals.
Figure 20.14 shows the time series of the different models for two probabilities
of infection: β = 0.5 in a and β = 0.05 in b. In both cases, the μ = 0.05 and the
initial condition for each vertex is given as PIi (0) = 0.01. In both panels, the average
probability of belonging to the infected compartment, PI avg , is plotted for the TPB
model as the solid red curve, for the TIB curve as the solid green curve, and the average
of the 104 MC realisations as the dashed blue curve. Again, one can observe that in
every case the TPB approximation offers an improvement over the TIB. However,
432 R. Humphries et al.

(a ) (b)
0.4

0.06
0.3
c(t)

0.2

R1
0.04

0.1
0.02
0.0

0.0 0.5 1.0 1.5 2.0 0.00 0.02 0.04


time, t (hr) β

Fig. 20.15 a The NBT reachability proportion of the conference network over the first 2.5 h. b
The final outbreak size, R∞ , for different values of β with a fixed value of μ = 0.005. R∞ was
computed from the average of the final number of recovered individuals of 103 MC simulations for
each value of β. The initial conditions of each simulation were set so that each individual had a 0.01
chance of starting infected. The analytically computed critical β at which the epidemic threshold
becomes greater than one is plotted as the vertical line

compared to the MC simulations, it still performs quite poorly, barely outperforming


the IB model in panel (a).
In order to explain why the TPB model fails to provide a reasonable increase in
accuracy when compared to the TIB model for this particular data set, we look at the
NBT reachability proportion over time in Fig. 20.15a. When compared to the equiv-
alent plot for the cattle trade dataset in Fig. 20.12a, the NBT reachability proportion
in this data set is far higher, with approximately a 10 times difference between scales.
This can be compared to the NBT reachability proportion of the Erdős-Rényi graph
in Fig. 20.8, which has a similar scale in NBT reachability proportion and performs
approximately the same as the TIB model in the SIR simulations (cf. Figs. 20.5 and
20.14)
The reason we do not see a good agreement with the MC average for this particular
data set is due to the underlying topology of the graph, which is a physical social
interaction network where individuals congregate in groups and most or all in the
group interact with one another. This leads to large clusters that give rise to many
possible NBT paths between vertices. The more time-respecting NBT paths that exist
between any two vertices, the worse the TPB model will perform. It is for this reason
that we see a relatively large deviation from the MC simulations for the TPB model.
For the conference network, we computed the critical β to be 0.0029 for given
the value of μ was taken to be 0.005. In Fig. 20.15b, we show how the final outbreak
size R∞ increases for varying values of β where we fix the value of μ at 0.005 and
show the computed critical β as a vertical line. To compute R∞ we run 103 MC
simulations until there are no infected individuals left for each of our values of β and
then take the average of the final number of recovered left. The initial conditions of
each simulation were such that each individual had a probability of 0.01 of starting
20 A Systematic Derivation and Illustration … 433

infected. Similarly to the other scenarios, we see agreement between the analytical
epidemic threshold critical β and the final outbreak sizes R∞ . As the values of β move
further away from this critical β, the final outbreak size R∞ also begins to increase,
whereas, below the critical value, R∞ stays very close to the initial conditions.

20.6 Summary

In this chapter, we have presented work on pair-based susceptible-infected-recovered


(SIR) models by systematically extending them to a temporal setting and investigat-
ing the effect of non-backtracking cycles on the accuracy of the model on arbitrary
network structures. We have found that the existence of many such non-backtracking
cycles leads to a deviation in the pair-based model from the dynamics of the full SIR
process due to the echo chamber effect they induce. Thus, the pair-based model is
best suited to network structures that do not contain many cycles, such as production
networks. We also find that our analytical finding for the epidemic threshold holds
when compared to numerical simulations as demonstrated by a qualitative change in
the final outbreak proportion.

Acknowledgements RH acknowledges support via a Ph.D. scholarship of University College


Cork. PH acknowledges support by Deutsche Forschungsgemeinschaft (DFG) under project ID
434434223 - SFB 1461.

References

A.L. Barabási, M. Pósfai, Network Science (Cambridge University Press, 2016)


A. Barabási, Science 286(5439), 509 (1999). https://doi.org/10.1126/science.286.5439.509
G. Bianconi, Multilayer Networks: Structure and Function, 1st edn. (Oxford University Press, 2018)
J.A. Bondy, U.S.R. Murty, Graph Theory. Graduate Texts in Mathematics, vol. 244 (Springer, 2008)
V. Colizza, R. Pastor-Satorras, A. Vespignani, Nat. Phys. 3(4), 276 (2007). https://www.nature.com/
articles/nphys560
M. Frasca, K.J. Sharkey, J. Theor. Biol. 399, 13 (2016). https://doi.org/10.1016/j.jtbi.2016.03.024
C.W. Gardiner, C.W. Gardiner, Stochastic Methods: A Handbook for the Natural and Social Sci-
ences. Springer Series in Synergetics, 4th edn. (Springer, 2009)
M. Génois, A. Barrat, EPJ Data Sci. 7(1), 11 (2018). https://doi.org/10.1140/epjds/s13688-018-
0140-1
J. Gethmann, C. Probst, J. Bassett, P. Blunk, P. Hövel, F.J. Conraths, Front. Vet. Sci. 6, 406 (2019).
https://www.frontiersin.org/articles/10.3389/fvets.2019.00406/full
S. Gómez, A. Arenas, J. Borge-Holthoefer, S. Meloni, Y. Moreno, EPL (Europhys. Lett.) 89(3),
38009 (2010). https://doi.org/10.1209/0295-5075/89/38009
R. Humphries, M. Spillane, K. Mulchrone, S. Wieczorek, M. O’Riordain, P. Hövel, Infect. Disease
Model. 6, 420 (2021). https://doi.org/10.1016/j.idm.2021.01.004
R. Humphries, K. Mulchrone, J. Tratalos, S.J. More, P. Hövel, Appl. Netw. Sci. 6(1), 23 (2021).
https://doi.org/10.1007/s41109-021-00363-w
F. Iannelli, A. Koher, D. Brockmann, P. Hövel, I.M. Sokolov, Phys. Rev. E 95(1), 012313 (2017).
https://doi.org/10.1103/physreve.95.012313
434 R. Humphries et al.

B. Karrer, M.E.J. Newman, Phys. Rev. E 82(1), 016101 (2010). https://doi.org/10.1103/PhysRevE.


82.016101
M.J. Keeling, K.T.D. Eames, J. R. Soc. Interface. 2(4), 295 (2005). https://doi.org/10.1098/rsif.
2005.0051
J.G. Kirkwood, J. Chem. Phys. 3(5), 300 (1935). https://doi.org/10.1063/1.1749657
A. Koher, H.H.K. Lentz, P. Hövel, I.M. Sokolov, PLoS ONE 11(4), e0151209 (2016). https://doi.
org/10.1371/journal.pone.0151209
A. Koher, H.H. Lentz, J.P. Gleeson, P. Hövel, Phys. Rev. X 9(3), 031017 (2019). https://doi.org/10.
1103/PhysRevX.9.031017
H.H.K. Lentz, T. Selhorst, I.M. Sokolov, Phys. Rev. Lett. 110(11), 118701 (2013). https://doi.org/
10.1103/PhysRevLett.110.118701
H.H.K. Lentz, A. Koher, P. Hövel, J. Gethmann, C. Sauter-Louis, T. Selhorst, F.J. Conraths, PLoS
ONE 11(5), e0155196 (2016). https://doi.org/10.1371/journal.pone.0155196
R.D. Luce, Proc. Am. Math. Soc. 3(3), 382 (1952). https://doi.org/10.2307/2031888
N. Masuda, P. Holme, Temporal Network Epidemiology (Springer, 2017)
M.E.J. Newman, Networks, 2nd edn. (Oxford University Press, 2018)
K.J. Sharkey, Theor. Popul. Biol. 79(4), 115 (2011). https://doi.org/10.1016/j.tpb.2011.01.004
K.J. Sharkey, R.R. Wilkinson, Math. Biosci. 264, 74 (2015). https://doi.org/10.1016/j.mbs.2015.
03.008
K.J. Sharkey, I.Z. Kiss, R.R. Wilkinson, P.L. Simon, Bull. Math. Biol. 77(4), 614 (2015). https://
doi.org/10.1007/s11538-013-9923-5
M. Shrestha, S.V. Scarpino, C. Moore, Phys. Rev. E 92(2), 022821 (2015). https://doi.org/10.1103/
PhysRevE.92.022821
J.A. Tratalos, J.M. Madden, G. McGrath, D.A. Graham, A.B. Collins, S.J. More, Prev. Vet. Med.
183, 105095 (2020). https://doi.org/10.1016/j.prevetmed.2020.105095
E. Valdano, L. Ferreri, C. Poletto, V. Colizza, Phys. Rev. X 5(2), 021005 (2015). https://doi.org/10.
1103/PhysRevX.5.021005
Y. Wang, D. Chakrabarti, C. Wang, C. Faloutsos, in 22nd International Symposium on Reliable
Distributed Systems, 2003. Proceedings. (2003), pp. 25–34. https://doi.org/10.1109/RELDIS.
2003.1238052
X.X. Zhan, Z. Li, N. Masuda, P. Holme, H. Wang, EPJ Data Sci. 9(1), 30 (2020). https://
epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-020-00248-5
Chapter 21
Modularity-Based Selection
of the Number of Slices in Temporal
Network Clustering

Patrik Seiron, Axel Lindegren, Matteo Magnani, Christian Rohner,


Tsuyoshi Murata, and Petter Holme

Abstract A popular way to cluster a temporal network is to transform it into a


sequence of networks, also called slices, where each slice corresponds to a time
interval and contains the vertices and edges existing in that interval. A reason to
perform this transformation is that after a network has been sliced, existing algorithms
designed to find clusters in multilayer networks can be used. However, to use this
approach, we need to know how many slices to generate. This chapter discusses how
to select the number of slices when generalized modularity is used to identify the
clusters.

Keywords Temporal network · Clustering · Number of slices

21.1 Introduction

Clustering is one of the most studied network analysis tasks, with an ever-growing
number of articles proposing new algorithms and approaches (Fortunato 2010; Coscia
et al. 2011; Bothorel et al. 2015). The absence of a unique definition of a cluster can
partly explain the large number of available clustering algorithms existing. While
a cluster in a simple network is generally understood as a set of vertices that are
well connected to each other and less well connected to other vertices, different
algorithms use different specific definitions of cluster and clustering (that is, the
set of all clusters). When we add more information to vertices and edges, e.g., the

P. Seiron · A. Lindegren · M. Magnani (B) · C. Rohner


Department of Information Technology, Uppsala University, Uppsala, Sweden
e-mail: matteo.magnani@it.uu.se
T. Murata
Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan
P. Holme
Department of Computer Science, Aalto University, Espoo, Finland
Center for Computational Social Science, Kobe University, Kobe, Japan

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 435
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_21
436 P. Seiron et al.

Fig. 21.1 A temporal


network with three
timestamps

fact that edges only exist at specific times, defining what constitutes a cluster or a
clustering becomes even more challenging.
The availability of temporal information about the existence of edges has two main
consequences on the clustering task. First, clustering algorithms not considering the
temporal information may miss clusters only existing at specific times (because they
are hidden by edges active at other times) or identify clusters that do not exist at
any specific time. Second, new types of clusters can be defined, for example, clusters
recurring at regular times or clusters growing, shrinking, merging, and splitting (Palla
et al. 2007). For these reasons, different extensions of clustering methods for temporal
networks have been proposed.
In this chapter, we focus on a common approach to temporal network clustering,
consisting of two steps: first, the temporal network is sliced into a sequence of
static networks, also called slices. This sequence of networks is a specific type of
multilayer network (Kivela et al. 2014; Boccaletti et al. 2014; Dickison et al. 2016).
Then a clustering algorithm for multilayer networks is used to discover clusters. An
example of this approach is shown in Figs. 21.1 and 21.2.
While this approach allows reusing the many clustering algorithms defined for
multilayer networks, it relies on the ability to choose a number of slices leading to
good clustering. This number can sometimes be decided based on domain knowledge.
Still, for this approach to be usable in general, it is important to have ways to discover
a good number of slices directly from the data. This is fundamental in the absence
of domain knowledge, but also useful to check if the number suggested by a domain
expert is compatible with the data. Therefore, the research question we address in this
chapter is: given a temporal network and a multi-slice network clustering algorithm,
how can we find a number of slices for which well-defined clusters emerge?
The answer to this question depends on the algorithm used to cluster the network
after slicing. In this paper, we focus on one of the most used multilayer network clus-
tering algorithms: generalized Louvain (Mucha et al. 2010). Multi-slice modularity,

Fig. 21.2 Three temporal slices, one for each timestamp, where distinct clusters can be identified
21 Modularity-Based Selection of the Number … 437

the objective function used by the algorithm, considers modularity in each layer and
also increases when the same vertex is included in the same cluster in different layers.
If our objective is to find the number of slices leading to the best clustering, having
an objective function (in this case, multi-slice modularity) we might be tempted to
run the generalized Louvain optimization algorithm for different numbers of slices
and pick the result with the highest modularity. Given the same network (or the
same sliced sequence of networks), we can compare the modularity of different
clusterings to identify the best. In particular, if two clusterings of the same network
sliced into the same number of slices have a different modularity, it is often assumed
that the clustering with a higher modularity is a better clustering. Unfortunately,
in general, we cannot use modularity to compare clusterings of the same network
sliced differently. If we split the same network using two different numbers of slices,
a clustering of the former with higher modularity is not necessarily better than a
clustering of the latter with lower modularity.
As an example, Fig. 21.3 shows the modularity of the clusterings discovered by the
generalized Louvain algorithm on four real temporal networks varying the number
of slices. The four networks represent contacts between people measured by carried
wireless devices (haggle (Chaintreau et al. 2007)), face-to-face interactions (hyper,

0.75
modularity (original)

data
haggle.tnt
hyper.tnt
0.50 infect.tnt
school.tnt

0.25

0 50 100 150 200


number of slices

Fig. 21.3 Modularity of the partitions returned by the generalized Louvain algorithm varying the
number of slices for four real temporal networks
438 P. Seiron et al.

infect (Isella et al. 2011)), and friendships between boys in a small high school in
Illinois (school (Coleman 1964)). We can see that the more slices we have, the higher
the modularity we get from the algorithm. This suggests that increasing modularity
values for different numbers of slices is not necessarily an indication of a better
clustering but can be a by-product of the changing size of the input networks.
To address this problem, we use the following hypothesis. Multi-slice modularity
has two components: one that increases with better clusterings and one that increases
just because the data size increases, e.g., if we duplicate a slice, the same cluster
extended across two slices will contain additional inter-slice edges. If this hypothesis
is correct, then we can try to isolate the first component in the modularity value and
use it to compare clusterings computed using different numbers of slices.
In this chapter, we show the dependency between the number of slices and multi-
slice modularity, both analytically and experimentally. We also use an edge reshuf-
fling algorithm to separate the effect of the number of slices, resulting in a first
corrected multi-slice modularity measure. We then experimentally validate the cor-
rected multi-slice modularity on synthetic networks where the best number of slices
is known in advance, and we identify clusters in different types of real temporal net-
works using the proposed approach. We conclude by discussing the limitations of the
reshuffling-based method and, more in general, of the modularity-based approach.

21.2 Related Work

For a detailed overview of clustering methods in temporal networks, we refer the


reader to the dedicated chapter in this book.
Clustering algorithms for temporal networks approach the temporal evolution of
networks in different ways. Algorithms building on segmentation seek time slices
containing well-defined community structures (quality) yet being similar to neigh-
boring slices (stability). For example, Aynaud and Guillaume (2011) builds a hierar-
chical time segmentation by extracting interesting time windows based on structural
changes and identifying a unique decomposition for the time windows. He et al.
(2017) uses the so-called Moore’s Visualization Method, which implies a certain
overlap of time slices. Simpler approaches segment the temporal network in equal
time slices or slices with similar edge density. In this chapter, we abstract from the
segmentation approach by using equal time slices and focus on how many slices
should be used.
After a temporal network has been converted into a multilayer network, sev-
eral approaches can be used to discover clusters. These include, among others,
algorithms based on generative models De Bacco et al. (2017), multilayer cliques
Tehrani and Magnani (2018), walks Boutemine and Bouguessa (2017), information
theory De Domenico et al. (2015), aggregation of single-layer clusterings Berlin-
gerio et al. (2013), Tagarelli et al. (2017), and modularity Mucha et al. (2010). An
overview of existing approaches is available in Magnani et al. (2021). Different algo-
rithms using different definitions of cluster require different measures to evaluate the
21 Modularity-Based Selection of the Number … 439

goodness of a slicing, and the best number of slices does not need to be the same for
all approaches. In this chapter, we focus on modularity-based clustering.
Other algorithms avoid segmentation and instead track evolution in time by
observing events (i.e., birth, death, growth, contraction, merge, split, continue, resur-
gence) (Rossetti et al. 2017). Palla et al. (2007) build on clique percolation and
observe that small communities, in general, have static, time-independent member-
ship, while large communities are dynamic. Random walk-based approaches (Rosvall
and Bergstrom 2008) and stochastic block modeling (Matias and Miele 2017) are
other alternatives to modularity-based clustering.

21.3 Method

To evaluate the quality of a clustering we use the concept of modularity, describing


the fraction of the edges within communities minus the expected fraction if edges
were distributed at random, preserving degree distribution. For simple networks, and
without considering the so-called resolution parameter that was introduced at a later
time, modularity is defined as follows:
 
1  ki k j
Ai j − δ(γi , γ j ) (21.1)
2m i, j 2m

where A is the adjacency matrix, ki is the degree of vertex i, γi is the community id


of vertex i, δ(γi , γ j ) = 1 if γi = γ j and 0 otherwise, and m is the number of edges
in the network.
In multi-slice networks, modularity is defined as Mucha et al. (2010):
  
1  kis k js
Ai js − δ(s, r ) + c jsr δ(i, j) δ(γi,s , γ j,r ) (21.2)
2μ i jsr 2m s

where i and j indicate vertices and s and r slices. This formula is a combination of
the modularity in each slice plus a contribution c jsr for vertices included in the same
community in two slices. In this chapter, we consider the case where c jsr = 1 iff r =
s + 1, that is, the two slices are consecutive. μ is the number of all (intra-slice) edges
plus the sum of all c jsr .
The basic approach we use to separate the effect of the presence of clusters from the
effect of just increasing the number of slices is based on an edge reshuffling process
that destroys the clusters in the network without affecting the degree distribution.
For each number of slices, the Louvain algorithm is run both on the original data and
on the reshuffled data where the clusters have been destroyed. The modularity on the
dataset without clusters indicates the effect of the number of slices on modularity.
Here we use the difference between the two to estimate the part of modularity due
to the presence of clusters. We call this difference corrected modularity.
440 P. Seiron et al.

In particular, we run community detection multiple times and take the maximum
value of modularity to account for the non-deterministic character of the generalized
Louvain algorithm. To remove the effect of the number of slices, we use the edge
swapping randomization model (Karsai et al. 2011; Gauvin et al. 2018) that selects
two edges (i, j) and (u, v) at random and swaps two randomly selected ends of the
two edges. Some practical decisions have to be made to perform the randomization.
First, we have to avoid swaps producing existing edges so that the total number of
edges does not change. We must also do the shuffling slice by slice to preserve the
intra-slice degree distributions—that is, in this context, we do not use the exact same
reshuffling as described in (Karsai et al. 2011; Gauvin et al. 2018). It should also be
noted that using this process, we cannot break a single clique contained in a single
slice.
In summary, the method can be described as follows:
1. For i from 1 to n:
a. Slice the temporal network into i slices.
b. Run modularity-based multilayer community detection multiple times and
compute modularity. We call m o (i) the maximum modularity found for i slices.
c. Apply edge randomization in each slice.
d. Run community detection multiple times and compute modularity. We call
m r (i) the maximum modularity found for i slices after randomization.
e. Compute the corrected modularity m n (i) = m o (i) − m r (i).
2. Return i maximizing m n (i).

21.4 Results

In this section, we present three main results. First, we support our assumption that
part of the growth in modularity when the number of slices increases is a direct
consequence of the increased number of slices and not necessarily of the presence
of better clusters. Then, we use synthetic datasets where the best number of slices is
known in advance to test if our approach can correctly identify these values. Finally,
we execute our method on real networks for which we do not know the best number
of slices.

21.4.1 Expected Modularity Increment in Sequentially


Duplicated Networks

In this section, we analyze the behavior of modularity when we start from a single slice
with clear clusters and add additional identical slices. In this dataset, the assignment
21 Modularity-Based Selection of the Number … 441

of vertices (i, j, . . . ) into clusters does not change with the number of slices: the
clusters are replicated on all slices.
We first do this analytically. We start from a single slice,where modularity is
computed as in Eq. 21.1. For convenience, let us call A = i, j (Ai j )δ(γi γ j ) and
 ki k j
K = i, j ( 2m )δ(γi γ j ). With this substitution, we can write the modularity of one
slice as:
A−K
(21.3)
2m
If we duplicate the network, that is, we add one slice identical to the original
network and replicate the same cluster assignments on the two slices, the expected
modularity includes the modularity on the two slices (each equal to the modularity
above) plus one interlayer link for each vertex (as by definition each vertex belongs
to the same cluster in both slices):

(A − K ) + (A − K ) + 2a
2(a + m + m)

Generalizing to S slices, we obtain:

S(A − K ) + 2a(S − 1)
2(a(S − 1) + Sm)

This result is tested in Fig. 21.4, where we show how the theoretical model approx-
imates well the empirical one.

21.4.2 Synthetic Data Validation

21.4.2.1 Hidden Cliques

Our first synthetic data consists of two cliques separated by random noise (20%
density), with this pattern repeated five times. The network is shown in Fig. 21.5a,
split into different numbers of slices. When we only have one slice, the combination
of the noise present throughout the existence of the network hides the clusters. When
we use five slices (Fig. 21.5b), the cliques are easily visible in all slices. In time, the
cliques disappear because they are spread across several less dense slices.
With this dataset, we know that the best clusters appear when we have five slices.
Figure 21.6a shows the original modularity, the randomized modularity, and our cor-
rected modularity. While the first two increase when the number of slices increases,
the corrected modularity peaks at five slices. Figure 21.6b shows the result for the
same experiment but with ten repetitions of the clique-noise pattern instead of five.
The method correctly finds a peak at ten slices.
442 P. Seiron et al.

Fig. 21.4 Expected modularity of a replicated network: the value increases with the number of
slices, following a predictable pattern

Figure 21.6b also shows the normalized mutual information (NMI) between the
ground truth clusters and the clusters found by the algorithm, for different numbers
of slices. A higher value of NMI corresponds to more similar clusterings. We notice
how the number of slices identified by our approach corresponds to the highest NMI.
However, the generalized Louvain algorithm would still be able to reach the same
NMI with other numbers of slices (up to 15, respectively 25).
Figure 21.7 shows a different type of synthetic temporal network where five groups
of vertices are active at different times. An example of this type of behavior can be
a museum, where some organized groups enter the exhibition at different times and
go through it together, being active for the whole duration of their visit and then
disappearing from the data.
In this case, the best number of slices is the one where all the groups are collected
together, that is, only one slice. This corresponds to the highest value of corrected
modularity (Fig. 21.7d). Notice, however, that the generalized Louvain algorithm
would identify the same clusters independently of the number of slices.
21 Modularity-Based Selection of the Number … 443

(a) (b) (c)

Fig. 21.5 A temporal network with two cliques separated by random noise is repeated five times:
For a single slice (a), the aggregated noise hides the cliques. Five slices (b) reveal the cliques as
the number of slices separates the five repetitions of the network. The cliques disappear for an
increasing number of slices (c) because they are spread across several less dense slices

1.00 1.00

0.75 0.75

value value
1_original 1_original
2_shuffled 0.50 2_shuffled
0.50 3_diff 3_diff
nmi nmi

0.25
0.25

0.00
(a) (b)
0 10 20 30 0 10 20 30
number of slices number of slices

Fig. 21.6 A temporal network with two cliques separated by random noise is repeated five and
ten times, respectively: The figures show the modularity of the original network (red), the shuffled
network (green), the corrected modularity (blue), as well as the normalized mutual information NMI
(purple), for five (a) and ten (b) repetitions of the experiment The peaks of the corrected modularity
are at five and ten, respectively

21.4.3 Real Data

Figure 21.8 shows the original modularity computed by the generalized Louvain
algorithm (black), the modularity of the randomized network (red), and the corrected
modularity (blue) for the four real datasets in Fig. 21.3. The corrected modularity fol-
lows similar trends for the Hypertext, Haggle, and School datasets, with a maximum
reached after a few slices have been obtained, while for the Infect dataset, we see a
maximum for the original, unsliced data.
444 P. Seiron et al.

(a) (b)

1.00

0.75

0.50
value
_original
_shuffled
_diff
nmi

0.25

(c) (d)
0.00

0 10 20 30
number of slices

Fig. 21.7 Five cliques are active at different times: The five cliques are easily identified with one
slice (a), where also the corrected modularity has its maximum. The normalized mutual information
reveals, however, that the clusters are correctly identified independent of the number of slices—five
in (b) and ten in (c). Panel (d) shows the modularity

21.5 Discussion

We have presented an approach to select the number of slices to discover commu-


nities in temporal networks using generalized modularity. Based on the observation
that the value of generalized modularity tends to increase when we increase the
number of slices, even when no additional information is generated, the proposed
approach corrects the modularity. In this chapter, we use edge reshuffling to perform
the correction.
21 Modularity-Based Selection of the Number … 445

0.8

0.75

0.6

0.50
0.4

0.2
0.25

0.0
(a) (b)
0 50 100 150 200 0 50 100 150
number of slices number of slices

0.8

0.75
0.6

0.4 0.50

0.2
0.25

(c) (d)
0.0
0 50 100 150 200 0 50 100 150
number of slices number of slices

Fig. 21.8 Examples from real-world data sets. Hypertext (a), Haggle (b), School (c), Infect (d).
original modularity (black) model reshuffling (red), and corrected modularity (blue)

While the proposed method looks for the highest value of corrected modularity
to select the best number of slices, this does not mean that lower values of corrected
modularity would necessarily correspond to worse clusterings. In fact, even when the
proposed method identifies the number of slices corresponding to the best clustering,
our experiments show that there is often a range of values where the communities
are clear enough to be identified by the clustering algorithm.
This chapter does not focus on efficient computation. In the experiments, we use a
brute-force search to find the number of slices maximizing the corrected modularity,
trying all values from 1 to an arbitrary number. While this works for small or medium
networks, recomputing generalized Louvain hundreds of times may not be feasible
for larger networks, so a smarter exploration of the solution space would be necessary.
While this chapter focuses on modularity, there are intrinsic limitations of modu-
larity that should be considered when it is used to identify communities in temporal
446 P. Seiron et al.

Fig. 21.9 An example of


recurrent communities that is
not captured well by
multi-slice modularity

networks. A first issue is that generalized Louvain tends to identify pillar clusters so
that a vertex that belongs to the same cluster in many layers may get clustered with
the same nodes also in layers where they are not well connected. More generally, in
a temporal network, we may expect some vertices to belong to some clusters only
at some times, while generalized Louvain would force all vertices to belong to a
cluster in each slice. Recurrent clusters (appearing and disappearing) are also not
supported well by the ordered version of generalized Louvain. Figure 21.9 shows a
slicing where clear communities are visible, but that does not correspond to a max-
imum value of corrected modularity. This is expected, as modularity tries to cluster
all the vertices and looks for inter-layer consistency.
As a final consideration, in this chapter, we assume that the two modularity compo-
nents, one that increases with better clusterings and one that increases just because the
data size increase, are additive. The assumption seems to hold for identifying clearly
planted clusters in synthetic data but also gives unstable results for low numbers of
slices in a case where we repeat the same network in every slice (see Fig. 21.10); in

Fig. 21.10 The same


network is replicated on all 0.6
the slices: A scenario where
the corrected modularity is
expected to be constant but
shows unexpected behavior 0.5

for small numbers of slices

value
0.4 1_original
2_shuffled
3_diff

0.3

0.2

0 25 50 75 100
number of slices
21 Modularity-Based Selection of the Number … 447

this case, we could expect corrected modularity not to show any difference between
different numbers of slices.

Acknowledgements This work has been partly funded by eSSENCE, an e-Science collaboration
funded as a strategic research area of Sweden, by STINT initiation grant IB2017-6990 “Mining
temporal networks at multiple time scales”, and by EU CEF grant number 2394203 (NORDIS –
NORdic observatory for digital media and information DISorder). P.H. was supported by JSPS
KAKENHI Grant Number JP 21H04595.

References

T. Aynaud, J.L. Guillaume, in 5th SNA-KDD Workshop, vol. 11 (2011)


M. Berlingerio, F. Pinelli, F. Calabrese, Data Mining Knowl. Discov. 27(3), 294 (2013)
S. Boccaletti, G. Bianconi, R. Criado, C.I. del Genio, J. Gómez-Gardeñes, M. Romance, I. Sendiña-
Nadal, Z. Wang, M. Zanin, Phys. Rep. 544(1), 1 (2014)
C. Bothorel, J. Cruz, M. Magnani, B. Micenková, Netw. Sci. 3(3) (2015)
O. Boutemine, M. Bouguessa, ACM Trans. Knowl. Disc. Data 11(4), 1 (2017)
A. Chaintreau, P. Hui, J. Crowcroft, C. Diot, R. Gass, J. Scott, IEEE Trans. Mob. Comput. 6(6),
606 (2007)
J.S. Coleman, Introduction to Mathematical Sociology (Free Press of Glencoe, 1964)
M. Coscia, F. Giannotti, D. Pedreschi, Statistical analysis and data mining: the ASA. Data Sci. J.
4(5), 512 (2011)
C. De Bacco, E.A. Power, D.B. Larremore, C. Moore, Phys. Rev. E 95(4) (2017)
M. De Domenico, A. Lancichinetti, A. Arenas, M. Rosvall, Phys. Rev. X 5(1) (2015)
M.E. Dickison, M. Magnani, L. Rossi, Multilayer Social Networks (Cambridge University Press,
Cambridge, 2016)
S. Fortunato, Phys. Rep. 486(3), 75 (2010)
L. Gauvin, M. Génois, M. Karsai, M. Kivelä, T. Takaguchi, E. Valdano, C.L. Vestergaard, (2018).
arXiv:1806.04032v1
J. He, D. Chen, C. Sun, Y. Fu, W. Li, Physica A: Stat. Mech. Appl. 469 (2017)
L. Isella, J. Stehlé, A. Barrat, C. Cattuto, J.F. Pinton, W.V.D. Broeck, J. Theor. Biol. 271(1), 166
(2011)
M. Karsai, M. Kivelä, R. Pan, K. Kaski, J. Kertéz, A.L. Barabási, J. Saramäki, Phys. Rev. 83 (2011)
M. Kivela, A. Arenas, M. Barthelemy, J.P. Gleeson, Y. Moreno, M.A. Porter, J. Complex Netw.
2(3), 203 (2014)
M. Magnani, O. Hanteer, R. Interdonato, L. Rossi, A. Tagarelli, ACM Comput. Surv. 54(3) (2021)
C. Matias, V. Miele, J. R. Stat. Soc.: Ser. B (Statistical Methodology) 79 (2017)
P.J. Mucha, T. Richardson, K. Macon, M.A. Porter, J.P. Onnela, Science 328(5980), 876 (2010)
G. Palla, A.L. Barábasi, T. Vicsek, Nature 446 (2007)
G. Rossetti, L. Pappalardo1, D. Pedreschi, F. Giannotti, Mach. Learn. 106 (2017)
M. Rosvall, C. Bergstrom, Proc. Natl. Acad. Sci. USA 105 (2008)
A. Tagarelli, A. Amelio, F. Gullo, Data Mining Knowl. Discov. 31(5), 1506 (2017)
N.A. Tehrani, M. Magnani, in Social Informatics—10th International Conference, vol. 11186
(Springer, 2018), pp. 15–28
Chapter 22
A Frequency-Structure Approach
for Link Stream Analysis

Esteban Bautista and Matthieu Latapy

Abstract A link stream is a set of triplets (t, u, v) indicating that u and v interacted
at time t. Link streams model numerous datasets and their proper study is crucial in
many applications. In practice, raw link streams are often aggregated or transformed
into time series or graphs where decisions are made. Yet, it remains unclear how the
dynamical and structural information of a raw link stream carries into the transformed
object. This work shows that it is possible to shed light on this question by studying
link streams via algebraically linear graphs and signal operators, for which we intro-
duce a novel linear matrix framework for the analysis of link streams. We show that,
due to their linearity, most methods in signal processing can be easily adopted by
our framework to analyze the time/frequency information of link streams. However,
the availability of linear graph methods to analyze relational/structural information
is limited. We address this limitation by developing (i) a new basis for graphs that
allow us to decompose them into structures at different resolution levels and (ii)
filters for graphs that allow us to change their structural information in a controlled
manner. By plugging in these developments and their time-domain counterpart into
our framework, we are able to (i) obtain a new basis for link streams that allow us to
represent them in a frequency-structure domain; and (ii) show that many interesting
transformations to link streams, like the aggregation of interactions or their embed-
ding into a euclidean space, can be seen as simple filters in our frequency-structure
domain.

Keywords Temporal networks · Link streams

E. Bautista · M. Latapy (B)


CNRS, Laboratoire d’Informatique de Paris 6, LIP6, F-75005 Paris, France
e-mail: Matthieu.Latapy@lip6.fr

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 449
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9_22
450 E. Bautista and M. Latapy

22.1 Introduction

Financial transactions, phone calls, or network traffic are examples of data that can
be very well modeled as a link stream (Latapy et al. 2018; Latapy et al. 2019): a set
of possibly weighted triplets (t, u, v) indicating that u and v interacted at time t. For
example, a triplet can model that a bank account u made a transaction to an account
v at a time t, or that a computer u sent a packet to a computer v at a time t (weights
may represent amounts or sizes). Link streams have gained considerable attention
in recent years as numerous phenomena of crucial interest, such as financial frauds,
network attacks, or fake news, correspond in them to clusters of interactions with
some distinctive dynamical and structural signature. For this reason, the develop-
ment of techniques allowing a precise understanding of the dynamical and structural
properties of link streams has become a subject of utmost importance.
Traditionally, link streams are seen as collections of time series (one for each
relation u, v) or as sequences of graphs (one for each time t). These interpretations
then allow using the frameworks of signal processing and graph theory to study
their dynamical and structural information, respectively. However, the very sparse
nature of most real-world link streams causes these approaches to be frequently
inconclusive. Namely, the graphs from the sequence often contain so few edges that
it is hard to extract useful structural information from them, while the time series of
relations are often too spiky to allow the extraction of insightful dynamical patterns.
For these reasons, it is a very common practice to transform a sparse link stream
into a denser one by aggregating all interactions contained within pre-defined time
windows, which results in a new link stream with properties that are simpler to
study. Yet, this comes at the price of potentially no longer containing the information
necessary to detect an event of interest. For example, if two links that appear in
succession is key to spotting an event, then the aggregation process is likely to
destroy this information.
Interestingly, aggregation is not the only situation where the above problem may
occur, as the signal and graph interpretations are often used to transform link streams
into new objects where decisions are made. Such new objects may be another link
stream (Chiappori and Cazabet 2021; Ribeiro et al. 2013; Paranjape et al. 2017), a
time series whose samples summarize structural information (Fontugne et al. 2017;
Özcan and Öğüdücü 2017; Bhatia et al. 2020; Kodali et al. 2020), or a graph whose
edges summarize dynamical information (Peng et al. 2019; Zhang et al. 2019; Chang
et al. 2021). In all these situations, information about the raw link stream is poten-
tially destroyed during the transformation processes, making it pertinent to raise the
question: what dynamical and structural information of the raw link stream carries
into the new object? Addressing this question is the fueling force of this work.
This work shows that, in order to address the aforementioned question, it is nec-
essary that operators from signal processing and graph theory can be algebraically
interchanged, which can be achieved by restricting to the subset of linear signal and
graph operators. Based on this insight, we introduce a new linear framework for the
analysis of link streams. In this framework, we represent a link stream by a simple
22 A Frequency-Structure Approach for Link Stream Analysis 451

matrix and its processing amounts to multiply the link stream by other matrices that
encode for linear graph and signal operators. Notably, the linearity of most signal-
processing methods implies that they can be readily incorporated into our framework
as a means to process the time properties of link streams. However, the processing
of the relational properties poses a challenge as the availability of linear graph oper-
ators is scarce, raising the question of how to design meaningful graph methods that
satisfy linearity.
To address the aforementioned challenge, we interpret graphs as functions and
then adapt signal processing methods to process such functions. In particular, we
leverage this methodology to (i) develop a new basis for graphs that allow us to
represent them in a structural domain; and (ii) develop filters for graphs that allow us
to suppress structural information from them in a controlled manner. These results
can therefore be seen as graph analogs of the Fourier/wavelet transform and of fre-
quency filters. By combining such developments with their time domain counterpart
via our framework, we are able to (i) develop a new basis for link streams that allow
us to represent them in a frequency-structure domain; and (ii) address our motivat-
ing question by showing that transformations to link streams, like aggregation, can
be seen as simple filters in our frequency-structure domain. Indeed, we show that
our results may have other interesting applications, like the extraction of patterns
with specific structure and frequency signatures and the quantification of the data
regularity, which paves the way to do machine learning directly on link streams.

22.2 Definitions and Problem Statement

22.2.1 Definitions

We formally define a link stream as a quadruplet L = (T, V, D, L), where T is


a set of times, V is a set of vertices, D ⊆ T × V × V is the set of link stream
triplets, and L : T × V × V → R is a function associating a weight to triplets, so
that L(t, u, v) = 0 if (t, u, v) ∈
/ D and, for the case of unweighted link streams,
L(t, u, v) = 1 if (t, u, v) ∈ D. We restrict to discrete-time link streams, thus we set
T = Z. The infinite size of T is assumed for theoretical simplicity, when processing
link streams numerically T is restricted to a bounded interval.
We denote the space of all possible relations between the vertices of the link
stream by E = V × V . Relations are considered directed, hence (u, v) = (v, u). We
also assume the members of E to be indexed so that ek ∈ E refers to its k-th element.
The index can be assigned arbitrarily, Sect. 22.4.2 covers techniques that re-index the
elements in useful manners. We denote |E| = M and assume, without loss of gener-
ality, that |E| and |V | are powers of two. Subsets E  ⊆ E are systematically referred
to as structures or motifs since we assume they have some structural significance:
they are made of spatially close relations that may form cliques, communities, stars,
etc.
452 E. Bautista and M. Latapy

The time series associated to ek is denoted as ek (t), where ek (t) = L(t, ek ). The
graph associated to time t is denoted as G t = (V, E t , f G t ), where E t = {ek ∈ E :
(t, ek ) ∈ D} refers to its set of edges and f G t : E → R refers to a weight function
defined as f G t (ek ) = L(t, ek ). We recall that E refers to the set of all possible relations,
while E t refers to only those that exist in G t . Thus, we have that E t ⊆ E. We consider
unweighted graphs as graphs with unit weights, thus we also include the weight
function when referring to unweighted graphs.
Given two unweighted graphs G 1 = (V1 , E 1 , f G 1 ) and G 2 = (V2 , E 2 , f G 2 ), the
distance of G 1 with respect to G 2 is given as dist (G 1 , G 2 ) = |E 1 | − |E 1 ∩ E 2 |,
which counts the number of edges in G 1 that are not in G 2 . The edit distance between
G 1 and G 2 is given as edit (G 1 , G 2 ) = dist (G 1 , G 2 ) + dist (G 2 , G 1 ), which counts
the number of different edges between G 1 and G 2 .

22.2.2 Problem Statement

As mentioned in the introduction, a link stream is frequently transformed into another


link stream, a time series, or a graph. In this subsection, we give two motivating
examples where such transformations are employed. The goal of the examples is to
highlight that, in order to understand how the dynamics and structure of a link stream
carry into the new object, signal and graph operators must be represented in the latent
space of each other, which may be hard due to their different nature. Then, we show
that one solution to this problem is by restricting to linear graph and signal operators.
For our discussion, we study the link stream through its weight function, which we
recall is a two-variable function of time and relations, i.e., L(t, e). Moreover, we
denote signal and graph operators by s and g, respectively. We stress that a signal
operator, when applied to L(t, e), acts on all t with e fixed and the graph operator
does the converse.
Motivating example (aggregation). A very common example where a signal
operator is used to transform a link stream is the aggregation of interactions, which
is a process that can drastically change the properties of a link stream. To illustrate
this, consider the example of Fig. 22.1, where we show a link stream consisting of
alternating triangle and claw graphs (i.e., they oscillate at frequency 1/2). Then,
this link stream is aggregated by means of a 2-sample window, resulting in a new
link stream that consists of a constant clique (i.e., it has zero frequency). Hence,
the aggregation process makes the oscillating claw and triangle disappear in order
to make emerge a new structure and a frequency that were not initially in the link
stream, highlighting the importance of understanding how the properties of a link
stream change when it is aggregated.
Mathematically, the aggregation process can be modeled as an operator s that
adds interactions in a sliding window as
22 A Frequency-Structure Approach for Link Stream Analysis 453

Fig. 22.1 Illustration of the aggregation of interactions in link streams. A four-vertex raw link
stream consisting of out-of-phase claw and triangle structures is aggregated via a 2-sample window
that makes emerge a constant clique


K −1
L̂(t, e) = s(L(t, e)) = L(t − k, e). (22.1)
k=0

where L̂ refers to the aggregated link stream. To faithfully model the process of
Fig. 22.1, a sub-sampler should also be applied to L̂ in order to only retain aggre-
gates from non-overlapping windows. Sub-sampling L̂ amounts to simply dilating its
frequency spectrum. Thus, we omit it for simplicity and focus on the real challenge
which is to relate the properties of L with those of L̂.
We begin by relating the dynamical information of L and L̂, which can be done
by relating their frequency content. Since s is a linear time-invariant operator, the
convolution theorem indicates that the frequency content of L̂ is that of L multiplied
by the frequency response of the operator s. Thus, since s acts on the time axis, its
impact on the frequency information is straightforward to assess with standard signal
processing results.
A more difficult task is, however, to relate how the structural information of L̂
relates to that of L. To show this, let us assume that we dispose of an operator g that
measures some structural property of the graphs from the sequence. For example, g
may be an operator that counts their number of triangles. In this case, if we aim to
know how g( L̂(t, e)) relates to g(L(t, e)), then it is necessary to know the equivalent
of the aggregation operator s in the latent space of the g operator. This is, we must
find an alternative representation of s, denoted s  , that transforms the number of
triangles in the graphs of L into the number of triangles in the graphs of L̂, so
that g( L̂(t, e)) = s  (g(L(t, e))). Due to their different nature, finding the equivalent
representation of s in the latent space of g is a difficult task.
One possible solution to the above problem consists in looking for graph operators
g that commute with s, as g( L̂(t, e)) = g(s(L(t, e)). Notably, this can  be achieved by
restricting to linear graph operators of the form g(L(t, e))(t, i) = e q(i, e)L(t, e),
where i is a dummy index that may refer to a coordinate (if g transforms a graph into
454 E. Bautista and M. Latapy

a feature vector) or to an edge (if g transforms the graph into another graph). In this
case, the linearity of g and s nicely combine to obtain

g( L̂(t, e)) = q(i, e) L̂(t, e) (22.2)
e

 
K
= q(i, e) L(t − k, e) (22.3)
e k=0


K 
= q(i, e)L(t − k, e) (22.4)
k=0 e


K
= g(L(t − k, e)), (22.5)
k=0

which expresses the structural information of L̂ as a combination of the structural


information of L.
Motivating example (embedding). A popular approach to transforming a link
stream into a time series is by means of graph embedding methods. Graph embedding
refers to the process of mapping a graph to a scalar or a point in a Euclidean space
so that structurally similar graphs map to close points in the embedded space. These
techniques, which are commonly based on neural networks (Yu et al. 2018; Yu et al.
2018; Pareja et al. 2020), spectral methods (Li et al. 2017; Zhu et al. 2018), or
diffusion processes (Lun et al. 2018; Hoang Nguyen et al. 2018), are hence applied
to the graph sequence in order to obtain a time series (or a group of them) that reflects
the structural evolution of the link stream. Since the dynamics of such time series
are usually studied to look for anomalies (Mahdavi et al. 2018; Zheng et al. 2019)
or events (Kumar et al. 2019; Trivedi et al. 2017) in the link stream, it is natural to
ask what extent the dynamics of such time series reflect the dynamics of the original
link stream.
In order to address this question, let us assume that the graph embedding method
is represented by the operator g so that the embedding into the i-th time series is
represented as L̂(t, i) = g(L(t, e)). Then, our goal is to relate the frequency content
of L̂ with that of L. For this, let us assume that s refers to the Fourier transform
operator. In this case, if we aim to relate s( L̂(t, i)) with s(L(t, e)), then we run into
a similar challenge as in our previous example: it is necessary to know the equivalent
of the graph embedding in the Fourier transform domain, which is unclear to know.
Nonetheless, as in our previous example, we stress that if the embedding method was
linear, then its linearity combines well with the linearity of the Fourier transform to
obtain
22 A Frequency-Structure Approach for Link Stream Analysis 455



s( L̂(t)) = L̂(t)e− jωt (22.6)
t=−∞

 
 
= q(i, e)L(t, e) e− jωt (22.7)
t=−∞ e
 ∞

= q(i, e) L(t, e)e− jωt (22.8)
e t=−∞

= q(i, e)s(L(t, e)) (22.9)
e

which expresses the frequency content of L̂ as a combination of the frequency content


of L.
From our two motivating examples, it can be seen that, as long as we stick to linear
signal and graph operators, it is possible to characterize the impact of transforma-
tions on a link stream. This observation clearly opens the question of what signal and
graph operators satisfying linearity allow a meaningful characterization of transfor-
mations like aggregation or embedding. The remainder of this paper investigates this
problem. For this, we first introduce a linear matrix framework for link stream anal-
ysis (Sect. 22.3). Then, we develop linear operators to analyze graphs (Sect. 22.4).
Lastly, we combine the new linear graph methods with signal processing ones to
soundly represent transformations as simple filters in a frequency-structure domain
(Sect. 22.5).

22.3 A Linear Framework for Link Stream Analysis

In this section, we formalize our insights from Sect. 22.2 by proposing a linear matrix
framework for the processing of link streams. This matrix framework offers two main
advantages: (i) it allows to unify the signal and graph points of view described above;
and (ii) it allows to easily study the joint impact that operators (or combinations of
them) have on the time and relational properties of link streams. To do this, we
simply represent link streams and operators as matrices that get multiplied. Namely,
we represent a link stream L by the matrix L ∈ R|T |×|E| given as

Lt,k = L(t, ek ). (22.10)

This matrix representation unifies the signal and graph interpretations, as the k-th
column of L corresponds to ek (t) and the t-th row corresponds to the graph f G t (e).
This, therefore, implies that the application of any linear signal processing operator
can be simply modelled as a matrix H that multiplies L from the left as


L = HL, (22.11)
456 E. Bautista and M. Latapy

where the notation  L represents a processed link stream. Similarly, the rows of L
coding for the graph sequence implies that the application of any linear graph operator
can be modeled as a matrix Q multiplying L from the right as


L = LQ. (22.12)

From Eqs. (22.11) and (22.12) it can be clearly seen that (i) any chain of operators
H1 . . . Hn or Q1 . . . Qn can be reduced to an equivalent one that consists of their
product; and (ii) signal and graph operators can be straightforwardly combined as


L = HLQ. (22.13)

Equation (22.13) constitutes our proposed linear matrix framework for link stream
analysis. Notice that it allows us to formalize our research question as the one of
finding matrices H and Q which characterize or model interesting transformations
to link streams.
We stress that the field of signal processing has established a large number of
powerful linear concepts to analyze signals, such as Fourier and wavelet transforms,
convolutions, sampling theorems, correlations, or filters. Thus, due to their linearity,
these methods can readily take the role of the matrix H and be used to analyze the time
properties of link streams. Yet, to analyze the relational properties, the availability
of linear graph methods is scarce. Indeed, while various useful graph concepts like
degree measures, edge counts, or cut metrics are linear, it is often necessary to
rely on more advanced techniques based on motif/community searches, spectral
decompositions, or graph embeddings. However, these more advanced approaches
are adaptive and non-linear, meaning that they cannot be modeled as a single matrix
Q that can be plugged into our framework. Section 22.4 aims to address this issue by
developing new linear methods for graphs. Section 22.5 then revisits our framework
by incorporating such developments.

22.4 Linear Methods for Graphs

The goal of this section is to develop methods that can take the role of the matrix
Q in (22.13). We exploit the fact that signal processing offers a powerful set of
linear methods to process functions. Thus, we approach the problem by interpreting
graphs as functions and by adapting signal processing techniques to process such
functions. From this perspective, the only difference between a time series and a
graph is that the former is a function of the form f s : R → R while the latter is of the
form f G : E → R. The main challenge lies in adapting signal processing methods
to take into account the non-ordered nature of the domain E of f G .
In particular, this section leverages the above approach to develop a structural
decomposition for f G and thoroughly explores its implications. The motivation for
developing a decomposition is to establish, for graphs, a tool that offers benefits that
22 A Frequency-Structure Approach for Link Stream Analysis 457

are similar to those offered by the Fourier transform in signal processing. Namely, by
representing signals in a frequency domain, the Fourier transform allows seeing that
many transformations to and between signals amount to simply amplify or attenuate
some of their frequency information. Thus, we aim for an analog for graphs: to have
a representation that permits to study what transformations to and between graphs
can be seen as simple ways to emphasize or suppress some of their structures.
The section is structured as follows. Section 22.4.1 develops a multi-scale struc-
tural decomposition for graphs. Section 22.4.2 investigates two techniques to auto-
matically construct the basic elements of the decomposition. Section 22.4.3, shows
that the decomposition can be interpreted as a linear graph embedding method.
Section 22.4.4 introduces structural filters for graphs.

22.4.1 A New Decomposition for Graphs

Our aim here is to develop a structural decomposition for f G . In signal processing,


a decomposition refers to the process of fixing a set of elementary signals and then
expressing an arbitrary signal as a weighted combination of them. The weighting
coefficients then allow seeing the importance of an elementary signal in the decom-
posed signal. For example, in the Fourier case, the elementary signals are complex
exponentials of varying frequencies, meaning that the Fourier transform coefficients
reflect the importance of frequencies in the decomposed signal. In our case, we aim
to track the importance of structures in f G . Thus we search for a set of structurally-
meaningful functions that allow us to express f G as a linear combination of them.
To find such functions, we begin noticing that many interesting structures in graphs
can be of different scales. For example, a structure of interest may consist of a large
group of vertices forming a community, or it may consist of a small clique confined
to the boundaries of the graph. Interestingly, if the vertices of such community or
clique are referred to by the set Vs ⊂ V , then we have that, in both cases, the function
f G is dense on the set Vs × Vs . This is, for most edges e ∈ Vs × Vs we have that
f G (e ) = 0. If we further consider that non-zero edge weights are often very similar
in magnitude (equal for unweighted graphs), then we can clearly see that the function
f G , on the set Vs × Vs , can be very well approximated by a constant function also
supported on the set Vs × Vs . Thus, building from this observation, we propose to use
constant functions supported on sets representing structures of interest as our group of
elementary functions to decompose f G . This is, we propose to do a multi-resolution
analysis of f G by piece-wise constant functions as our decomposition.
In the signal processing literature, a decomposition of signals by piece-wise con-
stant functions corresponds to the classical Haar wavelet transform. In it, a set of
piece-wise constant functions (known as scaling functions) and a set of tripolar func-
tions (known as wavelet functions) are constructed so that they form an orthonor-
mal basis that can be used to expand any signal. Hence, our aim is to adapt Haar
wavelets to the graph setting in a way that scaling and wavelet functions now acquire
structural significance. Yet, we stress that this adaptation is not straightforward. In
458 E. Bautista and M. Latapy

signals, the scaling and wavelet functions are built by shifting and dilating a prim-
itive pulse-shaped signal. However, the notions of shifting and scaling are not well
defined for functions supported on E due to its unordered nature. In the following, we
demonstrate that we can still construct a Haar multi-resolution analysis for graphs
by recursively partitioning the set E.
(log (M))
To begin, let us set E0 2 = E and recursively partition this set according to
( j+1) ( j) ( j) ( j) ( j) ( j) ( j)
the following rule: Ek = E2k ∪ E2k+1 , with E2k ∩ E2k+1 = ∅ and |E2k | = |E2k+1 |,
(0)
until we obtain singletons Ek = ek . Then, based on this partitioning, we can define
a set of scaling functions as:
√
( j)
( j) 2− j e ∈ Ek
φk (e) = (22.14)
0 otherwise

and a set of wavelet functions as:


⎧√ ( j)

⎨ √2− j e ∈ E2k
( j) ( j)
θk (e) = − 2− j e ∈ E2k+1 (22.15)


0 otherwise

It is easy to verify that our scaling and wavelet functions satisfy the following prop-
erties: (i) scaling and wavelet functions are pair-wise orthonormal; (ii) wavelet func-
tions are pair-wise orthonormal; (iii) scaling functions, for a fixed j, are pair-wise
orthonormal; and (iv) for level j = log2 (M) − , there are 2 associated scaling
functions and 2 associated wavelet functions. These properties imply that, for a
( j)
fixed j, the collection {φk } ∪ {ψk() }, for all k and  ≤ j, forms a set of M orthonor-
mal functions that constitute a basis for functions supported on E (i.e., graphs).
This basis, which constitutes an adaptation of the Haar multi-resolution analysis to
graphs, allows us to decompose a graph in terms of the scaling and wavelet functions
at different levels of resolution as indicated by ( j). This is done as follows:
 ( j) ( j)

f G (e) = sk φk (e) + wk() θk() (e), (22.16)
k ≤ j k

where
( j) ( j)
s k =  f G , φk  (22.17)

refers to a scaling coefficient and


( j) ( j)
wk =  f G , θk  (22.18)

refers to a wavelet coefficient. Equation (22.16) shows that a graph f G can be split
into two main parts. On the one hand, the first term on the right-hand side consti-
tutes a coarse grain approximation of f G at resolution level ( j). To see this, recall
22 A Frequency-Structure Approach for Link Stream Analysis 459

Fig. 22.2 Illustration of the proposed graph decomposition on graphs with four vertices. The left
panel displays the recursive partition of the relation space. The right panel shows the application of
the decomposition on the clique, claw and triangle graphs. Horizontally, the panel illustrates (i) how
such graphs can be decomposed into coarse-grain and detailed parts; and (ii) their decomposition
coefficients. Vertically, the panel shows how the aggregation process can also be studied in the
decomposition domain. For the figure, edges are shown as undirected, albeit the formalism considers
such edges as two directed ones pointing in opposite directions. Colored nodes represent self-loops
and colors encode edge-weights

that scaling functions represent structures of interest at some resolution level, like
the aforementioned community or clique. Thus, such first term constitutes the best
possible approximation of f G given by such structures. The second term on the
right-hand side contains, consequently, all the necessary details to recover f G from
its coarse-grain approximation. Indeed, by simple algebra it can be shown that the
wavelet coefficients at level () contain the information to recover the coarse-grain
approximation of f G at level ( − 1) from the one at level ().
Illustrating example (claw, triangle, clique). To better illustrate these concepts,
let us give a practical example of our decomposition in Fig. 22.2. For our example,
we choose to work with the space of graphs defined on four vertices. This allows us
to revisit our claw, triangle, and clique graphs discussed in Sect. 22.2. Since we aim
to work with graphs of four vertices, the first step in our decomposition consists in
partitioning the edge space associated with such graphs (which contains 16 directed
edges, including self-loops). This partitioning procedure is depicted in the left panel
of Fig. 22.2. We begin with a single structure E0(4) which contains all the relation-
space. This structure can thus be considered a clique with self-loops (as illustrated
by the colored vertices). Then, we generate the structures at the next resolution level,
this is E0(3) and E1(3) . To do this, we split E0(4) into two equal-sized parts, which, for
convenience, we pick as a claw and triangle with self-loops. These structures are
then recursively partitioned until they cannot be further divided.
Based on the structures found above, we can then define our scaling and basis
functions according to (22.14) and (22.15). Since they form a basis, we can use
them to decompose any graph living in the space of graphs of four vertices, like
the claw, triangle, and clique graphs shown in the right panel. For our example,
460 E. Bautista and M. Latapy

we select to decompose them at resolution level j = 4, which is the coarsest one.


This means that we will find the best approximations of such graphs by means of
a clique with self-loops of constant weight (first term of (22.16)), where the edge
weights are determined by the coefficient s0(4) . Naturally, the clique admits a very
good approximation as its only difference with E0(4) lies on the self-loops, the reason
for which the edge-weights of the approximation graph are large and the detail graph
concentrates its large-magnitude edges on the self-loops. Since the claw and triangle
are structurally farther, their approximation graphs have less important weights and
it is necessary to add to them an important amount of structural details to recover the
initial graphs, the reason for which their detailed graphs have important weights for
most edges.
Even though the claw and triangle approximations are not as accurate, their decom-
position at this resolution level is still useful to better understand what occurs when
such graphs are aggregated. Namely, due to the linearity of our decomposition, we
have that the decomposition of the sum of two graphs is equal to the sum of their
individual decompositions. This implies that the decomposition of the clique is equal
to the sum of the claw and triangle decompositions. Thus, we can see that, by adding
them, most of the details cancel out while the approximation graphs accumulate, giv-
ing evidence that we make emerge a large-scale structure. This can be more easily
seen by just looking at the decomposition coefficients, which are also shown in the
right panel. It can be seen that the scaling coefficients of the claw and triangle add to
produce a new larger coefficient, while the largest-scale wavelet coefficients, w0(4) ,
cancel out, indicating that the details in this new graph will no longer be at a large
scale but at finer scales (the self-loops in this case). Indeed, this cancelation effect
between decomposition coefficients can be exploited to study the scales at which
two graphs are different. Our next example illustrates this.
Illustrating example (comparing graphs). One common way to assess the dif-
ference between two unweighted graphs is by means of their edit distance, counting
their number of different edges. However, the edit distance is not entirely satisfac-
tory in all situations, as we may have two graphs that can be considered structurally
equal. Yet, their edit distance may still be very large. This is, for instance, the case of
two realizations of a stochastic block model under equal parameters. In such cases,
the two realizations have very few common edges with high probability, meaning
that their edit distance is large, albeit the two graphs are considered equal at the
community level.
Interestingly, our decomposition coefficients can help us to better spot the scales
at which two graphs are equal or not while still keeping the interpretation in terms of
( j) ( j)
edit distance. To show this, let us denote by (sG 1 )k and (wG 1 )k the decomposition
( j) ( j)
coefficients of an unweighted graph G 1 and by (sG 2 )k and (wG 2 )k the coefficients
of another unweighted graph G 2 (defined on the same relation-space). Then, we have
that our decomposition coefficients satisfy the following property:
22 A Frequency-Structure Approach for Link Stream Analysis 461

Fig. 22.3 Distribution of the edit distance across decomposition coefficients for two realizations of
the stochastic block model. Left-most coefficients refer to coarse-grain structures, and right-most
coefficients refer to detailed structures. Each coefficient captures a fraction of the edit distance.
Coarse-grain structures contribute little to the edit distance, confirming the intuition that both real-
izations are similar at the community scale even though their distance is large

 ( j) ( j)
2  2
(sG 1 )k − (sG 2 )k + (wG 1 )() ()
k − (wG 2 )k = edit (G 1 , G 2 )
k ≤ j k
(22.19)

The proof of this property is given in Sect. 22.4.3. This equation indicates that our
decomposition coefficients allow studying how the edit distance of two graphs dis-
tributes across resolution scales. Namely, since all the terms in the sum are strictly
positive, then each term contributes with a fraction of the edit distance. This thus
allows us to track the terms that contribute the most to the edit distance. Naturally, if
we observe that the edit distance is mostly determined by the coefficients associated
with a specific resolution level, then we can conclude that the differences between
the graphs arise at such resolution level, also implying that the graphs are similar at
all other resolution levels.
In Fig. 22.3, we show how these insights can be used to spot that two realizations of
a stochastic block model are still equal at the community scale despite their large edit
distance. For the figure, we generate two realizations of a block model with 2 blocks
of 16 vertices, each with within class probability 0.5 and between class probability
0.01. Then, we partition the relation-space so that, at resolution level j = 8, the sets
Ek(8) , coincide with the community structure. We then apply our decomposition to such
graphs based on the partitioning and display, in Fig. 22.3, the squared difference of
the coefficients. As can be seen, the coefficients associated with large-scale structures
contribute little to the edit distance, contrary to the detailed structures. This is thus
consistent with our intuition that both graphs are equal at the community scale and
that one needs to go to a finer scale in order to differentiate one from the other. In
sum, the previous examples illustrate that our decomposition provides a useful tool
to analyze the structural information of graphs, given that we appropriately choose
our basis elements. In the next subsection, we investigate this crucial subject.
462 E. Bautista and M. Latapy

22.4.2 Partitioning of the Relation-Space

While the derivations of the previous sub-section show that a multi-resolution analy-
sis of graphs is possible, the crucial step of how to partition the relation-space is not
addressed. This subsection investigates this point, for which we develop two tech-
niques tailored for two common scenarios: (i) the case of graphs with community-
like structures; and (ii) the case of activity graphs arising from fixed infrastructures.
Independently of the scenario, we recall that our decomposition works best when
( j)
the functions supported on the sets Ek are constant. Thus, our techniques essen-
( j)
tially aim to find sets Ek of spatially close elements that mostly consist of active or
inactive edges in the graph under analysis.
SVD-based partitioning. When the graph to study has a community structure,
( j)
a natural and adaptive approach to find the sets Ek consists in looking for rank-1
patterns of the adjacency matrix. The rationale is the following: since rank-1 patterns
correspond to regions of the matrix that have similar states, then we can use such
regions as our sets E ( j) . Matrix factorization is the standard technique to find rank-
1 patterns in a matrix. Therefore, we propose a methodology based on the second
( j)
largest singular vector of the adjacency matrix to construct the sets Ek . We stress
that even though matrix factorization is a non-linear procedure, this is done to fix
the basis used to analyze the entire graph sequence. Thus it does not impact our link
stream framework.
(log (N ))
Our partitioning algorithm works as follows. We first assign the set α1 2 = V.
Then, we recursively partition this set according to the following rule. For the set
( j) ( j)
αk , we take the submatrix (of the adjacency matrix) of rows that are indexed by αk
(we keep all the columns). Then, we extract the second largest left singular vector
( j)
of such sub-matrix and sort it in descending order. The elements of αk associated
( j−1)
with the top half entries of the singular vector form the set α2k−1 and the remainder
( j−1)
form the set α2k . We iterate this procedure until the sets αk(0) are singletons. For
example, if α1(2) = {v2 , v8 , v12 , v20 }, then our procedure selects the rows associated
to v2 , v8 , v12 , v20 and forms a flat sub-matrix of size 4 × |V |. Assuming that the two
largest entries of the left singular vector of this matrix are the ones associated with
v2 and v20 , then the procedure sets α1(1) = {v2 , v20 } and α2(1) = {v8 , v12 }. Based on
( j)
the retrieved sets, we construct the motifs Ek according to the tree diagram shown
( j)
in Fig. 22.4. Essentially, a node contains a set Ek and its child nodes contain its
division at the next resolution level. For odd levels, the division is done by separating
edges according to their origin vertex, while for even levels, the separation is done
( j)
according to the destination vertex. For both cases, the sets αk rule the splitting.
Interestingly, it is not necessary to build the tree and basis functions in practice in
order to compute the coefficients of our decomposition. It suffices to know the sets
αk(0) for all k. Namely, notice that the set Ei(0) , which contains a relation, implies a
mapping of such relation to the integer i. We thus have that the leaf nodes of the tree
define a hierarchy-preserving mapping of the relation-space to the interval [1, M].
(j )
This is, all the elements of the root node E0 max get mapped to the entire interval
22 A Frequency-Structure Approach for Link Stream Analysis 463

Fig. 22.4 SVD-based procedure to partition the relation-space. The root node contains the entire
relation-space which is recursively partitioned based on the splits found by the SVD procedure

(j −1)
[1, M], while those of the left child E0 max get mapped to [1, M/2] and those of
the right child to [M/2 + 1, M], and so on. The implication of this result is that the
( j)
scaling and wavelet functions defined over the sets Ek , when mapped based on this
rule, coincide with the classical Haar scaling and wavelet functions for time series
defined on the interval [1, M]. Thus, we can transform our graph problem into a time
series problem, which carries one main advantage: there exist fast implementations of
the Haar wavelet transform for time series that avoid computing the basis elements
and that use simple filter banks instead, achieving time and space complexity of
O(M).
Notably, we can also avoid the construction of the tree as it is possible to find an
analytic function that returns the position that a given relation has on the leaf nodes
of the tree. To see this, notice that our SVD-based partitioning procedure implies
a mapping of each vertex u ∈ αk(0) to a new unique index k. Then, we have that a
relation (u, v), relabelled as (x, y), has a position in the leaf nodes determined by
the following recursive function


⎨ p + z(x, y − p)
2
x ≤ p and y > p
z(x, y) = 2 p 2 + z(x − p, y) x > p and y ≤ p (22.20)

⎩ 2
3 p + z(x − p, y − p) x > p and y > p.
464 E. Bautista and M. Latapy

where p is the previous integer to max(x, y) that is a power of 2. Based on this


function, we can directly recover the index i of the set Ei(0) that contains (u, v),
implying that we can directly compute our decomposition coefficients by feeding
the edges of the graph to this function and efficiently analyzing the resulting time
series with a classical Haar filter bank.
BFS-based partitioning. We now propose another adaptive procedure to partition
the relation-space. This procedure is tailored for situations in which the graph models
activity on a fixed infrastructure. For instance, in a road network, the graph can model
if a road is being used, or in a wired computer network, the graph can model if a
computer is communicating with another. In such situations, due to the fixed nature
of the underlying infrastructure, some edges are impossible to appear in the activity
graphs due to the absence of physical connections between vertices. Thus, excluding
( j)
them from the relation-space and hence from the motifs Ek can help to construct
more meaningful basis elements. We, therefore, focus here on the study of functions
reduced to a domain Eactive ⊂ E, where Eactive represents the set of relations where
activity may occur. Our goal remains to meaningfully partition Eactive to build the
( j)
sets Ek . For our discussion, we assume that |Eactive | = M  is still a power of two.
Since structures of interest in these graphs have a strong notion of spatial locality,
we address the partitioning of Eactive by means of the breath first search algorithm
(log (M  ))
(BFS). Our procedure works as follows. We first set E0 2 = Eactive . Then, we
recursively partition this set according to the following rule: for the graph with edge-
( j)
set Ek , we pick a node (which can be done at random) and we run the BFS algorithm
( j)
from such a node. Once the algorithm has explored |Ek |/2 edges, we associate the
( j−1) ( j−1)
explored edges to the set E2k and the non-explored edges to the set E2k+1 . We
(0)
continue until the sets Ek are singletons.
( j)
Similarly to the SVD-based case, the resulting sets Ek can be organized in a tree
where the leaf nodes define a hierarchy-preserving mapping of the relation-space
to the interval [1, M  ]. As explained above, this implies that we can transform our
graph decomposition problem into one of decomposing a time series, which can be
efficiently done via filter banks that avoid the construction of the basis elements.
Thus, while our BFS-based method still requires to construct the tree, this remains
rather simple to do due to the efficiency of the BFS algorithm.

22.4.3 Interpretation as Graph Embedding

In this subsection, we show that our decomposition coefficients can be used as a linear
graph embedding technique. This is an important property, as classical embedding
methods, which are based on neural networks or matrix factorizations, cannot be used
as the matrix Q in our framework due to their non-linearity. To see our decomposition
as an embedding method, it simply suffices to arrange the scaling coefficients into a
vector
22 A Frequency-Structure Approach for Link Stream Analysis 465

( j) ( j)
s = [s0 , s1 , . . . ] (22.21)

and the wavelet coefficients into another one

w = [w0() , w1() , . . . ] (22.22)

to form a unique vector


x = [s, w] (22.23)

that represents the embedding of the graph into a euclidean space of dimension
M. While it is clear that the coefficients are individually related to the structural
information of the decomposed graph, it is less clear that the geometry of points in
the space also reflects the structural information of the embedded graphs. Our next
result demonstrates this.

Lemma 22.1 Let G 1 (V, E 1 , f G 1 ), G 2 (V, E 2 , f G 2 ) denote two unweighted graphs


and x1 , x2 their embedded vectors, respectively, under the same dictionary. Then, we
have that:
1. x1 22 = |E 1 |
2. x1 , x2  = |E 1 ∩ E 2 |
3. x1 − x2 22 = edit (G 1 , G 2 ).

The proof of the lemma is deferred to Appendix 22.7.1. It says that, for an
unweighted graph, the size of the graph is preserved in the length of its embed-
ding vector. Additionally, it says that if we measure the projection of one vector
onto another, the result is equal to the number of common edges between the graphs
represented by such vectors. Thus, this implies that two graphs with no common
edges are orthogonal in the embedding space. Lastly, the lemma says that the dis-
tance between points in the embedded space preserves the edit distance between the
graphs represented by such points. Thus, two nearby points in the embedded space
necessarily imply that their associated graphs have a small edit distance.
Yet, our last result also implies that two graphs that may be structurally similar
(like two realizations of the stochastic block model) are still embedded far away as
long as they have a large edit distance. This shows that our embedding method adopts
a graph similarity measure that may be too strict in some situations. To amend this
issue, we can exploit the fact that the scaling coefficients capture the best coarse-
grain approximation of a graph. Thus, by omitting the wavelet coefficients, we can
obtain a graph similarity measure that looks for similarities at the structure scale. Our
next result confirms this intuition. For it, we first formalize the notion of structurally
similar graphs. Then, we introduce a lemma that demonstrates the properties of an
embedding restricted to the sub-space spanned by s.
Definition 22.1 Let G 1 (V, E 1 , f G 1 ) and G 2 (V, E 2 , f G 2 ) be two unweighted graphs
( j)
and E = k Ek be a partitioning of the relation-space at resolution level j. Then,
466 E. Bautista and M. Latapy

G 1 and G 2 are said to be structurally equal at this resolution level if it holds that
( j) ( j)
|E 1 ∩ Ek | = |E 2 ∩ Ek |, for all k.

Lemma 22.2 Let C1 and C2 denote two classes of structurally equal graphs at
a resolution level j and assume that G 1 (V, E 1 , f G 1 ), G̃ 1 (V, Ẽ 1 , f G̃ 1 ) ∈ C1 and
G 2 (V, E 2 , f G 2 ), G̃ 2 (V, Ẽ 2 , f G̃ 2 ) ∈ C2 are chosen uniformly at random. If we denote
the scaling vectors of G 1 and G 2 by s1 and s2 , respectively, then we have that they
satisfy:

1. s1 22 = E E 1 ∩ Ẽ 1
2. s1 , s2  = E [E 1 ∩ E 2 ]
3. s1 − s2 22 = E [edit (G 1 , G 2 )] − E dist (G 1 , G̃ 1 ) + dist (G 2 , G̃ 2 )

The proof of Lemma 22.2 is deferred to Appendix 22.7.2. It says that our new
embedding no longer reflects the structure of individual graphs but rather the statisti-
cal properties of classes of graphs. For instance, previously, the length of a vector in
the embedded space reflected the number of edges in the graph that mapped to such
vector. Now, the length of the vector counts the expected number of edges that the
graph has in common with all the graphs that are structurally equal to it. Similarly,
the inner product of two vectors now counts the expected intersection between two
classes of graphs rather than between two individual graphs. Notice that the last
property also indicates that the distance between points in this new embedding space
is zero as long as the graphs are structurally equal. This, therefore, allow us to embed
structurally similar graphs to nearby points even though they may have a large edit
distance.
Compared to classical embedding techniques, our graph decomposition, when
seen as an embedding, provides various advantages: (i) it is linear; (ii) it allows to
express embedding vectors in analytic form; (iii) it allows to link the geometry of
the space to the structure of graphs; (iv) it can be inverted; (v) it allows to analyze
graphs at different resolution levels; (vi) it allows to characterize the details lost when
considering coarse-grain information; (vii) it can be very efficiently computed with
a simple filter-bank.

22.4.4 Filters for Graphs

Our decomposition allows representing graphs in a new structural domain. This


naturally opens the door to study operators by characterizing how they change the
structural-domain representation of a graph. Here, we perform this structural-domain
study of graph operators. In particular, we focus on operators that can be represented
as simple structural filters: operators that essentially amount to amplify or attenuate
the coefficients of our decomposition. In more precise terms, we formalize the notion
of a structural filter as follows.
22 A Frequency-Structure Approach for Link Stream Analysis 467

Definition 22.2 Let f G represent a graph that is decomposed according to


Eq. (22.16). Then, the structural filtering of f G is defined as
 ( j) ( j) ( j)

fˆG (e) = σk sk φk (e) + νk() wk() θk() (e) (22.24)
k ≤ j k

( j)
where {σk } and {νk() } are called the coefficients of the filter.

Thus, a structural filter transforms a graph f G into another graph fˆG by tuning
the importance given to scaling and wavelet functions. Equation (22.24) provides the
structural-domain definition of a filter, however, it leaves unclear what type of trans-
formations expressed in the initial domain of relations correspond to such structural
filters. To answer this question, let

( j)

f G (Ek ) := f G (e). (22.25)
( j)
e∈Ek

Then, after simple algebraic manipulations to (22.24), it is possible to show that


transformations of the following form to f G behave like structural filters:

σ
( j)
( j)
 ν ()
fˆG (e) = k j f G (Ek ) + k

(−1)
[ f G (E2k (−1)
) − f G (E2k+1 )]
2 ∈L
2
1

 ν () (−1) (−1)


+ k

[ f G (E2k+1 ) − f G (E2k )] (22.26)
∈L
2
2

( j) (−1) (−1)
where e ∈ Ek , L1 = { : e ∈ E2k } and L2 = { : e ∈ E2k+1 }. Therefore, transfor-
mations that take averages from coarse-grain information and differences at smaller
scales correspond to filters.
Structural filters open the door to interpret any graph as the result of filtering a pre-
defined template graph that contains all possible structures with equal importance.
Namely, consider a graph in which all its decomposition coefficients are equal to one.
Then, this graph, which we call template graph, contains all the structures and can
be transformed into any other graph by using filters that shape its structural-domain
representation into a desired form. This is illustrated in Fig. 22.5, where a template
graph on four vertices is filtered to generate the claw, triangle and clique graphs. In
terms of the graph embedding point-of-view of our decomposition, filters can be seen
as transformations of the embedding vector of a graph into the embedding vector of
another graph. In particular, filters can map a vector into any other vector that lives
in the same sub-space. Thus, they model transformations between graphs living in
the sub-space. In our example, the template graph lives on the largest sub-space and
hence any graph can be produced from it.
468 E. Bautista and M. Latapy

Fig. 22.5 Illustration of the graph filtering process. A graph of four vertices that contains all
structures with equal importance can be filtered out to produce any graph of four vertices with a
desired structure

Coarse-grain pass filters. From (22.24), two particular cases of interest arise.
Firstly, the case in which the filter is chosen to preserve the scaling coefficients and
( j)
to suppress the wavelet ones. This is, where σk = 1 for all k and νk() = 0, for all k
and . We coin this as a coarse-grain pass filtering of f G , which we denote fˆG(c) . By
revisiting Fig. 22.2, it can be clearly seen that fˆG(c) corresponds to the coarse-grain
approximations shown in the right panel. From (22.26), we trivially have that these
graphs are given in closed form as
( j)
f G (Ek )
fˆG(c) (e) = (22.27)
2j
An interesting application of coarse-grain pass filters is the generation of graphs that
are structurally similar to an input unweighted graph. To see this, notice that fˆG(c)
( j)
corresponds to a new graph where all edges e ∈ Ek have a constant value between
[0, 1]. If this value is taken as the probability of success of a Bernoulli trial, then by
( j)
drawing such trial for each e ∈ Ek and interpreting successes as the edges of a new
graph, we obtain a resulting graph that, on expectation, is structurally equal to f G .
Indeed, this procedure can be seen as a generalization of the stochastic block model
( j)
to blocks determined by the sets Ek and where the probabilities are given by the
input graph.
Detail pass filters. The second case of interest consists in choosing the filter to
suppress the scaling coefficients while preserving the wavelet ones. This is, the case
( j)
where σk = 0 for all k and νk() = 1 for all k and . We coin this as a detail-pass
filtering of f G , which we denote as fˆG(d) t
. Since this filter contains the information
not captured by the coarse-grain one, it corresponds to the detailed structures shown
in the right panel of Fig. 22.5. From (22.27), we readily have that
22 A Frequency-Structure Approach for Link Stream Analysis 469

( j)
f G (Ek )
fˆG(d) (e) = f G (e) − (22.28)
2j
( j)
where e ∈ Ek . This equation indicates that a detail filter provides a measure of the
local variability of f G as it compares how different the weight of edge e is with
( j)
respect to the mean weight of edges in Ek . It therefore can be interpreted as the
graph analog of the differentiation operator for time series. Namely, in time series,
differentiation consists in comparing a signal sample at time t with the signal sample
at time t − 1, due to the ordered nature of time. However, in graphs, there is no
such notion of ordered edges and it makes more sense to define differentiation by
comparing the weight of an edge with the other weights in its local vicinity. This is
precisely what our detail filters do, for which we define

d fG
(e) := fˆG(d) (e) (22.29)
de
A useful consequence of having a notion of graph differentiation is that it opens
the door to investigating regularity measures. Regularity measures are a key concept
in machine learning, as they allow reducing the space of admissible solutions to
functions satisfying certain regularity properties. Essentially, the intuition is that a
highly regular or smooth function (hence of small regularity metric) can be described
with just a few parameters. Thus, it is easy to reconstruct or guess by just knowing
a few of its samples. From this perspective, classical machine learning algorithms
look for functions that fit the known samples and that are as regular as possible. In
mathematical terms, the regularity of functions is often defined as the squared norm
of the derivative of the function. Thus, adapted to our setting, we can define the
regularity of f G as
  d f G 2
r eg( f G ) = (e) (22.30)
e∈E
de

Our next result demonstrates that this regularity metric indeed quantifies the com-
plexity of guessing f G . For it, it measures, in terms of graph distances, how hard it is
to recover an unweighted graph from just having knowledge of its number of edges.
Lemma 22.3 Let G(V, E, f G ) be an unweighted graph and G ∗ (V, E ∗ , f G ∗ ) denote
another unweighted graph selected uniformly at random from the class of structurally
equal graphs to G. Then, we have that
 
r eg ( f G ) = E dist(G, G ∗ ) . (22.31)

The proof of Lemma 22.3 is deferred to Appendix 22.7.3. It says that if we try to
recover G by generating a random graph G ∗ that has the same structure (which can
( j)
be done by knowing the number of active edges in each set Ek ), then this regularity
metric quantifies the expected error. Notice that zero error is only attained for the
( j)
cases in which the motifs Ek correspond to either empty or complete sub-graphs in
470 E. Bautista and M. Latapy

G. This is because the classes that contain such graphs only have them as members,
thus any random selection from such classes recovers them. On the other hand, the
( j)
metric attains maximum value when each motif has |Ek |/2 active edges in G. This
can be intuitively verified from the fact that the class that contains such graphs is the
largest possible, thus the probability of selecting a graph G ∗ that is close to G is the
smallest possible.

22.5 Link Stream Analysis

In this section, we combine our graph developments of Sect. 22.4 with their signal
processing counterpart. We do this via the link stream analysis framework presented
in Sect. 22.3. Firstly, we combine the proposed graph decomposition with classi-
cal signal decompositions, as the Fourier or wavelet transforms. We show that this
results in a new basis for link streams that allow us to represent them in a frequency-
structure domain. We then use the framework to combine frequency and structural
filters, allowing us to filter-out specific frequency and structural information from a
link stream. Notably, we show that interesting transformations to link streams, like
aggregation or embedding, correspond to simple filters in our frequency-structure
domain. We finish by showing that filters naturally make emerge a notion of the reg-
ularity of link streams, which opens the door to the application of machine learning.

22.5.1 Frequency-Structure Representation of Link Streams

We begin our analysis of link streams by combining our graph decomposition with
classical time series ones. As a first step, we must fix a basis for graphs that is relevant
to analyze the entire graph sequence. In the absence of extra information, the natural
way to do this consists in aggregating all the link stream into a single graph. This
graph then reveals all the regions of the relation-space where activity occurs and that
are relevant to track. Based on the such aggregated graph, we can then use one of
our procedures from Sect. 22.4.2 (SVD-based or BFS-based) to partition the relation
space and fix the graph basis. As a second step, we encode the basis elements in a
matrix  where the rows contain, from top to bottom, the scaling functions and the
wavelet functions:
( j) ( j)
 = [φ0 , φ1 , . . . , θ0() , θ1() , . . . ]. (22.32)

Then, by retaking the matrix representation of link streams L, we notice that the
coefficients of our decomposition can be simply computed, for the entire link stream,
as
L = X, (22.33)
22 A Frequency-Structure Approach for Link Stream Analysis 471

where the row t of matrix X contains the decomposition coefficients associated to


graph G t . From (22.33) we observe that (i) L can be exactly recovered from X;
and (ii) the rows and columns of X are indexed by time and structures, respectively.
Therefore, X can be interpreted as a time-structure representation of L.
Similarly, we can represent L in a frequency-relational domain by defining the
matrix = [ψ1 , ψ2 , . . . ], where the columns are the atoms of a signal dictionary,
like Fourier or wavelets. By focusing on the Fourier case, we can represent the
frequency analysis of L as the simple matrix product

L= F (22.34)

where column j of matrix F contains the Fourier transform of e j (t). Since F is


indexed by frequency and relations, and L can be entirely recovered from it, then we
can consider F as a frequency-relational representation of L.
It can easily be seen that the matrices  and  take the role of H and Q in
(22.13), respectively, implying that they can be readily combined to express L in a
frequency-structure domain. This is, by decomposing F into the graph dictionary (to
extract the structural information from it), and X into the signal dictionary (to extract
frequencies from it), we obtain in both cases the same matrix of coefficients C as:

L= C. (22.35)

The matrix C contains all the frequency and structure information of L as entry Cuk
quantifies the importance of structure k oscillating at frequency u in L. Equation
(22.35) constitutes our proposed decomposition for link streams.
Two important points to highlight from (22.35) are that (i)  and  are orthonor-
mal matrices, therefore the inverse transformation is given by their complex conju-
gate, which we denote   and  , respectively, for simplicity; and (ii) the expression
can be rewritten as L = u,k Cuk Zuk , where Zuk = ψu φk is a link stream consist-
ing of structure k oscillating at frequency u, which is furthermore of unit-norm and
orthonormal to any other Zu  k  for u  = u or k  = k, thus forming an orthonormal
basis of link streams.
Illustrating example (oscillating link stream). To better illustrate our link stream
decomposition, Fig. 22.6 gives a practical example by revisiting the oscillating link
stream introduced in Sect. 22.2. This link stream is composed of two clear structures
(the claw and the triangle) oscillating at frequency 1/2 (i.e., they repeat every two
samples). Therefore, we expect our decomposition to be able to reveal this informa-
tion.
We begin by encoding the link stream into the matrix L shown in the left side of
Fig. 22.6. This matrix constitutes a time-relational representation of the data, as its
entries encode the importance of a given relation at a given time. Since the rows of
this matrix encode the graph sequence, we can obtain a time-structure representation
by taking the product X = L . To do this, we must fix a basis  which, for
this example, is selected in the same way as in the example of Fig. 22.2. The only
difference is that we choose j = 3 since this resolution is more appropriate to study
472 E. Bautista and M. Latapy

Fig. 22.6 Illustration of the decomposition of link streams into their frequencies and structures.
The oscillating claw and triangle link stream is encoded into a matrix (left panel). This matrix can
therefore be projected into the graph basis (top-right), Fourier basis (center-right), or graph and
Fourier basis (bottom-right). To graph basis is chosen as in Fig. 22.2. The frequency-structure rep-
resentation clearly reveals the importance of the claw and triangle structures oscillating at frequency
1/2

triangles and claws. Based on , we compute X which is displayed at the top of the
right-hand side of Fig. 22.6. This representation permits us to see that a claw and the
triangle motifs, represented by the coefficients s0(3) and s1(3) , are very important in the
link stream, yet they do not appear simultaneously.
On the other hand, since the columns of L encode the activation times of indi-
vidual relations, then we can obtain a frequency-relational representation by taking
the product F =   L, where is the discrete Fourier transform matrix. For our
example, we display the magnitude of F in the middle plot of the right-hand side of
Fig. 22.6. The figure clearly shows that interactions in L possess only two frequen-
cies: 0 and 1/2. The 0 frequency reflects the mean value of ek (t), which is clearly
large due to the positivity of edge weights. On the other hand, the 1/2 frequency
indicates that ek (t) displays the same behavior every two samples.
In sum, X highlights the importance of the claw and triangle structures while F
indicates the importance of the 0 and 1/2 frequencies. Yet, these disjoint representa-
tions are not fully satisfactory as they make it unclear which structures are related to
which frequencies. To have a unique frequency-structure representation, we simply
decompose L from both sides as C =   L , for which the magnitude is shown
in the bottom plot of the right-side of Fig. 22.6. The figure clearly highlights that
the most important structures are indeed the claw and the triangle and that both
only oscillate at frequency 0 (they have positively weighted edges) and frequency
1/2 (they appear every two samples). Thus, this confirms that our decomposition
effectively reveals the oscillating and structure nature of the link stream.
22 A Frequency-Structure Approach for Link Stream Analysis 473

22.5.2 Filters in Link Streams

We now show that frequency and structural filters can be combined in order to filter
out specific frequencies and structures from a link stream. We begin by adapting
the structural filters to our matrix formalism. This is achieved by encoding the filter
coefficients in the following diagonal matrix
( j) ( j)
Q = diag(σ0 , σ1 , . . . , ν0() , ν1() , . . . ). (22.36)

This way, the filtering operation of the structural information of L can be represented
as

X = XQ . (22.37)

To represent the filter in terms of L, we have that


L= X (22.38)
= XQ  (22.39)
= L Q  (22.40)
(filt)
= LQ (22.41)

where Q(filt) =  Q . Hence, matrices that get diagonalized by  constitute struc-


tural filters for link streams.
Similarly, we can adapt frequency filters to our matrix formalism by encoding the
frequency response of the filter through the diagonal matrix

H = diag(χ0 , χ1 , . . . ). (22.42)

where χi denotes the frequency response of the filter at frequency i. This way, the
frequency filtering of L can be represented as


F = H F. (22.43)

By similar derivations as in (22.38), we have that the frequency filter can be expressed
in terms of L as

L = H(filt) L (22.44)

where H(filt) = H   .


We can then easily combine H(filt) and Q(filt) as


L = H(filt) LQ(filt) (22.45)
 
= H  L Q  (22.46)
= H CQ . (22.47)
474 E. Bautista and M. Latapy

Fig. 22.7 Illustration of the use of frequency and structural filters to recover specific information
from the link stream. In the example, filters are designed to recover the backbone activity: coarse-
grain structures with low frequencies

From (22.45), we can see that Q suppresses the columns of C while H suppresses
the rows. Thus, Q and H can be chosen to just let pass specific ranges of struc-
tures and frequencies. For example, they can be chosen to just let pass coarse grain
structures that slowly oscillate as illustrated in Fig. 22.7. Our next examples explore
in more detail the potential of these filters.
Illustrating example (aggregation and embedding). In Sect. 22.2, we stressed
that the aggregation or embedding of link streams changes their information in ways
that are hard to characterize. Notably, our developments above allow us to show that
aggregation or embedding can be seen as simple filters in our frequency-structure
domain. To show this, let us notice that the k-sample aggregation operator can be
represented by the matrix

(agg) 1 if j − k < i ≤ j
Hi j = (22.48)
0 otherwise

so that 
L = H(agg) L denotes the aggregated link stream. H(agg) is a circulant matrix
and therefore is diagonalizable by the Fourier basis, meaning that it constitutes a
frequency filter.
In Fig. 22.8 (left), we display the frequency response of this filter for the k = 2
case already used in our example of Sect. 22.2. As can be seen, this filter lets the low
frequencies pass while it entirely suppresses the information at frequency 1/2. This
implies that when it is applied to the oscillating link stream (see Fig. 22.6), only the
content at frequency 0 is retained, resulting in a new link stream where the claw and
triangle are now constant for all times. Since the simultaneous presence of the claw
and triangle constitute a clique, then this explains why the 2-sample aggregation of
our oscillating link stream results in a constant clique.
Similarly, if we embed the link stream by applying our methodology proposed
in Sect. 22.4.3, then we have that such procedure corresponds to applying a coarse-
grain pass filter to L. In Fig. 22.8 (right), we display the response of such filter, which
only let pass the structural coefficients. By applying this filter to our oscillating link
stream (see Fig. 22.6), then we can see that its effect is to just retain the coefficients
associated with s0(3) and s1(3) for frequencies 0 and 1/2. Thus, this shows that the
22 A Frequency-Structure Approach for Link Stream Analysis 475

Fig. 22.8 Aggregation and embedding processes modeled as filtering link streams. The two-sample
aggregation of interactions corresponds to a low-pass filter in the frequency-domain. The embedding
of graphs via the scaling coefficients corresponds to a coarse-grain filter in the structural-domain

embedding method effectively maps the link stream into two time series (s0(3) (t) and
s1(4) (t)) whose frequencies reflect the frequencies of the initial link stream. Of course,
we may combine the aggregation and embedding filters and see that both amount to
just retaining the coarse-grain information at zero frequency.
Illustrating example (backbone of a link stream). In this example, our aim is
twofold. Firstly, we aim to show that, even though a raw link stream may be extremely
sparse, our decomposition allows to soundly reveal its characterizing frequencies
and structures. Secondly, we aim to show that our filters can be used to retrieve
the backbone of the link stream: its fundamental activity pattern. To show this, let
us consider the example of Fig. 22.9, where it is shown (on the top) a link stream
that reflects a typical communication pattern: two communities whose members
sporadically communicate during daytime and are inactive during night-time. The
challenge is that spotting this communication pattern from the raw data is a difficult
task. Namely, the graphs from the sequence are too sparse to infer the community
structure and the edge-time series are too spiky to infer the day-night periodicity.
Normally, to see the pattern it is necessary to take aggregates of the link stream,
which are unsatisfactory as they involve loss of information and different patterns
may result in equal aggregates.
Notably, our decomposition is able to effectively spot the pattern from the raw
data. To show this, we employ our SVD-based partitioning procedure and report the
magnitude of the frequency-structure coefficients of the link stream in the lower-left
side of the figure. As it can be seen, there are four coefficients that contain most of the
information: s0(4) and s3(4) both with frequency 0 and 1/20. These coefficients effec-
tively indicate the presence of two communities whose activity periodically repeats
every twenty samples. Interestingly, having four significantly large coefficients and
many others with very little magnitude allows us to give a new interpretation of our
link stream: it consists of two oscillating communities (large coefficients) plus details
(small coefficients). We can therefore just retain the large coefficients in order to only
preserve the non-detailed version of the link stream. This is what we call its back-
bone. To do this, we simply employ our frequency and structural filters to suppress
the coefficients outside the red box shown in the plot. The resulting link stream is
displayed in the lower-right part of Fig. 22.9. As can be seen, it recovers the periods
of large activity during daytime and periods of no activity during night-time. More-
over, the large activity periods consist of graphs with the two types of community
structure. Thus, this effectively reflects the backbone of the original link stream.
476 E. Bautista and M. Latapy

Fig. 22.9 Example of the recovery of the backbone of a link stream. The raw link stream (top)
consists of two-communities with day/night activity periods. The graphs of the sequence are too
sparse to spot the community-like structure while the time series of edges are too spiky to identify
the day/night nature of the activity. The frequency-structure representation of the data (bottom
left) reveals the importance of two communities with twenty-sample periodicity. Frequency and
structural filters are therefore used to just retain the coefficients contained within the red box. The
resulting link stream (bottom right) reflects the data backbone: two fully active communities during
the day and empty graphs during the night

Illustrating example (regularity of link streams). In our previous example, we


showed that low-pass frequency and coarse-grain structure filters allow us to recover
the backbone of the link stream. In this example, our aim is to show that high-
pass frequency filters and detail structure filters allow defining a notion of regularity
for link streams that pave the way to do machine learning on them. We recall that
regularity metrics aim to measure the variability of a function, which is a good
indicator of how hard it is to interpolate (predict) or to encode with a few parameters
(compress). Since the link stream is a two-dimensional function of time and relations,
then it is natural to define its regularity by taking its differentiation with respect to
time and with respect to relations. To do this, let us notice that the time differentiation
operator can be represented by the matrix


⎨1 if u = v
(diff)
Huv = −1 if u = v + 1 (22.49)


0 otherwise,

so that the differentiation of L with respect to time can be simply expressed as

∂L
= H(diff) L. (22.50)
∂t
22 A Frequency-Structure Approach for Link Stream Analysis 477

Interestingly, H(diff) is a circulant matrix and therefore is diagonalizable by the Fourier


basis, meaning that it constitutes a frequency filter. In particular, it is a high-pass fre-
quency filter as it measures differences between successive time samples. Concerning
the relational dimension, we showed in Sect. 22.4.4 that detail pass structural filters
admit an interpretation as the graph analog of the differentiation operator. Therefore,
we employ them to compute the differentiation of L with respect to the relational
axis. To do this, notice that detail pass filters can be represented, in the time-relational
domain, by the matrix
 
(diff) 1  
Q =I− 1 ( j) 1 ( j) , (22.51)
2 j k Ek Ek

( j)
where 1E ( j) denotes the indicator vector of Ek . Thus, differentiation of L with respect
k
to relations can be expressed as

∂L
= LQ(diff) . (22.52)
∂e
Based on these definitions, we can then define the regularity of L along the relational
 2  
axis as r ege (L) =  ∂L  and along the temporal axis as r egt (L) =  ∂L 2 , which
∂e F ∂t F
can then be combined in order to obtain a measure of the total of variation of the link
stream as
r eg(L) = r egt (L) + r ege (L) . (22.53)

To show that our regularity metric effectively measures the complexity of the link
stream, our next result shows that, for unweighted link streams, the time regularity
term measures the number of edge state changes over time and the relational regularity
term measures the expected error when approximating the link stream by another
random one that is structurally equal.
Lemma
 22.4 Let L be an unweighted link stream. Then, wehave that r egt (L) =
∗ ∗
t edit (G t , G t−1 ) and that r ege (L) = E t dist G t , G t , where G t is a graph
drawn at random from the class of graphs structurally equal to G t .
The proof of Lemma 22.4 is given in Appendix 22.7.4. From the lemma, it can be
seen that zero regularity is only attained in the case where the following two criteria
are met: the link stream does not evolve at all and the sequence is formed graphs
( j)
where the sets Ek are either empty or complete. Therefore, only trivial link streams
admit zero regularity. On the other hand, maximum regularity is attained when the
following two criteria are satisfied: two successive graphs never intersect and the
( j)
structures Ek have half of their edges active at all times. In sum, our regularity metric
measures the slightest evolution and the slightest inactive edges in the structures
of L.
Clearly, the above time regularity term measuring individual edge changes may
be too strict for applications involving a sequence of structurally similar graphs: like
478 E. Bautista and M. Latapy

a sequence of realizations from a stochastic block model which may consider regular
due to its stable community structure, yet the above metric assigns a large regularity
value. To address such situations, we can leverage our insights from Lemma 22.2 to
propose a notion of time regularity at a larger structural resolution. For this, we notice
that X can be seen as the concatenation of two matrices X = [S, W], where S contains
the scaling coefficients and W the wavelet ones. Since the scaling coefficients act as
an embedding method that maps structurally similar graphs to similar vectors, we can
then define time differentiation at the structural resolution as ∂S ∂t
= H(diff) S. Based
 2
on this, we can define a more relaxed temporal regularity term as regt (S) =  ∂S  ,
∂t F
which is zero as long as all the graphs from the sequence are structurally equal.
We finish by stressing that these definitions open the door to machine learning
directly on link streams. To see this, recall that the classical machine learning prob-
lem, in the supervised setting, refers to the problem of finding a target function which
we partially know. Since the space of functions that fit the set of known values is
potentially infinite, it is standard to reduce the space of admissible functions by
imposing the a-priori that the target function should be highly regular (a property
usually held by natural functions). Therefore, metrics that assess the regularity of
functions are crucial in machine learning applications. In this vain, our regularity
metric makes it possible to do machine learning in situations where the link stream,
which is a 2-dimensional function, is partially known. For this problem and under
the assumption that the link stream under search should be regular, our derivations
above allow us to propose regularization problems of the form
 
argminL∗ L − L∗ 2F + r eg(L∗ ) , (22.54)

where the first term looks for a link stream that fits the known data and the second
one penalizes irregular solutions. We leave the study of regularization problems on
link streams as future work.

22.6 Conclusion

In this work, we presented a frequency-structure analysis of link streams. Our analy-


sis is based on a novel linear matrix framework that represents link streams as simple
matrices that unify their classical time series and graph interpretations. This represen-
tation then allows processing link streams by means of simple matrix products with
matrices representing linear signal and graph operators. We showed that most signal
processing operators could be readily adopted into our framework as a means to
process time and we also developed a set of novel graph-based techniques to process
the structural information. In particular, we developed a multi-resolution analysis for
graphs and structural filters that allow us to spot and tune their structural information.
These developments were possible by interpreting graphs as functions and then by
adapting signal processing methods to process such functions. We showed that these
22 A Frequency-Structure Approach for Link Stream Analysis 479

results permit a novel representation of link streams in a frequency-structure domain


that reveals the important structures and frequencies contained in it. Moreover, we
showed that various interesting processing tasks could be seen as simple filters in this
domain. In particular, our decomposition and filters open the door to extract features
of better quality when searching for events of interest in link streams and also pave
the way to do machine learning directly on them.

22.7 Appendix

22.7.1 Proof of Lemma 22.1

Proof Let Q denote the orthonormal matrix stacking the scaling and wavelet func-
tions as its rows. Then, we have that xi = f G i Q . The proof of (2) follows from
the fact that x1 , x2  = ( f G 1 Q )( f G 2 Q ) = f G 1 Q Q f G2 = f G 1 f G2 = |E 1 ∩ E 2 |,
where we used the orthonormality of Q and the unweighted graph assumption. The
proof of (1) follows as a particular case of (2) for f G 2 = f G 1 . To prove (3), notice that
 f G 1 − f G 2 22 = edit (G 1 , G 2 ) for unweighted graphs. Then, by developing the left-
hand side term and by using the orthonormality of Q, we have that  f G 1 − f G 2 22 =
( f G 1 − f G 2 )( f G 1 − f G 2 ) = (x1 − x2 )QQ (x1 − x2 ) = (x1 − x2 )(x1 − x2 ) .

22.7.2 Proof of Lemma 22.2


 ( j) ( j)
Proof We begin by proving (2). To do it, let us notice that s1 , s2  = k (s1 )k (s2 )k .
( j)
From the definition of structurally equal graphs, we have that |E 1 ∩ Ek | = | Ẽ 1 ∩
( j) ( j) ( j)
Ek | and that |E 2 ∩ Ek | = | Ẽ 2 ∩ Ek |. This implies that, irrespectively of the sam-
( j) ( j) ( j)
pled G 1 and G 2 , the product (s1 )k (s2 )k is invariant. By letting |E 1 ∩ Ek | = m 1 and
( j) ( j) ( j)
|E 2 ∩ Ek | = m 2 , we have that (s1 )k (s2 )k = m 1 m 2 /2 j . Our goal now is to show
that this quantity equals the expected number of common active tuples between G 1
( j)
and G 2 in the set Ek . We do this by noticing that the intersection between G 1 and G 2 ,
( j)
restricted to the tuples of the set Ek , can be modeled as a sampling process without
( j)
replacement. Namely, we can see the tuples of Ek as our total population consisting
of 2 j possible tuples. From this population, the edges of G 2 embed m 2 of such tuples
with the property of being active. Then, we take a sample of m 1 tuples from such a
population, where the sampled tuples are dictated by the edges of G 1 . Our interest is
to measure, from the sampled tuples, how many of them have been embedded with
the property of being active as such tuples are active in both G 1 and G 2 . This is,
they are the number of common edges in both graphs. The process of sampling m 1
elements out of a population of 2 j elements where m 2 of them have some property is
modeled by the hypergeometric distribution, where the expected number of sampled
480 E. Bautista and M. Latapy

( j)
elements having the property is given as m 1 m 2 /2 j . Since the sets Ek are disjoint and
expand the entire edge-space, then applying this argumentation to all k proves (2).
The proof of (1) follows as a particular case of (2) where C1 = C2 . To prove (3), let
 ( j) ( j)
us notice that s1 − s2 22 = k ((s1 )k − (s2 )k )2 . By using the same assumptions
as above, we have that
 2
( j) ( j) m1 − m2
((s1 )k − (s2 )k )2 = √ (22.55)
2j
m 21 m2
= j
+ j2 − 2r (22.56)
2 2
m 21 m 22
= j + j − 2r + γ1 + γ2 − γ1 − γ2 (22.57)
2 2
= m 1 − r1,2 + m 2 − r1,2 − m 1 + r1,1 − m 2 + r2,2 (22.58)

where ri, j = m i m j /2 j and γi = (1 − m2 ji )m 1 . The term m 1 − r1,2 encodes for the


number of edges in G 1 minus the expected number of edges in G 1 that are also in G 2
( j) ( j) ( j)
(restricted to Ek ). Therefore, if we let G (k) (k)
1 (V, E 1 ∩ Ek ) and G 2 (V, E 2 ∩ Ek )
( j)
denote the restrictions of G 1 and G 2 to Ek , we have that

m 1 − r1,2 = E[dist (G (k) (k)


1 , G 2 )] (22.59)

This means that (22.58) can be rewritten as

( j) ( j)
((s1 )k − (s2 )k )2 = E[dist (G (k) (k) (k) (k)
1 , G 2 s ) + dist (G 2 , G 1 )

− dist (G (k) (k) (k) (k)


1 , G 1 ) − dist (G 2 , G 2 )]. (22.60)

The proof is finished after summing over k.

22.7.3 Proof of Lemma 22.3

( j) ( j)
For simplicity, let us denote E ∩ Ek = E ∗ ∩ Ek = m k . Then, we have that
   m k 2
r eg( f G ) = f G (e) − (22.61)
k ( j)
2j
e∈Ek
  m 2k 2m k f G (e)
= f G (e)2 + − (22.62)
k ( j)
22 j 2j
e∈Ek
 m 3k 2m 2k (2 j − m k )m 2k
= mk + − + (22.63)
k
22 j 2j 22 j
22 A Frequency-Structure Approach for Link Stream Analysis 481

 m 2k
= mk − (22.64)
k
2j
= E[dist (G, G ∗ )] (22.65)

where we last step follows from the same argumentation used in the proof of property
(3) of Lemma 22.2.

22.7.4 Proof of Lemma 22.4


 
Proof We have that  ∂L 2 = t e ( f G t (e) − f G t−1 (e))2 . Due to the unweighted
∂t F
nature of the link stream, the term on  the right-hand side is equal to 1 if f G t (e) =
f G t−1 (e) and 0 otherwise. Therefore, e ( f G t (e) − f G t−1 (e))2 = edit (G t , G t−1 ). The
proof of the edge regularity term directly follows from Lemma 22.3 and the linearity
of the expected value.

References

S. Bhatia, B. Hooi, M. Yoon, K. Shin, C. Faloutsos, Midas: microcluster-based detector of anomalies


in edge streams, in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34 (2020),
pp. 3242–3249
Y.-Y. Chang, P. Li, R. Sosivc, M.H. Afifi, M. Schweighauser, J. Leskovec, F-fade: frequency fac-
torization for anomaly detection in edge streams, in Proceedings of the 14th ACM International
Conference on Web Search and Data Mining (2021)
A. Chiappori, R. Cazabet, Quantitative evaluation of snapshot graphs for the analysis of temporal
networks, in International Conference on Complex Networks and Their Applications. (Springer,
2021), pp. 566–577
L. Du, Y. Wang, G. Song, Z. Lu, J. Wang, Dynamic network embedding: an extended approach for
skip-gram based network embedding, in IJCAI, vol. 2018 (2018), pp. 2086–2092
R. Fontugne, P. Abry, K. Fukuda, D. Veitch, K. Cho, P. Borgnat, H. Wendt, Scaling in internet
traffic: a 14 year and 3 day longitudinal study, with multiscale analyses and random projections.
IEEE/ACM Trans. Netw. 25, 2152–2165 (2017)
L. Kodali, S. Sengupta, L. House, W.H. Woodall, The value of summary statistics for anomaly
detection in temporally evolving networks: a performance evaluation study. App. Stoch. Models
Bus. Ind. 36(6), 980–1013 (2020)
S. Kumar, X. Zhang, J. Leskovec, Predicting dynamic embedding trajectory in temporal interaction
networks, in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge
Discovery & Data Mining (2019), pp. 1269–1278
M. Latapy, C. Magnien, T. Viard, Weighted, bipartite, or directed stream graphs for the modeling
of temporal networks, in Temporal Network Theory. Computational Social Sciences (Springer,
2019), pp. 49–64
M. Latapy, T. Viard, C. Magnien, Stream graphs and link streams for the modeling of interactions
over time. Soc. Netw. Anal. Min. 8(1), 1–29 (2018)
J. Li, H. Dani, X. Hu, J. Tang, Y. Chang, H. Liu, Attributed network embedding for learning in
a dynamic environment, in Proceedings of the 2017 ACM on Conference on Information and
Knowledge Management (2017), pp. 387–396
482 E. Bautista and M. Latapy

S. Mahdavi, S. Khoshraftar, A. An, dynnode2vec: scalable dynamic network embedding, in 2018


IEEE International Conference on Big Data (Big Data) (IEEE, 2018), pp. 3762–3765
G.H. Nguyen, J.B. Lee, R.A. Rossi, N.K. Ahmed, E. Koh, S. Kim, Continuous-time dynamic
network embeddings, in Companion Proceedings of the Web Conference 2018 (2018), pp. 969–
976
A. Özcan, Ş.G. Öğüdücü, Supervised temporal link prediction using time series of similarity mea-
sures, in 2017 Ninth International Conference on Ubiquitous and Future Networks (ICUFN)
(IEEE, 2017), pp. 519–521
A. Paranjape, A.R. Benson, J. Leskovec (2017) Motifs in temporal networks, in Proceedings of the
Tenth ACM International Conference on Web Search and Data Mining (2017), pp. 601–610
A. Pareja, G. Domeniconi, J. Chen, T. Ma, T. Suzumura, H. Kanezashi, T. Kaler, T. Schardl, C. Leis-
erson, Evolvegcn: Evolving graph convolutional networks for dynamic graphs, in Proceedings
of the AAAI Conference on Artificial Intelligence, vol. 34 (2020), pp. 5363–5370
S. Peng, Y. Shen, Y. Zhu, Y. Chen, A frequency-aware spatio-temporal network for traffic flow pre-
diction, in International Conference on Database Systems for Advanced Applications (Springer,
2019), pp. 697–712
B. Ribeiro, N. Perra, A. Baronchelli, Quantifying the effect of temporal resolution on time-varying
networks. Sci. Rep. 3(1), 1–5 (2013)
R. Trivedi, H. Dai, Y. Wang, L. Song, Know-evolve: deep temporal reasoning for dynamic knowl-
edge graphs, in Proceedings of the 34th International Conference on Machine Learning, ICML’17,
vol. 70 (2017), pp. 3462–3471. JMLR.org
B. Yu, H. Yin, Z. Zhu, Spatio-temporal graph convolutional networks: A deep learning frame-
work for traffic forecasting, in Proceedings of the Twenty-Seventh International Joint Conference
on Artificial Intelligence, IJCAI-18, (2018), pp. 3634–3640. International Joint Conferences on
Artificial Intelligence Organization, 7 2018
W. Yu, W. Cheng, C.C. Aggarwal, K. Zhang, H. Chen, W. Wang, Netwalk: a flexible deep embedding
approach for anomaly detection in dynamic networks, in Proceedings of the 24th ACM SIGKDD
International Conference on Knowledge & Data Mining (2018), pp. 2672–2681
N. Zhang, X. Guan, J. Cao, X. Wang, W. Huayi, Wavelet-hst: a wavelet-based higher-order spatio-
temporal framework for urban traffic speed prediction. IEEE Access 7, 118446–118458 (2019)
L. Zheng, Z. Li, J. Li, Z. Li, J. Gao, Addgraph: Anomaly detection in dynamic graph using attention-
based temporal gcn, in IJCAI (2019), pp. 4419–4425
D. Zhu, P. Cui, Z. Zhang, J. Pei, W. Zhu, High-order proximity preserved embedding for dynamic
networks. IEEE Trans. Knowl. Data Eng. 30(11), 2134–2144 (2018)
Index

A Complex networks, 85, 88, 161, 203, 386,


Accessibility, 133, 139, 140, 142, 143, 146, 404
423, 424 Complex systems, 120, 146, 159, 165, 176,
Averaging, 10, 56, 58, 145, 270, 299, 306, 275, 353, 386
308 Concurrency, 14, 131–138, 141, 142, 145,
146, 259–261, 268, 273

B
Backbone, 6, 203–216, 220, 221, 244, 475, D
476 Disease spread, 10, 12, 13, 38, 234, 244
Bayesian inference, 65, 66 Distance measure, 149, 161
Behavioral synchrony, 381, 382, 400 Dynamic networks, 14, 32, 36, 42, 73, 76,
Betweenness, 53, 62, 128, 150, 158, 206– 186, 195, 197–199, 386, 387
209, 217, 221, 336
Blinking networks, 275–280
E
Electronic communication, 19, 211
C Epidemic spreading, 37, 38, 65–67, 69, 78,
Centrality, 9, 10, 53, 90, 91, 95, 128, 150, 79, 81, 132, 203, 221, 243, 255, 259,
158, 162, 221, 335–346, 348, 350, 261, 315, 319, 324
352, 353, 359 Epidemic threshold, 134, 241, 243, 248, 254,
Change points, 65, 66, 76–81, 187, 250 257, 259, 261, 262, 264, 266–271,
Clustering, 34, 39, 49, 50, 53, 55, 56, 58–62, 273, 319, 322, 324, 327, 329–331,
75, 84, 186, 187, 191, 393, 435–439, 403, 406, 416, 417, 419, 420, 425–
442, 445, 446 428, 430, 433
Communication networks, 25–27, 29, 30, Event graphs, 7, 107, 109–119, 121–124,
33, 41, 111, 126, 149–152, 158, 159, 126–128, 190
278, 279, 318, 324, 381, 399
Communication system, 149, 150, 161, 162
Community detection, 8, 30, 39, 40, 128, F
158, 162, 186, 188, 192–195, 198– Fastest path, 108, 203, 204, 206, 211, 220
200, 226, 440
Community structure, 8, 14, 65, 66, 72, 81,
90, 186, 195–199, 313, 319, 329, 392, H
438, 462, 475, 478 Higher-order Markov chains, 65

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer 483
Nature Switzerland AG 2023
P. Holme and J. Saramäki (eds.), Temporal Network Theory, Computational Social
Sciences, https://doi.org/10.1007/978-3-031-30399-9
484 Index

Human communication, 26, 108, 149, 150, 128, 131–134, 141, 143, 146, 149,
166, 169, 381, 399 151, 155–162, 204–207, 209–211,
249, 257, 336, 337, 360, 362, 410,
411, 420, 423–425, 432
I Poisson process, 228
Infection propagator, 241, 417, 418 Precedence, 149, 155
Information, 1–7, 10, 11, 19, 35, 38, 39, Proximity networks, 69, 75, 381, 399
50, 52, 54, 56, 58, 60, 68, 77, 83,
84, 86, 88, 89, 91, 93–95, 99–101,
107–110, 112, 127, 131, 133, 135– R
137, 140, 141, 145, 149–152, 155– Random walks, 12–14, 17, 96, 98, 134, 191,
159, 161, 162, 171, 174, 176, 179, 225, 226, 230–232, 234, 235, 237,
188, 193, 203, 204, 211, 215–218, 341, 439
220, 226, 234, 235, 260, 261, 278, Reachability, 62, 115, 131–135, 137–146,
279, 329, 357–362, 368, 377, 378, 425, 428, 429, 432
382, 385, 387, 392, 394, 400, 430,
435, 436, 444, 450, 451, 453, 454,
457, 459, 461, 465, 467, 468, 470, S
471, 473–475 Sexually Transmitted Diseases (STD), 126,
Information diffusion, 150, 203–205, 211, 133, 134, 137
213, 221, 234, 357, 360, 382 Shortest path, 108, 158–160, 203–207, 209,
217, 220, 221, 336
Spectral radius, 250, 254, 418
L Stochastic stability, 277, 282, 295, 301
Link prediction, 18, 36, 37, 44, 381, 382, Stream graphs, 49–51, 53–55, 57, 58, 61, 62,
384–392, 394, 395, 399, 400 186
Structural cohesion, 134, 135, 142, 144, 146

M
Master stability function, 296, 301, 302, 304, T
307 Temporal contacts, 19, 132, 133, 135, 136,
Multilayer, 140, 337, 438 140, 141, 145, 178, 213
Multilayer networks, 7, 40, 140, 337, 339, Temporal networks, 1–20, 25, 27, 29, 32–42,
340, 435, 436, 438 49, 50, 61, 65, 66, 69, 70, 73, 74, 78,
Multiplex networks, 337–340, 352 80, 83, 86–88, 91–93, 98, 101, 102,
107–111, 113–118, 120–123, 126–
128, 131–136, 139, 142, 145, 146,
N 149–154, 157, 159, 161, 165, 166,
Network connectivity, 107 175, 179–181, 185–188, 190, 195–
Network epidemiology, 11, 242 197, 200, 203–206, 210, 211, 213–
Network evolution, 14, 66, 84, 95, 382, 400 215, 217, 218, 225, 234–237, 241–
Network science, 1, 2, 8, 18, 33, 41, 49, 84, 243, 250, 251, 255–257, 259–261,
128, 161, 185, 226, 404 264, 265, 273, 275, 335–339, 342,
Network theory, 26, 34, 37 345, 352, 353, 357–359, 362–368,
Non-backtracking matrix, 254, 423 378, 379, 381–383, 386, 391, 394,
Number of slices, 435–445 403, 404, 406, 408, 427, 435–438,
440, 442–444, 446, 447
Temporal text network, 149–157, 161
P Text, 26, 27, 33, 36, 37, 44, 75, 110, 149–
Pagerank, 226, 335–338, 340, 341, 343–346, 155, 157–161, 391
348, 350, 353 Theoretical epidemiology, 241, 243, 251
Path, 4–6, 34, 41, 42, 51, 62, 86, 96, 108, Time decay, 382, 386–388, 390–392, 394,
109, 111, 113–115, 117–119, 127, 395, 400
Index 485

Time-varying networks, 14, 66, 259, 313, W


315, 330
Weighted network, 6, 203–207, 209, 214,
220, 221, 386
V Window of opportunity, 280, 281, 295, 304,
Visualization, 5, 19, 338, 345, 392, 393 306

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy