Abstract Interpretation of Distributed Network Control Planes
Abstract Interpretation of Distributed Network Control Planes
Planes
RYAN BECKETT, Microsoft Research, USA
AARTI GUPTA, Princeton University, USA
42
RATUL MAHAJAN, University of Washington, USA and Intentionet, USA
DAVID WALKER, Princeton University, USA
The control plane of most computer networks runs distributed routing protocols that determine if and
how traffic is forwarded. Errors in the configuration of network control planes frequently knock down
critical online services, leading to economic damage for service providers and significant hardship for users.
Validation via ahead-of-time simulation can help find configuration errors but such techniques are expensive
or even intractable for large industrial networks. We explore the use of abstract interpretation to address this
fundamental scaling challenge and find that the right abstractions can reduce the asymptotic complexity of
network simulation. Based on this observation, we build a tool called ShapeShifter for reachability analysis.
On a suite of 127 production networks from a large cloud provider, ShapeShifter provides an asymptotic
improvement in runtime and memory over the state-of-the-art simulator. These gains come with a minimal
loss in precision. Our abstract analysis accurately predicts reachability for all destinations for 95% of the
networks and for most destinations for the remaining 5%. We also find that abstract interpretation of network
control planes not only speeds up existing analyses but also facilitates new kinds of analyses. We illustrate
this advantage through a new destination "hijacking" analysis for the border gateway protocol (BGP), the
globally-deployed routing protocol.
CCS Concepts: • Networks → Protocol testing and verification; • Software and its engineering →
Abstraction, modeling and modularity; Software verification; Automated static analysis.
Additional Key Words and Phrases: Network Verification, Network Simulation, Network Reliability, Abstract
Interpretation, Network Control Plane, Distributed Routing Protocols, Router Configuration Verification
ACM Reference Format:
Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker. 2020. Abstract Interpretation of Distributed
Network Control Planes. Proc. ACM Program. Lang. 4, POPL, Article 42 (January 2020), 27 pages. https:
//doi.org/10.1145/3371110
1 INTRODUCTION
Computer networks are the connective tissue that provides access to services from online banking
to retail to flight reservations to movies to local and federal governments. When they malfunction,
these services often become inaccessible. The real-world consequences can range from revenue
losses at cloud providers reaching $100K/minute [Tweney 2013] to mass groundings of flights [Sverd-
lik 2017] to customers being unable to withdraw funds from their bank accounts [Roberts 2018].
Authors’ addresses: Ryan Beckett, Microsoft Research, USA, rbeckett@princeton.edu; Aarti Gupta, Princeton University,
USA, aartig@princeton.edu; Ratul Mahajan, University of Washington, USA , Intentionet, USA, ratul@cs.washington.edu;
David Walker, Princeton University, USA, dpw@princeton.edu.
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee
provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and
the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses,
contact the owner/author(s).
© 2020 Copyright held by the owner/author(s).
2475-1421/2020/1-ART42
https://doi.org/10.1145/3371110
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
42:2 Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker
A fundamental hurdle to operating computer networks correctly is that they usually run dis-
tributed protocols, and the behavior of these protocols is controlled via hundreds of thousands,
or sometimes even millions, of lines of low-level router configuration code.1 At this scale, it is
impossible to manually analyze configurations and certify their correctness. Moreover, any one-time
manual analysis is woefully insufficient because the network is in constant flux—link and router
failures change the paths used; new route announcements from a neighboring network change
traffic patterns; and network operators change configurations to defend against security threats,
engage in maintenance activities, reduce costs, or optimize performance. Such changes can lead to
unexpected results and network outages [Sharwood 2016; Sverdlik 2012].
Naturally then, researchers have turned to automated analysis methods to address this network
reliability challenge. In a first wave of research, researchers developed methods to analyze the
network data plane, i.e., the set of rules that govern how network traffic is forwarded from point A
to B right now. Systems such as Anteater [Mai et al. 2011], Header Space Analysis (HSA) [Kazemian
et al. 2012], Veriflow [Khurshid et al. 2013], NetPlumber [Kazemian et al. 2013], NetKAT [Anderson
et al. 2014], Network Optimized Datalog (NoD) [Lopes et al. 2015] and others model the data plane
and check properties such as reachability, absence of black holes, equivalence and more. This
groundbreaking research has produced methods that can scale to the world’s largest networks.
However, networks have a meta component, the control plane, composed of distributed routing
protocols that collectively decide which paths to use at any given time. When a failure happens,
for instance, the control plane computes alternate paths that avoid the failure and installs them
in the data plane. Hence, even if the data plane was correct 10 seconds ago, it may not be correct
now. Indeed, in practice, faults in control planes can lie dormant only to be triggered when devices
fail or during unexpected interactions with peer networks. A control plane analysis examines the
configurations of the distributed routing protocols and attempts to verify that some, or all, of the
possible future data planes that may be produced by a control plane satisfy a given property.
Research on control plane analysis is also well under way [Beckett et al. 2017, 2018; Fayaz et al.
2016; Feamster 2005; Fogel et al. 2015; Gember-Jacobson et al. 2016; Narain et al. 2016; Prabhu
et al. 2017; Wang et al. 2012; Weitz et al. 2016]. These analyses typically considers one or more
network environments (i.e., failure scenarios and routing announcements from neighbors), predict
the network behavior, and check that it satisfies desired connectivity and other invariants. They
have been able to find a range of configuration errors in real networks.
Scalability remains a key challenge for control plane analysis, however. The fundamental issue
is that the time and memory it takes to compute the behavior of routing protocols can grow highly
non-linearly in the size of the network, even for fixed concrete environments [Fabrikant et al. 2011;
L. Daggitt and Griffin 2018], let alone for general verification that considers an exhaustive set
of environments. This observation is not merely theoretical. Tools that model the control plane
precisely such as Minesweeper [Beckett et al. 2017] and Bagpipe [Weitz et al. 2016], exhibit highly
non-linear performance curves and are thus unable to analyze large networks. Other tools, such as
ARC [Gember-Jacobson et al. 2016] and ERA [Fayaz et al. 2016], scale better by sacrificing precision,
i.e., they craft specialized abstract representations that are either unsound or incomplete with
respect to protocol features. Despite such approximations, their scaling behavior is still highly
non-linear (e.g., ERA must enumerate all pairs of network paths). Topological abstractions, in which
an equivalent smaller network is crafted from a large network by exploiting symmetry [Beckett
et al. 2018], can also help scale. But these lossless abstractions too bring limited relief and do not
work when network topology or routing policy is asymmetric.
1 Despite the emergence of software-defined networking (SDN), where the control plane is centralized, most large networks
still use distributed protocols. We focus on such networks.
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
Abstract Interpretation of Distributed Network Control Planes 42:3
Tackling the grand challenge of scalable control plane analysis requires systematically using
abstraction to combat complexity. To do so, we build the first framework for analysis of network
control planes via abstract interpretation [Cousot and Cousot 1977]. There are, of course, hundreds
of papers on abstract interpretation of conventional programs. However, distributed control planes
have a unique structure, semantics, and scaling properties, and they deserve special attention
because they play a crucial role in the world’s computing infrastructure. Our work does not
contribute to the general literature on abstract interpretation but focuses on the use of abstract
interpretation in the network control plane. It considers which abstractions to choose in this domain;
why those abstractions are expected to work; what kinds of performance gains are expected; when
these abstractions are sound; and how to implement a practical tool that exploits them.
Our most surprising and impactful result is that the right abstractions lead to asymptotic gains
in performance for reachability analysis—a key property of interest to network operators—often
with no loss of precision. Figure 1 plots the analysis time for our abstract simulator, ShapeShifter,
against Batfish [Fogel et al. 2015], a state-of-the-art (concrete) network simulator, as the number
of routers in the data center increases. Consistent with our complexity analysis, Batfish scales
slightly worse than quadratically with network size whereas ShapeShifter scales slightly worse
than linearly. Similar scaling trends hold for memory. The end result is that, in our experiments,
Batfish runs out of memory on a machine with 32 GB of RAM when network sizes exceed 1000
devices, not an uncommon situation for a large network. On the other hand, ShapeShifter can
continue to analyze these networks, and larger ones, in under a minute. We also found that our
abstract reachability analysis was perfectly accurate on 95% of the 127 industrial networks we
studied from a large cloud provider. On the remaining 5% of networks, we found the reachability
analysis accurate for the majority of destinations (i.e., IP prefixes).
Paper roadmap. In §2, we explain the basics of control plane protocols and our key insights,
including why certain network abstractions can retain precision while speeding analysis and how
abstract interpretation can enable new kinds of network analysis. In §3, we model network protocols
as routing algebras [Griffin and Sobrinho 2005; Sobrinho 2005] and introduce the idea of abstract
routing algebras. In §4, we list conditions relating concrete and abstract algebras that are sufficient
for soundness. In §5, we enumerate, via examples, the space of control plane message abstractions
that one may consider when crafting control plane analyses. In §6 and §7, we describe the design
and implementation of ShapeShifter, the world’s fastest control plane reachability analysis. Finally,
in §8, we present results from running ShapeShifter on a range of synthetic and real networks.
2 KEY IDEAS
Networks with distributed control planes run routing protocols that compute paths to different
destinations. While the protocols differ widely in the details, they share a common structure that
can be formalized using routing algebras [Griffin and Sobrinho 2005; Sobrinho 2005]. In this section,
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
42:4 Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker
we first give an informal introduction to the process of modeling network control planes using
routing algebras and then illustrate how to build abstractions atop these models. We then highlight
three key observations that underpin our research: (1) despite only approximating control plane
behavior, network abstractions often enable precise analysis for important properties, (2) they can
improve analysis performance dramatically, and (3) they enable new kinds of analyses.
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
Abstract Interpretation of Distributed Network Control Planes 42:5
While non-convergence can be problematic, it is not a major source of bugs in practice. We ignore
issues of convergence in this paper and focus on other properties of interest.
Modeling routing protocols. To model a routing protocol (or multiple interacting protocols) on
a network, we need to know: (1) the topology of the network, (2) the types of messages exchanged
between nodes, (3) the transfer function (f ), which describes how messages are transformed (or
discarded) as they move between nodes, and (4) the merge function (written as a ⊕ b), which
describes how nodes combine messages to the same destination from multiple neighbors to select
best messages.
With this information, one can simulate the execution of the routing protocol and find its stable
state, by starting in the initial state, extracted from router configurations, and repeatedly exchanging
messages between nodes. When node i sends a message to j, the transfer function f along the (i, j)
edge will transform the message and the message will be merged with node j’s current state. If j’s
current state is s and i’s current state is m, when j receives a new message from i, its new state will
be f (m) ⊕ s. If s = f (m) ⊕ s then this part of the network is stable.
A simplified BGP example. Consider Figure 2(a), which presents the initial state for a 5-node
BGP network. Each node is an independent autonomous system (AS), a configuration that is
common in data centers [Lapukhov et al. 2015].2 For simplicity, assume that the network has a
single destination, R1 , and messages are represented as (lp, path, comm) triples, where lp is a local
preference that BGP routers can set to prefer one route over another, path contains the AS path that
routing advertisements traverse as they pass between devices, and comm is a bit vector that tracks
the community values attached to the message. We also assume that there is only one possible
community value, c 1 , which is either present (1) or absent (0). R1 starts by advertising the route
(100,[],[0]), and all other nodes start with state ∞, meaning they have no route.
The default BGP transfer function along each edge (i, j) adds the AS number of the router (j) to
the AS path and sets the local preference to 100. Network configurations can modify this default.
In the figure, the network has been configured to add the community c 1 when a message passes
between R 1 and R 3 and to update the local preference to 200 if the community c 1 is attached to
messages passing between R 3 and R 4 . The BGP merge function selects a route with the highest
local preference, discarding messages with lower local preference. If two messages have the same
local preference, it will choose the route with the shorter AS path.
Figure 2(b) records the stable state to which Figure 2(a) converges. The arrows indicate the flow
of routing messages. More specifically, the (lp, path, comm)-triple printed in black next to each node
is the final message chosen by that node (ignore the pairs printed in orange for now). For instance,
router R 3 converges to (100,[R 1 ],[1]), a route with local preference (lp) 100, an AS path that leads
directly to the destination and the community c 1 . Router R 4 converges to (200,[R 3 ,R 1 ],[1])—it
prefers the path through R 3 over R 2 because the lp of the message along that path is higher.
2 In other words, this network is using eBGP everywhere and does not use iBGP. But iBGP can be similarly modeled (§6).
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
42:6 Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker
Fig. 2. Example BGP network. (a) The network initial state; non-default transfer functions shown above edges
(R 1 ,R 3 ) and (R 3 ,R 4 ), (b) the stable state for a concrete execution and an abstract execution (orange).
field is again a vector of communities, but this time each community may be 0 (definitely absent), 1
(definitely present) or ∗ (unknown). These messages are significantly more compact because they
do not have localpref or path.
Figure 2(b) also presents the result of analyzing the network using these abstract messages. The
final state is shown in orange. This abstract analysis cannot model the effect of assigning the local
preference along the (R 3 ,R 4 ) edge. Consequently, it cannot determine whether the message from
R 3 (with community c 1 ) should be preferred over that from R 2 (without c 1 ). This uncertainty is
represented at R 4 using the message (R,[*]), indicating it has a route to the destination which
may or may not carry community c 1 . When the analysis completes, all routers have selected some
message, and we can conclude that all routers have a path to the destination. If our goal is to
guarantee reachability, this abstract analysis for this network yields precise answers.
Of course, such abstract analysis will not always produce precise answers. Suppose the edge
between R 4 and R 5 drops messages without c 1 attached. Then, for soundness, the message (R,[*])
must be dropped when it is transmitted between R 4 and R 5 because the ∗ indicates c 1 may or may
not be present. Upon termination, the analysis will report that it cannot guarantee R 5 can reach the
destination. However, R 5 can actually reach the destination, because R 4 prefers the message with
the community attached.
We have found that, in practice, policies that lose precision are uncommon in real networks
(§8), even for simple abstractions. Our example of path-sensitive routing, where messages along
different paths are modified differently in a way that matters downstream, is uncommon because it
decreases fault tolerance. A fault along the R 1 , R 3 , R 4 path will unnecessarily disconnect R 5 from
R 1 . Another reason of such a policy being uncommon is that there is a simpler implementation, e.g.
not exporting the route from R 1 to R 2 in the first place (R 4 would learn the route (R,[1])).
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
Abstract Interpretation of Distributed Network Control Planes 42:7
Fig. 3. Example of a data center running eBGP where abstraction improves simulation performance.
(2) Abstraction reduces the number of states a router can inhabit. Hence, analysis requires fewer
iterations to converge. For example, BGP can explore O(2n ) paths before converging [Fabrikant
et al. 2011], but a reachability abstraction will converge in O(n) iterations (§6.3).
(3) Abstraction enables collapsing of concrete routing messages for many destinations into a single
abstract message, which allows for simultaneous, bulk processing of those destinations.
As an example of the last point, consider the datacenter in Figure 3. It runs BGP as before, though
it does not use communities and has multiple destinations. To model the data center, routing
messages are represented as finite maps that take destination subnets to routes, which are (local
preference, AS path) pairs. The figure shows the top-of-rack router T01 advertising the destination
prefix 10.0.0.0/31 (written as 0/31) with local preference 100 and the empty AS path.
A concrete simulation might proceed with T01 , T02 , and T03 advertising their prefixes to T11 .
When T11 next propagates the routes it learned from the three T0 switches to the spine, it will send
the route for each destination prefix separately, and receiving routers will process each message
separately. While this processing may appear negligible, as the size of a data center grows, so too do
the number of prefixes and processing edges. If the data center has a shortest path routing policy, d
destination prefixes, and e edges, then computing the final network state will take O(tde) operations,
where t is the time it takes to process a single destination at a router (typically constant)—each
prefix will√propagate across all edges. For a fattree,3 a common topology in data centers, this results
in O(n 2 ∗ n) operations (and memory) for n devices [Dimitropoulos and Riley 2006].
Now consider an abstract analysis in Figure 3 (orange) that replaces the AS path with an integer
representing the path length and increments the length at each hop. When T01 , T02 , and T03
advertise their routes to T11 , each route advertisement is the same (length 1) and thus can be
combined as shown. When T11 advertises its routes further, recipients can process all destinations
simultaneously by simply bumping the path length to 2. If we process nodes in a breadth-first
manner (see more about processing order in §6.3), this analysis will complete in O(te) steps. All
routes from all destinations at top-of-rack devices will propagate up to the data center spine layer,
which learns of all destinations at once, and then propagates back down.
Although it may appear that the time to process messages (t) is now proportional to the number
of destinations in the set (d), there are many efficient data structures for processing sets efficiently
3 Fattrees
generalize ordinary tree-shaped graphs such that, instead of just one link from child to parent, there are multiple
links from a child to different parents. The additional links allow networks to tolerate some number of faults before
becoming disconnected. Fattree topologies are also called Clos networks, after Charles Clos, who formalized them in the
1950s [Al-Fares et al. 2008; Clos Network 2019].
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
42:8 Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker
Fig. 4. Example of a new BGP hijacking analysis. Dashed lines are iBGP connections.
(e.g., with union and intersection that are sublinear in d when the sets are mostly disjoint or
overlapping
√ [Liljenzin 2013]). For example, for fattrees, the complexity appears to be closer to
O(n∗ n) in practice since destinations advertised by different top-of-rack routers are usually disjoint.
Indeed, recall Figure 1 from the introduction. It showed the scaling trends of the Batfish [Fogel
et al. 2015] concrete simulator compared to the simple abstract reachability analysis on variable
sized fattrees using an efficient set representation (§6.2). Hence, the empirical results back up our
complexity analysis: As topology size increases, the time for concrete vs abstract simulation diverges
due to asymptotic differences in complexity.
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
Abstract Interpretation of Distributed Network Control Planes 42:9
behavior. The figure shows a single iteration of the analysis focused on router R3 . The peer filter is
applied to the external route, resulting in a new route that reflects the dropped prefix: [1/31 7→ ∞
,* 7→ {E}]. When the internal routes from R1 and R2 and those from the external peer are merged,
the result is the larger map shown by R3 . The resulting state indicates that the 1/31 prefix can not
be hijacked, while it may be possible for peers to hijack the 0/31 and 2/31 prefixes. The analysis
would continue until a fixed point is reached.
3 NETWORK MODEL
To reason about the correctness of abstract analyses, we must define a network model and its
routing semantics. To this end, we first briefly review routing algebras [Sobrinho 2005], an algebraic
model of distributed routing protocols. We then take inspiration from previous work [Daggitt et al.
2018b] to define an asynchronous routing semantics for these algebras. In §3.3, we build on these
technical definitions and formally connect abstract routing algebras to their concrete counterparts.
In other words, routes are ordered lexicographically, first by local preference and second by the
length of their paths.
In BGP, user can customize their transfer functions using BGP configurations. Hence, each
transfer function f ∈ F (one per edge) applies user-defined policy before appending the current
hop to the BGP path. The initial route (0) is the route (100,[],[0,0,...]) with an the empty
path and all communities set to 0.
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
42:10 Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
Abstract Interpretation of Distributed Network Control Planes 42:11
Property Definition
(1) ⊕, ⊕ ♯ commutative c ⊕d =d ⊕c
(2) ⊕, ⊕ ♯ associative c ⊕ (d ⊕ e) = (c ⊕ d) ⊕ e
(3) ⊕ ♯ monotone a ⊑ ♯ c ∧ b ⊑ ♯ d =⇒ a ⊕ ♯ b ⊑ ♯ c ⊕ ♯ d
(4) ⊕,⊕ ♯ sound for α α(c ⊕ d) ⊑ ♯ α(c) ⊕ ♯ α(d)
(5) F , F ♯ sound for α α(f (c)) ⊑ ♯ f ♯ (α(c))
(6) F ♯ monotone a ⊑ ♯ b =⇒ f ♯ (a) ⊑ ♯ f ♯ (b)
are merged pointwise with: (C 1 ⊕ ♯ C 2 )i = C 1i ⊕ ♯ C 2i and individual communities are merged as:
(
♯ c 1 if c 1 = c 2
c1 ⊕ c2 =
∗ otherwise
Merging (R,[0,1]) ⊕ ♯ (R,[1,1]) results in (R,[*,1]). Now, we can relate this abstract routing
algebra to the concrete BGP algebra by defining a suitable abstraction function α. We define
α(∞) = ∞♯
α((lp, path, C)) = (R, C)
The abstraction maps ∞ to ∞♯ , replaces the BGP local preference value and path with a reachability
marker R, and keeps the communities C intact. The partial order ⊑ ♯ over S ♯ is defined as follows.
(R, C 1 ) ⊑ ♯ (R, C 2 ) ⇐⇒ C 1 ⊕ ♯ C 2 = C 2
The abstraction of no route (∞♯ ) is incomparable to the abstraction of any valid route (R, C).
4 SOUNDNESS
In this section, we prove two theorems concerning the soundness of abstract routing algebras with
respect to their concrete counterparts. These soundness results depend upon a set of algebraic
conditions laid out in §4.1.
Our first theorem states that if one fixes a concurrent schedule s = (τ , ω) then an abstract
execution using schedule s is a sound overapproximation of a concrete execution with the same
schedule s. Using this theorem, we know a single execution of the abstract algebra is sound with
respect to an execution of the concrete algebra that uses the same schedule, but we do not know
if the execution of the abstract algebra is sound with respect to other executions of the concrete
algebra. Hence, if an abstract algebra has many solutions, we must execute it many times—at least
one per solution—in order to cover all possible concrete solutions. Moreover, because we know
of no method that can predict which interleavings of messages will give rise to new and different
solutions, we may have to consider all possible interleavings, which is prohibitively expensive. In
general, abstract interpretation of highly concurrent systems (e.g., with many threads) can lead to a
combinatorial explosion in complexity [Monat and Miné 2017].
Fortunately, in practice, the situation is promising. Routing algebras, unlike most concurrent
systems, typically have a single unique solution. For instance, Lopes and Rybalchenko observed
that Microsoft’s networks have sufficient monotonicity properties to imply convergence to a unique
solution [Lopes and Rybalchenko 2019]. Likewise, Gao and Rexford famously proved that the
internet’s economic structure disincentivizes network operators from configuring their networks to
generate unstable behavior [Gao and Rexford 2000]. Last, our discussions with network operators
have not turned up situations in which these operators fear non-convergence or multiple solutions.
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
42:12 Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker
Conditions for convergence and uniqueness of solutions are well-understood and have been
studied extensively [Daggitt et al. 2018a; Sobrinho 2005]. Hence, even in the unlikely case that the
concrete algebra has multiple solutions, we can craft abstract routing algebras that (provably) have a
single solution. Consequently, any single abstract execution will suffice to soundly overapproximate
all concrete executions. The specific abstractions we suggest in this paper have that property (See
§6.1) as they abstract features such as the BGP local preference that can lead to multiple solutions.
Hence, our second theorem states the strong property that one wants and that applies in practice:
If the abstract algebra has a unique solution, then it is sound with respect to all concrete simulations
(even if the concrete system has many solutions). Consequently, a single run of the abstract simulator,
with any concurrent schedule suffices.
We state and prove our two theorems in §4.2 from first principles. In §4.3, we discuss the
relationship to the standard Galois connection construction.
Proof. By strong induction on the time t. See the Appendix for details. □
Theorem 4.1 shows that the abstract analysis is sound for any fixed schedule. If the abstract
algebra is guaranteed to converge to a unique solution, then we can show that its answer is sound
for any possible concrete schedule.
Definition 4.2. We say algebra A with semantics σ is uniquely converging for topology (V , E), and
states (I , X ) if there is a stable state X ∗ such that for all schedules (τ , ω), there exists a convergence
time t such that for all k, σ t +k (π ) = X ∗ for π = (V , E, I , X , τ , ω).
Theorem 4.2. Consider algebra A with semantics σ , and abstract algebra A♯ with semantics σ ♯ ,
related by sound abstraction (α, ⊑ ♯ ). If A♯ is uniquely converging for topology (V , E) and states
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
Abstract Interpretation of Distributed Network Control Planes 42:13
S2 ⊤
... ...
... {[∞, s 1 ], [∞, s 2 )]} . . . ... [∞, (R, [∗])] . . .
{[∞, ∞]} . . . {[∞, s 1 ]} {[∞, s 2 ]} . . . [∞, ∞] . . . [∞, (R, [0])] [∞, (R, [1])] . . .
∅ ⊥
Fig. 6. The concrete (left) and abstract (right) routing state lattices from Figure 2. Concrete route s 1 =
(200, [R 3 , R 1 ], [0]) and s 2 = (200, [R 3 , R 1 ], [1]).
(α(I ), α(X )), then for all schedules (τ1 , ω 1 ) and (τ2 , ω 2 ), there exists a time t such that for all k,
α(σ t +k (π )) ⊑ ♯ σ ♯t +k (π ♯ ) for π = (V , E, I, X , τ1 , ω1 ), and π ♯ = (V , E, α(I ), α(X ), τ2 , ω2 ).
There are a few points worth noting. First, unlike previous work [Daggitt et al. 2018b], this
theorem does not identify conditions under which a routing algebra converges uniquely. But if
an abstract algebra does, then any single abstract execution is sound with respect to all concrete
executions—a powerful result. Second, a concrete algebra need not converge uniquely. For instance,
a BGP configuration may not converge uniquely, perhaps due to inconsistent use of local preference,
but our abstractions of BGP (those that elide local preference, for instance) could!
The theorems above are general in that they consider abstract/concrete executions for a fixed
schedule and for all schedules, respectively. They can be applied in various settings for control
plane analysis, such as:
• For concrete simulation of control planes as in Batfish [Fogel et al. 2015], Theorem 4.1 guarantees
that an abstract simulation is sound with respect to a concrete simulation.
• In addition, if an abstract algebra converges uniquely, then a single abstract simulation performs
verification, since Theorem 4.2 ensures soundness over all schedules.
• For control plane verification, our abstractions can be used by a tool like Minesweeper [Beckett
et al. 2017]. Since Minesweeper implicitly searches over stables paths due to all possible schedules,
our abstractions could improve performance while guaranteeing soundness (i.e., no bug in an
abstract model implies no bug in the concrete model). These can also be used in counterexample
guided abstraction refinement [Clarke et al. 2000].
• Verification based on model checking (e.g., Plankton [Prabhu et al. 2017]) can check invariants
over transient states in a network, i.e., before the network reaches convergence. Our abstractions
would enable checking executions over abstract transient states. Again, uniquely converging
abstract algebras would ensure soundness over all possible schedules.
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
42:14 Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker
concrete algebra (σ ) is monotone, then the standard abstract interpretation theory would apply.
Figure 6 shows an example of the concrete and abstract lattices corresponding to the algebra used
in the example from Figure 2 (restricted to a network with 2 nodes for simplicity).
There are several notable differences between this “more standard” formulation and that of
§4. First, unlike in abstract interpretation of traditional program where a collecting semantics
overapproximates the set of concrete values reachable at each program location by merging results
from different program paths, in routing algebras there is no equivalent notion of branching. As a
result, the full machinery of abstract interpretation is not needed – we do not require a complete
lattice with a join (⊔) defined. In particular, one will never obtain values that extend above the
second bottom level of the concrete lattice for any concrete execution. Second, the formulation
of abstractions in §4 as a routing algebra makes it easy to leverage existing results from the
routing algebra literature on unique convergence to obtain soundness for all possible asynchronous
schedules without having to account for asynchrony in the abstraction itself (Theorem 4.2). This
both simplifies the theory and improves performance since the implementation can fix any particular
schedule. Further, defining an abstraction as an abstract routing algebra has the practical advantage
enabling reuse of existing routing simulation engines, which we take advantage of in §7.
5 ABSTRACTIONS
While we showed example abstractions earlier, we now showcase a rich collection of routing
abstractions. Several of the abstractions are similar to ideas in previous work, while many others
are completely new in the context of network verification. Figure 7 shows an overview. The columns
denote: (1) the object being abstracted, (2) the idea behind the abstraction, (3) an example of a
concrete instance of the object, and finally, (4) the corresponding abstract instance for the concrete
example. The combinators in the last two rows help compose a richer abstraction from simpler
ones, where the soundness claims carry through.
Path abstractions. Rather than keeping only information about the existence of a path (R), one
can keep any number of other pieces of information with tradeoffs in terms of representation size
and precision. Several such examples are listed in Figure 7 (Rows 2-5). One example is to abstract a
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
Abstract Interpretation of Distributed Network Control Planes 42:15
path as its length. For instance, BGP carries around a path of AS numbers that a route has traversed.
In this case, the abstraction function is α(path) = |path|.
Alternatively, one could track other information about the path such as the potential origin of
the path or the particular external neighbors through which the route might have been learned.
More generally, one could track a set of waypoints of interest that the route has gone through.
Another interesting abstraction is to encode a path using a regular expression. A regular ex-
pression can overapproximate the set (or sequence) of elements in the path. In Row 5, the regex
"(1|2|3)(0)(0|1)" captures the set {100,200,301,101,201}, a superset of the actual concrete
values. Regular expressions afford a great deal of flexibility in the granularity to track path elements.
For instance, one can use regexes to enumerate all possible values in a path.
Environment. The BGP hijacking example from Figure 4 can be captured with a simple inter-
nal/external environment abstraction of the BGP AS path (Row 6).
Tags and bitfields. Protocols often keep a collection of tags or bit fields (e.g., BGP communities).
Given a finite set of tags or bits where each value is either 1 or 0, a simple way to abstract these
tags is to use a ternary representation (Row 7), where each element is abstracted as either 1, 0, or *.
To retain more information when representing concrete tag sets, one can use a powerset ab-
straction [Filé and Ranzato 1999] (Row 8), which explicitly tracks the set of possible tag sets. The
powerset abstraction gives more precision at the cost of representation space. For example, suppose
two tags x and y are only ever set together depending on which route was chosen, and policy
downstream applies an action only if the two values are the same. The powerset abstraction retains
the set {[0,0],[1,1]} and loses less precision downstream. In contrast, the ternary representation
would be [*,*].
BGP Local Preference. The BGP local preference is a policy attribute used to hard code preferences
for certain routes over others. Local preference is responsible for a number of problems in BGP
such as route oscillation [Griffin et al. 2002; Varadhan et al. 1996]. However, BGP preferences are
often used in a safe way by ordering routes between customers, peers, and providers. One way
to abstract local preference would be to translate the local preference value directly into the a
smaller set of finite values corresponding to these preferences (Row 9). For instance it is common
for wide-area-networks to assign local preferences between ranges based on cost, e.g., between 100
and 200 might be used for peers (P), while 201 to 300 is used for providers (R).
Decision Variables. Alternatively, one can drop values used only in a protocol’s decision process
such as local preference, origin-type, and others (Row 10). It may not be important what route is
chosen so long as the destination can be reached. Given a route that is a tuple of values, one can
project away one or more of the fields.
Destination IP. The destination IP address (Rows 11, 12) can be viewed as a collection of bits, and
thus any of the abstractions for collections bits can be used for the IP address. For instance, one can
use a ternary representation or a powerset construction. Since IP addresses are often matched by
wildcards on the least significant bits, another type of useful abstract may be an interval abstraction
(example 12), which abstracts an IP address as a line segment.
Edge abstraction. The BGP protocol maintains a set of concrete paths when used in a multipath
setting, for example in production data centers [Lapukhov et al. 2015]. This can lead to a large
amount of redundancy in routing tables when many similar, but overlapping, paths exist to the
same destination. It is possible to abstract the set of paths, which may contain overlap, into a set of
edges (Row 13) thus eliminating the redundancy at the cost of forgetting the exact set of paths.
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
42:16 Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker
Keeping the set of edges an advertisement may traverse can also be useful because it provides an
abstract view of the “flow” of control plane information in the network. Such information can be
used, for instance, to quickly identify the degree of failure needed to disconnect a source from a
destination—an insight leveraged in ARC [Gember-Jacobson et al. 2016].
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
Abstract Interpretation of Distributed Network Control Planes 42:17
on tags applied upstream, and discarding community information could lead to high precision
loss. We found that a simple ternary abstraction often suffices but we use a powerset abstraction
since community sets can be represented with little additional cost using BDDs (§6.2).
• Discard all decision variables. Since we do not keep track of what path is used, we decided to
discard all route fields related to protocol decision processes. This includes fields such as BGP
local preference, multi-exit discriminator, path length, administrative distance, and so on.
• Remember the protocol. Routers can run more than one protocol, and determining which
protocol is used for forwarding (i.e., which route is stored in the forwarding table, or FIB) is
complex. We maintain a set of possible chosen protocols: e.g., {CONNECTED, STATIC, OSPF,
BGP, IBGP}. We found that losing information about exactly which protocol was used to learn a
route is often fine in the context of reachability, since it may not affect the presence of a route
(e.g., OSPF vs. BGP). On the other hand, it is possible to lose precision since one may not be able
to redistribute a route from one protocol to another if the protocol of the chosen route stored in
the FIB is not known.
• Map abstraction. We use the Map abstraction to soundly lift the aforementioned routes to be
functions from destination prefix to abstract route.
An example abstract routing message in the final reachability domain would be [168/8 7→
([0,*,0], {R1}, {BGP})], which denotes that the second community could be attached or not
on a BGP route originating from R1.
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
42:18 Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
Abstract Interpretation of Distributed Network Control Planes 42:19
Fig. 8. Network size vs. analysis time (left) and peak memory usage (right), for different abstractions.
starts at 32 and ends at 0. From these routes, we then existentially quantify away the prefix length
and the last 32 −k bits of the prefix. This converts the prefix in the route to a symbolic set of packets
representation, and we accumulate this set of packets matched so far. At each step, we must then
remove any values in this set that are already matched by any route with prefix length k ′ > k. This
is done using set difference (BDD negation). The final result is a new BDD representing a map from
packet to route. Determining the set of packet headers that will reach different locations is then
simple since we stored the origin as part of the route. Interestingly, this allows the analysis to reuse
almost all the work done in executing the control plane to give a sound answer to the all pairs, all
packets data plane reachability problem (i.e., if the analysis says that a packet is “reachable”, then it
really is reachable).
Access control lists. It is possible to configure access control lists (ACLs) on individual router
interfaces that can block different types of packets (e.g., do not allow packets with destination port
53 unless the protocol is UDP). While ACLs do not affect control plane reachability, they do affect
data plane reachability. To address this, when checking data plane reachability, we track the set of
packet headers that might be blocked by ACLs. When a set of routes is propagated from one router
to another, if there are ACLs configured between the devices, then we convert the set of destinations
to a header space (again using BDDs) and intersect it with the header space represented by the
ACLs (also as BDDs). We then accumulate the set of potentially blocked packets due to ACLs.
7 IMPLEMENTATION
We implemented ShapeShifter as a simulation engine following the network semantics laid out in §3.
ShapeShifter uses Batfish [Fogel et al. 2015] to parse network configurations into a vendor-neutral
data model, and then operates over this model. The implementation uses BDDs to represent, batch,
and process a set of routing messages at a time (§6.2). ShapeShifter currently implements several
reachability abstractions (§8.2) with varying degrees of precision.
8 EVALUATION
The primary goals of our experiments are to illustrate tradeoffs between performance and precision
for abstract routing analysis and to find abstractions that work well for a broad class of industrial
networks. Our goal is not an exhaustive exploration of the space of all possible abstractions. More
specifically, we are interested in (1) how analysis time scales with network size, (2) whether the
precision of our abstractions suffices for real networks, and (3) whether abstract analyses can scale
well for real networks.
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
42:20 Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
Abstract Interpretation of Distributed Network Control Planes 42:21
Fig. 9. CDF of Analysis precision (left) and abstraction speedup (right) for production networks.
Fig. 10. Abs_ROC (blue) and Abs_DP (orange) performance distributions for production networks. Networks
are sorted by size (total configuration size).
Batfish routes that were obtained through the abstract analysis. The left graph of Figure 9 shows a
CDF of the abstract analysis precision for the production networks. We find that for about 95% of
the production networks, the abstract analysis computes the data plane with 100% accuracy.
Manual inspection of the remaining 5% of the networks for which we lose precision revealed that
the main cause is the use of BGP regular expressions, e.g., DC border devices that filter for WAN
AS numbers from internal DC routes. Since the abstraction does not track the AS path, it must
drop all BGP routes for soundness. Such imprecision could be addressed by keeping additional
information in the abstraction, e.g., by tracking if a WAN device has been encountered, or using a
regex domain as in Figure 7.
The right graph of Figure 9 shows a CDF of the speedup for the abstract analysis over Batfish
for the production networks. For most networks, the speedup ranges between 1 and 2 orders of
magnitude. However, we observe that the speedup over Batfish grows rapidly with network size.
Finally, to test the absolute scalability of the abstract analysis, we run the Abs_ROC and Abs_DP
analyses on the collection of production networks. Figure 10 shows the analysis time for all networks,
where networks are sorted from smallest to largest in terms of total lines of configuration along
the x-axis. The largest network is over half a million lines of configuration. The graph shows the
time both for control plane reachability (blue) as well as all pairs, all packets dataplane reachability
(orange). For all networks studied, the abstract analysis completes in under 40 seconds.
9 RELATED WORK
Abstraction in program analysis. The theory of abstract interpretation [Cousot and Cousot
1977] has been widely used in program verification, with successful applications in industry [Ball
et al. 2001; Blanchet et al. 2003]. Different abstract domains, such as intervals [Cousot and Cousot
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
42:22 Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker
1976], octagons [Miné 2001], and polyhedra [Cousot and Halbwachs 1978], provide tradeoffs
between precision and scalability. We apply ideas from abstract interpretation to network control
planes and formalize a theoretical abstraction framework via routing algebras [Griffin and Sobrinho
2005; Sobrinho 2005]. In some ways, networks are simpler than software programs since there are
no loops and thus no need for widening techniques. However, one must still address the challenges
of designing and implementing appropriate abstractions based on domain-specific characteristics.
Abstraction in network analysis. The use of abstraction in network analysis is also widespread.
Plotkin et al. 2016 and Beckett et al. 2018 use abstraction to factor out topological symmetries and
speed up data plane and control plane analysis, respectively. However, these works focus on lossless
topological abstractions. We focus on routing message abstraction and provide a continuum of
possibly lossy abstractions that trade off scalability and precision for any network.
Alpernas et al. 2018 explored the use of abstract interpretation to analyze isolation properties of
stateful data planes. Their stateful models, defined in a relational language called AMDL, represent
middleboxes, which include devices such as load balancers, proxies and stateful firewalls. These
devices are data plane devices, which forward packets, but record the sets of packets they have
seen so far (therein lies the state) and potentially alter their forwarding decisions based on past
packets seen. For instance, a stateful firewall may observe a trusted party A sending a message
to untrusted party B and henceforth allow B to send messages back to A. Middleboxes usually
operate independently and hence such computations are quite different from the distributed control
plane protocols that we model. The kinds of abstractions used by Alpernas include (1) abstract
away the order in which packets arrive, (2) abstract away the number of packets, (3) abstract away
the correlation between states of different middle boxes, and (4) abstract away the correlation
between states for different packets within a middlebox. None of these abstractions are similar to
our message abstractions and they generally do not make sense in our context.
A number of prior control plane analysis tools use a form of message abstraction (Figure 11).
ARC [Gember-Jacobson et al. 2016] abstracts the network by replacing traditional routing messages
with a single, global “path cost” value. ARC analyzes graphs that capture useable edges somewhat
like “edge set” abstraction of Figure 7 (Row 12). Bagpipe [Weitz et al. 2016] formalizes BGP’s
semantics and uses an SMT encoding to check control plane properties after abstracting away
the underlying IGP and abstracting BGP regex patterns as booleans. Minesweeper [Beckett et al.
2017] also encodes the network using SMT and abstracts the BGP AS path into a cartesian product
of several abstractions such as the path length, a loop flag, and more. Additionally, rather than
model all community flags that a neighbor can send, Minesweeper abstracts the set of possible
external communities into a single abstract value (i.e., “some/none external communities attached”).
ERA [Fayaz et al. 2016] and FastPlane [Lopes and Rybalchenko 2019] perform no explicit abstraction
but scale by assuming that the network lacks certain complex features. They can be be viewed as
using an abstraction that is precise only for such networks and imprecise otherwise. All the tools
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
Abstract Interpretation of Distributed Network Control Planes 42:23
above use a fixed abstraction that works well for some networks and fails for others. In contrast, we
develop a theory and practical framework that can accommodate a variety of different abstractions.
11 CONCLUSION
We provide the first formal and systematic account of the role of sound abstraction (with one-sided
error) in the analysis of network control planes. We define sufficient conditions for the construction
of new sound analyses, and we explore the resulting tradeoff between performance and precision.
The notion of abstraction is general, and we show that it is capable of capturing many ideas from
the existing verification literature as well as enabling new kinds of analyses. Guided by these ideas,
we design a new reachability analysis tool called ShapeShifter that can compute all reachable
packets between all pairs of devices, orders of magnitude faster than existing network simulators.
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
42:24 Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker
A SOUNDNESS PROOFS
Theorem 4.1. Consider algebra A with semantics σ and abstract algebra A♯ with semantics σ ♯ ,
where A and A♯ are related by sound abstraction (α, ⊑ ♯ ). For all times t, concrete instantiations
π = (V , E, I , X , τ , ω), and abstract instantiation π ♯ = (V , E, α(I ), α(X ), τ , ω) we have α(σ t (π )) ⊑ ♯
σ ♯t (π ♯ ).
Proof. By strong induction on the time t.
case (t = 0): Evaluate α(σ 0 (π )) = α(X ) and also σ ♯0 (π ♯ ) = α(X ), and α(X ) ⊑ ♯ α(X ) since ⊑ ♯ is
reflexive.
case t: To show α(σ t (π )) ⊑ ♯ σ ♯t (π ♯ ), we must show α(σ t (π ))i ⊑ σ ♯t (π ♯ )i for any i, since these
are routing states (vectors). There are two cases to consider.
(1) The first is that i < τ (t). In this case: α(σ t (π ))i = α(σ t (π )i ) = α(σ t −1 (π )i ) = α(σ t −1 (π ))i . From
the inductive hypothesis, we know α(σ t −1 (π ))i ⊑ σ ♯t −1 (π ♯ )i which is exactly what we obtain from
evaluating σ ♯t (π ♯ )i under the same schedule.
(2) The second case is where i ∈ τ (t). In this case, we have
α(σ t (π )i ) = α fi j (σ ω(t,i, j) (π )j ) ⊕ Ii
Ê
(i, j)∈E
We repeatedly apply the soundness condition: α(a ⊕b) ⊑ ♯ α(a) ⊕ ♯ α(b) together with commutativity
and associativity of ⊕ and ⊕ ♯ , to show a chain of inequalities. Consider:
fi j (σ ω(t,i, j) (π )j ) ⊕ Ii
Ê
α
(i, j)∈E
By repeatedly applying soundness of ⊕ for each neighbor j, we get that this is less than (⊑ ♯ ):
Ê♯
α(fi j (σ ω(t,i, j) (π )j )) ⊕ ♯ α(Ii )
(i, j)∈E
Since α(Ii ) is α(I )i by definition, this is just:
Ê♯
α(fi j (σ ω(t,i, j) (π )j )) ⊕ ♯ α(I )i
(i, j)∈E
Again, we want to show this is less than (⊑)
Ê♯
σ ♯t (π ♯ )i = fi♯j (σ ♯ω(t,i, j) (π ♯ )j ) ⊕ ♯ α(I )i
(i, j)∈E
Because ⊕♯
is monotone for abstract algebra A♯ , it suffices to show that each element is smaller
pointwise. We easily observe that α(I )i ⊑ ♯ α(I )i , so we need to show:
Ê♯ Ê♯
α(fi j (σ ω(t,i, j) (π )j )) ⊑ ♯ fi♯j (σ ♯ω(t,i, j) (π ♯ )j )
(i, j)∈E (i, j)∈E
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
Abstract Interpretation of Distributed Network Control Planes 42:25
ACKNOWLEDGEMENTS
This work was supported in part by the National Science Foundation awards NeTS 1704336 and
FMitF 1837030, DARPA Dispersed Computing program under award number HR0011-17-C-0047
and a Facebook Research Award on “Network control plane verification at scale.”
REFERENCES
Mohammad Al-Fares, Alexander Loukissas, and Amin Vahdat. 2008. A Scalable, Commodity Data Center Network Architec-
ture. In SIGCOMM.
Kalev Alpernas, Roman Manevich, Aurojit Panda, Mooly Sagiv, Scott Shenker, Sharon Shoham, and Yaron Velner. 2018.
Abstract Interpretation of Stateful Networks. In Static Analysis Symposium.
Carolyn Jane Anderson, Nate Foster, Arjun Guha, Jean-Baptiste Jeannin, Dexter Kozen, Cole Schlesinger, and David Walker.
2014. NetKAT: Semantic Foundations for Networks. In POPL.
Thomas Ball, Rupak Majumdar, Todd D. Millstein, and Sriram K. Rajamani. 2001. Automatic Predicate Abstraction of C
Programs. In PLDI. 203–213.
Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker. 2017. A General Approach to Network Configuration
Verification. In SIGCOMM.
Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker. 2018. Control Plane Compression. In SIGCOMM. 476–489.
Bruno Blanchet, Patrick Cousot, Radhia Cousot, Jérôme Feret, Laurent Mauborgne, Antoine Miné, David Monniaux, and
Xavier Rival. 2003. A static analyzer for large safety-critical software. In PLDI. 196–207.
Randal E. Bryant. 1986. Graph-Based Algorithms for Boolean Function Manipulation. IEEE Trans. Computers 35, 8 (1986),
677–691.
Edmund M. Clarke, Orna Grumberg, Somesh Jha, Yuan Lu, and Helmut Veith. 2000. Counterexample-Guided Abstraction
Refinement. In Computer Aided Verification, 12th International Conference, CAV, Proceedings. 154–169.
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
42:26 Ryan Beckett, Aarti Gupta, Ratul Mahajan, and David Walker
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.
Abstract Interpretation of Distributed Network Control Planes 42:27
Sanjay Narain, Dana Chee, Brian Coan, Ben Falchuk, Samuel Gordon, Jaewon Kang, Jonathan Kirsch, Aditya Naidu, Kaustubh
Sinkar, and Simon Tsang. 2016. A Science of Network Configuration. Journal of Cyber Security and Information Systems
1, 4 (2016).
Gordon D. Plotkin, Nikolaj Bjørner, Nuno P. Lopes, Andrey Rybalchenko, and George Varghese. 2016. Scaling Network
Verification Using Symmetry and Surgery. In POPL.
Santhosh Prabhu, Ali Kheradmand, Brighten Godfrey, and Matthew Caesar. 2017. Predicting Network Futures with Plankton.
In Proceedings of the First Asia-Pacific Workshop on Networking (APNet’17). 92–98.
Redistributing Routing Protocols 2012. Redistributing Routing Protocols. https://www.cisco.com/c/en/us/support/docs/ip/
enhanced-interior-gateway-routing-protocol-eigrp/8606-redist.html.
D Roberts. 2018. It’s been a week and customers are still mad at BB&T. https://www.charlotteobserver.com/news/business/
banking/article202616124.html.
Simon Sharwood. 2016. Google cloud wobbles as workers patch wrong routers. http://www.theregister.co.uk/2016/03/01/
googlec loudw obblesa sw orkersp atcw rongr outers/.
João Luís Sobrinho. 2005. An Algebraic Theory of Dynamic Network Routing. IEEE/ACM Trans. Netw. 13, 5 (October 2005),
1160–1173.
Yevgenly Sverdlik. 2012. Microsoft: misconfigured network device led to Azure outage. http://
www.datacenterdynamics.com/content-tracks/servers-storage/microsoft-misconfigured-network-device-led-
to-azure-outage/68312.fullarticle.
Y Sverdlik. 2017. United Says IT Outage Resolved, Dozen Flights Canceled Monday. https://www.datacenterknowledge.com/
archives/2017/01/23/united-says-it-outage-resolved-dozen-flights-canceled-monday.
Dylan Tweney. 2013. 5-minute outage costs Google $545,000 in revenue. https://venturebeat.com/2013/08/16/3-minute-
outage-costs-google-545000-in-revenue/.
Kannan Varadhan, Ramesh Govindan, and Deborah Estrin. 1996. Persistent route oscillations in inter-domain routing. Technical
Report. Computer Networks.
Anduo Wang, Limin Jia, Wenchao Zhou, Yiqing Ren, Boon Thau Loo, Jennifer Rexford, Vivek Nigam, Andre Scedrov, and
Carolyn L. Talcott. 2012. FSR: Formal Analysis and Implementation Toolkit for Safe Inter-domain Routing. IEEE/ACM
Trans. Networking 20, 6 (2012).
Konstantin Weitz, Doug Woos, Emina Torlak, Michael D. Ernst, Arvind Krishnamurthy, and Zachary Tatlock. 2016. Formal
Semantics and Automated Verification for the Border Gateway Protocol. In NetPL.
Proc. ACM Program. Lang., Vol. 4, No. POPL, Article 42. Publication date: January 2020.