0% found this document useful (0 votes)
94 views380 pages

Formal Methods For Components and Object

This document provides information about the Lecture Notes in Computer Science publication series. It was founded in 1973 and has had several founding and former series editors. It has an international editorial board of computer scientists and publishes revised lectures from the Second International Symposium on Formal Methods for Components and Objects held in November 2003 in Leiden, Netherlands. The symposium addressed applying formal methods to component-based and object-oriented software engineering.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
94 views380 pages

Formal Methods For Components and Object

This document provides information about the Lecture Notes in Computer Science publication series. It was founded in 1973 and has had several founding and former series editors. It has an international editorial board of computer scientists and publishes revised lectures from the Second International Symposium on Formal Methods for Components and Objects held in November 2003 in Leiden, Netherlands. The symposium addressed applying formal methods to component-based and object-oriented software engineering.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 380

Lecture Notes in Computer Science 3188

Commenced Publication in 1973


Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
University of Dortmund, Germany
Madhu Sudan
Massachusetts Institute of Technology, MA, USA
Demetri Terzopoulos
New York University, NY, USA
Doug Tygar
University of California, Berkeley, CA, USA
Moshe Y. Vardi
Rice University, Houston, TX, USA
Gerhard Weikum
Max-Planck Institute of Computer Science, Saarbruecken, Germany
Frank S. de Boer Marcello M. Bonsangue
Susanne Graf Willem-Paul de Roever (Eds.)

Formal Methods
for Components
and Objects

Second International Symposium, FMCO 2003


Leiden, The Netherlands, November 4-7, 2003
Revised Lectures

13
Volume Editors

Frank S. de Boer
Centre for Mathematics and Computer Science, CWI
Kruislaan 413, 1098 SJ Amsterdam, The Netherlands
E-mail: F.S.de.Boer@cwi.nl

Marcello M. Bonsangue
Leiden University, Leiden Institute of Advanced Computer Science
P.O. Box 9512, 2300 RA Leiden, The Netherlands
E-mail: marcello@liacs.nl

Susanne Graf
VERIMAG
2 Avenue de Vignate, Centre Equitation, 38610 Grenoble-Gières, France
E-mail: Susanne.Graf@imag.fr

Willem-Paul de Roever
Christian-Albrechts-University of Kiel
Institute of Computer Science and Applied Mathematics
Hermann-Rodewald-Straße 3, 24118 Kiel, Germany
E-mail: wpr@informatik.uni-kiel.de

Library of Congress Control Number: 2004112623

CR Subject Classification (1998): D.2, D.3, F.3, F.4

ISSN 0302-9743
ISBN 3-540-22942-6 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
Springer is a part of Springer Science+Business Media
springeronline.com
© Springer-Verlag Berlin Heidelberg 2004
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper SPIN: 11315810 06/3142 543210
Preface

Large and complex software systems provide the necessary infrastucture in all in-
dustries today. In order to construct such large systems in a systematic manner,
the focus in the development methodologies has switched in the last two decades
from functional issues to structural issues: both data and functions are encap-
sulated into software units that are integrated into large systems by means of
various techniques supporting reusability and modifiability. This encapsulation
principle is essential to both the object-oriented and the more recent component-
based software engineering paradigms.
Formal methods have been applied successfully to the verification of medium-
sized programs in protocol and hardware design. However, their application to
large systems requires a further development of specification and verification
techniques supporting the concepts of reusability and modifiability.
In order to bring together researchers and practioners in the areas of soft-
ware engineering and formal methods, we organized the 2nd International Sym-
posium on Formal Methods for Components and Objects (FMCO) in Leiden,
The Netherlands, from November 4 to 7, 2003. The program consisted of in-
vited tutorials and technical presentations given by leading experts in the fields
of theoretical computer science and software engineering. The symposium was
attended by more than 80 people from all over the world.
This volume contains the contributions of the invited speakers to
FMCO 2003. We believe that the presented material provides a unique com-
bination of ideas on software engineering and formal methods which we hope
will form an inspiration for those aiming at further bridging the gap between
the theory and practice of software engineering.
The very idea to organize FMCO arose out of the NWO/DFG bilateral
project Mobi-J. In particular we acknowledge the financial support of the NWO
funding of Mobi-J. Additional financial support was provided by the Lorentz
Center, the IST project Omega (2001-33522), the Dutch Institute for Program-
ming Research and Algorithmics (IPA), the Royal Netherlands Academy of Arts
and Sciences (KNAW), the Centrum voor Wiskunde en Informatica (CWI), and
the Leiden Institute of Advanced Computer Science (LIACS).

July 2004 F.S. de Boer


M.M. Bonsangue
S. Graf
W.-P. de Roever
VI Preface

The Mobi-J Project


Mobi-J is a project funded by a bilateral research program of the Dutch Orga-
nization for Scientific Research (NWO) and the Central Public Funding Orga-
nization for Academic Research in Germany (DFG).
The partners of the Mobi-J projects are:

– Centrum voor Wiskunde en Informatica (F.S. de Boer)


– Leiden Institute of Advanced Computer Science (M.M. Bonsangue)
– Christian-Albrechts-Universität Kiel (W.-P. de Roever)
This project aims at the development of a programming environment which
supports component-based design and verification of Java programs annotated
with assertions. The overall approach is based on an extension of the Java lan-
guage called Mobi-J with a notion of component which provides for the encapsu-
lation of its internal processing of date and composition in a network by means
of mobile asynchronous channels.
The activities of Mobi-J include the organization of international symposia
funded by the NWO and Ph.D. research funded by the DFG. By means of regular
meetings the partners discuss intensively Ph.D. research involving Mobi-J related
topics. Mobi-J also maintains contacts with other German universities, including
the universities of Oldenburg and Munich, and a close collaboration with the
European IST project OMEGA.

The Omega Project


The overall aim of the European IST project Omega (2001-33522) is the defini-
tion of a development methodology in UML for embedded and real-time systems
based on formal verification techniques. The approach is based on a formal se-
mantics of a suitable subset of UML, adapted and extended where needed with
a special emphasis on time-related aspects.
The Omega project involves the following partners: VERIMAG (France, Co-
ordinator), Centrum voor Wiskunde en Informatica (The Netherlands), Christ-
ian-Albrechts-Universität (Germany), University of Nijmegen (The Netherlands),
Weizmann Institute (Israel), OFFIS (Germany), EADS Launch Vehicles (France),
France Telecom R&D (France), Israeli Aircraft Industries (Israel), and National
Aerospace Laboratory (The Netherlands).
Table of Contents

Causality and Scheduling Constraints in Heterogeneous Reactive


Systems Modeling
Albert Benveniste, Benoı̂t Caillaud, Luca P. Carloni, Paul Caspi,
Alberto L. Sangiovanni-Vincentelli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Machine Function Based Control Code Algebras


Jan A. Bergstra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Exploiting Abstraction for Specification Reuse. The Java/C Case Study


Egon Börger, Robert F. Stärk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

On the Verification of Cooperating Traffic Agents


Werner Damm, Hardi Hungar, Ernst-Rüdiger Olderog . . . . . . . . . . . . . . 77

How to Cook a Complete Hoare Logic for Your Pet OO Language


Frank S. de Boer, Cees Pierik . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

Behavioural Specification for Hierarchical Object Composition


Răzvan Diaconescu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

Consistency Management Within Model-Based Object-Oriented


Development of Components
Jochen M. Küster, Gregor Engels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

CommUnity on the Move: Architectures for Distribution and Mobility


José Luiz Fiadeiro, Antónia Lopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

TulaFale: A Security Tool for Web Services


Karthikeyan Bhargavan, Cédric Fournet, Andrew D. Gordon,
Riccardo Pucella . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

A Checker for Modal Formulae for Processes with Data


Jan Friso Groote, Tim A.C. Willemse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

Semantic Essence of AsmL


Yuri Gurevich, Benjamin Rossman, Wolfram Schulte . . . . . . . . . . . . . . . 240

An MDA Approach to Tame Component Based Software Development


Jean-Marc Jézéquel, Olivier Defour, Noël Plouzeau . . . . . . . . . . . . . . . . . 260

An Application of Stream Calculus to Signal Flow Graphs


J.J.M.M. Rutten . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
VIII Table of Contents

Synchronous Closing and Flow Analysis for Model Checking Timed


Systems
Natalia Ioustinova, Natalia Sidorova, Martin Steffen . . . . . . . . . . . . . . . 292

Priority Systems
Gregor Gössler, Joseph Sifakis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

Preserving Properties Under Change


Heike Wehrheim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
Tools for Generating and Analyzing Attack Graphs
Oleg Sheyner, Jeannette Wing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373


Causality and Scheduling Constraints in
Heterogeneous Reactive Systems Modeling

Albert Benveniste1 , Benoı̂t Caillaud1 , Luca P. Carloni2 , Paul Caspi3


and Alberto L. Sangiovanni-Vincentelli2
1
Irisa/Inria, Campus de Beaulieu, 35042 Rennes cedex, France
Albert.Benveniste@irisa.fr
http://www.irisa.fr/sigma2/benveniste/
2
U.C. Berkeley, Berkeley, CA 94720
{lcarloni,alberto}@eecs.berkeley.edu
http://www-cad.eecs.berkeley.edu/HomePages/{lcarloni,alberto}
3
Verimag, Centre Equation, 2, rue de Vignate, F-38610 Gieres
Paul.Caspi@imag.fr
http://www.imag.fr/VERIMAG/PEOPLE/Paul.Caspi

Abstract. Recently we proposed a mathematical framework offering


diverse models of computation and a formal foundation for correct-by-
construction deployment of synchronous designs over distributed
architecture (such as GALS or LTTA). In this paper, we extend our
framework to model explicitly causality relations and scheduling con-
straints. We show how the formal results on the preservation of seman-
tics hold also for these cases and we discuss the overall contribution in
the context of previous work on desynchronization.

1 Introduction
Embedded systems are intrinsically heterogeneous since they are based on pro-
cessors that see the world digitally and an environment that is analog. In ad-
dition, the processing elements are heterogeneous too since they often include
micro-controllers and digital signal processors in addition to special purpose com-
puting engines. These parts must communicate and exchange information in a
consistent and reliable way over media that are often noisy and unreliable. Some
of the tasks that embedded systems must carry out are safety-critical, e.g., in
medical and transportation systems (cars, airplanes, trains) and for this reason
have hard constraints on timing and reliability. As technology advances, increas-
ingly more computing power becomes available thus offering the opportunity of
adding functionality to the system to such an extent that the complexity of the
design task is unmanegeable without a rigorous design methodology based on


This research was supported in part by the European Commission under the projects
COLUMBUS, IST-2002-38314, and ARTIST, IST-2001-34820, by the NSF under the
project ITR (CCR-0225610), and by the GSRC.

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 1–16, 2004.

c Springer-Verlag Berlin Heidelberg 2004
2 A. Benveniste et al.

sound principles. In particular, the need for fast time-to-market and reasonable
development cost imposes design re-use. And design re-use implies the use of
software for as many parts of the functionality as possible given size, production
cost and power consumption constraints. Consequently, software accounts for
most of the design costs today and it is responsible for delays in product deliv-
ery because of the lack of a unified design process that can guarantee correct
behavior.
Today, designers face a very diversified landscape when it comes to method-
ologies, supporting tools, and engineering best practices. This would not neces-
sarily be a problem if it were not for the fact that transitioning between tools
that are based on different paradigms is increasingly becoming a design produc-
tivity bottleneck as it has been underlined by the road map work performed
in the framework of the ARTIST network of excellence [3]. A solution to this
problem would be to impose a “homogeneous” design policy, such as the fully
synchronous approach. However, implementation costs in terms of performance
and components require a more diversified view. Heterogeneity will manifest it-
self at the component level where different models of computation may be used
to represent component operation and, more frequently, at different levels of
abstraction, where, for example, a synchronous-language specification of the de-
sign may be refined into a globally asynchronous locally synchronous (GALS)
architecture, thus alleviating some of the cost issues outlined above. Having
a mathematical framework for the heterogeneous modeling of reactive systems
gives freedom of choice between different synchronization policies at different
stages of the design process and provides a solid foundation to handle formally
communication and coordination among heterogeneous components. Interest-
ing work along similar lines has been the Ptolemy project [13, 15], the MoBIES
project [1], the Model-Integrated Computing (MIC) framework [16], and Inter-
face Theories [14].
This paper is an extension of [7] where we proposed Tagged Systems, a vari-
ation of Lee and Sangiovanni-Vincentelli’s (LSV) Tagged-Signal Model [22], as
a mathematical model for heterogeneous systems. Originally, we restricted our-
selves to Tagged Systems in which parallel composition is by intersection, mean-
ing that unifiable events of each component must have identical variable, data,
and tag. While this restriction has allowed us to handle GALS models of design, it
does not cover all cases of interest. For example, causality relations and schedul-
ing constraints are not compatible with parallel composition by intersection.
Neither are earliest execution times. Yet causality and scheduling constraints
are very important to include when implementing an embedded systems. Hence,
it is sometimes useful to have a notion of parallel composition that accepts the
unification of events having different tags (while the data that they carry must
still be equal). In this work, we propose an extension of Tagged Systems where
the unification rule for tags is itself parameterized. We show that this model cap-
tures important properties such as causality and scheduling constraints. Then,
we extend the general theorems of [7] on the preservation of semantics during
distributed deployment.
Causality and Scheduling Constraints 3

2 Tagged Systems and Heterogeneous Systems


2.1 Tagged Systems and Their Parallel Composition
Throughout this paper, N = {1, 2, . . .} denotes the set of positive integers; N
is equipped with its usual total order ≤. X → Y denotes the set of all partial
functions from X to Y . If (X, ≤X ) and (Y, ≤Y ) are partial orders, f ∈ X → Y
is called increasing if f (≤X ) ⊆ ≤Y , i.e., ∀x, x ∈ X : x ≤X x ⇒ f (x) ≤Y f (x ).

Tag Structures. A tag structure is a triple (T , ≤, ), where T is a set of tags,


and ≤ and are two partial orders. Partial order ≤ relates tags seen as time
stamps. Call a clock any increasing function (N, ≤) → (T , ≤). Partial order ,
called the unification order, defines how to unify tags and is essential to express
coordination among events. Write τ1  τ2 to mean that there exists τ ∈ T
such that τi τ . We assume that any pair (τ1 , τ2 ) of tags, such that τ1  τ2
holds, possesses an upper bound. We denote it by τ1 τ2 . In other words, (T , )
is a sup-semi-lattice. We call  and the unification relation and unification
map, respectively.
We assume that unification is causal with respect to partial order of time
stamps: the result of the unification cannot occur prior in time than its con-
stituents. Formally, if τ1  τ2 is a unifiable pair then τi ≤ (τ1 τ2 ), for i = 1, 2.
Equivalently:
∀τ, τ  : τ τ  ⇒ τ ≤ τ . (1)
Condition (1) has the following consequence: if τ1 ≤ τ2 ≤ τ1 , τ2 ,
τ1  τ2 ,
and τ1  τ2 together hold, then (τ1 τ2 ) ≤ (τ1 τ2 ) must also hold. This
ensures that the system obtained via parallel composition preserves the agreed
order of its components.

Tagged Systems. Let V be an underlying set of variables with domain D. For


V ⊂ V finite, a V -behaviour, or simply behaviour, is an element:
σ ∈ V → N → (T × D), (2)
meaning that, for each v ∈ V , the n-th occurrence of v in behaviour σ has tag
τ ∈ T and value x ∈ D. For v a variable, the map σ(v) ∈ N → (T × D) is called
a signal. For σ a behaviour, an event of σ is a tuple (v, n, τ, x) ∈ V × N × T × D
such that σ(v)(n) = (τ, x). Thus we can regard behaviours as sets of events. We
require that, for each v ∈ V , the first projection of the map σ(v) (it is a map
N → T ) is increasing. Thus it is a clock and we call it the clock of v in σ. A
tagged system is a triple P = (V, T , Σ), where V is a finite set of variables, T is
a tag structure, and Σ a set of V -behaviours.

Homogeneous Parallel Composition. Consider two tagged systems P1 = (V1 , T1 ,


Σ1 ) and P2 = (V2 , T2 , Σ2 ) with identical tag structures T1 = T2 = T . Let be the
unification function of T . For two events e = (v, n, τ, x) and e = (v  , n , τ  , x ),
define
e  e iff v = v  , n = n , τ  τ  , x = x , and
e  e ⇒ e e =def (v, n, τ τ  , x).
4 A. Benveniste et al.

The unification map and relation  extend point-wise to behaviours.


Then, for σ a V -behaviour and and σ  a V  -behaviour, define, by abuse of nota-
tion: σ  σ  iff σ|V ∩V   σ  |V ∩V  , and then
 
σ σ  =def σ|V ∩V  σ  |V ∩V  ∪ σ|V \V  ∪ σ  |V  \V .

where σ|W denotes the restriction of behaviour σ to the variables of W . Finally,


for Σ and Σ  two sets of behaviours, define their conjunction

Σ ∧ Σ  =def {σ σ  | σ ∈ Σ, σ  ∈ Σ  and σ  σ  } (3)

The homogeneous parallel composition of P1 and P2 is

P1  P2 =def (V1 ∪ V2 , T , Σ1 ∧ Σ2 ) (4)

2.2 Theme and Variations on the Pair (T , )


Parallel Composition by Intersection. This is the usual case, already in-
vestigated in [7]. It corresponds to:

– The tag set T is arbitrary.


– The unification function is such that τ  τ  iff τ = τ  , and τ τ  =def τ .

Modeling synchronous systems, asynchronous systems, timed systems, with


this framework, is extensively discussed in [7]. We summarize here the main
points.
To represent synchronous systems with our model, take for T a totally or-
dered set (e.g., T = N), and require that all clocks are strictly increasing. The
tag index set T organizes behaviours into successive reactions, as explained next.
Call reaction a maximal set of events of σ with identical τ . Since clocks are
strictly increasing, no two events of the same reaction can have the same vari-
able. Regard a behaviour as a sequence of global reactions: σ = σ1 , σ2 , . . ., with
tags τ1 , τ2 , . . . ∈ T . Thus T provides a global, logical time basis.
As for asynchronous systems, we take a very liberal interpretation for them.
If we interpret a tag set as a constraint on the coordination of different signals of
a system and the integer n ∈ N as the basic constraint of the sequence of events
of the behaviour of a variable, then the most “coordination unconstrained” sys-
tem, the one with most degrees of freedom in terms of choice of coordination
mechanism, could be considered an ideal asynchronous system. Then an asyn-
chronous system corresponds to a model where the tag set does not give any
information on the absolute or relative ordering of events. In more formal way,
take T = {.}, the trivial set consisting of a single element. Then, behaviours
identify with elements σ ∈ V → N → D.
Capturing Causality Relations and Scheduling Specifications. How can
we capture causalities relations or scheduling specifications in the above in-
troduced asynchronous tagged systems? The intent is, for example, to state
Causality and Scheduling Constraints 5

that “the 2nd occurrence of x depends on the 3rd occurrence of b”. Define
N0 =def N ∪ {0}. Define a dependency to be a map:

δ = V → N0 .

We denote by Δ the set of all dependencies, and we take T = Δ. Thus an


event has the form:

e = (v, n, δ, x), (5)

with the following interpretation: event e has v as associated variable, it is ranked


n among the events with variable v, and it depends on the event of variable w
that is ranked δ(w). The special case δ(w) = 0 is interpreted as the absence of
dependency. We take the convention that, for e as in (5), δ(v) = n − 1. Thus,
on σ(v), the set of dependencies reproduces the ranking. Δ is equipped with
the partial order defined by δ ≤ δ  iff ∀v : δ(v) ≤ δ  (v). Then we define the
unification map for this case:

dom ( ) = T × T and δ δ  =def max(δ, δ  ). (6)

With this definition, behaviours become labelled preorders as explained next.


For σ a behaviour, and e, e two events of σ, write:

⎨ e = (v, n, δ, x)
e →σ e iff e = (v  , n , δ  , x ) (7)

δ(v  ) = n

Note that, since n > 0, the condition δ(v  ) = n makes this dependency
effective. Definition (7) makes σ a labeled directed graph. Denote by σ the
transitive reflexive closure of →σ , it is a preorder 1 .

Capturing Earliest Execution Times. Here we capture earliest timed execu-


tions of concurrent systems. Take T = R+ , the set of non-negative real numbers.
Thus a tag τ ∈ T assigns a date, and we define

τ τ  =def max(τ, τ  ).

Hence is here a total function. Composing two systems has the effect that
the two components wait for the latest date of occurrence for each shared vari-
able. For example, assume that variable v is an output of P and an input of
Q in P  Q. Then the earliest possible date of every event of variable v in Q is
by convention 0, whereas each event associated to v has a certain date of pro-
duction in P . In the parallel composition P  Q, the dates of production by P
prevail.

1
We insist: “preorder”, not “partial order”—this should not be a surprise, since the
superposition of two partial orders generally yields a preorder.
6 A. Benveniste et al.

Capturing Timed Systems. Various classes of timed automata models have


been proposed since their introduction by Alur and Dill in [2]. In timed automata,
dates of events are subject to constraints of the form C : τ ∈ ∪i∈I [[si , ti ]], where
I is some finite set whose cardinality depends on the considered event, and
[[ = [ or (, and symmetrically for ]]. The classes of timed automata differ by
the type of constraint that can be expressed, and therefore they differ in their
decidability properties. Nevertheless, they can all be described by the following
kind of Tagged System2 .
Take T = Pow (R+ ), where Pow denotes powerset. Thus, a tag τ ∈ T assigns
to each event a constraint on its possible dates of occurrence. Then, several
definitions are of interest:
– τ1  τ2 iff τ1 ∩ τ2 = ∅, and τ1 τ2 = τ1 ∩ τ2 . This is the straightforward
definition, it consists in regarding tags as constraints and combining them
by taking their conjunction.
– the unification of tags is a total function, defined as follows: τ1 τ2 =
{max(t1 , t2 )|t1 ∈ τ1 , t2 ∈ τ2 }. In this case, events are synchronized by waiting
for the latest one.

Hybrid Tags. Define the product (T , ) =def (T  ,  ) × (T  ,  ) in a standard


way. This allows us to combine different tags into a compound, heterogeneous,
tag. For instance, one can consider synchronous systems that are timed and en-
hanced with causality relations. Such systems can be “desynchronized”, meaning
that their reaction tag is erased, but their causality and time tags are kept.

2.3 Running Example


The Base Case: Synchronous Systems. Let P and Q be two synchronous systems
involving the same set of variables: b of type boolean, and x of type integer. Each
system possesses only a single behaviour, shown on the right hand side of P : . . .
and Q : . . ., respectively. Each behaviour consists of a sequence of successive
reactions, separated by vertical bars. Each reaction consists of an assignment of
values to a subset of the variables; a blank indicates the absence of the considered
variable in the considered reaction.

b : t f t f t f ...
P :
x : 1 1 1 ...
b : t f t f t f ...
Q:
x : 1 1 1 ...

The single behavior of P can be expressed formally in our framework as


σ(b)(2n − 1) = (2n − 1, t) , σ(b)(2n) = (2n, f )
(8)
σ(x)(n) = (2n − 1, 1)
2
Our framework of Tagged Systems handles (infinite) behaviours and is not suited
to investigate decidability properties, this explains why we can subsume all variants
of timed automata into a unique Tagged Systems model.
Causality and Scheduling Constraints 7

where we take T = N to index the successive reactions. Note the absence of x


at tag 2n. Similarly, for Q we have the following where x is absent at tag 2n − 1:

σ(b)(2n − 1) = (2n − 1, t) , σ(b)(2n) = (2n, f )


(9)
σ(x)(n) = (2n, 1)

Now, the synchronous parallel composition of P and Q, defined by intersec-


tion: P  Q =def P ∩ Q, is empty. The reason is that P and Q disagree on where
to put absences for the variable x. Formally, they disagree on their respective
tags.

Desynchronizing the Base Case. The desynchronization of a synchronous system


like P or Q consists in (i ) removing the synchronization barriers separating the
successive reactions, and, then, (ii ) compressing the sequence of values for each
variable, individually. This yields:
b : t f t f t f ...
Pα = Qα :
x:111 . . .
where the subscript α refers to asynchrony. The reader may think that events
having identical index for different variables are aligned, but this is not the
case. In fact, as the absence of vertical bars in the diagram suggests, there is no
alignment at all between events associated with different variables.
Formally, we express asynchrony by taking T = {.}, the trivial set with a
single element. The reason is that we do not need any additional time stamping
information. Thus, the single behavior of Pα = Qα is written as

σα (b)(2n − 1) = t, σα (b)(2n) = f, and σα (x)(n) = 1. (10)

Regarding desynchronization, the following comments are in order. Note that


P = Q but Pα = Qα . Next, the synchronous system R defined by R = P ∪Q, the
nondeterministic choice between P and Q, possesses two behaviours. However, its
desynchronization Rα equals Pα , and possesses only one behaviour. Now, since
Pα = Qα , then Pα  Qα =def Pα ∩ Qα = Pα = Qα = ∅. Thus, for the pair (P, Q),
desynchronization does not preserve the semantics of parallel composition, in
any reasonable sense.

Adding Causality Relations. Suppose that some analysis of the considered pro-
gram allows us to add the following causality relations to P and Q:

b : t f t f t f ...
Pc : ↓ ↓ ↓ ...
x : 1 1 1 ...
b : t f t f t f ...
Qc : ↓ ↓ ↓ ...
x : 1 1 1 ...
8 A. Benveniste et al.

For example, in accordance to the above causality relations, the meaning of


P could be: if b = t then get x (and similarly for Q). By using the dependency
relation defined in (6), we can express formally the behavior of Pc as
σ(b)(2n − 1) = ([2n − 1, (x, 0)], t) , σ(b)(2n) = ([2n, (x, 0)], f )
σ(x)(n) = ([2n − 1, (b, 2n − 1)], 1)
and the behavior of Qc as
σ(b)(2n − 1) = ([2n − 1, (x, 0)], t) , σ(b)(2n) = ([2n, (x, 0)], f )
σ(x)(n) = ([2n − 1, (b, 2n − 1)], 1) , σ(x)(n) = ([2n, (b, 2n)], 1)
As for the base case and for the same reason, Pc  Qc = ∅.

Then, Desynchronizing. Removing the synchronization barriers from Pc and Qc


yields
b : t f t f t f ...
Pcα : ↓ ↓ ↓ ...
x : 1 1 1 ...
b : t f t f t f ...
↓ ↓ ↓ ...

c : x: 1 1 1 ...

We insist that, again, desynchronizing consists in (i ) removing the synchro-


nization barriers, and then (ii ) compressing the sequence of values for each
variable, individually—this last step is not shown on the drawing, just because
it is a lot easier to draw vertical arrows. Formally, for Pcα we have
σ(b)(2n − 1) = ((x, 0), t) , σ(b)(2n) = ((x, 0), f )
σ(x)(n) = ((b, 2n − 1), 1)
and, for Qα
c we have

σ(b)(2n − 1) = ((x, 0), t) , σ(b)(2n) = ((x, 0), f )


σ(x)(n) = ((b, 2n − 1), 1) , σ(x)(n) = ((b, 2n), 1)
These two behaviours are unifiable and yield the dependency (b, 2n), by the
max rule (6). In fact, the reader can check that Pcα  Qα α
c = Qc . Thus Pc and
Qc did not include enough causality relations for desynchronization to properly
preserve the semantics.
Adding More Causality Relations. Suppose that “oblique” causality relations are
added, from each previous occurrence of x to the current occurrence of b:

b : t f t f t f ...
Pcc : ↓ f ↓ f ↓ f ...
x:1 1 1 ...
b : t f t f t f ...
Qcc :  t ↓ t ↓ t ↓ ...
x: 1 1 1 ...
Causality and Scheduling Constraints 9

These supplementary causality relations conform to the synchronous model


since they agree with the increasing reaction index. Formally, the single behavior
of Pcc is written
σ(b)(2n − 1) = ([2n − 1, (x, 0)], t) , σ(b)(2n) = ([2n, (x, n)], f )
σ(x)(n) = ([2n − 1, (b, 2n − 1)], 1)
and the one of Qcc is
σ(b)(2n − 1) = ([2n − 1, (x, n − 1)], t) , σ(b)(2n) = ([2n, (x, 0)], f )
σ(x)(n) = ([2n − 1, (b, 2n − 1)], 1) , σ(x)(n) = ([2n, (b, 2n)], 1)
Again, Pcc  Qcc = ∅.

Then, Again Desynchronizing. Removing the synchronization barriers from Pcc


and Qcc yields
b : t f t f t f ...
α
Pcc : ↓ f ↓ f ↓ f ...
x:1 1 1 ...
b : t f t f t f ...

cc :  t ↓ t ↓ t ↓ ...
x: 1 1 1 ...
α
In our framework, for Pcc we have
σ(b)(2n − 1) = ((x, 0), t) , σ(b)(2n) = ((x, n), f )
σ(x)(n) = ((b, 2n − 1), 1)
and, for Qα
cc we have

σ(b)(2n − 1) = ((x, n − 1), t) , σ(b)(2n) = ((x, 0), f )


σ(x)(n) = ((b, 2n − 1), 1) , σ(x)(n) = ((b, 2n), 1)
However, now the composed behavior does not coincide with Qα
cc but is

b : t f t f t f . . .
α
Pcc  Qα
cc =  t  t  t  . . .
x: 1 1 1 ...

The reason for the double causality between x and f -occurrences of b is that
the n-th x causes the (2n)-th b (i.e. the n-th f -occurrence of b) in Pcc whereas the
(2n)-th b causes the n-th x in Qcc . Formally, by the max rule (6), the composed
α
behavior of Pcc  Qαcc is written

σ(b)(2n − 1) = ((x, n − 1), t) , σ(b)(2n) = ((x, n), f )


σ(x)(n) = ((b, 2n − 1), 1) , σ(x)(n) = ((b, 2n), 1)
α
In conclusion, Pcc  Qα
cc possesses causality loops and may be considered
pathological and thus “rejected” in accordance with the original semantics
P  Q = ∅.
10 A. Benveniste et al.

2.4 Heterogeneous Systems


Assume a functional system specification using a synchronous approach, for sub-
sequent deployment over a distributed asynchronous architecture (synchronous
and asynchronous are considered in the sense of subsection 2.1). When we de-
ploy the design on a different architecture, we must make sure that the original
intent of the designer is maintained. This step is non trivial because the infor-
mation on what is considered correct behaviour is captured in the synchronous
specifications that we want to relax in the first place. We introduce the no-
tion of semantic-preserving transformation to identify precisely what is a correct
deployment. We present the idea with our running example:

Running Example, Cont’d. Regarding semantics preserving deployment, the fol-


lowing comments can be stated on our running example. The synchronous paral-
lel composition of P and Q, defined by intersection: P  Q =def P ∩ Q, is empty.
The reason is that P and Q disagree on where to put absences for the variable x.
On the other hand, since Pα = Qα , then Pα  Qα =def Pα ∩ Qα = Pα = Qα = ∅.
Thus, for the pair (P, Q), desynchronization does not preserve the semantics of
parallel composition, in any reasonable sense. 
How to model that semantics is preserved when replacing the ideal syn-
chronous broadcast by the actual asynchronous communication? An elegant so-
lution was proposed by Le Guernic and Talpin for the former GALS case [21].
We cast their approach in the framework of tagged systems and we generalize
it.

Tag Morphisms. For T , T  two tag structures, call morphism a map ρ : T →


T  which is increasing w.r.t. ≤ and ≤ , surjective, and such that

ρ◦ = ◦ (ρ, ρ) (11)
holds, where ◦ denotes the composition of functions. As expected from their
name, morphisms compose. For ρ : T → T  a morphism, and σ ∈ V → N →
(T × D) a behaviour, replacing τ by ρ(τ ) in σ defines a new behaviour having
T  as tag set. We denote it by
σρ , or by σ ◦ ρ. (12)
Performing this for every behaviour of a tag system P yields the tag system
Pρ . (13)
ρ1 ρ2
For T1 −→ T ←− T2 two morphisms, define:
T1 ρ1×ρ2 T2 =def { (τ1 , τ2 ) | ρ1 (τ1 ) = ρ2 (τ2 ) } . (14)
A case of interest is Ti = Ti × T , i = 1, 2, and the Ti are different. Then
T1 ρ1×ρ2 T2 identifies with the product T1 × T × T2 . For example, the desynchro-
nization of synchronous systems is captured by the morphism α : Tsynch → {.},
which erases all global timing information (see Equations (8,9), and (10)).
Causality and Scheduling Constraints 11

Heterogeneous Parallel Composition. In this subsection we define the com-


position of two tagged systems Pi = (Vi , Ti , Σi ), i = 1, 2, when T1 = T2 . Assume
ρ1 ρ2
two morphisms T1 −→ T ←− T2 . Write:
σ1 ρ1 ρ2 σ2 iff σ1 ◦ ρ1  σ2 ◦ ρ2 . (15)
For (σ1 , σ2 ) a pair satisfying (15), define
σ1 ρ1 ρ2 σ2 (16)
as being the set of events (v, n, (τ1 , τ2 ), x) such that ρ1 (τ1 ) = ρ2 (τ2 ) =def τ
and (v, n, τ, x) is an event of σ1 ◦ ρ1 σ2 ◦ ρ2 . We are now ready to define the
heterogeneous conjunction of Σ1 and Σ2 by:
Σ1 ρ1∧ρ2 Σ2 =def { σ1 ρ1 ρ2 σ2 | σ1 ρ1 ρ2 σ2 } . (17)
Finally, the heterogeneous parallel composition of P1 and P2 is defined by:
P1 (ρ1 ρ2 ) P2 = ( V1 ∪ V2 , T1 ρ1×ρ2 T2 , Σ1 ρ1∧ρ2 Σ2 ) . (18)
We simply write (ρ1  instead of (ρ1 ρ2 ) when ρ2 is the identity.

GALS, Hybrid Timed/Untimed Systems, and More. To model the in-


teraction of a synchronous system with its asynchronous environment, take the
heterogeneous composition P(α  A, where P = (V, Tsynch , Σ) is a synchronous
system, A = (W, {.}, Σ  ) is an asynchronous model of the environment, and
α : Tsynch → {.} is the trivial morphism, mapping synchrony to asynchrony
(hence the special notation).
For GALS, take T1 = T2 = Tsynch , where Tsynch is the tag set of synchronous
systems. Then, take T = {.} is the tag set of asynchronous ones. Take α :
Tsynch → {.}, the trivial morphism. And consider P1 (α α) P2 .
For timed/untimed systems, consider P (ρ  Q, where P = (V, Tsynch × Tϕ , Σ)
is a synchronous timed system, Q = (W, Tsynch , Σ  ) is a synchronous but untimed
system, and ρ : Tsynch × Tϕ → Tsynch is the projection morphism.
This machinery of morphisms provides a framework for the different manip-
ulations of tags that were performed in Section 2.3 on our running example.

3 Application to Correct Deployment


In this section we apply our framework to the formalization of the practically
important—but generally informal—requirement of “correct deployment”.

3.1 Preserving Semantics: Formalization


We are given a pair Pi = (Vi , Ti , Σi ), i = 1, 2, such that T1 = T2 , and a pair
ρ ρ
T1 −→ T ←− T2 of identical morphisms. We can consider two semantics:
The strong semantics : P1  P2
The weak semantics : P1 (ρ ρ) P2 .
12 A. Benveniste et al.

We say that ρ is semantics preserving with respect to P1  P2 if

P1 (ρ ρ) P2 ≡ P1  P2 . (19)

Running Example, Cont’d. The reader can check the following as an exercise:
P  Q = ∅, and, as we already discussed, Pα  Qα = Pα . Now we compute
P (α α) Q. From (16) we get that, using obvious notations, (σP , σQ ) is a pair of
behaviours that are unifiable modulo desynchronization, i.e., σP α α σQ . Then,
unifying these yields the behaviour σ such that:

∀n ∈ N : σ(b)(n) = ((n, n), vb ) and σ(x)(n) = ((2n − 1, 2n), 1) (20)

where vb = t if n is odd, and vb = f if n is even. In (20), the expression for


σ(b)(n) reveals that desynchronizing causes no distortion of logical time for b,
since (n, n) attaches the same tag to the two behaviours for unification. On
the other hand, the expression for σ(x)(n) reveals that desynchronizing actually
causes distortion of logical time for x, since (2n − 1, 2n) attaches different tags
to the two behaviours for unification. Thus P  Q = ∅, but P (α α) Q consists
of the single behaviour defined in (20). Hence, P (α α) Q ≡ P  Q in this case:
semantics is not preserved. 

3.2 A General Result on Correct Deployment


Here we analyse requirement (19). The following theorem holds (see (13) for the
notation Pρ used in this theorem):
Theorem 1. The pair (P1 , P2 ) satisfies condition (19) if it satisfies the follow-
ing two conditions:

∀i ∈ {1, 2} : (Pi )ρ is in bijection with Pi (21)


(P1  P2 )ρ = (P1 )ρ  (P2 )ρ (22)

Comments. The primary application of this general theorem is when P and Q


are synchronous systems, and ρ = α is the desynchronization morphism. This
formalizes GALS deployment. Thus, Theorem 1 provides sufficient conditions
to ensure correct GALS deployment. Conditions (21) and (22) are not effective
because they involve (infinite) behaviours. In [5, 6], for GALS deployment, con-
dition (21) was shown equivalent to some condition called endochrony, expressed
in terms of the transition relation, not in terms of behaviours. Similarly, con-
dition (22) was shown equivalent to some condition called isochrony, expressed
in terms of the pair of transition relations, not in terms of pairs of sets of be-
haviours. Endochrony and isochrony are model checkable and synthesizable, at
least for synchronous programs involving only finite data types (see [5, 6] for a
precise statement of these conditions).
Proof. Inclusion ⊇ in (19) always hold, meaning that every pair of behaviours
unifiable in the right hand side of (19) is also unifiable in the left hand side.
Causality and Scheduling Constraints 13

Thus it remains to show that, if the two conditions of Theorem 1 hold, then
inclusion ⊆ in (19) does too. Now, assume (21) and (22). Pick a pair (σ1 , σ2 )
of behaviours which are unifiable in P1 (ρ ρ) P2 . Then, by definition of (ρ ρ) ,
the pair ((σ1 )ρ , (σ2 )ρ ) is unifiable in (P1 )ρ  (P2 )ρ . Next, (22) guarantees that
(σ1 )ρ (σ2 )ρ is a behaviour of (P1  P2 )ρ . Hence there must exist some pair
(σ1 , σ2 ) unifiable in P1  P2 , such that (σ1 σ2 )ρ = (σ1 )ρ (σ2 )ρ . Using the
same argument as before, we derive that ((σ1 )ρ , (σ2 )ρ ) is also unifiable with re-
spect to its associated (asynchronous) parallel composition, and (σ1 )ρ (σ2 )ρ =
(σ1 )ρ (σ2 )ρ . But (σ1 )ρ is the restriction of (σ1 )ρ (σ2 )ρ to its events labeled by
variables belonging to V1 , and similarly for (σ2 )ρ . Thus (σi )ρ = (σi )ρ for i = 1, 2
follows. Finally, using (21), we know that if (σ1 , σ2 ) is such that, for i = 1, 2:
(σi )ρ = (σi )ρ , then: σi = σi . Hence (σ1 , σ2 ) is unifiable in P1  P2 . 
Corollary 1. Let P1 and P2 be synchronous systems whose behaviors are
equipped with some equivalence relation ∼, and assume that P1 and P2 are closed
with respect to ∼. Then, the pair (P1 , P2 ) satisfies condition (19) if it satisfies
the following two conditions:
∀i ∈ {1, 2} : (Pi )ρ is in bijection with Pi / ∼ (23)
(P1  P2 )ρ = (P1 )ρ  (P2 )ρ (24)
where Pi / ∼ is the quotient of Pi modulo ∼.
Proof. Identical to the proof of Theorem 1 until the paragraph starting with
“Finally”. Finally, using (23), we know that if (σ1 , σ2 ) is such that, for i = 1, 2:
(σi )ρ = (σi )ρ , then: σi ∼ σi . Hence (σ1 , σ2 ) is unifiable in P1  P2 , since all
synchronous systems we consider are closed under ∼. 
This result is of particular interest when ∼ is the equivalence modulo stut-
tering, defined in [7].

Running Example, Cont’d. Since P and Q possess a single behaviour, they


clearly satisfy condition (21). However, the alternative condition (22) is vio-
lated: the left hand side is empty, while the right hand side is not. This explains
why semantics is not preserved by desynchronization, for this example. In fact,
it can be shown that the pair (P, Q) is not isochronous in the sense of [5, 6].

More Examples and Counter-Examples. Our running example was a counter-


example where condition (22) is violated.
For the following counter-example, condition (21) is violated: P emits to Q
two signals with names x and y. Signal y is emitted by P if and only if x > 0
(assuming, say, x integer). Signals x and y are awaited by Q. Formally:

⎨ σ(x)(n) = (n, −)
P : σ(y)(n) = (m(n), −), where

m(n) = min{m | m > m(n − 1) ∧ σ(x)(m) > 0}
(25)

σ(x)(n) = (k(n), −)
Q:
σ(y)(n) = (l(n), −)
14 A. Benveniste et al.

In (25), symbol − denotes an arbitrary value in the domain D, and k(.), l(.)
are arbitrary strictly increasing maps, from N to N. As the reader can check, P
satisfies (21) but Q does not. The desynchronization α is not semantics preserv-
ing for this pair (P, Q).
Now, consider the following modification of (P, Q): P  emits to Q two signals
with names x and y. Signal y is emitted by P  if and only if x > 0 (assuming, say,
x integer). In addition, P  emits a boolean guard b simultaneously with x, and b
takes the value true iff x > 0. Signals x and y are awaited by Q . In addition, Q
awaits the boolean guard b simultaneously with x, and Q knows that he should
receive y simultaneously with the true occurrences of b. Formally:


⎪ σ(x)(n) = (n, −)

σ(b)(n) = if σ(x)(n)(n) > 0 then (n, t) else (n, f )
P :

⎪ σ(y)(n) = (m(n), −), where

m(n) = min{m | m > m(n − 1) ∧ σ(x)(m) > 0}
⎧ (26)

⎪ σ(x)(n) = (k(n), −)

 σ(b)(n) = (k(n), −)
Q :

⎪ σ(y)(n) = (l(n), −), where

l(n) = min{k(m) | k(m) > l(n − 1) ∧ σ(b)(m) = t}

The guard b explicitly says when y must be awaited by Q , this guarantees


that Q satisfies (21) (and so does P  ). On the other hand, the pair (P  , Q )
satisfies (22). Thus the modified pair (P  , Q ) is semantics preserving, for desyn-
chronization. The modification, from (P, Q) to (P  , Q ), has been by adding the
explicit guard b. This can be made systematic, as outlined in [6].

4 Discussion and Perspectives


In [11] the following result was proved. For P and Q two synchronous systems
such that both P , Q, and P  Q are functional, clock-consistent, and with loop-
free combinational part, then P  Q can be seen as a Kahn network—for our
purpose, just interpret Kahn networks as functional asynchronous systems. This
result applies to functional systems with inputs and outputs, it gives no help
for partial designs or abstractions. Our conditions of endochrony and isochrony
allows us to deal even with partial designs, not only with executable programs.
Hence, they do represent effective techniques that can be used as part of the
formal foundation for a successive-refinement design methodology.
As said before, this paper extends the ideas of [21] on desynchronization. A
more naive “token-based” argument to explain GALS deployment is also found
in [8], Sect. V.B. This argument is closely related to the use of Marked Graphs
in [12] to justify GALS desynchronization in hardware.
Another example can be found in theory of latency-insensitive design [10]:
here, if P  Q is a specification of a synchronous system and P and Q are stallable
processes, then it is always possible to automatically derive two corresponding
Causality and Scheduling Constraints 15

patient processes Pp and Qp that seamlessly compose to give a system imple-


mentation Pp  Qp that preserves semantics while being also robust to arbitrary,
but discrete, latency variations between P and Q. Again, Pp  Qp is a correct
deterministic executable system made of endochronous sub-systems. In fact, as
the notion of stallable system and patient system correspond respectively to the
notion of stuttering-invariant system and endochronous system, Corollary 1 sub-
sumes the result presented in [10] on the compositionality of latency-insensitivity
among patient processes.
Now, the remaining key challenge is to make the theory of this paper effective.
In this respect, Theorem 1 and its corollary are not enough, since they involve
(infinite) behaviours. What is needed is sort of a counterpart of “automata”
for our Tagged Systems. Synchronous Transition Systems as used in [6] are an
example. Order automata are another example, that can be associated to Tagged
Systems with causality relations. How to define such machines for general Tagged
Systems is our next objective. Then, having this at hand, we will have to properly
extend the “endochrony” and “isochrony” results of [6], thus providing effective
algorithms to generate adaptors that ensure correct-by-construction deployment
for general Tagged Systems.

References
1. R. Alur, T. Dang, J. Esposito, Y. Hur, F. Ivancic, V. Kumar, I. Lee, P. Mishra,
G. J. Pappas and O. Sokolsky. Hierarchical Modeling and Analysis of Embedded
Systems. Proc. of the IEEE, 91(1), 11–28, Jan. 2003.
2. R. Alur and D. L. Dill. A Theory of Timed Automata. Theor. Comp. Science,
126(2), 183–235, Apr. 1994.
3. ARTIST Network of Excellence. Roadmap on Hard Real-Time Develop-
ment Environments. Available in may 2003 from url http://www.systemes-
critiques.org/ARTIST/.
4. A. Benveniste. Some synchronization issues when designing embedded systems
from components. In Proc. of 1st Int. Workshop on Embedded Software, EM-
SOFT’01, T.A. Henzinger and C.M. Kirsch Eds., LNCS 2211, 32–49, Springer.
5. A. Benveniste, B. Caillaud, and P. Le Guernic. From synchrony to asynchrony. In
J.C.M. Baeten and S. Mauw, Eds., CONCUR’99, Concurrency Theory, 10th Intl.
Conference, LNCS 1664, pages 162–177. Springer, Aug. 1999.
6. A. Benveniste, B. Caillaud, and P. Le Guernic. Compositionality in dataflow syn-
chronous languages: specification & distributed code generation. Information and
Computation, 163, 125-171 (2000).
7. A. Benveniste, L. P. Carloni, P. Caspi, and A. L. Sangiovanni-Vincentelli. Het-
erogeneous reactive systems modeling and correct-by-construction deployment. In
R. Alur and I. Lee, Eds., Proc. of the Third Intl. Conf. on Embedded Software,
EMSOFT 2003, LNCS 2855, Springer, 2003.
8. A. Benveniste, P. Caspi, S. Edwards, N. Halbwachs, P. Le Guernic, and R. de Si-
mone. The Synchronous Language Twelve Years Later. Proc. of the IEEE,
91(1):64–83, January 2003.
9. G. Berry. The Foundations of Esterel. MIT Press, 2000.
16 A. Benveniste et al.

10. L. P. Carloni, K. L. McMillan, and A. L. Sangiovanni-Vincentelli. Theory of


Latency-Insensitive Design. IEEE Transactions on Computer-Aided Design of In-
tegrated Circuits and Systems, 20(9):1059–1076, September 2001.
11. P. Caspi, “Clocks in Dataflow languages”, Theor. Comp. Science, 94:125–140, 1992.
12. J. Cortadella, A. Kondratyev, L. Lavagno, and C. Sotiriou. A concurrent model
for de-synchronization. In Proc. Intl. Workshop on Logic Synthesis, May 2003.
13. J. Eker, J.W. Janneck, E.A. Lee, J. Liu, J. Ludwig, S. Neuendorffer, S. Sachs,
and Y. Xiong. Taming heterogeneity—The Ptolemy approach. Proc. of the IEEE,
91(1), 127–144, Jan. 2003.
14. L. de Alfaro and T.A. Henzinger. Interface Theories for Component-Based Design.
In Proc. of 1st Int. Workshop on Embedded Software, EMSOFT’01, T.A. Henzinger
and C.M. Kirsch Eds., LNCS 2211, 32–49, Springer, 2001.
15. E.A. Lee and Y. Xiong. System-Level Types for Component-Based Design. In
Proc. of 1st Int. Workshop on Embedded Software, EMSOFT’01, T.A. Henzinger
and C.M. Kirsch Eds., LNCS 2211, 32–49, Springer, 2001.
16. G. Karsai, J. Sztipanovits, A. Ledeczi, and T. Bapty. Model-Integrated Develop-
ment of Embedded Software. Proc. of the IEEE, 91(1), 127–144, Jan. 2003.
17. N. Halbwachs, P. Caspi, P. Raymond, and D. Pilaud. The Synchronous Data Flow
Programming Language LUSTRE. Proc. of the IEEE, 79(9):1305–1320, Sep. 1991.
18. D. Harel. Statecharts: A visual formalism for complex systems. Science of Com-
puter Programming, 8(3):231–274, June 1987.
19. H. Kopetz, Real-Time Systems: Design Principles for Distributed Embedded Ap-
plications. Kluwer Academic Publishers. 1997. ISBN 0-7923-9894-7.
20. P. Le Guernic, T. Gautier, M. Le Borgne, and C. Le Maire. Programming real-time
applications with SIGNAL. Proc. of the IEEE, 79(9):1326–1333, Sep. 1991.
21. P. Le Guernic, J.-P. Talpin, J.-C. Le Lann, Polychrony for system design. Journal
for Circuits, Systems and Computers. World Scientific, April 2003.
22. E.A. Lee and A. Sangiovanni-Vincentelli. A Framework for Comparing Models of
Computation. IEEE Trans. on Computer-Aided Design of Integrated Circuits and
Systems, 17(12), 1217–1229, Dec. 1998.
23. M. Mokhtari and M. Marie. Engineering Applications of MATLAB 5.3 and
SIMULINK 3. Springer, 2000.
Machine Function Based Control Code Algebras

Jan A. Bergstra1,2
1
University of Amsterdam, Programming Research Group
janb@science.uva.nl
2
Utrecht University, Department of Philosophy, Applied Logic Group
janb@phil.uu.nl

Abstract. Machine functions have been introduced by Earley and Stur-


gis in [6] in order to provide a mathematical foundation of the use of the
T-diagrams proposed by Bratman in [5]. Machine functions describe the
operation of a machine at a very abstract level. A theory of hardware and
software based on machine functions may be called a machine function
theory, or alternatively when focusing on inputs and outputs for machine
functions a control code algebra (CCA). In this paper we develop some
control code algebras from first principles. Machine function types are
designed specifically for various application such as program compilation,
assembly, interpretation, managed interpretation and just-in-time com-
pilation. Machine function dependent CCA’s are used to formalize the
well-known compiler fixed point, the managed execution of JIT compiled
text and the concept of a verifying compiler.

1 Introduction
Machine models can be given at a very high level of abstraction by using so-
called machine functions, a concept due to [6] as a basis. Machine functions
are hypothetical functions, which may be classified by machine function types.
Machine function types provide information about the expected inputs and out-
puts, or more general the behavior of a machine. Machine functions are named
elements of machine function types. Machine functions are used as the primitive
operators of a control code algebra. Machine functions may be viewed as black
box containers of behavior. It is not expected that machine functions are actu-
ally either formally specified or algorithmically given in any detail. Important,
however is the realization that different machine architectures may use different
machine functions as realizations of the same machine function type. A number
of issues is can be clarified using machine functions only: the so-called compiler
fixed point, the distinction between compilation and interpretation, the role of
intermediate code, managed execution and JIT compilation, and finally verifying
compilation.

1.1 Motivation
The identification of machine function types and the description of machine
models as composed from named but otherwise unspecified machine functions
may be helpful for the following reasons:

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 17–41, 2004.

c Springer-Verlag Berlin Heidelberg 2004
18 J.A. Bergstra

– A multitude of different machine models and instantiations thereof can be


described in some formal detail while at the same time ignoring the massive
complexities of processor architecture and program execution.
– Machine function theory can be used to obtain logically complete machine
models. A machine model is logically complete if all of its concepts are ex-
plained in internal technical terms in a bottom up fashion, permitting a
reader to understand the model and stories about it without the use of sub-
ject related external knowledge and without understanding in advance the
whole story of modern computer architecture.
– By giving names to machine functions simple calculations are made possible
which may produce insights unavailable otherwise. In addition system spec-
ifications can be given in terms of combinations of requirements on a family
of named machine functions. Machine function theory is a very elementary
axiomatic theory of machine behavior, not claiming that the essential prop-
erties of machine functions are captured by axioms. The more limited claim,
however, is that the role that machine functions play in a larger architectural
framework can be analyzed in some useful detail.
– Pseudo-empirical semantics explains the meaning of codes and concepts re-
garding codes by means of hypothetical experiments with the relevant ma-
chine functions. The phrase ‘the meaning of a code is given via its compiler’
belongs to the dogma’s of pseudo-empirical semantics. Pseudo-empirical se-
mantics provides definitions close to what computer users without a back-
ground in formal semantics may have in mind when working with machines.

1.2 Control Code Algebras


Each machine function gives rise to an algebra with codes as its domain. Codes
are represented as finite sequences of bits. The codes play the role of input data
and output data as well as of control codes. As the main focus of the use of
machine functions is in analyzing semantic generalities of control codes at a
very abstract level these algebras will be referred to as control code algebras
(CCA’s).1 Control code algebra (CCA) is the meta theory of the control code
algebras. It might just as well be called machine function theory.

1.3 Scope of CCA


The simplest CCA may be viewed as another form of theory of T-diagrams from
[5], and implicitly present in [7], which has been further developed in [6]. Similar
work appeared in [1]. Clearly the limits of CCA are reached when modeling a
phenomenon requires significant information concerning the details of machine
code execution at the lowest level of abstraction. An extreme view might be that
each insight in computer software processing which is independent of the details

1
The acronym CCA is a mere abbreviation, having well over a million hits on Google
and in no way specific. Its expansion ‘Control Code Algebra’ generates no single hit,
however, at the time of this writing.
Machine Function Based Control Code Algebras 19

of code may lie within the scope of CCA. Here are some interesting issues for
which that is unclear:
– Can the phenomenon of computer viruses can be modeled at the level of
CCA. Many computer users are supposed to use their computers in such a
way that infections are prevented, while no plausible argument can be given
that these users must understand the details of how a machine works. The
question may therefore be rephrased as follows: can an explanation via an
appropriate CCA be given of what machine users must do in order to prevent
virus infections of their systems. And, moreover can it be formulated what
requirements systems must satisfy if these prescriptions for user behavior are
to guarantee that systems remain clear of infections.
– Another question that may be answered at the level of an appropriate CCA
is what an operating system is. Textbooks on operating systems never intro-
duce the concept of an operating system as a concept that merits a definition
in advance. Substantial questions concerning what minimum might be called
an OS and whether or not an OS is a computer program or a control code
are impossible to answer without proper definitions, however.
– A third question which may or may not lie within the scope of CCA is the
question as to what constitutes a software component. By its nature the
concept of a software component abstracts from the details of computation.
But a first attempt to define software components in the framework of CCA
reveals that some non-algorithmic knowledge about software component in-
ternals may be essential and outside the scope of CCA at the same time.

2 External Functionality Machine Functions


A way to capture the behavior of a machine is to assume that it has a simple
input output behavior, taking an number of bit sequences as its an put and
producing similar outputs. Taking multiple inputs into account is a significant
notational overhead but it will be indispensable for instance when discussing
the notion of an interpreter (for the case of multiple inputs) whereas multiple
outputs arise with a compiler that may produce a list of warnings in addition
to its compilation result. A list (with length k > 0) of bit sequences (codes) is
taken as an input and the result takes the form of one or more bit sequences. If f
is the name of a machine function and n is a natural number then fn names the
mapping that yields the n-th result. The result of f on an input list y may be
undefined (M , for meaningless) or divergent (D). In case of divergence none of
the outputs gets computed, in the case of convergence from some index onwards
all outputs are M , indicating that it has not been provided. The following axioms
are imposed:
∀n, m (fn (y) = D → fm (y) = D)
∀n, m (fn (y) = M & m > n → fm (y) = M )
∀n (fn (y) = D → ∃m > n fm (y) = M )
20 J.A. Bergstra

∀n (fn (y  z) = M → fm (y) = M )
∀n (fn (y) = D → fm (y  z) = D)
These rules express that
– The sequence of outputs is computed in a single computation. A divergence
implies that no bit sequences can be produced, whereas an error (M ) only
indicates that a certain output is not produced. Only finitely many outputs
bit sequences are produced, and beyond that number an error occurs.
– There is one algorithm doing the computation accessing more arguments
when needed. When arguments are lacking an error (M ) will occur, rather
than D. Providing more inputs (y  z instead of y) cannot cause an error
(M ). In other words if a computation on a list of inputs runs into an error
than each computation on a shorter list must run into an error as well.
– A divergence taking place on a given sequence of inputs cannot be prevented
by supplying more inputs. (This in contrast with an error which may result
from a lack of inputs.)

2.1 External Functionality


A machine function produces external functionality if the transformations it
achieves are directly contributing to what users intend to do. A typical example
may be a formatting/typesetting functionality. Often user commands directly
invoke the operation of an external functionality.
External functionality machine functions describe external functionalities.
Their operation can (but need not) be exclusively under the control of manu-
ally entered user commands. Here is a somewhat artificial example involving a
compiler and a cross-assembler producing executable code for another machine.
Because the produced code must be moved elsewhere before use, its machine
may be regarded external functionality.
Using Bit Sequence Machine Functions, an Example. One may imagine
a machine P powered by the, bit sequence machine function Fc, requiring one
input and producing three outputs for each input, and the bit sequence machine
function Fd also requiring a single input and producing 2 outputs in all cases.
The inputs are taken from a stack, the first input from the top and so on, while
the outputs are placed on the stack in reversed order, after removing the inputs.
For instance Fc takes a program in and compiles it in the context of a list of
programs (encoded in the second argument), producing the output code, a listing
with error messages and warnings and an assembled version of the compiled code.
Fd applies a disassembler to its first argument producing the result as well as a
listing of errors and/or warnings.
A manual operation of the system is assumed with the following instructions:
s.load:f , pushing file f on the stack and s.store:g, placing the top of the stack
in file g while subsequently popping the stack. Here s stands for the machine
command interface able to accept and process manual user commands. s.Fc is
the command that invokes the operation of the first of the two machine functions
Machine Function Based Control Code Algebras 21

and produces three output sequences that are placed on top of the stack. s.Fd
is the command for the second one, which places two results on the top of the
stack consecutively. How such commands are entered is left unspecified in the
model at hand.
Consider the command sequence
CS = s.load:f;s.Fc;s.store:g;s.pop;
s.pop;s.load:g;s.Fd;s.store:h;s.pop
If the content of f before operation is x then after performing CS the content
of h is Fd1 (Fc1 (x)). This output is probably empty if either one of the two
commands lead to error messages, whereas if the error message output is empty
on may assume that the other outputs are ‘correct’. It also follows from this
model that the operation of the system is deterministic for command sequences of
this kind. Several potentially relevant system properties may now be formulated:
Code is produced unless errors are found Fc2 (x) = [ ] → Fc1 (x) = [ ]
Disassembly succeeds unless errors are found Fd2 (x) = [ ] → F1 (x) = [ ]
Disassembly inverts of assembly Fc2 (x) = [ ] → Fd1 (Fc3 (x)) = Fc1 (x)
The use of machine functions in the examples mentioned above is in making
these requirement descriptions more readable than it would be possible without
them. With these few primitives there appears to be no significant reasoning
involving calculation, however.

Non-programmable Machines. If a machine is given by means of a finite


list of machine functions one may imagine its user to be able to enter inputs,
select a function to be applied and to retrieve outputs thereafter. We will not
provide syntax or notation for the description of these activities. As it stands a
non-programmable machine may be given in terms of finite listing of machine
functions. As the number of machine functions of a machine increases it becomes
increasingly useful to represent machine functions as codes and to use a single
universal machine function from which the other machine functions are derived.
In this way a control code dependent machine emerges. This universal machine
function is used to bring other codes into expression. Therefore the phrase code
expression machine functions will be used rather than the term universal machine
function, which is somewhat unclear regarding to the scope of its universality.

3 Code Expression Machine Functions


It is now assumed that the first argument of a machine function consists of a bit
sequence which is viewed as control code for a machine. Through the application
of the machine function the control code finds its expression. Formally there is
no difference with the case of an external functionality machine function. But the
idea is that without a useful control code (i.e. a first argument for the machine
function) no significant external functionality may be observed. It is a reasonable
assumption for a simplest model that a machine which can be controlled via
exchangeable data must be controlled that way. At this stage the important
22 J.A. Bergstra

question what makes a bit sequence a control code (rather than just its place in
the argument listing of a machine function) is left for later discussion.
For code expression machine functions another notation is introduced which
separates the first and other arguments. A notation for the n’th result of a
code expression function taking k arguments (except the code argument x) is as
follows:
x • •n y1 , .., yk
By default the first argument is meant of no superscript is given:

x • •y1 , .., yk = x • •1 y1 , .., yk

If the name f is given this can be made explicit with x • •nf y1 , .., yk . The
semantics of a control code x w.r.t. a machine function is the family of all
machine functions taking x fixed, denoted |x|•• . Thus semantic equivalence for
control codes reads

x ≡beh z ↔ |x|•• = |z|•• ↔ ∀n, k, y1 , .., yk (x • •n y1 , .., yk = z • •n y1 , .., yk ).

With an explicit name of the machine function this reads:

x ≡beh z ↔ |x|••f = |z|••f ↔ ∀n, k, y1 , .., yk (x • •nf y1 , .., yk = z • •nf y1 , .., yk ).

Bit sequence generating machine functions are less useful when it comes to
the description of interactive systems. But code expression machine functions
are very simple as a concept and they can be used until a lack of expressive
power or flexibility forces one to move to a different model.2

3.1 Split Control Code Machine Models


A code expression machine function − • •f − determines all by itself a machine
model. For an execution, which takes a single step, a triple of the code and a
sequence of inputs and the machine function are needed. This may be formalized
as mf (x, y). The code is not integrated in the machine in any way. Thus it is

2
The discussion may be made more general by using a terminology consistent with
the possibility that a machine function produces an interactive behavior (i.e. a pro-
cess). A bit sequence generating machine function is just a very simple example of a
behavior. If the behavior of a machine is described in terms of a polarized process its
working may be determined through a function that produces a rational polarized
behavior over a given collection A of basic actions from the codes that have been
placed in the machine as an input or as a control code. The reason to consider a
mapping function, say FM : BS × BS × BS → BP P A(A), as the description of a
machine if a control code dependent machine is to be compared with a programmable
machine. BPPA is the algebra of basic polarized process from [2]. As will become
clear below polarized processes are well-suited for the specification of programs and
programmed machine behavior. In the case such a machine needs to be viewed as a
control code dependent machine a polarized process machine function results.
Machine Function Based Control Code Algebras 23

implausible to speak of a stored code. For that reason the model is classified
as a split control code machine model. This is in contrast with a stored code
machine model for which it is required that code storage is achieved by means of
the same mechanisms as any storage during computation. As nothing is known
about these mechanisms due to the abstract nature of machine functions, being
no more than formal inhabitants of their types, the model cannot be classified
as a stored control model.
Having available the notion of a split control code machine, it is possible
to characterize when a code is an executable for it: x is an executable for ••f
if for some y, x • •f y = M . Because this criterion depends on an application
of the code in an execution it is a dynamic criterion. In practice that may be
useless and entirely undecidable. For that reason a subset (or predicate) Ec of the
executables may be put forward as the collection of ‘certified’ executables. Here
it is understood that certification can be checked efficiently. A pair (••f , Ec ) is
a split control machine with certified executables.
It is always more or less taken for granted that modern computer program-
ming is based on the so-called stored program computer model. In the opinion
of the author a bottom up development of the concept of a program starts with
a split program machine model, however, the register machine constituting a
prime example of a split program machine model. Having available a split pro-
gram machine model one may then carry on to develop stored program machine
models. The usual objection against this argument is that a Turing machine ([9])
constitutes a stored program machine model already. That, however, is debat-
able because it is not clear which part of the tape content or state space of a
Turing might be called a program on compelling grounds. Usually some intuition
from an implicit split machine model is used to provide the suggestion that a
part of the tape contains a program in encoded form. That, however, fails to be
a compelling argument for it being a program.

3.2 Two Conceptual Issues on Split Code Machine Models


The split control code machine model aims at providing a very simple account of
control code dependent machines while ignoring the aspect of the control code
being either a program or being cast in software. There are two philosophical
issues that immediately emerge from the introduction of the concept of a split
control code machine.
The Control Code Identification Problem. An explanation is needed for
why the first code argument is said to contain the control code. It seems to be a
rather arbitrary choice to declare the first argument the control code argument
rather than for instance the second argument. This is the control code iden-
tification problem for an external machine function. The question is about the
machine function and irrespective of any of its actual arguments. So the problem
is: determine, given a code expression machine function, which of its argument
contains the control code if any.
24 J.A. Bergstra

Below in 5.2 we will see that the situation is not symmetric indeed and sound
reasons may exist for taking one code argument to play the role of containing
the control code rather than another argument.3

The Program Recognition Problem. An obvious question raised by the split


control code machine model is this: under which circumstances is it appropriate
to view an executable control code as a program? This is the program recognition
problem. It presumes that the control code identification has successfully led to
the classification of an argument position as the (unique) position for control
code. This question is important for an appreciation of the distinction between
control code an programs. This question can be answered first by means of the
hypothetical program explanation principle which has been proposed in [3].

4 Split Program Machine Models


Complementary to split control code machine models there are split program
machine models. For the development of CCA it is in fact immaterial how the
concept of a program is defined. What matters is that for some code to be called
a program some definition of what it is to be a program is needed. Let some
theory of programming be given which provides one with a reliable definition of
the concept of a program.
A split code machine model (i.e. a code expression machine function) quali-
fies as a split program machine model if there is for each code a mapping to an
acknowledged program notation (which may be hypothetical) for a (hypotheti-
cal) machine model such that the machine function describes the behavior of the
(hypothetical) machine as programmed by a code (viewed as a program). Thus
each code may be read as a program for a machine with a well-understood op-
erational meaning which produces results that happen to coincide with those of
the given machine function. In other words the code expression machine function
is an abstraction of a split program machine model.
As a notation for a split program machine model with name f we take

x • •nspm−f y1 , .., yk

The semantics of a program u in a split program machine model is derived in


a completely similar style to the split control code case with notation |u|••spm−f .
A split program machine model will also called an analytical architecture
because it focuses on the analysis and explanation of behavior, whereas a split
control code machine model may be called a synthetic architecture because it
focuses on the components from which the machine is physically synthesized,
without any commitment to the explanation of behavior.

3
When investigating stored control code machines or stored program machines this
question reappears in the following form: which memory compartment contains con-
trol code or program code?
Machine Function Based Control Code Algebras 25

4.1 Hypothetical Architectures and Program Recognition


For this section it will be assumed that a split program machine is like a split
control code machine with the additional information that the control code con-
stitutes a program irrespective of the definition of a program that has been used.
Further it is assumed that the behavior for a split program machine can be found
in such a way that the program helps to understand the code expression machine
function (or process machine) function at hand. That is to say, the split program
machine is considered more easily understood or explained because its program
and the working of that program is known and it is also known how the program
is transformed into the produced behavior by means of interaction with other
relevant parts of the machine.
A control code is just data without any indication that it has the logical
structure of a program.
Having available the concept of a split control code machine model one may
then investigate under which conditions a split control code may appropriately
be called a control code and even a program. This is the program recognition
that was problem mentioned above.

The Program Recognition Problem: An Informal Solution. A code is a


program if it can be viewed as the product of some computable transformation
translating (machine) programs for some conceivable programmable architecture
to code for a split code machine. This state of affairs indicates that the reasons
for classifying code as a program lie in the possibility of reverse engineering, or
rather disassembling, the code into a program for some programmable machine,
which may serve as an explanation for the given code.

The Program Recognition Problem: A Formalized Solution. Given the


split control code machine ••f and a collection R of relevant functionalities for
••f , a control code x is a program if there exists a split program machine model4
••spm−g and a computable code generation mapping5 ψg2f from programs for
••spm−g to codes for ••f such that

– Hypothetical program existence: x is in the range of ψ (e.g. for some z,


x = ψg2f (z)),
– Hypothetical assembler soundness: for all programs u (for ••spm−g )
|ψg2f (u)|••spm−f = |u|••g , and:
– Functional assembler completeness: for all control codes x for ••f : if the
behavior |u|••f belongs to the collection R, which covers the relevant func-
tionalities for which the machine model has been designed, then either
• u has a disassembled version (z with ψg2f (z) = x)) (which qualifies as
a program for the split program model architecture providing the same
functionality), or otherwise,

4
The hypothetical programmed machine architecture.
5
Also called assembler mapping, the inverse being an assembly projection.
26 J.A. Bergstra

• there is a program z such that |z|••spm−g = |x|••f and consequently


|ψg2f (z)|••f = |x|••f .
This last condition guarantees that the split program machine has not been
concocted specifically to turn the particular control code x into a program (ac-
cording to our proposed definition) but instead that this hypothetical machine
provides an explanation of the full operational range of ••f by providing a pro-
gram for each relevant behavior.

4.2 Control Code Need Not Originate Through Programming


One may wonder whether split control code is in all cases a transformation prod-
uct of programs. If that were true the conceptual distinction between programs
and code is marginal, a matter of phase in the software life-cycle only, and as a
consequence the concept of control code is only seemingly independent of that
of a program. We give two examples of control code that fails to qualify as a
program. The conclusion is drawn that there are (at least) two entirely different
strategies for computer software engineering, only one of which involves pro-
gramming. Machine functions provides an abstraction level that both strategies
have in common.
Two Examples of Non-programmed Control Code. A counterexample to
the hypothesis that all control code originates as the result of program produc-
tion may be as follows: one imagines a neural network in hardware form, able to
learn while working on a problem thereby defining parameter values for many fir-
ing thresholds for artificial neurons. Then the parameter values are downloaded
to a file. Such a file can be independently loaded on a machine, depending on
the particular problem it needs to address. These problem dependent parameter
files can be considered control code by all means. In all likelihood they are not
programs in any sense supported by our theory of programming, however. The
particular property of neural networks is their ability to acquire control code
along another path than human based computer programming.
Another example of control code which may rather not be understood as a
program is the geographical information downloaded in a purely hardware made
robot together with a target location it is supposed to find. The robot will apply
its fixed processing method to these data, but the data determine the behavior of
the robot. The loaded data constitute control code (this follows from the absence
of other software in the robot). But programming (compliant with the assumed
theory of computer programming) has played no role in the construction of this
control code.
In both mentioned examples of control code which is not the result of pro-
gram transformation artificial intelligence plays a role. In the case of the neural
network the learning process itself is an example of artificial intelligence, whereas
in the case of the robot the processing performed by the robot must involve sig-
nificant AI applications. In the robot case the preparation of control code is
similar to the briefing a human being might get when confronted with the same
task.
Machine Function Based Control Code Algebras 27

Two Software Engineering Strategies. Machine learning may altogether


be understood as an alternative form of control code production, contrasting
with the currently more usual control code development starting with computer
programming and followed by compilation and assembly phases.
Examples of non-programming based software engineering outside artificial
intelligence seem to be hard to find. Both examples above embody different
aspects of artificial intelligence based software engineering: control code con-
struction by means of artificial intelligence and control code construction for
an artificially intelligent system play a role in non-programming based software
engineering. Therefore software engineering is in terms of software construction
techniques covered strictly larger than computer programming, as it also covers
these AI based techniques.

A Third Option: Genetic Control Code Evolution. A third option for


software engineering that may avoid programming as well as neural network
training lies in the application of genetic algorithms. This involves a number of
operators for constructing new control codes from (pairs of) known ones. Then
by randomly applying these operations on some initial codes and by filtering out
codes that are bad solutions for the software engineering problem at hand an
evolutionary process may produce adequate solutions.

5 The Code Identification Problem


The essential simplification of split control code in comparison to split programs
lies in the decision to view code as binary data without any need to explain why
these data play the role of programs or may be best understood as programs.
This, however, leaves open the question as to why a certain code is classified as
a control code.

5.1 Splitting Control Code from Inputs


Given a split control code machine, one may take its first argument as just one
of the arguments, thus obtaining a code expression machine function. Looking at
the split control code machine from this level of abstraction, a question similar
to the program recognition problem appears: why has the first of the arguments
been split of to play the role of the control code. The notion of overruling is now
proposed to deal with this issue.

Symmetry Prevents Control Code Identification. It is useful to experi-


ment for a while with one simple design for the split control code machine, by
assuming that there is just a single input. In particular consider the split control
code machine ••f . By forgetting the control code role of the first argument a
control code independent machine is found (say Sf ) and its behavior is given by
Sf (x, y) = x • •f y The only possible justification for making the first argument
play the role of a control code and the second code argument the role of an
input code must lie in properties of the code expression machine function Sf .
28 J.A. Bergstra

Indeed suppose, hypothetically that for all x and y, Sf (x, y) = Sf (y, x). Then
the symmetry is complete and a justification for putting the first argument of
Sf in the role of control code and the other argument in the role of input data
cannot be found.
A justification for putting the first argument in the control code role can only
be found if the code expression machine function is asymmetric and if, moreover,
the first code argument (x) can be said to be ‘more in control’ than the second
one (y) or any other argument. The control code identification problem will now
be simplified to the case that the code expression machine function has exactly
two inputs. The task is to select the control code input, if that can be done.
Here are three informal criteria that may be applied to argue that indeed, the
first argument has the role of a control code:

Overruling Argument Positions. For a two place function F : BS × BS →


BS ∪ {D, M } the first argument overrules the second if there can be found
different sequences e.g. O1 = ”0” and O2 = ”1” and codes x1 and x2 such that
for all code arguments y, F (x1 , y) = O1 and F (x2 , y) = O2 .
If the first argument overrules the second, the second one cannot overrule
the first argument at the same time. Indeed suppose that O3 = O4 and that y3
and y4 are found such that for all x, F (x, y3 ) = O3 and F (x, y4 ) = O4 . Then
it follows that O3 = F (x1 , y3 ) = O1 = F (x1 , y4 ) = O4 , which contradicts the
assumptions.

5.2 Control Code Overrules Non-control Code


Consider Sf (x, Y ), then the first argument is said to be at the control code
position (for Sf ) if the first argument overrules the second argument. If this
condition is met that condition solves the control code identification problem
and justifies the notation x • •f y = Sf (x, y).
The criterion of overruling becomes more interesting if there are more than
two code arguments needed to have a successful (not yielding M ) computation
of the split control code function. Below the concept of a split interpreter con-
trol machine model will be outlined. In the case of split interpreter control two
argument positions have great influence on machine operation: the control code
position where the interpreter control code is located and the position where the
code to be interpreted is located. In this setting the first position overrules the
second position and the second position overrules the third position. But the
dependence of behavior from the second argument is more flexible than for the
first argument.

6 Control Code Assembly Notations


In the sequel the assumption is made that for the class of split control code
machines of interest all control codes are actually transformed programs in the
sense of 4.1. This assumption justifies the jargon used.
Machine Function Based Control Code Algebras 29

6.1 Executables
Given a split control code machine the control codes may as well be called exe-
cutables. This phrase is often used if control codes are transformed programs. It
is common to make a distinction, however, between arbitrary control codes and
executable control codes, or simply ‘executables’ where executables are those
control codes really meant for execution on the model. Lacking any intrinsic
criterion as to which codes are to be considered executable it is reasonable to
assume that some collection Ec of codes comprises the executables. This collec-
tion is specific to the code expression machine function of the machine model
under consideration.

6.2 Assembly and Executable


One may consider the notion of an assembly notation for a given split control
code machine with certified executables Ec . An assembly code should be a useful
tool for a human author in the production of control code. It allows the produc-
tion of readable and meaningful texts which may be automatically transformed
into control codes by means of an assembler which is given in the form of an-
other control code. In this discussion the question who wrote the assembler and
how this may have been done without the help of an appropriate design code is
ignored. The simplest view is that producing the assembler in the absence of a
suitable control code design notation has been an enormous amount of work that
may be seen as a part of the investment of the development of a split control
code machine, which is useless otherwise.
A control code assembly notation (say A) is simply viewed as a subset of the
possible codes. An assembler for A is an executable control code n:a2e:e. Here
the colon is part of the name, which thereby carries the following information:
code reference n including a version number, functionality a2e (from assembly
to executable) and code class e (for executable). The assembler must satisfy this
criterion: for each code x:a ∈ A, n:a2e:e • •x:a ∈ Ec . Thus, a compiler transforms
each control code design into an executable.6

6.3 Assembling an Assembler in Assembly


An interesting thought experiment now runs as follows. Let an assembler x1:a2e:e
for A be given. Assume that a version of an assembler for a is made available,
written in the assembly notation A represented by a itself.7 The following name
will be used for this new version of the compiler: u:a2e:a. The name combines a
local name (u), a functionality (a2e) and a notation that is used (A). The new
assembler cannot be used without preparatory steps. It needs to be assembled

6
It is a common additional requirement that for codes outside A a non-executable
output is produced together with a second output which contains a list of so-called
errors allowing the control code designer to find out what is wrong.
7
This is not implausible as the compiler is a complex piece of control code which may
profit from advanced design technology.
30 J.A. Bergstra

itself by means of the existing assembler which is available in an executable form


already. With that in mind first of all the correctness of this new assembler can
be formalized as follows:
(1) It can be assembled successfully by means of the given assembler i.e.,
x1 : a2e : e • •u : a2e : a ∈ Ec , which permits one to write x2 : a2e : e for
x1:a2e:e • •u:a2e:a.
(2) x2:a2e:e is a control code that transforms each a code into an executable.
(3) For all control code designs y:a the two executable assemblers mentioned in
(2) produce equivalent control executables. That is,
|x1:a2e:e • •y:a|•• = |x2:a2e:e • •y:a|•• .
Assembling an Assembler Once More. Now the following well-known ques-
tion can be raised: is it useful to use x2:a2e:e to assemble the code u:a2e:a once
more? Let x3:a2e:e = x2:a2e:e • •u:a2e:a. This must be an executable because
x2:a2e:e is an assembler for A. Due to the assumption that x2:a2e:e is a correct
compiler this second new assembler must also be a correct one because it deter-
mines a semantics preserving code transformation, i.e x3:a2e:e =behavior x2:a2e:e,
or equivalently: |x3:a2e:e|•• = |x2:a2e:e|•• .
Why may it be an advantage to make this second step? The second step takes
the new executable assembler (obtained after assembling with the old one) to
assemble the new non-executable. A useful criterion here is code compactness,
measuring the length of control codes. Now suppose that the new assembler
produces much shorter code than the old one, then the second step is useful to
obtain a more code compact executable for the new assembler. The executable
version of the new assembler x2:a2e:e has already the virtue of producing short
codes but its own code may still be ‘long’ because it was produced by the old
assembler.
The Assembler Fixed Point. The obvious next question is whether it is useful
to repeat this step once more, i.e. to use x3:a2e:e once more to assemble the new
assembler. It is now easy to prove that this will bring no advantage, because it
will produce exactly the code of x3:a2e:e. This is the so-called compiler fixed
point phenomenon (but phrased in terms of an assembler). In more detail, let
x4:a2e:e = x3:a2e:e • •u:a2e:a. Because |x3:a2e:e|•• = |x2:a2e:e|•• , which has
been concluded above, x3:a2e:e • •u:a2e:a = x2:a2e:e • •u:a2e:a = x3:a2e:e.
This leads to the following conclusion: if a new assembler is brought in for a
notation for which a assembler in an executable form is available and if the new
code is written in the assembly notation itself then it is necessary to translate
the new assembler with the old executable assembler first. Thereafter it may be
an advantage to do this once more with the executable assembler just obtained.
Repeating this step a second time is never needed. This is a non-trivial folk-lore
insight in systems administration/software configuration theory. It exists exactly
at the level of abstraction of machine functions.
The Assembler Application Notation. The machine function of an assem-
bler code merits its own notation: x:a • •a y:d = (x1:a2e:e • •x:a) • •y:d.
Machine Function Based Control Code Algebras 31

Using this notation the versions of the executable code that occur in the
description of the fixed point are: x3 : a2e : e = u : a2e : a • •a u : a2e : a, and
x4:a2e:e = (u:a2e:a • •a u:a2e:a) • •u:a2e:a (= x3:a2e:e • •u:a2e:a). The fixed point
fact itself reads: (u:a2e:a • •a u:a2e:a) • •u:a2e:a = u:a2e:a • •a u:a2e:a.
Correct Assembly Code. An assembly code is correct if its image under the
assembler is an executable, thus z ∈ Ac if x1 : a2e : e • •z ∈ Ec .8 The correct
codes depend on the available assembler codes. Improved assemblers may accept
(turn into an executable) more codes, however, thereby increasing the extension
of ‘correct assembly codes’.

7 More Dedicated Codes


It is reasonable to postulate the availability of an executable code which allows
to test the certified executability of codes. Let a1 : t4e : e be the application
code a testing for executability (functionality t4e) in executable form. Then
a1:t4e:e • •u:d produces an empty file on input u:d (data file u which may or may
not admit the stronger typing u:e) if u:d is a certified executable (i.e. ∈ Ec ), and
a non-empty result e.g. containing a listing of warnings or mistakes otherwise.

7.1 Source Level Code


A code notation for source level code S may be modeled as a collection of
bit sequences. Its codes are denoted e.g. with x: s. A special notation for the
application of a source level code is useful. The action of source level code on a
sequence of inputs y is given by x:s • •s y. Given a compiler u:s2a:a for S this
can be defined by: x:s • •s y = (u:s2a:a • •a x:s) • •a y.
It is assumed that u:s2a:a • •a x:s never produces D and that the first output
is a correct assembler code provided the second result (u:s2a:a • •2a x:s) is the
empty sequence. The correct S codes9 are denoted with Sc .
Compiling a ‘New’ Compiler. Given a compiler u1:s2a:a for S a new compiler
u:s2a:s for S written in S may be provided. Then to make this new compiler
effective it should itself be compiled first and subsequently be used to compile
itself. Then the compiler fixed point is found entirely similar to the case of the
assembler fixed point mentioned before. The new compiler may be compiled into
assembly: u2:s2a:a = u1:s2a:a • •a u:a2e:s. It is then required that for all y:s ∈ Sc ,
|u1:s2a:a • •a y:s|••a = |u2:s2a:a • •a y:s|••a .
The Compiler Fixed Point. The next phases are given using source level
application notation: x:s • •s y:d = (u1:s2a:a • •a x:s) • •a y:d. Using this notation
the versions of the assembler code that occur in the description of the fixed point
are: u3:s2a:a = u:s2a:s • •s u:s2a:s, and x4:a2e:e = (u:a2e:a • •a u:a2e:a) • •u:a2e:a.
The fixed point fact itself reads: (u:s2a:s••s u:s2a:s)••u:s2a:s = u:s2a:s••s u:s2a:s.

8
Ac may be called the pseudo-empirical syntax of A.
9
Compiler based pseudo-empirical syntax of S.
32 J.A. Bergstra

7.2 Interpreted Intermediate Code


Taking three or more inputs and two outputs it becomes possible to view the first
input as the control code for an interpreter, the second input as a control code to
be interpreted and the third input (and further inputs) as an ordinary argument,
while the first output serves as the standard output and the second output serves
as error message listing when needed. In actual computer technology compilation
and interpretation are not alternatives for the same codes. A plausible setting
uses an interpreted intermediate code notation I Correct intermediate codes
must be defined via some form of external criterion. Thus Ic may be the collection
of certified intermediate codes. The functionality of interpreters for I is denoted
int4i, an interpreter for I is an executable u:int4i:e. The application of I code
is then defined as x:i • •int4i y = u:int4i:e • •x:i  y.

Compiling to Interpreted Code. A compiler from source code notation S


to I is an executable u:s2i:e. Having available a compiler to I an alternative
definition of the correct codes for S is given by: x ∈ Sc:int if and only if u:s2i:
e • •x ∈ Ic .10
In the presence of a compiler for S to I as well as to A (which has a definitional
status for S by assumption) the following correctness criterion emerges: (i) for
all x:s ∈ Sc ∩ Sc:int , and for all data files y: x:s • •s y = (u:s2i:e • •x:s) • •int4i y
and (ii) Sc ⊆ Sc:int .
It is also possible that S is primarily defined in terms of its projection to I.
Then the second condition becomes Sc ⊇ Sc:int , thus expressing that the com-
piler which now lacks a definitional status can support all required processing.

7.3 Managed Execution of Compiled Intermediate Code


The intermediate code is managed (executed in a managed way) if it is both
assembled and compiled. Thus an interpreter, vm called a run-time (system)
vm : rt4ae : e acts as an interpreter for a compiled version of the intermediate
code. rt4ae is the role of run-time for an almost executable code. This code is
also called a virtual machine, which has been reflected in its mnemonic name. The
intermediate code is compiled by c:i2ae:e to a stage between the assembly level
and the executable level denoted AE. This leads to the definition x:i • •i:m y =
vm:rt4ae:e • •(c:i2ae:e • •x:i)  y.
Managed code execution allows the virtual machine to perform activities like
garbage collection in a pragmatic way, not too often and not too late, depending
on statistics regarding the actual machine. Other activities may involve various
forms of load balancing in a multi-processor system. Also various security checks
may be incorporated in managed execution.

JIT Compiled Intermediate Code. At this stage it is possible to indicate


in informal terms what is actually supposed to happen in a modern computer
system. Source codes are compiled to an intermediate code format. That code

10
Interpreter based pseudo-empirical syntax for S.
Machine Function Based Control Code Algebras 33

format is assumed to be widely known thus permitting execution on a large


range of machines. Its execution involves managed execution after compilation.
However, compilation is performed in a modular and fashion and the run-time
system calls the compiler which then compiles only parts of the intermediate code
when needed, throwing the resulting codes away when not immediately needed
anymore, and recompiling again when needed again. This is called ‘just in time
compilation’ (or JITting for short). The use of JIT compiler based managed
execution has the following advantages:
(i) The accumulation of a machine specific (or rather a machine architecture
specific) code base is avoided, in favor of a code base consisting of intermediate
code which can be processed on many different machine architectures. Espe-
cially if the code is very big and during an execution only a small fraction of it
is actually executed integral compilation before execution becomes unfeasible.
Therefore mere code size implies that a decision to design an intermediate code
for compilation forces one to opt for JIT compilation.
(ii) The advantages of managed execution are as mentioned above. Only
managed execution can provide a high quality garbage collection service, because
a naive approach where all garbage is removed immediately after its production
(i.e. becoming garbage) is impractical and inefficient.
(iii) The advantages of (JIT) compilation in comparison to intermediate code
interpretation are classical ones. Compiled code execution may be faster than
code interpretation, and, perhaps more importantly, because the intermediate
code will be (JIT) compiled, it may be more abstract and therefore shorter, more
comprehensible and more amenable to maintenance than it would be were it to
be interpreted.

8 Machine Functions for JIT Compilation


The formalization of JIT compilation requires code to consist of parts that can
be independently compiled. The syntax of multi-part code below will admit
this aspect of formalization. For the various parts of multi-part code to work
together intermediate results of a computation need to be maintained. That
can be modeled by using machine functions that produce a state from a state
rather than a code from one or more codes. State machine functions (also termed
state transition functions) will be introduced to enable the formalization of the
sequential operation of different program parts. Finally program parts may be
executed several times during a computation in such a way that different entry
points in the code are used. Natural numbers will be used as code entry points.

8.1 Multi-part Code


Code can be extended to code families in various ways. Here we will consider so-
called flat multi-part code. A flat multi-part code is a sequence of codes separated
by colons such as for instance 10011:1110:0:0001. The collection of these codes
is denoted with MPC (multi-part codes). It will now be assumed that the input
of a machine function is a code family. The case dealt with up to now is the
34 J.A. Bergstra

special case where multi-part codes have just one part (single part codes: SPC).
Access to a part can be given via its rank number in the flat multipart code
listing, starting with 1 for the first part. MPCN, (multi-part code notation)
admits bit sequences and the associative operator ‘:’. Brackets are admitted
in MPCN but do not constitute elements of the code families, for instance:
(10:011):00 = 10:(011:00) = 10:011:00. When multi-part codes are used it is
practical to modify the type of machine functions in order to admit multi-part
inputs.
The n’th part of a multi-part code x is denote partn (x). This notation pro-
duces empty codes if n is too large.
Non-separate Assembly and Compilation. An assembler or a compiler for
multipart code that produces a single executable will be called a non-separate
assembler or a non-separate compiler. To use a positive jargon a non-separate
assembler or compiler may be called a gluing assembler or compiler. It is assumed
that a gluing assembler/compiler detects itself which parts of a code must be
used, for instance by looking at a code that contains some marker and using some
form of cross referencing between the parts. Let MPA be a multipart assembly
code notation. c:mpa2e:e is an executable which transforms multi-part control
codes to single-part executables. A gluing compiler can be used to define the
meaning of multi-part control codes as follows:
x:mpa • •mpa y = (c:mpa2e:e • •x:mpa) • •y.
Similarly for a notation s, mps may be its multi-part extension. Then one
may define its meaning given a non-separate compiler by
x:mps • •mps y = (c:mps2a:e • •x:mps) • •a y.
Separate Compilation. Separate compilation holds for code families of some
code notation if a compiler ψ distributes over the code part separator : ψ(x:y) =
ψ(x):ψ(y)). A compiler with this property is called separation modular. So it
may now be assumed that mS (modular S) consists of sequences of S separated
by ‘:’ and that u:ms2e:e is a separation modular compiler for mS i.e., using
MPCN, u:ms2e:e • •y:z = (u:ms2e:e • •y):(u:ms2e:e • •z) for all y and z which
may themselves be code families recursively. For the second output component
there is a similar equation: u:ms2e:e • •2 y:z = (u:ms2e:e • •2 y):(u:ms2e:e • •2 z);
corresponding equations are assumed for the other output components.

8.2 State Machine Functions


Machine functions, as used above, transform sequences of (possibly multi-part)
input files into output files. This is adequate as long as intermediate stages of
a computation need not be taken into account. Modeling separate compilation
typically calls for taking into account intermediate stages of a computation. In
order to do so we will use state machine functions rather than machine functions.
State machine functions provide a new state given a control code and a state.
At the same time it becomes necessary to deal with code entry points. A code
entry point is a natural number used to indicate where in (the bit sequence
Machine Function Based Control Code Algebras 35

of) a code execution is supposed to start. Code entry points will arise if code
has been produced via imperative programming using a program notation with
subroutine calls and subsequent compilation and/or assembly. In addition a state
machine function must produce information concerning which part of the multi-
part code is to be executed next. It is now assumed that ST denotes the collection
of states of a machine model. A multipart part code state machine function has
the following type:
(SP C × CEP × ST ) → ((ST ∪ {D, M }) × CP N × CEP )
Here CPN is the collection of code part numbers and CEP is the the collection
of code entry points. The CEP argument indicates where in the control code
execution is supposed to start. These two numbers at the output side indicate
the code part that must be executed in the next phase of the execution and the
entry point within that code part which will be used to start the computation
for the next phase. If the code part number is outside the range of parts of the
MPC under execution (in particular if it equals 0), the computation terminates,
and if the exit point is outside the range of exit points of a part termination will
also result.
Relating Machine Functions with State Machine Functions. The con-
nection with machine functions can be given if input files can be explicitly incor-
porated in the state (using a function ‘in’) and output functions can be extracted
(by means of ‘out’). State machine functions will be denoted with a single bul-
let. A CEP argument will be represented as superscript, a CPN argument may
be given as another superscript, (preceding the CPN superscript), and other
information may be given in subscripts. For single part code x one may write:
x:e • •y = out(state(x:e •1 in(y, s0 ))).
Here out produces a sequence of outputs different from M , unless the com-
putation diverges. This equation provides a definition that works under the as-
sumptions that the computation starts in state s0 and that at the end of the
single part computation the next part result equals 0: i.e. for some s and p ,
x:e •1 (in(y, s0 ))) = (s , 0, p ). Moreover it must be noticed that in the notation
used the separator in the code argument separates name parts for a single part
code rather than code parts of a multi-part code.
State Machine Functions for Multi-part Control Code. Assuming that
multi-part code execution starts by default with the first code part at its first
entry point the following pair of equations (start equation and progress equation)
defines the machine function for multi-part executable code (tagged with mpe).
Start equation:
x:mpe • •y = out(state(x:mpe •1,1 (in(y, s0 ))))
expressing that execution starts with the first code part at entry point 1, and
with the following progress equation:
 
partp (x:mpe) •q s = (p , q  , s ) → x:mpe •p,q s = (s  p = 0  (x:mpe •q ,p s ))
36 J.A. Bergstra

where the progress equation is valid under the assumption that the only returned
code part number outside the code part range of x:mpe equals 0.

8.3 Unmanaged JIT


The JIT equation for (almost) unmanaged11 multi-part intermediate code exe-
cution connects the various ingredients to a semantic definition of JIT execution
for a multi-part intermediate code x:i. The start equation reads

x:mpi • •jit y = out(state(x:mpi •1,1


jit (in(y, s0 ))))

with the progress equation:

(c:i2e:e • •partp (x:mpi)) •q s = (p , q  , s ) →


  q  ,p 
x:mpi •p,q
jit s = (s  p = 0  (x:mpi •jit s ))

For the progress equation it is assumed that a compiler/assembler c:i2e:e for


the intermediate code is available that translates it into executable code.

Limited Buffering of Compiled Parts. A buffer with the most recent compi-
lations (i.e. files of the form c:i2e:e • •partp (x:i)) of code parts can be maintained
during the execution of the multi-part intermediate code. Ass this buffer has a
bounded size, some code fragments will probably have to be deleted during a
run12 and may subsequently need to be recompiled. The idea is that a compila-
tion is done only when absolutely needed, which justifies the phrase JIT.

8.4 Managed (and Interpreted) Multi-part Code Execution


Managed code execution involves the interpretation of (JIT) compiled interme-
diate code. The start equation reads

x:mpi • •manjit y = out(state(x:mpi •1,1


manjit (in(y, s0 ))))

The corresponding progress equation reads

vm:rt4ae:e •q in(c:i2ae:e • •partp (x:mpi), s) = (p , q  , s ) →


  q  ,p 
x:mpi •p,q
manjit s = (s  p = 0  (x:mpi •manjit s ))

The virtual machine vm : rt4ae : e computes the intermediate state reached


after execution has exited from code part p after incorporating the code to be
executed in the state. The interpreter takes into account that the code to be
interpreted has to be started at entry state q.

11
The only aspect of execution management is taking care of JIT compiling an inter-
mediate code part when needed and starting its execution subsequently.
12
Code garbage collection.
Machine Function Based Control Code Algebras 37

A Requirement Specification for Managed JITting. Provided a compiler


c:mpi2a:e is known the following identity must hold:

x:mpi • •manjit y = (c:mpi2a:a • •x:mpi) • •a y.

This equation may be used alternatively as a correctness criterion for the


definitions using managed JIT compiled execution. The compiler based definition
will not capture any of the features of execution management but it may be very
useful as a semantic characterization of the execution of multi-part intermediate
code.

9 Verifying Compilers
Recent work of Hoare (e.g. [4]) has triggered a renewed interest in the idea that
program verification may be included in the tasks of a compiler. The notion of
a verifying compiler has been advocated before by Floyd in 1967, [8], but it has
not yet been made practical, and according to Hoare it might well be consid-
ered a unifying objective which may enforce systematic cooperation from many
parts of the field. Getting verifying compilers used in practice at a large scale is
certainly a hard task and Hoare refers to this objective as a ‘grand challenge’ in
computing. Turning quantum computing into practice, resolving P versus NP,
proofchecking major parts of mathematics and the theory of computing in the
tradition of de Bruijn’s type theory, removing worms and viruses from the inter-
net, and (more pragmatically) the .NET strategy for using the same intermediate
program notation for all purposes may be viewed as other grand challenges. In
computational science the adequate simulation of protein unfolding stands out
as a grand challenge; in system engineering the realization of the computing grid
and the design of open source environments of industrial strength for all major
application areas, constitute grand challenges as well.
Program verification has been advocated by Dijkstra (e.g. in EWD 303) be-
cause testing and debugging cannot provide adequate certainty. Indeed, if human
certainty is to be obtained proofs may be unavoidable. For the design of com-
plex systems, however, the need for proofs is harder to establish. The biological
evolution of the human mind has produced a complex system, at least from
the perspective of current computer technology, through a design process using
natural selection (which seems to be a form of testing rather than a form of
verification) as its major methodology for the avoidance of design errors. This
might suggest a way around program verification as well: rather than verify-
ing program X, one may design a far more complex system C using X and
many variations of it which is then tested. A test of X may involve millions
of runs of X and its variations. Usability of C, as demonstrated using tests,
will provide confidence in components like X, as well as the possibility to use
these components in a complex setting. Whether confidence or certainty is what
people expect from computing systems is not an obvious issue, however. As a
consequence this particular grand challenge has a subjective aspect regarding
38 J.A. Bergstra

the value of its objective that the other examples mentioned above seem not to
have.
In this section the phenomenon of a verifying compiler will be discussed at
the level of (state) machine functions and control code algebra.

9.1 Requirement Codes and Satisfaction Predicates


REQ will represent a collection of predicates on behaviors that will viewed as
descriptions of properties of codes under execution. r ∈ REC may serve a s a
requirement on a behavior. A given predicate satreq (B satreq r, for behavior B
and requirement r ∈ REQ) determines the meaning of requirements. Having
available the satisfaction predicate at the level of behaviors, it may be gradually
lifted to source codes.
If B is the behavior of executable code x on machine m (B = |x|••m ), then
x satm:e r if B satreq r. Further for an assembly code x:a one may write x:a satm:a r
if (c:a2e:e • •m x:a) satm:e r, given an assembler c:a2e:e for A.
For a high-level and machine independent program notation s we define
x:s sats r if (v:s2a:e • •m x:s) satm:a r for a machine m and a compiler v:s2a:e
on the same machine m. In the sequel the machine superscripts will be omitted,
because machine dependence is not taken into account.

9.2 Proofs and Proof Checking


Given r it may be viewed a design task to find a control code x:s such that
x : s sats r. A proof for this fact is a code p in a collection PROOFS of codes
such that p  x:s sats r, where  is a relation such that p  x:s sats r implies
that x:s sats r. This implication represents the so-called soundness of the proof
method. Validating  requires a proof checking tool w:ch4p4s:e (check for being
a proof for an s code) such that w:ch4p4s:e • •p, x:s, r always terminates, never
produces an error, and produces the code ”0” exactly if the code p constitutes
a proof of x:s sats r.
Non-automatic Proof Generation. Proving the correctness of control codes
by hand, as a control code production strategy, amounts to the production of a
pair (x, p) such that w:ch4p4s:e••p, x, r = ”0”. Here it is taken for granted that if
there is to be any chance of obtaining a proof for the control code that code may
have to be specifically designed with the objective to deliver the required proof
in mind. It seems to be a folk-lore fact in system verification practice that hand
made proofs can only be given if the code to be verified is somehow prepared
for that purpose, though in theory that need not be the case. A complete proof
method guarantees the existence of a proof for each required fact (concerning
a code that satisfies a requirement) . But actually finding a proof which theory
predicts to exist is by no means an easy matter.
Infeasibility of Proof Search. The task to prove code correctness manually
(and have it machine checked thereafter) may be considered an unrealistic bur-
den for the programmer. Therefore control code production methods are needed
Machine Function Based Control Code Algebras 39

which can take this task out of the hands of human designers. The most obvious
strategy is automated proof search.
Having available a proof method and a proof checker, one may design an
automated proof generator as a code u:pg4s:e, such that u:pg4s:e••x:s, r produces
a proof p with p  x:s sats r if such a proof exists with the computation diverging
otherwise (or preferably producing a negative answer, i.e. a code outside REQ).
Even if a proof checker exists in theory its realization as an executable code with
useful performance seems to be elusive. Therefore this strategy to avoid the need
for people to design proofs has been considered infeasible and a different strategy
is needed.

9.3 Verifying Compilers


Given the high-level control code notation s a version as of it that admits an-
notations is designed. A code in as is considered an annotated code for s. Us-
ing a tool strip : as2s : e the annotations can be removed and a code in s is
found which determines the meaning of an annotated s code. In other words:
x:as • •as y = (strip:as2s:e • •x:as) • •s y.
A requirement r may will be included as a part of the annotation. The require-
ment can be obtained (extracted) from the annotated code by means of the ap-
plication of a tool u:as2r:e. The computation u:as2r:e••x:as either produces a re-
quirement r ∈ REQ or it produces an error (M for meaningless). If a requirement
is produced that implies that as a part of the extraction the computation has suc-
ceeded in generating a proof for the asserted program and checking that proof.
Thus in this case it is guaranteed that strip:as2s:e • •x:as sats (u:as2r:e • •x:as).
The production of adequately annotated control code may be simpler than
the production of proofs as it allows the designer to distribute requirements over
the code thus simplifying the proofs that the extraction tool needs to find by
itself. In addition it is held by some that for each construction of a source code
a ‘natural’ assertion may be found, which the conscious control code author
should have in mind. This allows to provide a canonical annotation for any
source code, which can be proved without much trouble. Then follows a stage
of logic that helps to derive the ‘required’ requirement from a requirement that
has been automatically synthesized from the canonical annotation. This piece of
logic needs careful attention, because by itself it is as unfeasible as P= NP. The
asserted code needs a good deal of guidance cast in the form of annotations for
this matter.

Verifying Compilers and Multi-part Code. Large amounts of code will be


organized as multi-part codes, admitting JIT compilation strategies. A compu-
tation in this setting uses the JIT compiler to transform part after part in an
(almost) executable form. It may be assumed that verifying compilers are used
to obtain the various code parts in the code base at hand. It is reasonable to
assume that these requirements have been stored in a database annex to the
code base. In that context it is preferable to refer to the stored requirements
as specifications. Indeed, because the JIT compiler using the various parts must
40 J.A. Bergstra

rely on the use of valid code it cannot accept the possibility that a code part
defeats the extraction of a requirement.
Now complimentary to JIT compilation there is JIT requirements integration,
a process that allows the computation to dynamically synthesize a requirement
from the requirements known to be satisfied by the various parts it uses. The
JIT compiler should combine requirements synthesis and verification. This is a
difficult task that by itself leads to different solution architectures.

Trivial Requirement Synthesis JIT Compilation. A straightforward way


to obtain the requirements integration functionality is to assume that all speci-
fications for parts used during a computation are identical (which is plausible if
these are understood as global system invariants), thereby shifting the engineer-
ing burden to the production of the code base. This solution architecture will be
called the trivial requirements synthesis JIT architecture (TRS-JIT).
For TRS-JIT it becomes helpful to admit storing a collection of requirements
for the various code parts in the annex database. It should be noticed, however,
that the previous discussion of verifying compilers provides no technique for
finding a multitude of validated requirements for the same (compiled) code. In
addition, as the code base contains code parts with different requirements (stored
in the annex data base), the JIT compiler now faces the possibility of running
into a code part that is not equipped with a suitable requirement. It may be the
case that the requirement for the part is logically stronger than needed, but that
is impossible to check dynamically. Thus it must be accepted that executions may
always stop in an error condition indicating that a part had to be JIT compiled
for which the needed requirement was absent. This is a severe drawback because
it is useless for real time control applications. If this drawback is unacceptable
the JIT compiler must be shown always to ask for a code that exists in the code
base and that is equipped with the required specification.

A Verifying JIT Compiler. Guaranteeing that a computation will not halt at


an ill-specified of even absent code part is the task of the verifying JIT compiler
in the case of the trivial requirements synthesis architecture. This leads to a two-
phase strategy that first checks this latter property using a dataflow analysis on
the initial code, and thereafter a check that all needed parts are equipped with
the required specification. In this stage a limited amount of proof generation
may be used to allow parts that have logically stronger specifications to be used
if the required specifications can be derived by these limited means.

10 Conclusion
Machine functions have been used to formalize several software processing mech-
anisms at a high level of abstraction, by means of the formation of code algebras.
This abstract formalization has been proposed in particular for compilation, as-
semblage, interpretation, and for managed and unmanaged, interpreted and just
in time compiled multi-part code execution, and for verifying compilers. While
the notion of a verifying compiler can be captured to some extent at the ab-
Machine Function Based Control Code Algebras 41

straction level of CCA, this seems not to be the case for the unavidable concept
of a verifying JIT compiler, however.

References
1. A.W.Appel. Axiomatic bootstrapping, a guide for compiler hackers. ACM TOPLAS,
16(6):1699–1719, 1994.
2. J.A. Bergstra and M.E. Loots. Program algebra for sequential code. Journal of
Logic and Algebraic Programming, 51(2):125–156, 2002.
3. J.A. Bergstra and S.F.M. van Vlijmen. Theoretische Software-Engineering. ZENO-
Institute, Leiden, Utrecht, The Netherlands, 1998. In Dutch.
4. C.A.Hoare. The verifying compiler, a grand challenge for computer research. JACM,
50(1):63–69, 2003.
5. H.Bratman. An alternate form of the UNCOL diagram. CACM, 4(3):142, 1961.
6. J.Earley and H.Sturgis. A formalism for translator interactions. CACM, 13(10):607–
617, 1970.
7. M.I.Halpern. Machine independence: Its technology and economics. CACM,
8(12):782–785, 1965.
8. R.W.Floyd. Assigning meanings to programs. Proc. Amer. Soc. Symp. Appl. Math.,
19:19–31, 1967.
9. A. Turing. On computable numbers, with an application to the entscheidungsprob-
lem. Proc. London Math. Soc. Ser 2, 42,43:230–265,544–564, 1936.
Exploiting Abstraction for Specification Reuse.
The Java/C Case Study

Egon Börger1 and Robert F. Stärk2


1
Dipartimento di Informatica, Università di Pisa
boerger@di.unipi.it
2
Computer Science Department, ETH Zürich
staerk@inf.ethz.ch

Abstract. From the models provided in [14] and [4] for the semantics of
Java and C programs we abstract the mathematical structure that un-
derlies the semantics of both languages. The resulting model reveals the
kernel of object-oriented programming language constructs and can be
used for teaching them without being bound to a particular language. It
also allows us to identify precisely some of the major differences between
Java and C.

1 Introduction
In this work the models developed in [14] and in [4] for a rigorous definition
of Java and C and their implementations on the Java Virtual Machine (JVM)
resp. in the Common Language Runtime (CLR) of .NET are analyzed to ex-
tract their underlying common mathematical structure. The result is a platform-
independent interpreter of high-level programming language constructs which
can be instantiated to concrete interpreters of specific languages like Java, C,
C++. It is structured into components for imperative, static, object-oriented,
exception handling, concurrency, pointer related, and other special language fea-
tures (like delegates in C) and thus can be used in teaching to introduce step
by step the basic concepts of modern programming languages and to explain the
differences in their major current implementations.
The task is supported by the fact that the models in [14, 4] have been defined
in terms of stepwise refined Abstract State Machines (ASMs), which

separate the static and the dynamic parts of the semantics,


capture the dynamics by ASM rules, one rule set for each cluster of language
constructs1 , describing their run-time effect on the abstract program state,
guided by a walk through the underlying attributed abstract syntax tree.

1
A related modular approach, to define the semantics of a language by a collection
of individual language construct descriptions, now named “incremental semantics”,
appears in [12, 6].

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 42–76, 2004.

c Springer-Verlag Berlin Heidelberg 2004
Exploiting Abstraction for Specification Reuse 43

The stepwise refined definitions unfold in particular the following layered


modules of orthogonal language features, which are also related to the historical
development of programming concepts from say FORTRAN, via PASCAL and
MODULA, SMALLTALK and EIFFEL, to JAVA and C:
imperative constructs, related to sequential control by while programs, built
from statements and expressions over simple types,
classes with so-called static class features, namely procedural abstraction
with class initialization and global (module) variables,
object-orientation with class instances, instance creation, instance methods,
inheritance,
exception handling,
concurrency (threads),
special features like delegates, events, etc.
so-called unsafe features like pointers with pointer arithmetic.
This leads us to consider a sequence of sublanguages LI ⊂ LC ⊂ LO ⊂ LE ⊂
LT ⊂ LD ⊂ LU of a general language L, which can be instantiated to the corre-
sponding sublanguages of Java and C defined in [14, 4]. The interpreter ExecLS
of each language LS in the sequence conservatively (purely incrementally) ex-
tends its predecessor. We show how it can be instantiated to an interpreter of
JavaS or CS by variations of well-identified state or machine components. The
interpreter ExecL of the entire language L is the parallel composition of those
submachines:

ExecL ≡
ExecLI
ExecLC
ExecLO
ExecLE
ExecLT
ExecLD
ExecLU

Delegates and unsafe code are peculiar features of C and not included at
all into Java, therefore we refer for the two corresponding submachines to [4].
Since the thread models of Java and C have been analyzed and compared ex-
tensively in [14, 1, 13], we skip here to reformulate the interpreter ExecLT . The
static semantics of most programming languages can be captured appropriately
by mainly declarative descriptions of the relevant syntactical and compile-time
checked language features, e.g., typing rules, control-flow analysis, name reso-
lution, etc.; as a consequence we concentrate our attention here on the more
involved language dynamics for whose description the run-time oriented ASM
framework turns out to be helpful. So we deal with the static language features
in the form of conditions on the attributed abstract syntax tree, resulting from
parsing and elaboration and taken as starting point of the language interpreter
ExecL.
44 E. Börger and R.F. Stärk

This paper does not start from scratch. We tailor the exposition for a reader
who has some basic knowledge of (object-oriented) programming. A detailed
definition of ASMs and their semantics, as originally published in [11], is skipped
here, because ASMs can be correctly understood as pseudo-code operating over
abstract (domains of) data. A textbook-style definition is available in Chapter 2
of the AsmBook [5].

2 The Imperative Core LI


In this section we define the sequential imperative core LI of our general lan-
guage L together with a model for its semantics. The model takes the form of
an interpreter ExecLI , which defines the basic machinery for the execution of
the single language constructs. LI provides structured while-programs consist-
ing of statements to be executed (appearing in method bodies), which are built
from expressions to be evaluated, which in turn are constructed using predefined
operators over simple types. The computations of our interpreter are supposed
to start with an arbitrary but fixed L-program. As explained above, syntax and
compile-time matters are separated from run-time issues by assuming that the
program is given as an attributed abstract syntax tree, resulting from parsing
and elaboration.

2.1 Static Semantics of LI


Expressions and statements of the sublanguage LI are defined as usual by a
grammar, say the one given in Fig. 1. We view this figure as defining also the
corresponding ASM domains, e.g., the set Exp of expressions built from Literals
and variable expressions using the provided operators (unary, binary, condi-
tional) and including besides some possibly language-specific expressions the set
Sexp of statement expressions, i.e., expressions than can be used on the top-level
like an assignment to a variable expression using ‘=’ (or an assignment operator
from a set Aop or one of the typical prefix/postfix operators ‘++’ or ‘--’). In
this model the set Vexp of variable expressions (lvalues) consists of the local
variables only and will be refined below. The prefix operators ‘++’ and ‘--’ are
syntactically reduced to assignment operators, e.g., ++v is reduced to v += 1.
The auxiliary sets, like Uop of unary operators, which one may think of as in-
cluding also operators to construct type cast expressions of form ‘(’ Type ‘)’ Exp,
vary from language to language. SpecificExp(L) may include expressions that are
specific for the language L, like ‘checked’ ‘(’ Exp ‘)’ and ‘unchecked’ ‘(’ Exp ‘)’
in the model CI in [4, Fig. 1]. In the model JavaI in [14, Fig. 3.1] the set
SpecificExp(L) is empty. Similarly, the set JumpStm of jump statements may
vary from language to language; in JavaI it consists of ‘break’ Lab ‘;’ and
‘continue’ Lab ‘;’, in CI of ‘break’ ‘;’ | ‘continue’ ‘;’ | ‘goto’ Lab ‘;’.
SpecificStm(L) may contain statements that are specific to the language L, e.g.,
‘checked’ Block | ‘unchecked’ Block for the language CI . In JavaI it is empty.
Bstm may also contain block statements for the declaration of constant expres-
Exploiting Abstraction for Specification Reuse 45

Exp ::= Lit | Vexp | Uop Exp | Exp Bop Exp | Exp ‘?’ Exp ‘:’ Exp
| Sexp | SpecificExp(L)
Vexp ::= Loc
Sexp ::= Vexp ‘=’ Exp | Vexp Aop Exp | Vexp ‘++’ | Vexp ‘--’
Stm ::= ‘;’ | Sexp ‘;’ | Lab ‘:’ Stm | JumpStm
| ‘if’ ‘(’ Exp ‘)’ Stm ‘else’ Stm | ‘while’ ‘(’ Exp ‘)’ Stm
| SpecificStm(L) | Block
Block ::= ‘{’ {Bstm} ‘}’
Bstm ::= Type Loc ‘;’ | Stm

Fig. 1. Grammar of expressions and statements in LI

sions whose value is known at compile time, like ‘const’ Type Loc ‘=’ Cexp ‘;’
in CI .
Not to burden the exposition with repetitions of similar arguments, we do
not list here statements like do, for, switch, goto case, goto default, etc.,
which do appear in real-life languages and are treated analogously to the cases
we discuss here. When referring to the set of sequences of elements from a set
Item we write Items. We usually write lower case letters e to denote elements
of a set E , e.g., lit for elements of Lit. For expository purposes, in Fig. 1 we
also neglect that in C labeled statements are only allowed as block statements,
whereas in Java, every statement (also embedded statements) can have a label.
Different languages usually exhibit not only differences in syntax, but above
all different notions of types with their conversion and promotion rules (sub-
type or compatibility definition), different type constraints on the operand and
result values for the predefined operators, different syntactical constraints for
expressions and statements like scoping rules, definite assignment and reachabil-
ity rules, etc. As a consequence, the static analysis differs, e.g., to establish the
correctness of the definite assignment conditions or more generally of the type
safety of well-typed programs (for Java see the type safety proof in [14, Ch. 8],
for C see the proof of definite assignment correctness in [8]). Since this paper is
focused on modeling the dynamic semantics of a language, we omit here any gen-
eral discussion of standard static semantics issues and come back to them only
where needed to explain how the interpreter uses the attributed abstract syntax
tree of a well-typed program. E.g., we will use that each expression node exp
in the attributed syntax tree is annotated with its compile-time type type(exp),
that type casts are inserted in the syntax tree if necessary (reflecting implicit
type conversions at compile-time), etc.

2.2 Dynamic Semantics of LI


The dynamic semantics for LI describes the effect of statement execution and of
expression evaluation upon the program execution state, so that the transition
rule for the LI interpreter (the same for its extensions) has the form
46 E. Börger and R.F. Stärk

ExecLI ≡
ExecLExpI
ExecLStmI

The first subrule defines one execution step in the evaluation of expressions;
the second subrule defines one step in the execution of statements.

Syntax Tree Walk. To facilitate further model refinements by purely incre-


mental extensions, the definition proceeds by walking through the abstract syn-
tax tree, starting at pos = root-position, to compute at each node the effect of
the program construct attached to the node. We formalize the walk by a cursor
, whose position in the tree – represented by a dynamic function pos: Pos – is
updated using static tree functions, leading from a node in the tree down to its
first child, from there to the next brother or up to the parent node (if any), as
illustrated by the following self-explanatory example. Pos is the set of positions
in the abstract syntax tree. A function label : Pos → Label decorates nodes with
the information which identifies the grammar rule associated with the node. For
the sake of notational succinctness and in adherence to widespread program-
ming notations, we use some concrete syntax from Java or C to describe the
labels, thus hiding the explicit introduction of auxiliary non-terminals2 . In the
example the label of the root node is the auxiliary non-terminal If , identifying
the grammar rule which produces the construct if (exp) stm1 else stm2 —the
‘occurrence’ of which here constitutes what we are really interested in when con-
sidering the tree. As explained below, this construct determines what we will
call the context of the root node or of its children nodes.

If
t
first 6@I up
up @ if (exp) stm1 else stm2
@
t next- t next- t
exp stm1 stm2

Local Variable Values. The next question is what are the values computed
for expressions and how they are stored as current values of local variables,
namely upon executing an assignment statement or as side effect of an expres-
sion evaluation. The answer to the second question depends upon whether such
values are stored directly, as for example in Java, or indirectly via an addressing
mechanism, as for example in C. To capture both possibilities we introduce two
domains, namely of values and of addresses, and use the following two dynamic
functions

locals: Loc → Adr , mem: Adr → SimpleValue ∪ {Undef }

2
In [12] an abstract syntax notation is proposed which avoids Java or C like notation.
Exploiting Abstraction for Specification Reuse 47

which can be used to assign memory addresses to local variables and to store the
values there. To simplify the formulation of how to instantiate our interpreter
for Java or C and to prepare the way for later refinements, we use a macro
WriteMem(adr , t, val ) to denote writing a value of given type t to a given
address. For the sublanguage LI (as for Java) the macro is only an abbreviation
for mem(adr ) := val , which will be refined in the model for LO .
One possible instance of this scheme, namely for Java, is to identify Loc
and Adr so that locals becomes mem. It goes without saying and will not be
mentioned any more in the sequel that a similar simplification applies to all other
functions, predicates, and macros introduced below in connection with handling
the values stored at addresses.
Since the values we consider in LI are of simple types, in this model the
equation
Value = SimpleValue ∪ Adr
holds, which will be refined for LO to include references (and structs, which
appear in C). The fact that local variables have to be uniquely identified can
be modeled by stipulating Loc = Identifier × Pos. For the initialization of the
interpreter it is natural to require that an address has been assigned to each
local variable, but that the value stored there is still undefined.
locals(x ) ∈ Adr for every variable x
mem(i ) = Undef for every i ∈ Adr

Recording Intermediate Values. During the walk through the tree, also in-
termediate results of the elaboration of syntactical constructs appear, which have
to be recorded somewhere, namely values of evaluated (sub-) expressions, but
also possible results of the execution of statements. Statements may terminate
normally, but also abruptly due to jumps (in LI ) or returns from method calls
(in LC ) or to the occurrence of exceptions (in LE ). There are many ways to keep
track of such temporary items, e.g., using a stack (as do many virtual machines,
see for example the Java Virtual Machine operand stack opd in [14, pg. 140]),
or replacing directly the elaborated syntactical constructs by their intermedi-
ate result (as do SOS-based formalisms, see for example the restbody concept
in [14, pg. 38]), or via some dynamic functions defined on the static syntax tree.
We choose here to use a partial function to record the values computed for the
syntactic construct labeled by the node with each node.
values: Pos → Result
For LI , the range Result of this function contains a) Undef , to signal that
no value is defined yet, b) simple values, resulting from expression evaluation,
c) Norm, for normal termination of statement execution, and d) reasons for
abruption of statement execution. The set Abr of abruptions derives here from
the jump statements (see below) and will be refined in successive models to also
contain statement returns and exceptions.
Result = Value ∪ Abr ∪ {Undef , Norm}
48 E. Börger and R.F. Stärk

As intermediate values at a position p the cursor is at or is passing to, the


computation may yield directly a simple value; at AddressPositions as defined
below it may yield an address; but it may also yield a memValue which has to be
retrieved indirectly via the given address (where for LI the memory value of a
given type t at a given address adr is defined by memValue(adr , t) = mem(adr );
the parameter t will become relevant only in the refinements of memValue in
LO and LU ). This is described by the following two macros:

Yield(val , p) ≡
values(p) := val
pos := p

YieldIndirect(adr , p) ≡
if AddressPos(p) then Yield(adr , p)
else Yield(memValue(adr , type(p)), p)

We will use the macros in the two forms Yield(val ) ≡ Yield(val , pos) and
YieldUp(val ) ≡ Yield(val , up(pos)), similarly for YieldIndirect(adr ) and
YieldUpIndirect(adr ).
A context where an address and not a value is required characterizes the con-
text of first children of parent nodes labeled with an assignment or prefix/postfix
operator. It can thus be defined as follows:

AddressPos(α) ⇐⇒ FirstChild(α) ∧ label (up(α)) ∈ {++, --} ∪ Aop


FirstChild(α) ⇐⇒ first(up(α)) = α

Notational Conventions. To reduce any notational overhead not needed by


the human reader, in spelling out the ASM rules below we identify positions
with the occurrences of the syntactical constructs nodes encode via their labels
and those of their children. This explains updates like pos := exp or pos := stm,
which are used as shorthand for updating pos to the node labeled with the
corresponding occurrence of exp respectively stm.3
For a succinct formulation of the interpreter rules we use a macro context(pos)
to describe the context of the expression or statement currently to be handled in
the syntax tree. context(pos) has to be matched against the cases appearing in
the ASM rules below, choosing for the next computation step the first possible
match following the textual order of the rules. If the elaboration of the subtree at
the position pos has not yet started, then context(pos) is the construct encoded
by the labels of pos and of its children. Otherwise, if pos carries already its
result in values, context(pos) is the pseudo-construct encoded by the labels of
the parent node of pos and of its children after replacing the already evaluated
constructs by their values in the corresponding node. This explains notations like
uop  val to describe the context of pos, where pos is marked with the cursor

3
An identification of this kind, which is common in mathematics, has clearly to be
resolved in an executable version of the model.
Exploiting Abstraction for Specification Reuse 49

(), resulting from the successful evaluation of the argument exp of the construct
uop exp (encoded by up(pos) and its child pos), just before uop is applied to val
to YieldUp(Apply(uop, val )).

Expression Evaluation Rules. We are now ready to define ExecLExpI , the


machine for expression evaluation. We do this in a compositional way, namely
proceeding expression-wise: for each group of structurally similar expressions,
defined by an appropriate parameterization described in Fig. 1,4 there is a set
of rules covering each intermediate phase of their evaluation. SpecificExpressions
of L are separately discussed below. The machine passes control from uneval-
uated expressions to the appropriate subexpressions until an atom (a literal or
a local variable) is reached. It can continue its computation only as long as
no operator exception occurs (see below for the definition of UopException and
BopException). When an operator has to be applied, we use a static function
Apply to determine the value the operator provides for the given arguments.
This function can be separately described, as is usually done in the language
manual. Similarly for the static function defining the ValueOfLiteral s.
ExecLExpI ≡ match context(pos)
lit → Yield(ValueOfLiteral (lit))
loc → YieldIndirect(locals(loc))
uop exp → pos := exp
uop  val → if ¬UopException(uop, val ) then
YieldUp(Apply(uop, val ))
exp1 bop exp2 → pos := exp1

val bop exp → pos := exp
val1 bop  val2 → if ¬BopException(bop, val1 , val2 ) then
YieldUp(Apply(bop, val1 , val2 ))
exp0 ? exp1 : exp2 → pos := exp0

val ? exp1 : exp2 → if val then pos := exp1 else pos := exp2
True ?  val : exp → YieldUp(val )
False ? exp :  val → YieldUp(val )
loc = exp → pos := exp
loc =  val → {WriteMem(locals(loc), type(loc), val ), YieldUp(val )}
vexp op= exp → pos := vexp

adr op= exp → pos := exp

adr op= val → let t = type(up(pos)) and v = memValue(adr , t) in
if ¬BopException(op, v , val ) then
let w = Apply(op, v , val ) in
let result = Convert(t, w ) in
WriteMem(adr , t, result)
YieldUp(result)
4
The desired specializations can be obtained expression-wise by mere parameter ex-
pansion, a form of refinement that is easily proved to be correct.
50 E. Börger and R.F. Stärk

vexp op → pos := vexp // for postfix operators op ∈ {++, --}



adr op → let old = memValue(adr , type(pos)) in
if ¬UopException(op, old ) then
WriteMem(adr , type(up(pos)), Apply(op, old ))
YieldUp(old )
SpecificExpI

Note that in an assignment operator op= the result of the operation has to
be converted back to the type of the variable expression, e.g., if c is a variable
of type char, then c += ’A’ is evaluated as c = (char)(c + ’A’), since the
operands of + are promoted to the type int in Java as well as in C.

Language-Specific Expressions. In JavaI the set of SpecificExpressions and


therefore the submachine SpecificExpI is empty, whereas in the model for CI
the set contains checked and unchecked expressions ‘checked’ ‘(’ Exp ‘)’ and
‘unchecked’ ‘(’ Exp ‘)’. The notion of Checked positions serves to define when
an operator exception occurs due to arithmetical Overflow (for which case a rule
will be added in the model for LE ). The principle is that operators for integral
types only throw overflow exceptions in a checked context except for the division
by zero; operators for the type decimal always throw overflow exceptions. By
default every position is unchecked, unless explicitly declared otherwise. This is
formally expressed as follows.

UopException(uop, val ) ⇐⇒ Checked (pos) ∧ Overflow (uop, val )


BopException(bop, val1 , val2 ) ⇐⇒
DivisonByZero(bop, val2 ) ∨ DecimalOverflow (bop, val1 , val2 ) ∨
(Checked (pos) ∧ Overflow (bop, val1 , val2 ))
Checked (α) ⇐⇒ label (α) = Checked ∨
(label (α) = Unchecked ∧ up(α) = Undef ∧ Checked (up(α)))

As a consequence of these definitions and of the fact that the extension by


rules to handle exceptions is defined in the model extension ExecLE , the fol-
lowing SpecificExpI rules of ExecCI do not distinguish between checked and
unchecked expression evaluation.
match context(pos)
checked(exp) → pos := exp
checked( val ) → YieldUp(val )
unchecked(exp) → pos := exp
unchecked( val ) → YieldUp(val )

Statement Execution Rules. The machine ExecLStmI is defined statement-


wise. It transfers control from structured statements to the appropriate substate-
ments, until the current statement has been computed normally or abrupts the
computation. Abruptions trigger the control to propagate through all the enclos-
ing statements up to the target labeled statement. The concept of propagation is
Exploiting Abstraction for Specification Reuse 51

defined for LI in such a way that in the refined model LE the concept of finally
blocks can easily be specified. In case of a new execution of the body of a while
statement, the previously computed intermediate results have to be cleared.5
Since we formulate the model for the human reader, we use the . . .-notation,
for example in the rules for abruption or for sequences of block statements.
This avoids having to fuss with an explicit formulation of the context, typically
determined by a walk through a list.

ExecLStmI ≡ match context(pos)


; → Yield(Norm)
exp; → pos := exp

val ; → YieldUp(Norm)
JumpStm(L)
if (exp) stm1 else stm2 → pos := exp
if ( val ) stm1 else stm2 → if val then pos := stm1 else pos := stm2
if (True)  Norm else stm → YieldUp(Norm)
if (False) stm else  Norm → YieldUp(Norm)
while (exp) stm → pos := exp
while ( val ) stm → if val then pos := stm
else YieldUp(Norm)
while (True)  Norm → {pos := up(pos), ClearValues(up(pos))}
PropagateJump(L)
type loc; → Yield(Norm)
lab: stm → pos := stm
lab:  Norm → YieldUp(Norm)
SpecificStmI

... abr . . . → if up(pos) = Undef ∧ PropagatesAbr (up(pos)) then
YieldUp(abr )
{} → Yield(Norm)
{stm . . . } → pos := stm
{ . . .  Norm} → YieldUp(Norm)
{ . . .  Norm stm . . . } → pos := stm
JumpOutOfBlockStm
{ . . .  abr . . . } → YieldUp(abr )

In JavaI the set JumpStm consists of the jump statements break lab; and
continue lab;, so that the set of abruptions is defined as Abr = Break (Lab) |

5
ClearValues is needed in the present rule formulation due to our decision to have
a static function label and a dynamic function to record the intermediate values
associated to nodes. In a more syntax-oriented SOS-style, as used for the Java model
in [14] where a function restbody combines the two functions label and values into
one, ClearValues results automatically from re-installing the body of the while
statement as new rest program.
52 E. Börger and R.F. Stärk

Continue(Lab). In CI the set JumpStm contains the jump statements break; |
continue; | goto lab;, so that Abr = Break | Continue | Goto(Lab). The
differences in the scoping rules for break; and continue; statements in the two
languages are reflected by differences in the corresponding interpreter rules.
JumpStm(Java) ≡ match context(pos)
break lab; → Yield(Break (lab))
continue lab; → Yield(Continue(lab))

PropagateJump(Java) ≡ match context(pos)


lab:  Break (labb ) → if lab = labb then YieldUp(Norm)
else YieldUp(Break (labb ))
lab:  Continue(labc ) → if lab = labc then
{pos := up(pos), ClearValues(up(pos))}
else YieldUp(Continue(labc ))
JumpStm(C) ≡ match context(pos)
break; → Yield(Break )
continue; → Yield(Continue)
goto lab; → Yield(Goto(lab))

PropagateJump(C) ≡ match context(pos)


while (True)  Break → YieldUp(Norm)
while (True)  Continue → {pos := up(pos), ClearValues(up(pos))}
while (True)  abr → YieldUp(abr )

Due to the differences in using syntactical labels to denote jump statement


scopes, the definitions of how abruptions are propagated upwards differ slightly
for JavaI and for CI , though the conceptual content is the same, namely to pre-
vent propagation at statements which are relevant for determining the abruption
target. For JavaI we have the simple definition6
PropagatesAbr (α) ⇐⇒ label (α) = LabeledStm
whereas for CI we have the following definition:
PropagatesAbr (α) ⇐⇒ label (α) ∈
/ {Block , While, Do, For , Switch}
Since Java has no goto statements, it has an empty JumpOutOfBlockStm
rule, whereas ExecCStmI contains the rule
JumpOutOfBlockStm ≡ match context(pos)
{ . . .  Goto(l ) . . . } → let α = GotoTarget(first(up(pos)), l )
if α = Undef then
{pos := α, ClearValues(up(pos))}
else YieldUp(Goto(l ))
6
We disregard here the minor difference in the formulation of PropagatesAbr in [14],
where the arguments are not positions, but syntactical constructs or intermediate
values.
Exploiting Abstraction for Specification Reuse 53

where an auxiliary function is needed to compute the target of a label in a list


of block statements, recursively defined as follows:
GotoTarget(α, l ) =
if label (α) = Lab(l ) then α
elseif next(α) = Undef then Undef
else GotoTarget(next(α), l )
Analogously to ExecCExpI also ExecCStmI has checked contexts and
therefore the following submachine (which in ExecJavaStmI is empty):
SpecificStmI ≡ match context(pos)
checked block → pos := block
checked  Norm → YieldUp(Norm)
unchecked block → pos := block
unchecked  Norm → YieldUp(Norm)
The auxiliary macro ClearValues(α) to clear all values in the subtree at
position α can be defined by recursion as follows, proceeding from top to bottom
and from left to right7 :
ClearValues(α) ≡
values(α) := Undef
if first(α) = Undef then ClearValuesSeq(first(α))

ClearValuesSeq(α) ≡
ClearValues(α)
if next(α) = Undef then ClearValuesSeq(next(α))

3 Extension LC of LI by Procedures (Static Classes)


In LC the concept of procedures (also called subroutines or functions) is added
to the purely imperative instructions of LI . We introduce the basic mechanism
of procedures first for so-called static methods, which belong to classes playing
the role of modules. Different languages have different mechanisms to pass the
parameters to a procedure call. In Java parameters are passed by-value, whereas
in C also call-by-reference is possible. Classes8 come with variables which play
the role of global module variables, called class or static variables or fields to
distinguish them from instance fields provided in LO . Usually classes come with
some special methods, so-called static initializers or static constructors, which
are used to ‘initialize’ the class. The initialization concepts of different languages
usually differ, in particular through different policies of when a class is initial-
ized. In the extension ExecLC of ExecLI we illustrate these differences for Java

7
Intuitively it should be clear that the execution of this recursive ASM provides
simultaneously – in one step – the set of all updates of all its recursive calls, as is
needed here for the clearing purpose; see [3] for a precise definition.
8
We disregard here the slight variations to be made for interfaces.
54 E. Börger and R.F. Stärk

and C. Normally classes are also put into a hierarchy, which is used to inherit
methods among classes to reduce the labor of rewriting similar code fragments.
As is to be expected, different languages come with different inheritance mech-
anisms related to their type concepts. Since the definition of the inheritance
mechanism belongs mainly to the static semantics of the language, we mention
it only where it directly influences the description of the dynamics.
We present the extension as a conservative (purely incremental) refinement
of the ASM ExecLI , which is helpful for proving properties of the extended
machine on the basis of properties of the basic machine. Conservative refinement
means that we perform the following three tasks (see [2] for a general description
of the ASM refinement method).
Extension of the ASM universes and functions, or introduction of new ones,
for example to reflect the grammar extensions for expressions and state-
ments. This goes together with the addition of the appropriate constraints
needed for the static analysis of the new items (like type constraints, definite
assignment rules, etc.).
Extension of some of the definitions or macros, here for example the predicate
PropagatesAbr (α), to make them work also for the newly occurring cases.
Addition of new ASM rules, in the present case to define the semantics of
the new expressions and statements.

3.1 Static Semantics of LC


In LC a program is a set of compilation units, each coming with declarations of
names spaces (also called packages), using directives (import declarations), type
declarations (classes, interfaces, structs and enumerations), conditions on class
extension, accessibility, visibility, etc. Since in this paper the focus is on dynamic
semantics, we assume nested namespaces to be realized by the adoption of fully
qualified names. We also do not discuss here the rules for class extensions (in-
heritance), for overriding of methods, for the accessibility of types and members
via access modifiers like public, private, etc. This allows us to make use, for
example, of a static function body(m) which associates to a method its code.
The extension of the grammars for Vexp, Sexp, Stm and thereby of the corre-
sponding ASM domains reflects the introduction of Classes with static Field s and
static Methods, which can be called with various arguments and upon returning
may pass a computed value to the calling method. The new set Arg of arguments
appearing here foresees that different parameters may be used. For example, Java
provides value parameters (so that Arg ::= Exp), whereas C allows also ref and
out parameters (in which case Arg ::= Exp | ‘ref’ Vexp | ‘out’ Vexp). We do
not discuss here the different static constraints (on types, definite assignment,
reachability, etc.) which are imposed on the new expressions and statements in
different languages.9

9
See for example [8] for a detailed analysis of the extension of the definite assignment
rules needed when allowing besides by-value parameter passing (as does Java) also
call-by-reference (as does C).
Exploiting Abstraction for Specification Reuse 55

Vexp ::= . . . | Field | Class ‘.’ Field


Sexp ::= . . . | Meth ( [Args] ) | Class ‘.’ Meth ( [Args] )
Args ::= Arg {‘,’ Arg}
Stm ::= . . . | ‘return’ Exp ‘;’ | ‘return’ ‘;’
The presence of method calls and of to-be-initialized classes makes it nec-
essary to introduce new universes to denote multiple methods (pairs of type
and signature), the initialization status of a type (which may have additional
elements in specific languages, e.g., Unusable for the description of class initial-
ization errors in Java, see below) and the sequence of still active method calls
(so-called frame stack of environments of method executions). One also has to
extend the set Abr of reasons for abruption by returns from a method, with or
without a computed value which has to be passed to the caller.
Meth = Type × Msig
TypeState = Linked | InProgress | Initialized
Frame = Meth × Pos × Locals × Values
where Values = (Pos → Result) and Locals = (Loc → Adr )
A method signature Msig consists of the name of a method plus the sequence
of types of the arguments of the method. A method is uniquely determined by
the type in which it is declared and its signature. The reasons for abruptions are
extended by method return:
Abr = . . . | Return | Return(Value)

3.2 Dynamic Semantics of LC


To dynamically handle the (addresses of) static fields, the initialization state of
types, the current method and the execution stack, we use the following new
dynamic functions:
globals: Type × Field → Adr frames: List(Frame)
typeState: Type → TypeState meth: Meth
To allow us to reuse without any rewriting the ExecLI rules as part of the
ExecLC rules, we provide a separate notation (meth, pos, locals, values) for the
current frame, instead of having it on top of the frame stack. We extend the
stipulations for the initial state as follows:
typeState(c) = Linked for each class c
meth = EntryPoint::Main() [EntryPoint is the main class]
pos = body(meth) [The root position of the body]
locals = values = ∅ and frames = []
The submachine ExecLC extends the interpreter ExecLI for LI by addi-
tional rules for the evaluation of the new expressions and for the execution of
return statements. In the same way the further refinements in the sections below
consist in the parallel addition of appropriate submachines.
56 E. Börger and R.F. Stärk

ExecLC ≡
ExecLExpC
ExecLStmC

Expression Evaluation Rules. The rules in ExecLExpC for class field evalu-
ation are analogous to those for the evaluation of local variables in ExecLExpI ,
except for using globals instead of locals and for the additional clause for class ini-
tialization. The rules for method calls use the macro InvokeStatic explained
below, which takes care of the class initialization. The submachine ArgEval
for the evaluation of sequences of arguments depends on the evaluation strat-
egy of L. The definition of the submachine ParamExp for the evaluation of
special parameter expressions depends on the parameter types provided by the
language L. If Arg = Exp as in Java, this machine is empty; for the case of C,
where Arg ::= Exp | ‘ref’ Vexp | ‘out’ Vexp, we show below its definition.

ExecLExpC ≡ match context(pos)


c.f → if Initialized (c) then YieldIndirect(globals(c::f ))
else Initialize(c)
c.f = exp → pos := exp
c.f =  val → if Initialized (c) then
WriteMem(globals(c::f ), type(c::f ), val )
YieldUp(val )
else Initialize(c)
c.m(args) → pos := (args)
c.m  (vals) → InvokeStatic(c::m, vals)
ArgEval
ParamExp

Once the arguments of a method call are computed, InvokeStatic invokes


the method if the initialization of its class is not triggered, otherwise it initial-
izes the class. In both Java and C, the initialization of a class is not triggered
if the class is already initialized.10 For methods which are not declared exter-
nal or native, InvokeMethod updates the frame stack and the current frame
in the expected way (the same in both Java and C), taking care also of the
initialization of local variables, which includes passing the call parameters. Con-
sequently the definition of the macro InitLocals depends on the parameter
passing mechanism of the considered language L, which is different for Java
and for C. Since we will also have to deal with external (native) methods –
whose declaration includes an extern (native) modifier and which may be im-
plemented using a language other than L – we provide here for their invocation

10
See [9] for other cases where the initialization is not triggered in C, in partic-
ular the refinement for classes which are marked with the implementation flag
beforefieldinit to indicate that the reference of the static method does not trigger
the class initialization.
Exploiting Abstraction for Specification Reuse 57

a submachine InvokeExtern, to be defined separately depending on the class


of external/native (e.g. library) methods. The predicate StaticCtor recognizes
static class constructors (class initialization methods); their implicit call inter-
rupts the member access at pos, to later return to the evaluation of pos instead
of up(pos).

InvokeStatic(c::m, vals) ≡
if triggerInit(c) then Initialize(c) else InvokeMethod(c::m, vals)
where triggerInit(c) = ¬Initialized (c) ∧ ¬BeforeFieldInit(c)

InvokeMethod(c::m, vals) ≡
if extern ∈ modifiers(c::m) then InvokeExtern(c::m, vals)
else let p = if StaticCtor (c::m) then pos else up(pos) in
frames := push(frames, (meth, p, locals, values))
meth := c::m
pos := body(c::m)
values := ∅
InitLocals(c::m, vals)

The definition of the macro InitLocals for the initialization of local vari-
ables depends on the parameter passing mechanism. In Java the macro simply
defines locals (which assumes the role of mem in our general model) to take as
first arguments the actual values of the call parameters (the ValueParams for
call-by-value). In C one has to add a mechanism to pass reference parameters,
including so-called out parameters, which can be treated as ref parameters ex-
cept that they need not be definitely assigned until the function call returns.
In the following definition of InitLocals for C, all (also simultaneous) appli-
cations of the external function new during the computation of the ASM are
supposed to provide pairwise different fresh elements from the underlying do-
main Adr .11 paramIndex (c::m, x ) yields the index of the formal parameter x in
the signature of c::m.

InitLocals(c::m, vals)(C) ≡
forall x ∈ LocalVars(c::m) do // addresses for local variables
locals(x ) := new (Adr , type(x ))
forall x ∈ ValueParams(c::m) do // copy value arguments
let adr = new (Adr , type(x )) in
locals(x ) := adr
WriteMem(adr , type(x ), vals(paramIndex (c::m, x )))
forall x ∈ RefParams(c::m) ∪ OutParams(c::m) do
locals(x ) := vals(paramIndex (c::m, x )) // ref and out arguments

The difference between ref and out parameters at function calls and in
function bodies of C is reflected by including as AddressPositions all nodes

11
See [10] and [5, 2.4.4] for a justification of this assumption. See also the end of Sect. 4
where we provide an abstract specification of the needed memory allocation.
58 E. Börger and R.F. Stärk

whose parent node is labeled by ref or out and by adding corresponding definite
assignment constraints (listed in [4]):

AddressPos(α) ⇐⇒ FirstChild(α) ∧ label (up(α)) ∈ {ref, out, ++, --} ∪ Aop

Therefore the following rules of ParamExp for C can ignore ref and out:
ParamExp(C) ≡ match context(pos)
ref vexp → pos := vexp
ref  adr → YieldUp(adr )
out vexp → pos := vexp
out  adr → YieldUp(adr )
For the sake of illustration we provide here a definition for the submachine
ArgEval with left-to-right evaluation strategy for sequences of arguments. The
definition has to be modified in case one wants to specify another evaluation
order for expressions, involving the use of the ASM choose construct if some
non-deterministic choice has to be formulated. For a discussion of such model
variations we refer to [15] where an ASM model is developed which can be
instantiated to capture the different expression evaluation strategies in Ada95,
C, C++, Java, C and Fortran.
ArgEval ≡ match context(pos)
() → Yield([])
(arg, . . . ) → pos := arg
(val1 , . . . , valn ) → YieldUp([val1 , . . . , valn ])
( . . .  val ,arg . . . ) → pos := arg

Statement Execution Rules. The semantics of static initialization is lan-


guage dependent and is further discussed below for Java and C. The rules
for method return in ExecLStmC trigger an abruption upon returning from a
method. Via the ReturnPropagation submachine defined below, an abrup-
tion Return or Return(val ) due to method return is propagated to the beginning
of the body of the method one is returning from. There an execution of the sub-
machine ExitMethod is triggered, which restores the environment of the caller.
This abruption propagation mechanism allows one an elegant refinement for LE ,
where the method exit is subject to the prior execution of so-called finally
code which may be present in the method. The rule to YieldUp(Norm) does
not capture falling off the method body, but yields up the result of the normal
execution of the invocation of a method with void return type in an expression
statement.

ExecCStmC ≡ match context(pos)


StaticInitializer(L)
return exp; → pos := exp
return  val ; → YieldUp(Return(val ))
return; → Yield(Return)
Exploiting Abstraction for Specification Reuse 59

ReturnPropagation(L)

Norm; → YieldUp(Norm)
The return propagation machine for C is simpler than (in fact part of) that
for Java due to static differences (including the different use of labels) in the
two languages. As mentioned above, both machines, instead of transferring the
control from a return statement directly to the invoker, propagate the return
abruption up to the starting point of the current method body, from where the
method is exited.
ReturnPropagation(C) ≡ match context(pos)
Return → if pos = body(meth) ∧ ¬Empty(frames) then
ExitMethod(Norm)
Return(val ) → if pos = body(meth) ∧ ¬Empty(frames) then
ExitMethod(val )
ReturnPropagation(Java) ≡ match context(pos)
lab :  Return → YieldUp(Return)
lab :  Return(val ) → YieldUp(Return(val ))
ReturnPropagation(C)
To complete the return propagation in Java one still has to treat the special
case of a return from a class initialization method. In [14, Fig. 4.5] this has
been formulated as part of the StaticInitializer machine, which also realizes
the condition for the semantics of Java that before initializing a class, all its
superclasses have to be initialized. To stop the return propagation at the point of
return from a class initialization, in the case of Java the predicate PropagatesAbr
has to be refined as follows:
PropagatesAbr (α) ⇐⇒ label (α) ∈
/ {LabeledStm, StaticBlock }
In C the initialization of a class does not trigger the initialization of its direct
base class, so that StaticInitializer(C) is empty.
StaticInitializer(Java) ≡ match context(pos)
static stm → let c = classNm(meth) in
if c = Object ∨ Initialized (super (c)) then pos := stm
else Initialize(super (c))
static Return → YieldUp(Return)
The machine ExitMethod, which is the same for Java and for C (modulo
the submachine FreeLocals), restores the frame of the invoker and passes
the result value (if any). Upon normal return from a static constructor it also
updates the typeState of the relevant class as Initialized . We also add a rule
FreeLocals to free the memory used for local variables and value parameters,
using an abstract notion FreeMemory of how addresses of local variables and
value parameters are actually de-allocated.12

12
Under the assumption of a potentially infinite supply of addresses, which is often
made when describing the semantics of a programming language, one can dispense
with FreeLocals.
60 E. Börger and R.F. Stärk

ExitMethod(result) ≡
let (oldMeth, oldPos, oldLocals, oldValues) = top(frames) in
meth := oldMeth
pos := oldPos
locals := oldLocals
frames := pop(frames)
if StaticCtor (meth) ∧ result = Norm then
typeState(type(meth)) := Initialized
values := oldValues
else
values := oldValues ⊕ {oldPos → result}
FreeLocals

FreeLocals ≡
forall x ∈ LocalVars(meth) ∪ ValueParams(meth) do
FreeMemory(locals(x ), type(x ))

For both Java and C, a type c is considered as initialized if its static con-
structor has terminated normally, as is expressed by the update of typeState(c)
to Initialized in ExitMethod above. In addition, c is considered as initialized
already if its static constructor has been invoked, to guarantee that during the
execution of the static constructor accesses to the fields of c or invocations of
methods of c do not trigger a new initialization of c. This explains the update
of typeState(c) to InProgress in the definition of Initialize and the following
definition of Initialized :

Initialized (c) ⇐⇒ typeState(c) = Initialized ∨ typeState(c) = InProgress

To initialize a class its static constructor is invoked (denoted <clinit> in


Java and .cctor in C). All static fields of the class are initialized with their
default value. The typeState of the class is updated to prevent further invocations
of Initialize(c) during the execution of the static constructor of c. The macro
will be further refined in LE to account for exceptions during an initialization.

Initialize(c) ≡
if typeState(c) = Linked then
typeState(c) := InProgress
forall f ∈ staticFields(c) do
let t = type(c::f ) in WriteMem(globals(c::f ), t, defaultValue(t))
InvokeMethod(c::.cctor, [])

With respect to the execution of initializers of static class fields the ECMA
standard [7, §17.4.5.1] for C says that the static field initializers of a class cor-
respond to a sequence of assignments that are executed in the textual order in
which they appear in the class declaration. If a static constructor exists in the
class, execution of the static field initializers occurs immediately prior to execut-
ing that static constructor. Otherwise, the static field initializers are executed at
Exploiting Abstraction for Specification Reuse 61

an implementation-dependent time prior to the first use of a static field of that


class.
Our definitions above for C express the decision taken by Microsoft’s cur-
rent C compiler, which in the second case creates a static constructor and
adds the beforefieldinit flag to the class. If one wants to reflect also the
non-determinism suggested by the ECMA formulation, one can formalize the
implementation-dependent initialization of beforefieldinit types by the fol-
lowing rule:13
ExecCC ≡ choose x ∈ {0, 1} do
if x = 0 then {ExecCExpC , ExecCStmC }
if x = 1 then
choose c ∈ Class with BeforeFieldInit(c) ∧ ¬Initialized (c) do
Initialize(c)

4 Extension LO of LC by Object-Oriented Features


In this section we extend LC to an object-oriented language LO by adding ob-
jects for class instances, formally represented as elements of a new set Ref of
references. The extension provides new expressions, namely for instance fields,
instance methods and constructors, and for the dynamic creation of new class
instances. The inheritance mechanism we consider supports overriding and over-
loading of methods, dynamic type checks, and type casts. We skip the (via
syntactical differences partly language-specific) treatment of arrays; their de-
scription for Java and C can be found in [14, 4]. The interpreter ExecLO is
defined as a refinement of ExecLC , obtained from the latter by extending its
universes, functions, macros and rules to make them work also for the new ex-
pressions.

4.1 Static Semantics of LO


The first extension concerns the sets Exp, Vexp, Sexp where the new reference
types appear. ‘null’ denotes an empty reference, ‘this’ is interpreted as the
current reference. A RefExp is an expression of a reference type. We use ‘pred’
to denote a predecessor class, in Java written ‘super’ and in C ‘base’.
Exp ::= . . . | ‘null’ | ‘this’ | Exp ‘.’ Field | ‘(’ Type ‘)’ Exp | SpecificExp(L)
Vexp ::= . . . | Vexp ‘.’ Field | RefExp ‘.’ Field | ‘pred’ ‘.’ Field
Sexp ::= . . . | ‘new’ Type ( [Args] ) | Exp ‘.’ Meth ( [Args] )
| ‘pred’ ‘.’ Meth ( [Args] )

13
This is discussed in detail in [9]. The reader finds there also a detection of further
class initialization features that are missing in the ECMA specification, related to
the definition of when a static class constructor has to be executed and to the
initialization of structs.
62 E. Börger and R.F. Stärk

The specific expressions of JavaI and CI are extended by specific object-
oriented expressions of these languages as follows:

SpecificExp(Java) ::= . . . | Exp ‘instanceof’ Type


SpecificExp(C) ::= . . . | ‘typeof’ ‘(’ RetType ‘)’ | Exp ‘is’ Type
| Exp ‘as’ RefType

Type Structure. To be able to explain by variations of our interpreter ExecL


the major differences between Java and C, we need to mention here some of the
major differences in the type structure underlying the two languages. For effi-
ciency reasons C distinguishes between value types and reference types. When
a compiler encounters the declaration of a variable of value type, it directly
allocates the memory to that variable, whereas for declarations of variables of
reference type it creates a pointer to an object on the heap. A mediation between
the two types is provided, known under the name of boxing, to convert values
into references, and of an inverse operation called unboxing. At the level of LO ,
besides the new type of Ref erences present in both languages, C also introduces
so-called Struct types, a value-type restricted version of classes, to circumvent
the overhead associated with class instances.
Therefore, to be prepared to instantiate our L-interpreter to an interpreter
for both Java and C, the domain of values of LI is extended to contain not
only Ref erences (with a special value null ∈ Ref to denote a null reference), as
would suffice for interpreting JavaO , but also struct values. For the case of C we
assume furthermore references to be different from addresses, i.e., Ref ∩Adr = ∅.
Value = SimpleValue ∪ Adr ∪ Ref ∪ Struct
The set Struct of struct values can be defined as the set of mappings from
StructType::Field to Value. The value of an instance field of a value of struct
type T can then be extracted by applying the map to the field name, i.e.,
structField (val , T , f ) = val (f ). We abstract from the implementation-dependent
layout of structs and objects and use a function
fieldAdr : (Adr ∪ Ref ) × Type::Field → Adr
to record addresses of fields. This function is assumed to satisfy the following
properties, where the static function
instanceFields: Type → Powerset(Type::Field )
yields the set of instance fields of any given type t; if t is a class type, it includes
the fields declared in all pred(ecessor) classes of t:
If t is a struct type, then fieldAdr (adr , t::f ) is the address of field f of a value
of type t stored in mem at address adr .
A value of struct type t at address adr occupies the following addresses:
{fieldAdr (adr , f ) | f ∈ instanceFields(t)}
Exploiting Abstraction for Specification Reuse 63

return type

void type

value type reference type

class type
enum type struct type
interface type
array type
simple type delegate type
null type

bool numeric type

integral type floating−point type decimal

sbyte byte float


short ushort double
int uint
long ulong
char

Fig. 2. The classification of types of C

If runTimeType(ref ) is a class type, then fieldAdr (ref , t::f ) is the address of


field t::f of the object referenced by ref and the object represented by ref
occupies the addresses
{fieldAdr (ref , f ) | f ∈ instanceFields(c)}
where c = runTimeType(ref ) = c.
For our language L we do not specify further types. For the sake of illustration
see Fig. 2 with the extended type classification of C, where the simple types of
LI became aliases for struct types.

Syntax-Tree Information. According to our assumption that the attributed


syntax tree has exact information, for the formulation of our model we assume
as result of field and method resolution that each field access has the form T ::f ,
where f is a field declared in the type T . Similarly, each method call has the
form T ::m(args), where m is the signature of a method declared in type T .
Moreover, for the access of fields and methods via the current instance or the
predecessor class we know the following:
pred.f in class C has been replaced by this.B ::f , where B is the first
predecessor class of C where the field f is declared.
pred.m(args) in class C has been replaced by this.B ::M (args), where
B ::M is the method signature of the method selected by the compiler (the
set of applicable methods is constructed starting in the pred class of C ).
64 E. Börger and R.F. Stärk

If f is a field, then f has been replaced by this.T ::f , where f is declared


in T .
Instance creation expressions are treated like ordinary method invocations,
splitting an instance creation expression into a creation part and an invocation
of an instance constructor. To make the splitting correctly reflect the intended
meaning of new T ::M (args), we assume in our model without loss of generality
that class instance constructors return the value of this.14
Let T be a class type. Then the instance creation expression new T ::M (args)
is replaced by new T .T ::M (args).
Also for constructors of structs we assume that they return the value of this.
For instance constructors of structs one has to reflect that in addition they need
an address for this. Let S be a struct type. Then:
vexp = new S ::M (args) has been replaced by vexp.S ::M (args). This reflects
that such a new triggers no object creation or memory allocation since structs
get their memory allocated at declaration time.
Other occurrences of new S ::M (args) have been replaced by x .S ::M (args),
where x is a new temporary local variable of type S .
For automatic boxing we have:
vexp = exp is replaced by vexp = (T )exp if type(exp) is a value type,
T = type(vexp) and T is a reference type. In this case we must have
type(exp)  T , where  denotes the here not furthermore specified sub-
type relation (standard implicit conversion) resulting from the inheritance
and the ‘implements’ relation between classes and interfaces.
arg is replaced by (T )arg if type(arg) is a value type, the selected method
expects an argument of type T and T is a reference type. In this case we
must have type(arg)  T .

4.2 Dynamic Semantics of LO


Two new dynamic functions are needed to keep track of the runTimeType: Ref →
Type of references and of the type object typeObj : RetType → Ref of a given
type, where RetType ::= Type | ‘void’. The memory function is extended to store
also references:
mem: Adr → SimpleValue ∪ Ref ∪ {Undef }
For boxing we need a dynamic function valueAdr : Ref → Adr to record
the address of a value in a box. If runTimeType(ref ) is a value type, then
valueAdr (ref ) is the address of the struct value stored in the box.

14
The result of a constructor invocation with new is the newly created object, which
is stored in the local environment as value for this. Therefore one can equivalently
refine the macro ExitMethod for constructors to pass the value of this upon
returning from a constructor, see [14, pg. 82].
Exploiting Abstraction for Specification Reuse 65

The this reference is treated as first parameter with index zero. It is passed
by value in instance methods of classes. It is passed by reference in struct meth-
ods, as an out paramter in constructors and as a ref paramter in instance
methods.
For the refinement of the ExecLC transition rules it suffices to add the new
machine ExecLExpO for evaluating the new expressions, since LO introduces
no new statements.

ExecLO ≡
ExecLExpO

In ExecLExpO the type cast rule contains three clauses concerning value
types, which are needed for C but are not present for Java. In fact for Java
the first RefType subrule suffices, the one where both the declared type and the
target type are compatible reference types and the reference is passed through.
FieldExpO contains the rules for field access and assignment as needed for C,
where for Java the additional access rule for value types is not needed (and the
macros for getting and setting field values are simplified correspondingly). NewO
differs for Java and C, reflecting the different scheduling for the initialization,
as specified below. The rules for instance method invocation are the same for
Java and C modulo different definitions for the macro Invoke and except that
for C an additional clause is needed for StructValueInvocations. A struct value
invocation is a method invocation on a struct value which is not stored in a
variable. For such struct values a temporary storage area (called ‘home’) has to
be created which is passed in the invocation as value of this. The submachine
SpecificExpO is specified below for Java and C.

ExecLExpO ≡ match context(pos)


null → Yield(null )
this → YieldIndirect(locals(this))
(t)exp → pos := exp
(t) val →
if type(pos) ∈ RefType then
if t ∈ RefType ∧ (val = null ∨ runTimeType(val )  t) then
YieldUp(val ) // pass reference through
if t ∈ ValueType ∧ val = null ∧ t = runTimeType(val ) then
// un-box a copy of the value
YieldUp(memValue(valueAdr (val ), t))
if type(pos) ∈ ValueType then
if t = type(pos) then YieldUp(val ) // compile-time identity
if t ∈ RefType then YieldUpBox(type(pos), val ) // box value
FieldExpO
NewO
exp.T ::M (args) → pos := exp

val .T ::M (args) →
pos := (args)
66 E. Börger and R.F. Stärk

if StructValueInvocation(up(pos)) then
let adr = new (Adr , type(pos)) in // create home for struct value
WriteMem(adr , type(pos), val )
values(pos) := adr
val .T ::M  (vals) → Invoke(val , T ::M , vals)
SpecificExpO

The following definition formalizes that a struct value invocation is a method


invocation on a struct value which is not stored in a variable.

StructValueInvocation(exp.T ::M (args)) ⇐⇒


type(exp) ∈ StructType ∧ exp ∈
/ Vexp

The rules for instance field access and assignment in FieldExpO are equiva-
lent for Java and C modulo two differences. The first difference comes through
the different definitions for the macro SetField explained below. The second
difference consists of the fact that C needs the struct type clause formulated
below (in the second rule for field access), which is not needed for Java.15 We
use type(exp.t::f ) = type(t::f ).

FieldExpO ≡ match context(pos)


exp.t::f → pos := exp

val .t::f → if type(pos) ∈ ValueType ∧ val ∈ / Adr then
YieldUp(structField (val , type(pos), t::f ))
elseif val = null then
YieldUpIndirect(fieldAdr (val , t::f ))
exp1 .t::f = exp2 → pos := exp1

val .t::f = exp → pos := exp
val1 .t::f =  val2 → if val1 = null then
SetField(val1 , t::f , val2 )
YieldUp(val2 )

The different schedules for the initialization of classes in Java and C appear
in the different definitions for their submachines NewO and Invoke. When
creating a new class instance, Java checks whether the class is initialized. If
not, it initializes the class. Otherwise it does what also the machine NewO (C)
does, namely it creates a new class instance on the heap, initializing all in-
stance fields with their default values. See below for the detailed definition of
HeapInit.

15
As in most parts of this paper, we disregard merely notational differences between
the two models, here the fact that due to the presence of both memory addresses
and values, the CO model uses the machine YieldUpIndirect(fieldAdr (val , t::f ))
where the JavaO model has the simpler update YieldUp(getField (val , t::f )).
Exploiting Abstraction for Specification Reuse 67

NewO (Java) ≡ match context(pos)


new c → if Initialized (c) then
let ref = new (Ref , c) in
HeapInit(ref , c)
Yield(ref )
else Initialize(c)

NewO (C) ≡ match context(pos)


new c → let ref = new (Ref , c) in
runTimeType(ref ) := c
forall f ∈ instanceFields(c) do
let adr = fieldAdr (ref , f ) and t = type(f ) in
WriteMem(adr , t, defaultValue(t))
Yield(ref )

The Invoke rule for Java is paraphrased from [14, Fig. 5.2]. The compile-
time computable static function lookup yields the class where the given method
specification is defined in the class hierarchy, depending on the run-time type of
the given reference.

Invoke(val , T ::M , vals)(Java) ≡


let S = case callKind (up(pos)) of
Virtual → lookup(runTimeType(val ), T ::M )
Super → lookup(super (classNm(meth)), T ::M )
Special → T
InvokeMethod(S ::M , [val ]vals)

C performs the initialization test only in the very moment of performing


Invoke, after the evaluation of the constructor arguments. Thus the invocation
of an instance constructor of a class may trigger the class initialization (see the
detailed analysis in [9]). The split into virtual and non-virtual method calls is
reflected in the submachine InvokeInstance.

Invoke(val , T ::M , vals)(C) ≡


if InstanceCtor (M ) ∧ triggerInit(T ) then Initialize(T )
elseif val = null then InvokeInstance(T ::M , val , vals)

InvokeInstance(T ::M , val , vals) ≡


if callKind (T ::M ) = Virtual then // indirect call, val ∈ Ref
let S = lookup(runTimeType(val ), T ::M ) in
let this = if S ∈ StructType then valueAdr (val ) else val in
InvokeMethod(S ::M , [this] · vals)
if callKind (T ::M ) = NonVirtual then // direct call, val ∈ Adr ∪ Ref
InvokeMethod(T ::M , [val ] · vals)

The machines SpecificExpO define the semantics of the language-specific


expressions listed above, which are all related to type checking.
68 E. Börger and R.F. Stärk

SpecificExpO (Java) ≡ match context(pos)


exp instanceof t → pos := exp

val instanceof t → YieldUp(val = null ∧ runTimeType(val )  t)
SpecificExpO (C) contains SpecificExpO (Java) as a submachine (modulo
notational differences), namely consisting of the first and the third rule for the is-
instruction. In addition we have rules to yield the type of an object and for type
conversion between compatible types, which needs a new macro YieldUpBox
defined below for yielding the reference of a newly created box.
SpecificExpO (C) ≡ match context(pos)
typeof(t) → Yield(typeObj (t))
exp is t → pos := exp

val is t → if type(pos) ∈ ValueType then
YieldUp(type(pos)  t) // compile-time property
else
YieldUp(val = null ∧ runTimeType(val )  t)
exp as t → pos := exp

val as t → if type(pos) ∈ ValueType then
YieldUpBox(type(pos), val ) // box a copy of the value
elseif (val = null ∧ runTimeType(val )  t) then
YieldUp(val ) // pass reference through
else YieldUp(null ) // convert to null reference

Memory Refinement. Due to the appearance of reference (and in C also


struct) types an extension of the memory notion is needed. To model the dynamic
state of objects, storage is needed for all instance variables and to record to which
class an object belongs. The model for JavaO in [14] provides for this reason a
dynamic function heap: Ref → Heap to record every class instance together with
the values of its fields. Heap can be considered as an abstract set of elements
of form Object(t, fields), where fields is a map associating a value with each
field in instanceFields(t). One can then define two simple macros SetField and
GetField to manipulate references on this abstract heap as follows (where ⊕
denotes adding a new (argument,value)-pair to a function, or overwriting an
existing value by a new one):
GetField(ref , f )(Java) ≡ case heap(ref ) of
Object(t, fields) → fields(f )
SetField(ref , f , val )(Java) ≡ let Object(t, fields) = heap(ref ) in
heap(ref ) := Object(t, fields ⊕ {(f , val )})
For modeling CO a further refinement of both reading from and writing to
memory is needed, due to the presence of struct types and call-by-reference. The
notion of reading from the memory is refined by extending the simple equation
memValue(adr , t) = mem(adr ) of CI to fit also struct types, in addition to
reference types. This is done by the following simultaneous recursive definition
of memValue and getField along the given struct type.
Exploiting Abstraction for Specification Reuse 69

memValue(adr , t) =
if t ∈ SimpleType ∪ RefType then mem(adr )
elseif t ∈ StructType then
{f → getField (adr , t::f ) | f ∈ instanceFields(t)}
getField (adr , t::f ) = memValue(fieldAdr (adr , t::f ), type(t::f ))

Similarly, writing to memory is refined from


WriteMem(adr , t, val ) ≡ mem(adr ) := val
in CI , recursively together with SetField along the given struct type:

WriteMem(adr , t, val ) ≡
if t ∈ SimpleType ∪ RefType then mem(adr ) := val
elseif t ∈ StructType then
forall f ∈ instanceFields(t) do SetField(adr , t::f , val (f ))

SetField(adr , t::f , val ) ≡ WriteMem(fieldAdr (adr , t::f ), type(t::f ), val )

The notion of AddressPos from CC is refined to include also lvalue nodes of
StructType, so that address positions are of the following form:
ref 2, out 2, 2++, 2--, 2 op= exp, 2.f , 2.m(args)

AddressPos(α) ⇐⇒ FirstChild(α) ∧
label (up(α)) ∈ {ref, out, ++, --} ∪ Aop ∨
(label (up(α)) = ’.’ ∧ α ∈ Vexp ∧ type(α) ∈ StructType)

YieldUpBox creates a box for a given value of a given type and returns its
reference. The run-time type of a reference to a boxed value of struct type t is
defined to be t. The struct is copied in both cases, when it is boxed and when
it is un-boxed.

YieldUpBox(t, val ) ≡ let ref = new (Ref ) and adr = new (Adr , t) in
runTimeType(ref ) := t
valueAdr (ref ) := adr
WriteMem(adr , t, val )
YieldUp(ref )

5 Extension LE of LO by Exceptions
LE extends LO with exceptions, designed to provide support for recovering from
abnormal situations, separating normal program code from exception handling
code. When an L-program violates certain semantic constraints at run-time, the
interpreter signals this as an exception. The control is transferred, from the point
where the exception occurred, to a point that can be specified by the program-
mer. An exception is said to be thrown from the point where it occurred, and it
is said to be caught at the point to which control is transferred. The model for LE
70 E. Börger and R.F. Stärk

makes explicit how jump statements from LI , return statements from LC and the
initialization of classes from LO interact with catching and handling exceptions.
Technically, exceptions are represented as objects of predefined system excep-
tion classes (in Java java.lang.Throwable and in C System.Exception) or of
user-defined application exception classes. Once ‘thrown’, these objects trigger
an abruption of the normal program execution to ‘catch’ the exception – in case
it is compatible with one of the exception classes appearing in the program in an
enclosing try-catch-finally statement. Optional finally statements are guaranteed
to be executed independently of whether the try statement completes normally
or is abrupted. We consider run-time exceptions, which correspond to invalid
operations violating the semantic constraints of the language (like an attempt
to divide by zero or to index an array outside its bounds) and user-defined ex-
ceptions. We do not treat errors which are failures detected by the underlying
virtual machine machine (JVM or CLR).

5.1 Static Semantics of LE


For the refinement of ExecLO by exceptions, it suffices to extend the static
semantics and to add the new rules for exception handling. The set of statements
is extended by throw and try-catch-finally statements as defined by he following
grammar (where the throw statement without expression and so-called general
catch clauses of form catch block are present only in C, not in Java):

Stm ::= . . . | ‘throw’ Exp ‘;’ | ‘throw’ ‘;’


| ‘try’ Block {Catch} [‘catch’ Block] [‘finally’ Block]
Catch ::= ‘catch’ ‘(’ ClassType [Loc] ‘)’ Block

Various static constraints are imposed on try-catch-finally statements in L-


programs, like the following ones that we need to explain the correctness of the
transition rules below:
every try-catch-finally statement contains at least one catch clause, general
catch clause, or finally block
the exception classes in a catch clause appear there in a non-decreasing type
order, more precisely for every try-catch statement
try block catch (E1 x1 ) block1 . . . catch (En xn ) blockn
holds: i < j =⇒ Ej  Ei
Some static constraints on try-catch-finally statements are language-specific.
We only list the following three specific constraints of C which will be needed
to justify the correctness of the transition rules below.
no return statements are allowed in finally blocks
a break, continue, or goto statement is not allowed to jump out of a finally
block
a throw statement without expression is only allowed in catch blocks
Exploiting Abstraction for Specification Reuse 71

To simplify the exposition we assume that general catch clauses ‘catch block ’
are replaced at compile-time by ‘catch (Object x ) block ’ with a new variable x
and that try-catch-finally statements have been reduced to try-catch and try-
finally statements, e.g., as follows:
try {
try TryBlock
try TryBlock
catch (E1 x1 ) CatchBlock1
.. catch (E1 x1 ) CatchBlock1
. =⇒ ..
.
catch (En xn ) CatchBlockn
catch (En xn ) CatchBlockn
finally FinallyBlock
} finally FinallyBlock
Since throwing an exception completes the computation of an expression or
a statement abruptly, we introduce into the model a new type of reasons of
abruptions and type states, namely references Exc(Ref ) to an exception object.
Due to the presence of throw statements without expression in C, also a stack
of references is needed to record exceptions which are to be re-thrown.
Abr = . . . | Exc(Ref ), TypeState = . . . | Exc(Ref ), excStack : List(Ref )

5.2 Dynamic Semantics of LE


The transition rules for ExecLE are defined by adding two submachines to
ExecLO . The first one provides the rules for handling the exceptions which
may occur during the evaluation of expressions, the second one describes the
meaning of the new throw and try-catch-finally statements.
ExecLE ≡
ExecLExpE
ExecLStmE

Expression Evaluation Rules. ExecLExpE contains rules for each of the


forms of run-time exceptions foreseen by L. We give here some characteristic
examples and group them for the ease of presentation into parallel submachines
by the form of expression they are related to, namely for arithmetical exceptions
and for those related to cast and reference expressions. The notion of FailUp we
use is supposed to execute the code throw new E () at the parent position, which
allocates a new object for the exception and throws the exception (whereby the
execution of the corresponding finally code starts, if there is any, together with
the search for the appropriate exception handler. Therefore one can define the
macro by invoking an internal method ThrowE with that effect for each of the
exception classes E used as parameter of FailUp.
In the formulation of the following rules we use the exception class names
from C, which are often slightly different from those of Java. A binary expression
throws an arithmetical exception, if the operator is an integer division or remain-
der operator and the right operand is 0. The overflow-clause for unary or binary
operators is expressed using the above defined Checked predicate from C.
72 E. Börger and R.F. Stärk

ExecLExpE ≡ match context(pos)


val1 bop  val2 → if DivisionByZero(bop, val2 ) then
FailUp(DivideByZeroException)
elseif DecimalOverflow (bop, val1 , val2 )∨
(Checked (pos) ∧ Overflow (bop, val1 , val2 ))
then FailUp(OverflowException)
uop  val → if Checked (pos) ∧ Overflow (uop, val ) then
FailUp(OverflowException)
CastExceptions
NullRefExceptions
In Java, a reference type cast expression throws a ClassCastException, if
the value of the direct subexpression is neither null nor compatible with the
required type. This is the first clause in the rule below which is formulated
for C, where additional clauses appear due to value types.
CastExceptions ≡ match context(pos)
(t) val →
if type(pos) ∈ RefType then
if t ∈ RefType ∧ val = null ∧ runTimeType(val )  t then
FailUp(InvalidCastException)
if t ∈ ValueType then // attempt to unbox
if val = null then FailUp(NullReferenceException)
elseif t = runTimeType(val ) then
FailUp(InvalidCastException)
if type(pos) ∈ SimpleType ∧ t ∈ SimpleType ∧
Checked (pos) ∧ Overflow (t, val )
then FailUp(OverflowException)
An instance target expression throws a NullReferenceException, if the
operand is null (in Java a NullPointerException).
NullRefExceptions ≡ match context(pos)

ref .t::f → if ref = null then
FailUp(NullReferenceException)
ref .t::f =  val → if ref = null then
FailUp(NullReferenceException)
ref .T ::M ( vals) → if ref = null then
FailUp(NullReferenceException)

Statement Execution Rules. The statement execution submachine splits


naturally into submachines for throw, try-catch, try-finally statements and a
rule for the propagation of an exception (from the root position of a method
body) to the method caller. We formulate the machine below for C and then
explain its simplification for the case of Java (essentially obtainable by deleting
every exception-stack-related feature).
When the exception value ref of a throw statement has been computed,
and if it turns out to be null , a NullReferenceException is reported to the
Exploiting Abstraction for Specification Reuse 73

enclosing phrase using FailUp, which allocates a new object for the exception
and throws the exception. If the exception value ref of a throw statement is
not null , the abruption Exc(ref ) is passed up to the (position of the) throw
statement, thereby abrupting the control flow with the computed exception as
reason. The semantics of the parameterless throw; statement is explained as
throwing the top element of the exception stack excStack .
Upon normal completion of a try statement, the machine passes the con-
trol to the parent statement, whereas upon abrupted completion the machine
attempts to catch the exception by one of the catch clauses. The catching con-
dition is the compatibility of the class of the exception with one of the catcher
classes. If the catching fails, the exception is passed to the parent statement, as is
every other abruption which was propagated up from within the try statement.
Otherwise the control is passed to the execution of the relevant catch statement,
recording the current exception object in the corresponding local variable and
pushing it on the exception stack (thus recording the last exception in case it
has to be re-thrown). Upon completion of this catch statement, the machine
passes the control up and pops the current exception object from the exception
stack—the result of this statement execution may be normal or abrupted, in the
latter case the new exception is passed up to the parent statement. No special
rules are needed for general catch clauses ‘catch block ’ in try-catch statements,
due to their compile-time transformation mentioned above.
For a finally statement, upon normal or abrupted completion of the first
direct substatement, the control is passed to the execution of the second di-
rect substatement, the finally statement proper. Upon normal completion of
this statement, the control is passed up, together with the possible reason of
abruption, the one which was present when the execution of finally statement
proper was started, and which in this case has to be resumed after execution of
the finally statement proper. However, should the execution of this finally
statement proper abrupt, then this new abruption is passed to the parent state-
ment and a possible abruption of the try block is discarded. The constraints
listed above for C restrict the possibilities for exiting a finally block to normal
completion or triggering an exception, whereas in Java also other abruptions
may occur here.
In Java there is an additional rule for passing exceptions when they have
been propagated to the position directly following a label, namely:

lab : Exc(ref ) → YieldUp(Exc(ref ))

If the attempt to catch a thrown exception in the current method fails, the
exception is propagated to the caller using the submachine explained below.

ExecCStmE ≡ match context(pos)


throw exp; → pos := exp
throw  ref ; → if ref = null then FailUp(NullReferenceException)
else YieldUp(Exc(ref ))
throw; → Yield(Exc(top(excStack )))
74 E. Börger and R.F. Stärk

try block catch (E x ) stm . . . → pos := block


try  Norm catch (E x ) stm . . . → YieldUp(Norm)
try  Exc(ref ) catch(E1 x1 ) stm1 . . . catch(En xn ) stmn →
if ∃i ∈ [1 . . n] runTimeType(ref )  Ei then
let j = min{i ∈ [1 . . n] | runTimeType(ref )  Ei } in
pos := stmj
excStack := push(ref , excStack )
WriteMem(locals(xj ), object, ref )
else YieldUp(Exc(ref ))
try  abr catch(E1 x1 ) stm1 . . . catch(En xn ) stmn → YieldUp(abr )
try Exc(ref ) . . . catch( . . . )  res . . . →
{excStack := pop(excStack ), YieldUp(res)
try tryBlock finally finallyBlock → pos := tryBlock
try  res finally finallyBlock → pos := finallyBlock
try res finally  Norm → YieldUp(res)
try res finally  Exc(ref ) → YieldUp(Exc(ref ))
PropagateToCaller(Exc(ref ))
If the attempt to catch a thrown exception in the current method fails,
the exception is passed by PropagateToCaller(Exc(ref)) to the invoker of
this method (if there is some), to continue the search for an exception han-
dler there. In case an exception was thrown in the static constructor of a type,
in C its type state is set to that exception to prevent its re-initialization and
instead to re-throw the old exception object, performed by an extension of
Initialize(c) by the clause if typeState(c) = Exc(ref ) then Yield(Exc(ref )).
In Java, the corresponding type becomes Unusable, meaning that its initial-
ization is not possible, which is realized by the additional Initialize(c)-clause
if typeState(c) = Unusable then Fail(NoClassDefFoundErr).
PropagateToCaller(Exc(ref )) ≡ match context(pos)
Exc(ref ) →
if pos = body(meth) ∧ ¬Empty(frames) then
if StaticCtor (meth) then typeState(type(meth)) := Exc(ref )
ExitMethod(Exc(ref ))
The model ExecJavaStmE in [14, Fig. 6.2] has the following rule for un-
caught exceptions in class initializers, which is inserted before the general rule
PropagateToCaller(Exc(ref )). Java specifies the following strategy for this
case. If during the execution of a static initializer an exception is thrown, and if
this is not an Error or one of its subclasses, an ExceptionInInitializerError
is thrown. If the exception is compatible with Error, then the exception is
rethrown in the directly preceding method on the frame stack.
match context(pos)
static Exc(ref ) →
if runTimeType(ref )  Error then YieldUp(Exc(ref ))
else FailUp(ExceptionInInitializerError)
Exploiting Abstraction for Specification Reuse 75

An alternative treatment appears in the model ExecCStmE in [4] where


unhandled exceptions in a static constructor are wrapped into an exception of
type TypeInitializationException by translating the static constructor
static T () { BlockStatements }
into
static T () {
try { BlockStatements }
catch (Exception e) {
throw new TypeInitializationException(T ,e);
}
}
The interpreter for JavaE needs also a refinement of the definition of propa-
gation of abruptions, to the effect that try statements suspend jump and return
abruptions for execution of relevant finally code. For C this is not needed due
to the constraints cited above for finally code in C. As explained above, after the
execution of this finally code, that abruption will be resumed (unless during
the finally code a new abruption did occur which cancels the original one).
PropagatesAbr (α) ⇐⇒
label (α) ∈
/ {LabeledStm, StaticBlock , TryCatchStm, TryFinallyStm}

6 Conclusion
We have defined hierarchically structured components of an interpreter for a
general object-oriented programming language. In doing this we have identified
a certain number of static and dynamic parameters and have shown that they
can be instantiated to obtain an interpreter for Java or C. As a by-product this
pinpoints in a precise and explicit way the main differences the two languages
have in their dynamic semantics. The work confirms the idea that one can use
ASMs to define in an accurate way appropriate abstractions to support the
development of precise patterns for fundamental computational concepts in the
fields of hardware and software, reusable for design-for-change and useful for
communicating and teaching design skills.

Acknowledgement. We gratefully acknowledge partial support of this work


by a Microsoft grant within the ROTOR project during the years 2002–2003.

References
1. V. Awhad and C. Wallace. A unified formal specification and analysis of the new
Java memory models. In E. Börger, A. Gargantini, and E. Riccobene, editors,
Abstract State Machines 2003–Advances in Theory and Applications, volume 2589
of Lecture Notes in Computer Science, pages 166–185. Springer-Verlag, 2003.
76 E. Börger and R.F. Stärk

2. E. Börger. The ASM refinement method. Formal Aspects of Computing, 15:237–


257, 2003.
3. E. Börger and T. Bolognesi. Remarks on turbo ASMs for computing functional
equations and recursion schemes. In E. Börger, A. Gargantini, and E. Riccobene,
editors, Abstract State Machines 2003 – Advances in Theory and Applications, vol-
ume 2589 of Lecture Notes in Computer Science, pages 218–228. Springer-Verlag,
2003.
4. E. Börger, N. G. Fruja, V. Gervasi, and R. Stärk. A high-level modular definition
of the semantics of C. Theoretical Computer Science, 2004.
5. E. Börger and R. F. Stärk. Abstract State Machines. A Method for High-Level
System Design and Analysis. Springer, 2003.
6. K. G. Doh and P. D. Mosses. Composing programming languages by combining
action-semantics modules. Science of Computer Programming, 47(1):3–36, 2003.
7. C Language Specification. Standard ECMA–334, 2001.
http://www.ecma-international.org/publications/standards/ECMA-334.HTM.
8. N. G. Fruja. The correctness of the definite assignment analysis in C. In V. Skala
and P. Nienaltowski, editors, Proc. 2nd International Workshop on .NET Tech-
nologies 2004, pages 81–88, Plzen, Czech Republic, 2004. ISBN 80–903100–4–4.
9. N. G. Fruja. Specification and implementation problems for C. In W. Zimmer-
mann and B. Thalheim, editors, 11th International Workshop on Abstract State
Machines, ASM 2004, Wittenberg, Germany, pages 127–143. Springer-Verlag, Lec-
ture Notes in Computer Science 3052, 2004.
10. N. G. Fruja and R. F. Stärk. The hidden computation steps of turbo Abstract State
Machines. In E. Börger, A. Gargantini, and E. Riccobene, editors, Abstract State
Machines 2003 – Advances in Theory and Applications, volume 2589 of Lecture
Notes in Computer Science, pages 244–262. Springer-Verlag, 2003.
11. Y. Gurevich. Evolving algebras 1993: Lipari Guide. In E. Börger, editor, Specifi-
cation and Validation Methods, pages 9–36. Oxford University Press, 1995.
12. P. Mosses. Definitive semantics. Version 0.2 of Lecture Notes made available at
http://www.mimuw.edu.pl/ mosses/DS-03, May 2003.
13. R. F. Stärk and E. Börger. An ASM specification of C threads and the .NET
memory model. In W. Zimmermann and B. Thalheim, editors, 11th International
Workshop on Abstract State Machines, ASM 2004, Wittenberg, Germany, pages
38–60. Springer-Verlag, Lecture Notes in Computer Science 3052, 2004.
14. R. F. Stärk, J. Schmid, and E. Börger. Java and the Java Virtual Machine—
Definition, Verification, Validation. Springer-Verlag, 2001.
15. W. Zimmermann and A. Dold. A framework for modeling the semantics of ex-
pression evaluation with Abstract State Machines. In E. Börger, A. Gargantini,
and E. Riccobene, editors, Abstract State Machines 2003–Advances in Theory and
Applications, volume 2589 of Lecture Notes in Computer Science, pages 391–406.
Springer-Verlag, 2003.
On the Verification of Cooperating
Traffic Agents

Werner Damm1,2 , Hardi Hungar2 , and Ernst-Rüdiger Olderog1


1
Carl von Ossietzky University, Oldenburg, Germany
2
OFFIS, Oldenburg, Germany

Abstract. This paper exploits design patterns employed in coordinat-


ing autonomous transport vehicles so as to ease the burden in verifying
cooperating hybrid systems. The presented verification methodology is
equally applicable for avionics applications (such as TCAS), train appli-
cations (such as ETCS), or automotive applications (such as platooning).
We present a verification rule explicating the essence of employed design
patters, guaranteeing global safety properties of the kind “a collision will
never occur”, and whose premises can either be established by off-line
analysis of the worst-case behavior of the involved traffic agents, or by
purely local proofs, involving only a single traffic agent. In a companion
paper we will show, how such local proof obligations can be discharged
automatically.

1 Introduction
Automatic collision avoidance systems form an integral part of ETCS-compatible
train systems, are appearing or about to appear in cars, and – in the somewhat
weaker form of only providing recommendations to the pilot – required to be
installed on any aircraft with more than 30 passengers. The verification of the
correctness of such collision avoidance system has been studied extensively e.g.
within the PATH project (see [12]), by Leveson et al. [10], Sastry et al. [17] and
Lynch et al. [11] for various versions of the TCAS system, or by Peleska et al.
[6] and Damm et al. [4] for train system applications. Shankar et al presents in
[12] a general approach of developing such distributed hybrid systems.
Our paper aims at reducing the complexity of the verification of collision
avoidance systems. To this end, we explicate typical design approaches for such
protocols, and cast this into a proof rule reducing the global safety requirement
“a collision will never occur” to simpler proof tasks, which can either be es-
tablished off line, involve purely local safety- or real-time properties, or pure
protocol verification. We illustrate our approach by showing how the correctness


This work was partly supported by the German Research Council (DFG) as part
of the Transregional Collaborative Research Center “Automatic Verification and
Analysis of Complex Systems” (SFB/TR 14 AVACS). See www.avacs.org for more
information.

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 77–110, 2004.

c Springer-Verlag Berlin Heidelberg 2004
78 W. Damm, H. Hungar, and E.-R. Olderog

of TCAS and a protocol for wireless safe railroad crossings can be established
by instantiating the proposed verification rule. While the methodology as such
is applicable to any bounded number of agents, we will illustrate our approach
considering only collision avoidance for two agents.
The approach centers around the following key design concepts.
Each traffic agent is equipped with a safety envelope, which is never to be
entered by another agent. The safety envelope of an agent is an open ball around
the agent’s current three-dimensional position, with an agent-specific diameter.
We allow the extent of the diameter to be mode dependent, thus e.g. enforcing
different degrees of safety margins during different flight phases, or allowing a
train to actually cross a railroad crossing if this is in mode “secured”. The global
safety property we want to prove is, that safety envelopes of agents are always
kept apart.
To enforce separation of safety envelopes, the controllers’ coordinating colli-
sion avoidance offers a choice of corrective actions, or maneuvers. E.g. TCAS dis-
tinguishes “steep descent”, “descent”, “maintain level”, “climb”, “steep climb”,
thus restricting the maneuvers to adjustment of the height of the involved air-
crafts (with future versions expected to also offer lateral avoidance maneuvers).
In the train-system application, the maneuvers of the crossing are simple (flash
light and lower barriers), and the train will be brought to a complete stop, trying
first the service brake, and resorting to an emergency brake as backup option.
From the point of view of our methodology, we will assume a set of explicitly
designated states for such collision avoidance maneuvers, henceforth called cor-
rective modes. It is the task of the collision-avoidance protocol to select matching
pairs of corrective modes: the joint effect of the strategies invoked by the chosen
states ensures, that the agents will pass each other without intruding safety en-
velopes – a condition, which we refer to as adequacy. E.g. TCAS will never ask
both aircrafts to climb – the decision of which aircraft should climb, and which
descend, is taken based on the relative position and the identity of the aircraft
(to break symmetry, identities are assumed to be totally ordered).
A key design aspect of collision avoidance protocols is the proper determina-
tion of invocation times for corrective maneuvers. Due to the safety criticality of
the application, such triggering points must be determined taking into account
the most adversary behavior of traffic agents, within the overall limits stipulated
by traffic regulation authorities. As an example, triggering a mode switch to en-
force safe braking of the train must be done in time to allow the train to come
to a complete stop even when driving the maximal allowed speed for the cur-
rent track segment, as well as the maximally allowed slope for track segments.
This in fact simplifies the analysis, because it both requires as well as allows
to perform the analysis using an over-approximation of the agents’ behavior.
Conformance to such upper bounds on velocity must be established separately
– note, that this is a verification task local to the agent. We can then use off-line
analysis based on the over-approximated behavior of the plant and knowledge
of the corrective maneuvers to determine the need for a mode switch to one of
the corrective modes.
On the Verification of Cooperating Traffic Agents 79

We will show that we can determine off line, which combinations of correc-


tive states will satisfy the adequacy condition (we call these matching corrective
states), by exploiting guarantees about invocation time for corrective actions,
and analyzing the possible evolutions from the plant regions characterized by
the predicates inducing mode switching, thus reducing the global verification
of adequacy to the verification, that only matching corrective states are se-
lected.
This paper is structured as follows. The next section explicates our mathe-
matical model of cooperating traffic agents, enriching the well known model of
cooperating hybrid automata with sufficient structure to perform our analysis.
The ingredients of our verification methodology are elaborated in Section 3. We
discuss to which extent the analysis from [11] differs from our decomposition ap-
proach in Section 4, and perform likewise for the railroad crossing in Section 5.
We will discuss possible extensions of the methodology in the conclusion.

2 Model of Cooperating Traffic Agents


This section introduces our mathematical model of cooperating traffic agents, us-
ing a variant of communicating hybrid automata from Tomlin et al [16] allowing
general ordinary differential equations as specifications of continuous evolutions.
An agent’s controller is typically organized hierarchically. A cooperation layer
is responsible for the protocol with other agents, allowing each agent to acquire
approximate knowledge about the state of other agents, and exchanging infor-
mation on the chosen strategies for collision avoidance. At the lowest level, which
we will refer to as the reflex layer, basic control laws are provided for all normal
modes of operations, such as maintaining the speed of the train at a given set
point. We abstract from the concrete representation of the control laws, assum-
ing that they can be cast into a set of differential equations, expressing how
actuators change over time. The state space of the controller is spanned from
a finite set of discrete modes M and a set of (real-valued) state variables V ,
subsuming sensors and actuators of the controlled plant. We assume a special
variable id to store the identity of an agent. We explicitly designate a set of
modes as corrective modes, implementing the control laws for collision avoidance
maneuvers. E.g. in the TCAS context, this would include the control laws for
the various degrees of strength for climb, respectively descent maneuvers, or in
the train system context, the selection of either activating the service brake, or
the emergency brake. A coordination layer is responsible for mode switching in
general, and in particular for the selection of the collision avoidance maneuvers,
by entering the control mode activating the control law associated with the cho-
sen maneuver. Mode switching can be triggered by communication events, by
timeouts, and by conditions on the plant state becoming true. We allow to cap-
ture typical assumptions on discrete and continuous variables to be expressed
as global invariances or mode invariances. Typical usages of invariants for dis-
crete modes will relate to the cooperation protocol, as well as the stipulation,
that corrective maneuvers will be maintained until potentially critical situations
80 W. Damm, H. Hungar, and E.-R. Olderog

have been resolved.1 Typical usages of invariances for continuous variables will
be the enforcement of regulatory stipulations such as regarding maximal allowed
speed, or tolerated degrees of acceleration resp. deceleration.
As to the model of the traffic agent’s plant, we assume, that the set of plant
variables subsumes the traffic agents space coordinates x , y, z , its velocity in each
dimension vx , vy , vz , as well as its acceleration ax , ay , az in each dimension, a set
of actuator variables A, as well as a set of disturbances D. A subset S of the sys-
tem variables, the sensor variables, are observable by the plant’s controller, which
in turn can influence the plant by the setting of actuators. We assume that space
coordinates, velocity, and acceleration are directly observable by the controller.
For the current version, we consider simple plant models without discrete state
components.2 The dynamics of the system is specified by giving for each non-
input system variable an ordinary differential equation, whose right hand side
may contain all variables of the plant. This includes the standard dependencies
between acceleration, velocity, and space coordinates. Plant invariances allow to
place bounds on disturbances. In particular, we will assume for each disturbance
upper and lower bounds under which the agent is expected to operate.
A traffic-agent model combines an agent controller with its plant model, us-
ing standard parallel composition of hybrid automata. Communication between
these automata is based solely on the actuator and sensor variables shared be-
tween these. We thus abstract in this paper from noise in sensor readings and
inaccuracy of actuators when observing resp. controlling its own plant. Note,
that the parallel composition used is just a standard parallel composition of au-
tomata, hence we can view a traffic agent model again as a hybrid automaton.
A distributed traffic system consists of the parallel composition of N traf-
fic agents. We abstract in this paper from modeling communication channels,
and assume, that all communication between traffic agents is based on variables
shared between their controllers, of mode “output” on the one side and of mode
“input” on the other. As noted before, mode-switches can be initiated by com-
munication events, which in this setting is expressed by allowing as trigger to
test for a condition involving communication variables to become true. We make
the simplifying assumption, that the variables shared between traffic-agent con-
trollers include the sensor variables for its coordinates and its speed, as well
as the identity of the controllers. It is straightforward to extend our approach
to message-passing based information and accordingly delayed knowledge about
the other agent’s whereabouts, by interpolating from the last known readings,
using worst-case bounds dictated by regulations to safely over-approximate the
actual readings.

1
The TCAS protocol actually allows to withdraw a given recommendation in re-
stricted situations, complicating the analysis, see [11]. The above assumption indeed
eases the off-line analysis required in our approach, but is not inherent. We leave it
to a later paper to consider more flexible strategies.
2
In the formal development, we view a plant as a hybrid automaton with a single
state and no discrete transitions.
On the Verification of Cooperating Traffic Agents 81

Since a distributed traffic system is again just built by parallel composition of


hybrid automata, we can safely identify it with the well-defined hybrid automa-
ton resulting from the parallel composition of its agents. We can thus focus the
remainder of this chapter on defining a sufficiently rich class of hybrid automata
offering all constructs elaborated above for the agents controller and its plant,
and closed under parallel composition.
Definition 1 (Hybrid Automaton). A hybrid automaton is a tuple
HA = (M , V , R d , R c , m0 , Θ)
where
– M is a finite set of modes,
– V is a real-valued set of variables, partitioned into input, local and output
variables, respectively:
. .
V = V i ∪ V  ∪ V o,
– m0 is the initial mode,
– Θ associates with each mode m a local invariant Θ(m) (a quantifier free
boolean formula over V ),
– R d is the discrete transition relation with transitions (m, ↑ ϕ, A, m  ), also
↑ϕ/A
written as m −−−→ m  , where
• m, m  ∈ M ,
• A is a (possibly empty) set of (disjoint) assignments of the form v := ev
with v ∈ V  ∪ V o and ev an expression over V ,
• the trigger ↑ ϕ is the event that a quantifier-free boolean formula ϕ over
V becomes true.
– R c is the continuous transition relation associating with each mode m and
each non-input variable v an expression R c (m)(v ) over V .
Intuitively, R c thus defines for mode m and each v the differential equation
.
v = R c (m)(v )
governing the evolution of v while HA is in mode m. 2

Additionally we require of the discrete transition relation, that the execution


of one transition does not immediately enable a further transition.
Definition 2 (Transition Separation). Let σ : V → R be a valuation of the
variables V . Then A(σ) denotes the update of σ according to the assignments in
A, i.e.
∀ v ∈V : (∃ ev : v := ev ∈ A) ⇒ A(σ)(v ) = σ(ev )
∧ ¬(∃ ev : v := ev ∈ A) ⇒ A(σ)(v ) = σ(v ).
The discrete transitions in a hybrid automaton are separated, if for any two
transitions (m1 , ↑ ϕ1 , A1 , m1 ) and (m2 , ↑ ϕ2 , A2 , m2 ) with m1 = m2 it holds that
∀ σ : V → R : (σ |= ϕ1 ⇒ A1 (σ) |= ϕ2 ).
82 W. Damm, H. Hungar, and E.-R. Olderog

Separation implies that at any given point in time during a run, at most one
discrete transition fires. I.e., our models have dense time, not superdense time,
where a sequence of discrete transitions is permitted to fire at one instant in time.
Discrete variables may be included into hybrid automata according to our
definition via an embedding of their value domain into the reals, and associating
a derivation of constantly zero to them (locals and outputs). Timeouts are easily
coded via explicit local timer variables with a derivative in {−1, 0, 1}, as required
respectively.
Note, that indeed this general model subsumes both controller and plant mod-
els, by choosing the set of variables appropriately and enforcing certain modeling
restrictions. For our plant models, we require the absence of discrete transitions.
This entails in particular, that plant variables only evolve continuously, and can-
not be changed by discrete jumps. This is convenient for the formulation of our
approach but not essential.
We will in the definition of runs of a hybrid automaton interpret all transi-
tions as urgent, i.e. a mode will be left as soon as the triggering event occurs.
This can either be the expiration of a time-out, or a condition on e.g. the plant
sensors becoming true. Valid runs also avoid Zeno behavior and time-blocks, i.e.
each run provides a valuation for each positive point in time. We did not take
provisions to ensure the existence of such a run, nor the property that each ini-
tial behavior segment can be extended to a full run. Such might be added via
adequate modeling guidelines (e.g. by including the negation of an invariant as
trigger condition on some transition leaving the mode). As these properties are
not essential to the purpose of this paper we left them out.
We now give the formal definition of runs of a hybrid automaton HA. To this
end we consider continuous time and let Time = R≥0 . Further on, we use the
notation
previous(v̂ , t) = limu→t (v̂ (u))
for some v̂ : Time → R and 0 < t ∈ Time. Satisfaction of a condition containing
previous entails that the respective limes does exist.3
Definition 3 (Runs of a Hybrid Automaton). A tuple of trajectories
π = (m̂, (v̂ )v ∈V ) , with
m̂ : Time → M
v̂ : Time → R, v ∈ V
capturing the evolution of modes and valuations of continuous variables is called
a run of HA iff
∃ (τi )i∈ω ∈ Time ω : τ0 = 0 ∧ ∀ i : τi < τi+1 ,
a strictly increasing sequence of discrete switching times s.t.

3
In fact, our definition of a run implies that these limits do exist for all local and
output variables in any run.
On the Verification of Cooperating Traffic Agents 83

1. non-Zeno

∀ t ∈ Time ∃ i : t ≤ τi

2. mode switching times

∀ i ∀ t ∈ [τi , τi+1 ) : m̂(t) = m̂(τi ).

3. continuous evolution

∀ i ∀ t ∈ [τi , τi+1 ) ∀ v ∈ V :
d v̂ (t)
= R c (m̂(τi ))[ŵ (t)/w | w ∈ V ]
dt
4. invariants

∀ t ∈ Time : (λ v . v̂ (t)) |= Θ(m̂(t)),

5. urgency

∀ i ∀ t ∈ [τi , τi+1 ) ∀ (m, ↑ ϕ, A, m  ) ∈ R d :


m̂(t) = m ⇒ (λ v . v̂ (t)) |= ϕ

6. discrete transition firing

∀ i : m̂(τi+1 ) = m̂(τi ) ∧ (∀ v ∈ V  ∪ V o : v̂ (τi+1 ) = previous(v̂ (τi+1 ))



∃ (m, ↑ ϕ, A, m  ) ∈ R d :
m̂(τi ) = m ∧ m̂(τi+1 ) = m 
∧ ∃ σ∈V →R:
(∀ v ∈ V  ∪ V o : σ(v ) = previous(v̂ , τi+1 )
∧ σ |= ϕ
∧ ∀ v ∈ V i : v̂ (τi+1 ) = σ(v )
∧ ∀ v ∈ V  ∪ V o : v̂ (τi+1 ) = A(σ)(v )

The time sequence (τi )i∈ω identifies the points in time, at which mode-
switches may occur, which is expressed in Clause (2). Only at those points dis-
crete transitions (having a noticeable effect on the state) may be taken. On the
other hand, it is not required that any transition fires at some point τi , which
permits to cover behaviors with a finite number of discrete switches within the
framework above. Our simple plant models with only one mode provide exam-
ples. As usual, we exclude non-zeno behavior (in Clause (1)). As a consequence
of the requirement of transition separation, after each discrete transition some
time must elapse before the next one can fire. Clause (4) requires, for each mode,
the valuation of continuous variables to meet the local invariant while staying
in this mode. Clause (3) forces all local and output variables (whose dynamics
84 W. Damm, H. Hungar, and E.-R. Olderog

is constrained by the set of differential equations associated with this mode) to


actually obey their respective equation. Clause (5) forces a discrete transition to
fire when its trigger condition becomes true. The effect of a discrete transition
is described by Clause (6). Whenever a discrete transition is taken, local and
output variables may be assigned new values, obtained by evaluating the right-
hand side of the respective assignment using the previous value of locals and
outputs and the current values of the input. If there is no such assignment, the
variable maintains its previous value, which is determined by taking the limit of
the trajectory of the variable as t converges to the switching time τi+1 . Values
of inputs may change arbitrarily. They are not restricted by the clauses, other
that they obey mode invariants and contribute to the satisfaction of discrete
transitions when those fire.
The parallel composition of two such hybrid automata HA1 and HA2 pre-
supposes the typical disjointness criteria for modes, local variables, and output
variables. Output variables of HA1 which are at the same time input variables of
HA2 , and vice versa, establish communication channels with instantaneous com-
munication. Those variables establishing communication channels become local
variables of HA1  HA2 (in addition to the local variables of HA1 and HA2 ), for
other variable sets we simply take the union of those not involved in communi-
cation. Modes of HA1  HA2 are the pairs of modes of the component automata.
One may define the set of runs of HA as those tuples of trajectories which project
to runs of HA1 and HA2 , respectively. It is not always possible to give a hybrid
automaton for HA1  HA2 , because of problems with cycles of instantaneous
communications. Therefore, we impose the following additional condition on the
composability of hybrid automata.
Definition 4 (Composable Hybrid Automata). Let two hybrid automata
HAi , i = 1, 2, with discrete transition relations Rid , i = 1, 2, be given. For a
pair of transitions si = (mi , ↑ ϕi , Ai , mi ) ∈ Rid , i = 1, 2, the transition s1 is
unaffected by s2 , if each variable for which there is an assignment in A2 appears
neither in ϕ1 nor in A1 (on any of the right-hand sides).
The two transition relations are composable, if for each pair of transitions
si ∈ Rid , i = 1, 2, either s1 is unaffected by s2 or vice versa.
Composability establishes essentially a direction on instantaneous communi-
cations – communications may have an immediate effect on the output and thus
the partner automaton, but they must not immediately influence the originator
of the information. Assuming composability, the rest of the construction of the
parallel composition automaton is rather standard.
For a mode (m1 , m2 ), the associated invariant condition is the conjunction
of the invariance conditions associated with m1 and m2 . Similarly, the set of
differential equations governing the continuous evolution while in mode (m1 , m2 )
is obtained by simply conjoining the set of differential equations attached to m1
and m2 , respectively – note that the disjointness conditions on variables assure,
that this yields a consistent set of differential equations. Finally, the discrete
transition relation consists of the following transitions:
On the Verification of Cooperating Traffic Agents 85

1. ((m1 , m2 ), ϕ1 ∧ A1 (ϕ2 ), A1 ∪ A1 (A2 ), (m1 , m2 ))


for each pair of transitions si = (mi , ↑ ϕi , Ai , mi ) ∈ Rid , i = 1, 2 where s1 is
unaffected by s2 ,
2. ((m1 , m2 ), ϕ1 {¬A1 (ϕ2 ) | ϕ2 trigger in R2d }, A1 , (m1 , m2 ))
for each (m1 , ↑ ϕ1 , A1 , m1 ) ∈ R1d , and
3. transitions of the forms (1) and (2) with the role of HA1 and HA2 inter-
changed,
where
1. A(ϕ) denotes the substitution into ϕ of ev for v for each assignment v :=
ev ∈ A, and
2. A1 (A2 ) denotes the substitution of the assignments of A1 into the right-hand
terms of A2 .
Composability ensures that the simultaneous transitions of Clause (1) indeed
capture the combined effect of both transitions. The separation of transitions in
the resulting automaton is inherited from separation in the component automata
by the way single-automata transitions (Clause (2)) are embedded.
In the sequel, we will analyse two-agent systems
(C1  P1 ) (C2  P2 ),
consisting of a controller and a plant automaton each. We will establish a proof
rule allowing to reduce the global requirement of collision freedom between these
two traffic agents to safety requirements of a single agent Ci  Pi . A follow-
up paper will show, how by a combination of first-order model checking and
Lyapunov’s approach of establishing stability for the individual modes such local
verification tasks can be fully automated.

3 A Proof Rule for Cooperating Traffic Agents


This section develops a generic proof-rule to establish collision freedom between
cooperating traffic agents. In doing so, we contribute to the verification of this
class of properties in two ways. First, by extracting a generic pattern, we iden-
tify the key ingredients involved in such classes of cooperation protocols. We are
guided in this process by the in-depth knowledge of a spectrum of applications,
notably from the train system and the avionics domain, and will use examples of
each of these domains, which demonstrate, how the generic approach we develop
can be specialized to both concrete protocol instances, though they differ in a
significant number of design decisions. Jointly, the two example show, that we
can cover with one generic scheme the design space ranging from completely
symmetric solutions without fail-safe states, to asymmetric solutions involving
heterogeneous traffic agents with fail safe states. Secondly, the proof rule demon-
strates, how the verification problem for collision freedom, which involves a hy-
brid system composed of two traffic agents – each a pair (Cj  Pj ) of the agent’s
controller and its plant-model – can be reduced to simpler verification tasks, of
the following classes:
86 W. Damm, H. Hungar, and E.-R. Olderog

(A) Off-line analysis of the dynamics of the plant assuming worst-cases dynamics
(B) Mode invariants for C1  C2
(C) Real-time properties for Cj
(D) Local safety properties, i.e. hybrid verification tasks for Cj  Pj

Type (A) verification tasks capture, how we can capitalize on rigorous design
principles for safety-critical systems, where a high degree of robustness must
be guaranteed in cooperation procedures for collision avoidance. This entails,
that potentials for collision must be analysed assuming only a minimum of well-
behavedness about the other agent’s behavior, as expressed in invariance proper-
ties guaranteed by the controller (these will lead to proof obligations of type (D).
As a concrete example, when analyzing a two-aircraft system, given their cur-
rent position and speed, we will bound changes in speed, height and direction
when analyzing their worst case behavior in detecting potentials for collision.
These invariances reflect typical requirements for flight phases – the aircraft
must stay within safety margins of its current flight corridor, and maintain a
separation with aircrafts traveling in the same flight corridor ahead and behind
the considered aircraft. Type (A) verification tasks are simple, because they do
not involve reasoning about hybrid systems. They consider the evolvement of all
possible trajectories from a subregion of P1  P2 and asks about reachability of
forbidden plant regions. In this analysis trajectories are evolving according to
the differential equations of P1  P2 ,4 with input changes restricted by constants
(such as maximal allowed setting of actuators) or invariances guaranteed by the
controller (such as “velocity is within the interval [vmin , vmax ]”), or invariances
bounding disturbances specified as operating conditions (such as “the maximal
slope for a track for high-speed trains is 5 0 /00 ”), which must be stated as plant
invariances in order to enter the analysis.
Type (B) verification tasks ensure, that both agents take decisions which co-
operate in avoiding potential collision situations. To this end, we assume that
each controller has a designated set of modes called correcting modes, which
define the agents capabilities for performing collision avoidance maneuvers. Ex-
amples of maneuvers are the resolution advisories issued by TCAS, such as
“climb”, “descend”, . . . .. Clearly, the combined system of two aircrafts will only
avoid collisions, it the advisories of the two controllers match. E.g. in most sit-
uations, asking both aircrafts to climb, might actually increase the likelihood of
a collision. It is the task of the designer of the cooperation protocol, to desig-
nate matching pairs of modes, and to characterize the plant states of P1  P2
under which invoking a matching pair of maneuvers is guaranteed to resolve a

4
Recall that plant-models are assumed to only have a single discrete state. Note that
we can still model so-called switched dynamical systems, where the dynamics of the
system depends on the setting of switch variables, typically occurring as input to
the plant-model. Worst-case analysis is then performed for all valuations of switch
variables conforming to invariant properties. Here it will sufficient to work with
over-approximations (such as the flow pipe approximation technique developed in
[5]).
On the Verification of Cooperating Traffic Agents 87

potential collision situation. Demonstrating, that two matching maneuvers avoid


collision when the maneuvers are invoked in a state meeting the characterizing
predicate is a type (A) verification task. Demonstrating, that only matching cor-
recting modes are selected by the protocol leads to simple invariance properties
for C1  C2 .
Finally, type (C) verification conditions stem from the need to enforce timeli-
ness properties for correcting maneuvers. At protocol design time, upper bounds
on the time interval between detecting a potentially hazardous plant state and
the actual invocation of correcting maneuvers must be determined. Establish-
ing these timeliness properties allows to place further bounds on the state-space
exploration in type (A) verification tasks.
We start the formal development of our proof rule by formalizing our notion
of collision freedom as maintaining disjointness of the safety envelopes associated
with each traffic agent. A safety envelope of an agent is a vicinity of its position
which must not overlap the other agent’s safety envelope for otherwise a collision
might not be safely excluded. This vicinity may depend on the agent’s state. For
instance, a railroad crossing might be passed by trains if the barriers are closed
and the crossing is considered safe. Then, its safety envelope is empty, otherwise
it covers the width of the crossing. The safety envelope of the train encloses all
of the train.
In our second example, that of the TCAS system, we have a more symmetrical
situation. Both aircrafts have a nonempty safety envelope when in flight, which
must be sufficient to exclude any dangerous interferences (direct collisions or
turbulances due to near passage) between planes.
In the formal definition, safety envelopes are convex subspaces of R3 sur-
rounding the current position, whose extent can be both dependent on the mode
as well as the current valuation of other plant variables, including in particular
the current velocity.
Definition 5. The safety envelope of an agent is a function
SE : M × RV → P(R3 ),
which in our applications is a convex subset of R3 including the current position,
i.e. if SE(m, σ) = S then (σ(x ), σ(y), σ(z )) ∈ S. Given a run π, and a point in
time t, the current safety envelope is given by SE(π(t)).
Collision freedom of traffic agents is satisfied, if in all trajectories of the com-
posed traffic system (C1  P1 ) (C2  P2 ) the safety envelopes associated with
(C1  P1 ) and (C2  P2 ) have an empty intersection (assuming that trajectories
start in plant states providing “sufficient” distance between the agents, a predi-
cate to be made precise below).
Definition 6. Two runs π1 and π2 are collision free if
∀ t ∈ Time : SE1 (π1 (t)) ∩ SE2 (π2 (t)) = ∅.
Two sets of runs are collision free if each pair of runs of the respective sets
are collision free.
88 W. Damm, H. Hungar, and E.-R. Olderog

We will now introduce a set of verification conditions, which jointly allow to


infer collision freedom, and start by outlining the “essence” of collision avoidance
protocols, as depicted in the phase-transition diagram of Fig. 1

(1)

FAR

NEGOTIATING

(3)
(2)

CORRECTING (5) (4)

SAFE RECOVERY

(6)

Fig. 1. Phase transition diagram for proof rule

Fig. 1 distinguishes the following phases of such protocols. Phase FAR sub-
sumes all controller modes, which are not pertinent to collision avoidance. The
protocol may only be in phase FAR if it is known to the controller, that the
two agents are “far apart” – the agents’ capabilities for executing maneuvers are
such, that maintaining collision freedom can be guaranteed by not yet invoking
correcting actions. Determining conditions for entering and leaving phase FAR is
thus safety critical. The NEGOTIATION phase is initiated, as soon as the two
agent system might evolve into a potentially hazardous situation. The derivation
of the predicate ϕN guarding transition (1) is non-trivial and discussed below.
Within the negotiation phase, the two agents determine the set of maneuvers
to be performed. The CORRECTING phase is entered via transition (2), when
matching correcting modes have been identified and no abnormal conditions –
discussed below – have occurred. During the correcting phase, control laws asso-
On the Verification of Cooperating Traffic Agents 89

ciated with the correcting modes will cause the distance between traffic agents
to increase, eventually allowing to (re-)enter the FAR-phase via transition (3).
The cycle of transitions numbered (1) to (3) described above characterizes
successful collision avoidance maneuvers. Other phases and transitions shown
in Fig. 1 increase the robustness of the protocol, by providing recovery actions
in case of protocol failures in the negotiation phase (e.g. because of disturbed
communication channels). A key concept here is that of fail-safe states: we say
that a traffic agent offers a fail-safe state, if there is a stable state of Cj  Pj ,
i.e. a valuation of the plant variables and actuators, such that their derivative is
zero, and in particular the traffic agent maintains its current position, such as
e.g. for train-system applications. For such traffic agents, any failure encountered
during the negotiation or correcting phase will lead to the activation of recovery
modes, whose control law will be guaranteed to drive the agent into a fail-safe
state, as long as it is entered within a predicate characterizing its application
condition. For simplicity, we assume only a single designated recovery mode for
such traffic agents. Correcting collision avoidance even in case of failures can
only be guaranteed, if both agents offer fail-safe states, and the recovery modes
driving the system to a fail-safe state are entered while the plant is (still) in a
state meeting the application condition for both recovery modes. Transitions (4)
and (5) are guarded by conditions catering for
– an inconsistent selection of correcting modes,
– a timely invocation of recovery actions so as to allow the system to reach
the fail-safe state.
As mentioned above, we assume the existence of an upper bound on the time
to conclude the negotiation phase, and exploit this in analyzing, that correcting
maneuvers are initiated in time for the maneuver to be performed successfully.
To guarantee success of recovery actions, we similarly exploit the fact, that
either the negotiation phase is concluded within the above bound, or a failure
is detected and recovery mechanisms are initiated. Transition (5) provides an
additional safeguard, in that the transition to the recovery mode will be taken
as soon as its application condition applies. The activation of such recovery
mechanisms should only be performed, if in addition the duration allowed for
a successful conclusion of the negotiation phase has expired. Once the recovery
mode is entered, the system will then eventually be driven to a fail-safe state via
transition (6), causing it to remain in phase SAFE.
For systems not offering fail-safe states, no formal analysis can be performed
as to the effects of initiating a recovery strategy. Recovery modes for such sys-
tems reflect the “never-give-up” design strategy for safety-critical systems, where
“best guesses” as to the possible state of the other agent are made in the deter-
mination of control-laws associated with such recovery modes.
The phase-transition diagram defines a set of type (C) verification tasks ap-
pearing as premises of our proof rule for establishing collision freedom. To in-
stantiate the proof rule for a concrete conflict-resolution protocol, the phases
discussed above must be defined as (derived) observables of the concrete pro-
tocol. Often, phases can be characterized by being in a set of modes of the
90 W. Damm, H. Hungar, and E.-R. Olderog

concrete protocol. In general, we assume that phases are definable by first-order


quantifier-free formula in terms of the variables and mode of the concrete system.
We denote by Φobs the set of equivalences defining phases and transition guards
in terms of the entities of the concrete protocol. It is then straightforward to
generate verification conditions enforcing compliance of the concrete protocol to
the phase-transition system of Fig. 1, such as by using the subclass of so called
“implementables” of the Duration Calculus (c.f. [15]), or some variant of real-
time temporal logic. Such verification conditions take the general form “when in
phase p then remain in phase p unless trigger condition occurs”, and “when in
phase p then switch to phase p’ if trigger condition is met”.5 We denote the set
of temporal logic formula jointly characterizing the phase-transition diagram of
Fig. 1 assuming Φobs by Φphase .
This leads to the first set of verification conditions for establishing collision
freedom. Here and in the remainder of this section we assume a fixed system
(C1  P1 ) (C2  P2 ) as given.

(VC 1) Controllers observe phase-transition diagram


Ci |= Φphase (for i = 1, 2)

Prior to deriving trigger conditions for phase-transitions, let us first elaborate


on three verification conditions to be established by what we refer to as “off-line
analysis”. These verification conditions serve to establish at design time, that the
invocation of matching maneuvers will guarantee collision freedom, as long as
these are initiated while the global plant state satisfies a predicate characterizing
the application condition of the pair of maneuvers. A similar off-line analysis
must be carried out for recovery mechanism, where it must be verified, that any
plant state meeting the application condition for such recovery states will drive
both agents into their fail-safe state prior to reaching a collision situation.
As discussed in Section 2, we assume as given a subset of correcting modes
CMj of the Modes Mj of Cj . If Cj has fail-safe states, we denote by mrec2fss,Cj
the unique mode of Cj defining the control strategy for reaching fail-safe states.
We assume a binary relation MATCH ⊆ CM1 × CM2 to be given, which
captures the designer’s understanding, of which maneuvers are compatible and
hence expected to resolve potential collision situations. To this end, she will also
determine application conditions Φ(m1 ,m2 ) , where (m1 , m2 ) ∈ MATCH, under
which these maneuvers are expected to be successful. Typically, these applica-
tion conditions will involve the relative position of the traffic agents, as well as
their velocity. Maneuvers are typically executed under assumptions on some of
the system variables (e.g. bounding speed or lateral movement), which must be
established separately. Similarly, let Φrec2fss denote the application condition for
recovery to fail-safe states (see the following sections for examples).

5
We assume in this paper that the successor phase is uniquely determined by the
trigger condition. Time-bounds on taking the transition when the trigger is enabled
are discussed below.
On the Verification of Cooperating Traffic Agents 91

The first verification condition requires a proof, that indeed matching maneu-
vers avoid collision. It states that all trajectories obeying the set of differential
equations governing the maneuvers and the plants will be free of collisions, as
long as they are initiated in a state satisfying the application conditions of match-
ing modes. The runs to be considered are assumed to meet the plant invariants
on boundary conditions for the plants’ variables. Moreover, any local invariances
associated with the correcting modes may be assumed during off-line analysis –
separate verification conditions cater for these.

(VC 2) Adequacy of matching modes


∀ (m1 , m2 ) ∈ MATCH :
(RCc
1
(m1 )  RP
c
1
()) (RP
c
2
() (RC
c
2
(m2 )))
|= 2(Θ(m1 ) ∧ Θ(m2 ) ∧ Θ(P1 ) ∧ Θ(P2 ) ∧ Φ(m1 ,m2 ) )
⇒ 2(collision-free)

A key point to note is, that the above verification condition does not involve
state-based reasoning. By abuse of notation, we have replaced hybrid automata
by differential equations in the parallel composition above. To establish the ver-
ification condition, we must analyze the possible evolution of trajectories of the
plant variables in subspaces characterized by the application condition. We can
do so by over-approximation, using boundary conditions on plant variables (such
as maximal speed, maximal acceleration, maximal disturbance) and actuators
(as determined by the control laws associated with the correcting modes), and
invariance conditions associated with correcting modes. See the next section for
a concrete example, demonstrating in particular the ease in establishing such
verification conditions. For more involved maneuvers, we can resort to classical
methods from control-theory such as using Lyapunov functions (c.f. [9]) for es-
tablishing collision freedom, since no state-based behaviour must be analyzed. A
follow-up paper will solve the constraint system induced by (VC 2) generically,
establishing constraints on the involved design parameters under which (VC 2)
is guaranteed to hold.
A similar verification condition must be established for recovery modes driv-
ing the system to fail-safe states. It requires each traffic agent to reach a fail-safe
state without colliding with the other agent, as long as the recovery maneuver
is initiated in a plant state meeting its applicability condition.

(VC 3) Adequacy of recovery maneuvers


(RCc
1
(mrec2fss,1 )  RP
c
1
()) (RP
c
2
()  RC
c
2
(mrec2fss,2 ))
|= 2(Θ(mrec2fss,1 ) ∧ Θ(mrec2fss,2 ) ∧ Θ(P1 ) ∧ Θ(P2 ) ∧ Φrec2fss )
⇒ 3fail-safe ∧ 2collision-free

So far we have only discussed the adequacy of the maneuvers for avoiding
collision avoidance. This must be complemented by a proof of completeness of
the proposed methods. More specifically, for any trajectory of the unconstrained
92 W. Damm, H. Hungar, and E.-R. Olderog

plants (respecting only global invariances on maximal disturbances, maximal


rate of acceleration, etc) leading to a collision situation, there must be at least be
one pair of matching maneuvers, whose application condition is reached during
the trajectory.
(VC 4) Completeness of maneuvers
∀ π ∈ [[P1  P2 ]] : π |= Θ(P1 ) ∧ Θ(P2 ) ⇒
( ∀ t : π(t) |= collision ∧ π[0, t) |= collision-free
⇒ ∃ t  < t ∃(m, m‘) ∈ MATCH : π(t  ) |= Φ(m,m  ) )
We will now derive a sufficient condition for the predicate ΦN guarding the
transition to the negotiation phase. This must ensure, that in any potentially
hazardous situation, the negotiation phase is allowed in time to ensure, that
correcting maneuvers can still be executed successfully. To derive this condition,
we perform the following simple Gedankenexperiment. Assume, that the negoti-
ation phase can be performed in zero time. Then, by completeness of maneuvers,
it would be sufficient to trigger the transition into the negotiation phase, if at
least one activation condition of the set of possible maneuvers is met. Indeed, by
completeness, any potentially hazardous “legal” trajectory would have to pass at
least one of the activation conditions of the set of maneuvers, hence by selecting
one of these, the maneuver will ensure, that the potentially hazardous situation
is resolved (by the adequacy of matching correcting modes).
To complete the Gedankenexperiment, we now drop the assumption, that ne-
gotiation is performed in zero time. Instead, we assume that at protocol design
time, a time window ΔN is determined, in which the negotiation phase will be
left, either successfully, by entering phase CORRECTING, or otherwise by in-
voking a recovery mode. To determine the trigger condition ΦN , we must derive
the set of states which can evolve during this time window ΔN into a potentially
hazardous situation, i.e. into a state meeting at least one of the activation con-
ditions of matching maneuvers. When performing this pre-image computation,
we must take into account the dynamics of both traffic agents. In this paper,
we assume a cooperative approach: the pre-image computation is performed, as-
suming that the speed of the traffic agent is not increased while performing the
negotiation phase – an assumption, which must then be established separately
as verification condition.
Let pre = preP1 P2 (Δ, [v1,l , v1,u ], [v2,l , v2,u ], Φ) be a predicate which over-
approximates the set of plant states which in Δ time units can evolve into a state
satisfying Φ, assuming that the speed v of agent i is bound by [vi,l , vi,u ] (a vector
of bounds for all three dimensions, if applicable), and restricted by Θ(Pi ). (VC 5)
below expresses the restriction on the condition ΦN triggering the transition to
phase NEGOTIATING, given the time bound ΔN for the NEGOTIATION to
complete.
(VC 5) Negotiation is initiated in time
|= ΦN ⇐ preP1 P2 (ΔN , [v1,l , v1,u ], [v2,l , v2,u ], Φ(m1 ,m2 ) )
(m1 ,m2 )∈MATCH
On the Verification of Cooperating Traffic Agents 93

That ΔN is met by the agents’ negotiation is to be verified separately. We


assume that the necessary communication can be performed by the controllers
on their own.
(VC 6) Negotiation completes in time

C1  C2 |= 2(NEGOTIATING(C1 ) ∨ NEGOTIATING(C2 )
⇒ 3≤ΔN (¬NEGOTIATING(C1 ) ∧ ¬NEGOTIATING(C2 )))

To cover the assumptions made in preconditions, we require the control laws


active during the negotiation phase to maintain the speed of traffic agents within
the bounds used in (VC 5) in pre-image computation.
(VC 7) Speed stays bounded during negotiation

Ci  Pi |= 2(NEGOTIATING(Ci ) ⇒ vi ∈ [vi,l , vi,u ]) (for i = 1, 2)

For traffic systems allowing recovery to fail-safe states, we would like to ensure
additionally, that the negotiation is initiated in system states allowing to invoke
recovery actions if negotiation fails.
(VC 8) Recovery actions will be possible

|= ΦN ⇒ preP1 P2 (ΔN , [v1,l , v1,u ], [v2,l , v2,u ], Φrec2fss )

Jointly, (VC 5), (VC 6) and (VC 7) guarantee, that the activation conditions
of maneuvers remain true during negotiation, hence are still valid when actually
invoking the maneuvers, and (VC 7) in conjunction with (VC 6) and (VC 8)
guarantees, that the activation condition for recovery to fail-safe states is met,
if the negotiation phase fails.
That the negotiation results in matching modes being selected is stated in
the following condition.
(VC 9) Only matching modes with satisfied activation condition are selected

C1  C2 |= 2(↑ (CORRECTING(C1 ) ∧ CORRECTING(C2 ))


⇒ (Mode(C1 ), Mode(C2 )) ∈ MATCH ∧ Φ(Mode(C1 ),Mode(C2 )) )

We must finally ensure, that a selected maneuver is not abandoned until the
hazardous situation has been resolved, i.e., until phase FAR can be reentered.
Hence, while in mode CORRECTING, the maneuver may not be changed.
(VC 10) Correcting modes are not changed

C1  C2 |= ∀ (m1 , m2 ) ∈ MATCH :
2( CORRECTING(Ci ) ∧ Mode(Ci ) = mi
i=1,2

⇒ (Mode(Ci ) = mi while CORRECTING(Ci )))


i=1,2
94 W. Damm, H. Hungar, and E.-R. Olderog

The condition ΦF guarding the transition to phase FAR is easily derived: by


completeness of the set of maneuvers, the plant state is no longer potentially
hazardous, if all activation conditions of maneuvers are false. In this case, the
controllers are permitted to leave the correcting mode.
(VC 11) Termination of Maneuver

|= ΦF ⇒ ¬Φ(m1 ,m2 )
(m1 ,m2 )∈MATCH

Jointly, these verification conditions are sufficient to establish collision free-


dom. A later version of this paper will prove this formally, by performing the
required off-line analysis and pre-image computation symbolically on parameter-
ized models. In the following, we give two examples of proofs of collision freedom
and discuss how they can be rephrased in terms of our proof rule.

4 Decomposition in the Verification of Safety of the


TCAS System
4.1 A Short Description of the TCAS System
The Traffic Alert and Collision Avoidance System (TCAS) serves to detect and
avoid the danger of aircraft collisions. It is realized by electronic units on board
of the aircraft. Its task is to detect potential threats, to alert the pilot and to
give advice in maneuvers which resolve the conflicts. To achieve this, the system
gathers information from nearby aircraft to predict the vertical separation at the
point of closest horizontal approach. If the vertical separation is deemed insuffi-
cient, it issues “Resolution Advisories” (RAs) to the pilot (“climb”, “descend”,
. . . ) which, assuming unchanged horizontal course, shall avoid that the aircraft
come dangerously close. The TCAS units of different aircraft communicate, to
alert the other of its classification as a threat, to ensure consistency of RAs, and
to avoid unnecessary alarms by taking intended course changes into account.
Each TCAS system resorts to deciding on its own if communication cannot be
established. Fig. 2 from [11] gives an overview of the components involved in
conflict avoidance in the case of two aircraft.
There have been several versions of the system. We will discuss the TCAS II-
7. This version is in widespread use, and has undergone quite some analysis, both
internally during its development and externally in e.g. [1]. The precise analysis
from [11] comes closest to our approach, and we will point out similarities and
differences, to highlight to which extent the formulated goal of establishing safety
has been proven to be achieved by the TCAS system.

4.2 The Analysis of TCAS by Livadas, Lygeros and Lynch


The general approach of [11] is to use hybrid automata to model all components
involved in conflict avoidance for a two-aircraft instance, and to analyse the runs
On the Verification of Cooperating Traffic Agents 95

Aircraft 1 Aircraft 2

TCAS 1 TCAS 2

Sensor 1 Sensor 2

Pilot 1 Pilot 2

Conflict Conflict
Detection 1 Detection 2

Communication
Channel 1
Conflict Conflict
Resolution 1 Resolution 2

Communication
Channel 2

Advisories Advisories

Fig. 2. TCAS-System for two aircraft

of the combined automata to check for the absence of collisions. The real system
would then have to shown to be a refinement of the respective components (or
have to be assumed to act as it were a refinement – the model includes compo-
nents for pilots, which would hardly be provable refinements of an automaton).
Though the automaton format differs (hybrid I/O automata with superdense
time are used), this is very much in line with our approach.
Several assumptions are stated under which the analysis is performed. Some
of them serve to reduce the computational complexity of trajectory analysis,
which may limit the practical relevance but is conceptually reasonable. Some
are even indispensable, as the one restricting pilot behavior to not oppose traffic
advisories. Others limit the relevance of the derived safety result if compared to
our stated goal. Our discussion will center around these latter issues.
Disregarding simplifying assumptions, the safety theorem of [11] can be stated
as follows:

All assumption-compliant runs where a conflict is detected will either


evolve to retraction of the conflict declaration before passage, or to a
passage where sufficient vertical separation is kept at the point of closest
approach.

This can be rephrased as saying that the proof establishes that the interplay
between discrete and continuous components will ensure that TCAS’ operational
goals are met. An “operational” goal is formulated via the system’s own defini-
tions:
96 W. Damm, H. Hungar, and E.-R. Olderog

– A “conflict” is defined as the system’s conflict-detection component defines


it.
– “Undeclaration” of a conflict is similarly taken from the respective system
criterion.
– Collision-freedom of two trajectories is given if passages have sufficient ver-
tical separation.

This theorem tells a lot about the effects of TCAS – the vertical separation
will be achieved as required in the system specification. However, it does not
imply that two aircraft guided by TCAS will not collide. The latter is an exten-
sional property, the former is intensional and presupposes that the operational
specification is consistent (which is, considering the rather successful history of
guaranteeing safety in air traffic, not unreasonable). But our verification goal
would be more ambitious. So let us have a look at the verification conditions of
our rule which have been established in [11], and those which would have to be
added.

4.3 Items for Completing the Analysis


The extensional goal is formulated in terms of safety envelopes. Such notions do
exist in air traffic regulations – there is for instance a safety distance between
landing aircraft which depends on the weight class of the leading aircraft. Similar
separation requirements can be derived for mid-air traffic. So we may assume
adequate definitions of safety envelopes as given.
Our approach presupposes a compliance of the resolution procedure to the
state diagram Fig. 1. In the case of TCAS, there are no fail-safe states, which
reduces the picture to the three states FAR, NEGOTIATING and CORRECT-
ING. On the other hand, as TCAS follows the “never give up” strategy, there
are further situations not taken care of in the general picture. We will have to
extend our approach to cater for these. In the meantime, we resort to excluding
further behavior from our analysis by adequate assumptions. Then, one may
indeed view TCAS II-7 as adhering to the three-state cycle, if attention is re-
stricted to the two-aircraft case. Adherence to the behavior pattern is implicitly
shown in [11], we may conclude that (VC 1) does not pose a problem.
As there are no fail-safe states, the verification conditions (VC 3) and (VC 8)
do not apply here. We will go through the list of remaining verification conditions
in the following.
A central part of the adequacy of matching modes (VC 2) is indeed estab-
lished by the essence of the proof from [11]: Trajectories resulting from selected
correcting strategies will result in sufficient separation. To complete the proof of
(VC 2), we have to add that the intensional (TCAS-internal) criterion of “verti-
cal separation at closest approach” implies disjointness of safety envelopes. This
will (likely) be a consequence of the bounds on climb and descend speeds of
aircraft following TCAS resolution advisories. Of course, also a precise defini-
tion of the conditions Φ(m1 ,m2 ) on matching modes will be required to formally
complete the proof.
On the Verification of Cooperating Traffic Agents 97

The completeness of maneuvers, (VC 4), is also addressed partly: After a


conflict has been detected, there will be a pair of maneuvers selected. To be
added is the important argument that each conflict is detected, and detected
in time for some maneuver to be successful. This is again the (missing) part
distinguishing intensional and extensional arguments.
The conditions regarding timing (VC 5), and (VC 6) have been left out of
consideration by adding appropriate assumptions. The speed bound (VC 7) is
rendered trivial by assumption. Adding proofs (and weakened assumptions) for
these three verification conditions would be rather easy.
The selection of matching modes with satisfied verification conditions (VC 9)
is addressed implicitly by the proof, since for all cases of selections it is shown
that the corrections will be successful. Our proof rule requires a separation of
the proof via the introduction of MATCH and Φ(m1 ,m2 ) . A case split according
to the matching mode pair is actually performed in [11], and can be extended
to the additional proof obligations.
Adherence to a selection of correction modes (VC 10) is a condition simpli-
fying our proof rule and restricting the solution space. TCAS II-7 matches this
restriction, if the definition of correction is made adequately, and “never give
up” is left out of consideration. The reversal of correcting actions in one aircraft
– one of the new features in the version II-7 – seems to violate this requirement.
But it fits into the picture as it reverses a correcting maneuver which has started
before the end of negotiations. The reversal caters (among others) for the not
unlikely situation that one aircraft pilot ignores resolution advisories.
Termination of maneuvers (VC 11) is of course also not addressed exten-
sionally in [11]. We think that it will follow easily from the definition of the
application conditions Φ(m1 ,m2 ) .

5 Decomposition in the Verification of Safety of a


Railroad Crossing
Railroad crossings are often taken as case studies for real-time systems [3, 7].
Here we consider a railroad crossing as a hybrid system taking the continuous
dynamics of train and gate into account. Our formal model extends that in [14] by
some aspects of the radio controlled railroad crossing, a case study of the priority
research program “Integration of specification techniques with applications in
engineering”6 of the German Research Council (DFG) [8].
We start from domains Position = R≥0 with typical element p for the position
of the train on the track and Speed = R≥0 with typical element v for the speed
of the train. Let vmax denotes the maximal speed of the train. As part of the
track atlas we find the positions of all crossings along the track. In particular,
the function

next : Position → Position

6
http://tfs.cs.tu-berlin.de/projekte/indspec/SPP/index.html
98 W. Damm, H. Hungar, and E.-R. Olderog

yields for each position of the train the start position of the next crossing ahead
such that ∀ p ∈ Position : p < next(p) holds. Further on,

inCr : Position → B

is a predicate describing all positions in the crossing and

afterCr : Position → B

is a predicate describing the positions in a section immediately after the crossing.


For simplicity we shall not be explicit about the extension of the train and the
crossing. We use inCr to describe all positions where the train (or its safety
envelope) overlaps with the crossing.
There is a section near the crossing in which the train has to request per-
mission from the gate to enter the next crossing. The extension of this section
depends of the train’s speed. The predicate

nearCr : Speed → (Position → B)

describes for a given speed all positions and that are in such a section. The
positions satisfying these predicates are illustrated in Fig. 3. The predicate

farCr : Speed → (Position → B)

describes for a given speed v all positions that are far away from the next cross-
ing, i.e. the complement of all positions satisfying nearCr (v ) or inCr or afterCr .
Note that we expect the following implications to hold:

∀ v1 ≤ v2 ∈ Speed : (nearCr (v1 ) ⇒ nearCr (v2 )) ∧ (farCr (v2 ) ⇒ farCr (v1 ))

We assume that subsequent crossings on the track are far apart so that even
for the maximal speed vmax of the train there is a nonempty section farCr(vmax )
between each section afterCr and the section inCr of the next crossing. We model
the gate by its angle α ranging from 0 to 90 degrees.

Fig. 3. Parameters of the railroad crossing

The railroad crossing will now be modelled by four components: a train plant
and controller interacting with a gate plant and controller. Fig. 4 shows how
these plants are represented by continuous variables pos(ition), speed and α,
and which variables the four components share for communication with each
other. For simplicity we shall ignore the communication times in the subsequent
analysis.
On the Verification of Cooperating Traffic Agents 99

Train Plant Gate Plant

Fail
pos
α
speed

pos speed Cl Op

sc Dir

Train Controller Gate Controller


Req

OK

Fig. 4. Communication in the train gate system

Train Plant
We assume that the train has knowledge of its position on the track and controls
its speed depending on requests from the train controller. It will react to speed
control commands from the train controller. Thus we consider the variables
below. We do not distinguish between the (syntactic) variables of the automaton
and the corresponding trajectories in runs. So we take for the type of a variable
the type of its time-dependent trajectory, and we permit variables with discrete
ranges without explicitly coding them in reals.

Variables:

input sc : Time → {Normal , Keep, Brake} (speed control)


output pos : Time → Position (position of the train)
speed : Time → Speed (speed of the train)

Dynamics. Let −areg be the deceleration of the train when braking (in regu-
lar operational mode, not in emergency mode which is not modeled). Thus we
assume that the following invariants hold:

pos • = speed
−areg ≤ speed •
speed ≤ vmax
100 W. Damm, H. Hungar, and E.-R. Olderog

Here we are interested only in the change of speed during braking:



⎨ 0 if sc = Keep ∨ (sc = Brake ∧ speed = 0)
speed • = −areg if sc = Brake ∧ speed > 0

... if sc = Normal

With these characteristics we can calculate for which speeds v and positions
p the predicate nearCr (v )(p) should hold taking the maximal closing time εmax
of the gate into account:

nearCr (v )(p) ⇐ (next(p) − v 2 /2 · areg − εmax · v ) ≤ p

Here v 2 /2 · areg is the maximal distance it takes for the train with an initial
speed of v to stop, and εmax ·v is the maximal distance the train can travel while
the gate is closing. Remember that we do not take the time for communication
into account here. We assume that initially the train is far away from the next
crossing, i.e. the predicate farCr (speed )(pos) holds for a while. This, and the
definition of nearCr (v )(p) is needed for establishing (VC 4).

Train Controller
The train controller monitors the position and speed of the train, sends requests
to enter the next crossing, and waits for an OK signalling permission by the
crossing to enter. If this OK signal does not occur within some time bound
measured by a clock x , the train controller will request the train to enforce
braking in order to stop before entering the crossing. Thus the train controller
has the following time dependent variables.

Variables:

input pos : Time → Position (position of the train)


speed : Time → Speed (speed of the train)
OK : Time → B (permission to enter crossing)
local x : Time → Time (clock)
output Req : Time → B (request to enter crossing)
sc : Time → {Normal ,
Keep, Brake} (speed control)
Modes: Far, Appr, SafeAppr, FailSafe

Dynamics. The dynamics of the train controller is described by the hybrid au-
tomaton in Fig. 5. Initially, the controller is in the mode Far. When for the
current values of pos and speed the predicate near (speed )(pos) becomes true,
On the Verification of Cooperating Traffic Agents 101

the controller switches to the mode Appr setting Req to true and starting the
clock x . When things proceed as expected the crossing should respond with an
OK signal within εmax time, the maximal time the gate needs to close. On oc-
currence of OK the controller switches to the mode SafeAppr indicating that
the train can safely approach the crossing. However, if the OK signal does not
arrive within εmax time, the controller enters the mode FailSafe where the train
is forced to brake until it stops. Only if later an OK signal arrives it may resume
its safe approach to the crossing.

true /
Req := ff,
sc := Normal nearCr(speed)(pos) / Appr
Far
Req := tt, x := 0 , sc := Keep
x := 1

afterCr(pos) /
Req := ff x=ε /
max
OK / sc := Normal sc := Brake

SafeAppr FailSafe
OK / sc := Normal

Fig. 5. Train controller

The controller states correspond nicely to the phases from Fig. 1: Far is
FAR, Appr is NEGOTIATING, FailSafe is RECOVERY, and SafeAppr is
CORRECTING. The only action the train does in correcting is keeping the Req
signal up.7

Gate Plant
The gate controls its angle α depending on the direction requested by the variable
Dir of the gate controller. If Dir is Up the gate should open, and if it is Down the
gate should close. The gate outputs whether it is completely open by a signal
Op or completely closed by a signal Cl . We take into account that the gate may
fail to open or close when requested. This it is modelled by the following time
dependent variables.

7
Thus, one might also view the mode Appr as already CORRECTING.
102 W. Damm, H. Hungar, and E.-R. Olderog

Variables:

input Dir : Time → {Up, Down} (direction of the gate)


Fail : Time → B (gate failure)
local α : Time → [0, 90] (angle of the gate)
output Op : Time → B (gate is open)
Cl : Time → B (gate is closed)

Dynamics. We assume that initially the gate is open. i.e. α(0) = 90, that c is
the speed with which the gate can change its angle α, and that the gate plant
is characterised by the following differential equation:

⎨ c if Dir = Up ∧ α < 90 ∧ ¬Fail
α• = −c if Dir = Down ∧ α > 0 ∧ ¬Fail

0 otherwise

Thus when there is no gate failure the maximal closing time εmax of the gate
(plus one extra second) is calculated as follows:

εmax = 90/c + 1

The outputs Op and Cl are coupled with the angle of the gate by the following
invariants:

Op ⇔ α = 90 and Cl ⇔ α = 0

Gate Controller
The gate controller reacts to the presence or absence of requests by the train
(controller) to enter the crossing by issuing GoUp and GoDown signals to the
gate plant. Depending on the messages Op and Cl of the gate plant it can
signal an OK to the train controller as a permission to enter the crossing. This
motivates the following time dependent variables.

Variables:

input Op : Time → B (gate open)


Cl : Time → B (gate closed)
Req : Time → B (request to enter crossing)
output Dir : Time → {Up, Down} (direction of the gate)
OK : Time → B (permission to enter crossing)
Modes: CrUnsafe, CrSafe, CloseGate, OpenGate
On the Verification of Cooperating Traffic Agents 103

Dynamics. The dynamics of the gate controller is described by the automaton


in Fig. 6. Initially, the controller is in the mode CrUnsafe where it does not
grant any permission to enter the crossing. When it senses a request by the train
to enter it orders the gate to go down and switches to the mode CloseGate.
It stays there until the gate signals that it it completely closed. Only then the
controller gives permission to the train to enter the crossing by issuing an OK
signal and switches to the mode CrSafe. It stays there until the request to enter
the crossing is withdrawn. Then the controller switches to the mode OpenGate,
withdraws the permission to enter the crossing and orders the gate to go up.
When the gate is completely open, the controller switches back to its initial
mode.

true / Dir := Up
OK : = ff Req / Dir := Down
CrUnsafe CloseGate

Op Cl / OK := tt

OpenGate CrSafe
Req / Dir := Up, OK := ff

Fig. 6. Gate controller

The mode CrUnsafe belongs to the gate’s phase FAR, CloseGate and
CrSafe form the phase CORRECTING, and OpenGate is again FAR. The
phase NEGOTIATING has no corresponding mode in this model: The gate does
not negotiate, and one may view the transition from CrUnsafe to CloseGate
as passing NEGOTIATING in an instant.

Correctness Proof
Desired Safety Property. We wish to show that whenever the train is in the
critical section the gate is closed. This might be formulated in terms of safety
envelopes as follows. For SETrain (pos) we choose an extension around the current
position pos which encompasses the extension of the train, independent of mode
and speed. The crossing’s safety envelope is

∅ if m = CrSafe
SECr (m) =
Q otherwise
104 W. Damm, H. Hungar, and E.-R. Olderog

for some adequate set of positions Q. This choice of ∅ as the extension of the
safety envelope in mode CrSafe permits the train to pass the crossing when the
bars are closed without safety violation. We assume that inCr (pos) includes all
train positions which could, if the crossing is not safe, violate safety, i.e.

SETrain (pos) ∩ Q = ∅ ⇒ inCr (pos).

To perform the proof, we use the State Transition Calculus [18], an exten-
sion of the Duration Calculus [19, 15] to deal with instantaneous transitions.
The Duration Calculus itself is a temporal logic and calculus for expressing and
proving real-time interval properties of trajectories or observables of the form
obs : Time → Data. As a consequence of our assumption above, the desired
safety property can then be expressed as follows:

2(#inCr (pos)$ ⇒ #Cl $)

This Duration Calculus formula states that for every time intervals whenever
the state assertion inCr (pos) about the train holds throughout this interval then
also the state assertion Cl holds throughout the same interval.
This property is established in a complete, self-contained proof. This proof
is presented below, with added comments on which parts correspond to the
verification conditions of our rule and the argument of the rule itself.

Notation. Duration formulas are evaluated in time intervals whereas state as-
sertions are evaluated in time points. In the following D, D1 , D2 denote duration
formulas, S denotes a state assertion, and t ∈ Time.

D1 ; D2 (chop operator) holds in a given time interval


if first D1 and then D2 holds
3D ⇔ true; D; true holds in a given time interval
if on some subinterval D holds
2D ⇔ ¬3¬D holds in a given time interval
if for all subintervals D holds
#$ ⇔  = 0 holds in a given time interval
if this is a point interval (i.e. has length 0)
#S $ ⇔  > 0 ∧ S =  holds for a non-point interval
in which S holds throughout
#S $t ⇔  = t ∧ S =  holds for an interval of length t
and S holds throughout
D −→0 #S $ ⇔ ¬(D; #¬S $) holds in an interval starting at 0
if D is followed by #S $
D −→ #S $ ⇔ 2¬(D; #¬S $) holds in a given time interval
if always D is followed by #S $
On the Verification of Cooperating Traffic Agents 105

t
D −→ #S $ ⇔ (D ∧  = t) −→ #S $ holds in a given time interval
if whenever D is true for t time
it is followed by #S $
↑S (start transition) holds for a point interval where S
switches from false to true
↓S (end transition) holds for a point interval where S
switches from true to false

To avoid unnecessary brackets, we assume that the chop operator ; binds


stronger than the binary Boolean connectives ∧, ∨, ⇒, ⇔ and the derived bi-
t ≤t
nary operators −→0 , −→, −→, −→. In state assertions, equations obs = d are
abbreviated to d if it is clear to which observable obs the data value d be-
longs. For instance, #Keep$ abbreviates #sc = Keep$. For Boolean observables
obs we abbreviate obs = true to obs. For instance, #Cl $ abbreviates #Cl =
true$.

Train Plant. The properties of the train plant are expressed in the State Tran-
sition Calculus as follows:
Invariants:
2#pos • = speed $
2#speed ≤ vmax $
Initial state of the train on the track:
#$ ∨ #farCr (speed )(pos)$; true
The assumption about the initial state is needed for (VC 4).
Assertions about the train movements:
#farCr (speed )(pos)$ −→ #farCr (speed )(pos) ∨ nearCr (speed )(pos)$
#nearCr (speed )(pos)$ −→ #nearCr (speed )(pos) ∨ inCr (pos)$
#inCr (pos)$ −→ #inCr (pos) ∨ afterCr (pos)$
#afterCr (pos)$ −→ #afterCr (pos) ∨ farCr (speed )(pos)$
These assertion establish the necessary properties for reasoning about tra-
jectories for (VC 2) and (VC 3) on the abstract level with the predicates farCr ,
nearCr , inCr and afterCr .
The speed is stable under sc = Keep:
∀ v ∈ Speed : #speed ≤ v ∧ sc = Keep$ −→ #speed ≤ v $
Stability of speed directly establishes (VC 7).
Braking:
∀ v ∈ Speed : #speed ≤ v $; #sc = Brake$v /areg −→ #speed = 0$
#speed = 0 ∧ Brake$ −→ #speed = 0$
∀ v ∈ Speed : ↑ nearCr (v )(pos); #speed ≤ v $εmax −→ #nearCr (v )(pos)$
∀ v ∈ Speed : (↑ nearCr (v )(pos);
#speed ≤ v $εmax ; #(speed • = −areg ) ∨ speed = 0$v /areg )
−→ #nearCr (v )(pos)$
106 W. Damm, H. Hungar, and E.-R. Olderog

The behavior while braking is needed for the adequacy and feasibility of re-
covery actions (VC 3) and (VC 8), while further properties of the abstract logical
level, insofar as the interaction with the train’s movement is concerned.

The train does not move at speed = 0:


∀ p ∈ Position : #speed = 0$ ∧ #pos = p$; true ⇒ 2#pos = p$

Train Controller. The following characterization of the train controller of Fig. 5


in terms of Duration Calculus formulas is not related to the decomposition proof
rule. It only serves to lift reasoning about the automaton to the level of the
Duration Calculus.

Initial mode:
#$ ∨ #Far $; true
Mode sequencing:
#Far $ −→ #Far ∨ Appr $
#Appr $ −→ #Appr ∨ SafeAppr ∨ FailSafe$
#SafeAppr $ −→ #SafeAppr ∨ Far $
#FailSafe$ −→ #FailSafe ∨ SafeAppr $
Stabilities:
#Far $; #farCr (speed )(pos)$ ⇒ #Far $
#Appr $; (#¬OK $ ∧  < εmax ) ⇒ #Appr $
#SafeAppr $; #¬afterCr (pos)$ ⇒ #SafeAppr $
#FailSafe$; #¬OK $ ⇒ #FailSafe$
Transitions upon input events:
#Far $; ↑ nearCr (speed )(pos) −→ #Appr $
#Appr $; ↑ OK −→ #SafeAppr $
#SafeAppr $; ↑ afterCr (pos) −→ #Far $
#FailSafe$; ↑ OK −→ #SafeAppr $
Timeout transition:
εmax
↑ Appr ; #Appr $ −→ #FailSafe$
The output variables Req and sc depend on the mode only:
2#Far ⇔ ¬Req$
2#(Far ∨ SafeAppr ) ⇔ sc = Normal $
2#Appr ⇔ sc = Keep$
2#FailSafe ⇔ sc = Brake$

Gate Plant. The following assumptions serve to characterize the trajectories


resulting from the hybrid automaton in terms of the Duration Calculus formulas,
similar to the characterization of the train plant above.
Initial state of the gate:
#$ ∨ #α = 90$; true
On the Verification of Cooperating Traffic Agents 107

Assertions on the dynamics of the gate:


#α = 90 ∧ Dir = Up$ −→ #α = 90$
εmax
#Dir = Up ∧ ¬Fail $ −→ #α = 90$
#α = 0 ∧ Dir = Down$ −→ #α = 0$
εmax
#Dir = Down ∧ ¬Fail $ −→ #α = 0$
The output variables Op and Cl depend on the state of the gate only:
2#Op ⇔ α = 90$
2#Cl ⇔ α = 0$

Gate Controller. We formalise the properties of the automaton in Fig. 6.


Initial mode:
#$ ∨ #CrUnsafe$; true
Mode sequencing:
#CrUnsafe$ −→ #CrUnsafe ∨ CloseGate$
#CloseGate$ −→ #CloseGate ∨ CrSafe$
#CrSafe$ −→ #CrSafe ∨ OpenGate$
#OpenGate$ −→ #OpenGate ∨ CrUnsafe$
Stabilities:
#CrUnsafe$; #¬Req$ ⇒ #CrUnsafe$
#CloseGate$; #¬Cl $ ⇒ #CloseGate$
#CrSafe$; #Req$ ⇒ #CrSafe$
#OpenGate$; #¬Op$ ⇒ #OpenGate$
Transitions upon input events:
#CrUnsafe$; ↑ Req −→ #CloseGate$
#CloseGate$; ↑ Cl −→ #CrSafe$
#CrSafe$; ↑ ¬Req −→ #OpenGate$
#OpenGate$; ↑ Op −→ #CrUnsafe$
The output variables Dir and OK depend on the mode only:
2#(CrUnsafe ∨ OpenGate) ⇔ Dir = Up$
2#(CloseGate ∨ CrSafe) ⇔ Dir = Down$
2#CrSafe ⇔ OK $

Proof of the Safety Property


The proof of the safety property is sketched by the following timing diagrams
showing interpretations of the time dependent variables that satisfy the above
formulas of the State Transition Calculus.
Fig. 7 shows the normal case where for the interval when pos takes the value
inCr we have #Cl $ as desired. For the adequacy (VC 2) and completeness (VC 4)
arguments, we need that the Req signal is sent in time to close the gate. As
NEGOTIATING is rather trivial here, and MATCH consists of just one pair
of modes, (VC 9) and (VC 10) are rather trivial. Maneuvers are terminated on
108 W. Damm, H. Hungar, and E.-R. Olderog

pos farCr(...) nearCr(v) inCr afterCr farCr(...)

v
speed

mode Far Appr SafeAppr Far


Train−C

Req Req Req


Req

Normal Keep Normal


sc

mode CrUnsafe CloseGate CrSafe OpenGate CrUnsafe


Gate−C

Up Down Up
Dir

Op (Op Cl) Cl (Op Cl) Op


α
ε ε
max max

Fail Fail

OK OK OK
OK

Fig. 7. Sketch of proof: normal case

reaching the noncritical Far again, which, due to the assumption on separation
of crossings, dissatisfies the application condition of correcting maneuvers. The
timing diagram in Fig. 7 sketches the combined behavior of controllers and
plants, and thus subsumes (one part of) the discrete phase-transitions addressed
in (VC 1) and the dynamic trajectories from (VC 3).
Fig. 8 shows the failure case where upon the train’s request to enter the
crossing the gate continues to show ¬Cl for more than εmax time and thus
the train is prevented from entering the crossing. Here, the condition on the
definition of nearCr (v )(p) becomes essential in all its parts. It directly trans-
lates into a condition Φrec2fss for the train, with resulting proof obligations
(VC 3), (VC 7) and (VC 8), which sum up into a maximal delay in the tim-
ing diagram from sending Req to coming to a full stop when the gate does not
close.
We may note that, if there were more matching modes, each would require a
corresponding timing diagram, enforcing the use of a global diagram fixing the re-
lations between the cases. This argumentation pattern would, by and large, have
to follow our proof rule, only that the rule further decomposes the (somewhat
On the Verification of Cooperating Traffic Agents 109

pos farCr(...) nearCr(v)

v
speed

mode Far Appr FailSafe


Train−C

Req Req
Req

Normal Keep Brake


sc

mode CrUnsafe CloseGate


Gate−C

Up Down
Dir

Op Cl
α
ε
max
Fail Fail
Fail

OK
OK

Fig. 8. Sketch of proof: failure case

informal) timing diagrams into completely formalized peaces, with well-defined


interrelations.

6 Conclusion
We have presented an approach to the verification of cooperating traffic agents
and shown its applicability to two radically different examples. Future work
will complement this paper by formally proving the soundness of the proposed
verification rule, formally instantiating this to the examples of this paper, and
by demonstrating how the derived verification conditions can be discharged by
automatic verification methods.

References
1. B. Abdul-Baki, J. Baldwin, and M.-P. Rudel. Independent validation and veri-
fication of the TCAS II collision avoidance subsystem. Aerospace and Electronic
Systems Mag., 15(8):3–21, 2000.
110 W. Damm, H. Hungar, and E.-R. Olderog

2. R. Alur, C. Courcoubetis, T.A. Henzinger, and Pei-Hsin Ho. Hybrid automata:


An algorithmic approach to the specification and verification of hybrid systems. In
R.L. Grossman, A. Nerode, A.P. Ravn, and H. Rischel, editors, Hybrid Systems,
volume 736 of Lecture Notes in Computer Science, pages 209–229. Springer-Verlag,
1992.
3. R. Alur and D. Dill. A theory of timed automata. Theoret. Comput. Sci., 126:183–
235, 1994.
4. J. Bohn, W. Damm, J. Klose, A. Moik, and H. Wittke. Modeling and validating
train system applications using statemate and live sequence charts. 2002.
5. A. Chutinam and B.H. Krogh. Computing polyhedral approximations in flow pipes
for dynamic systems. In 37th IEEE Conference on Decision and Control. IEEE,
1998.
6. A.E. Haxthausen and J. Peleska. Formal development and verification of a dis-
tributed railway control system. IEEE Transactions on Software Engineering,
26(8):687–701, 2000.
7. C. Heitmeyer and D. Mandrioli, editors. Formal Methods for Real-Time Comput-
ing, number 5 in Trends in Software. Wiley, 1996.
8. J. Hoenicke and E.-R. Olderog. CSP-OZ-DC: A combination of specification tech-
niques for processes, data and time. Nordic Journal of Computing, 9(4):301–334,
2002. appeared March 2003.
9. R.E. Kalman and J.E. Bertram. Control system analysis and system design via
the “second method” of lyapunov – Part I: Continuous time systems. Transactions
of the ASME, Journal of Basic Engineering, 82:371–393, 1960.
10. N.G. Leveson. Safeware: System Safety and Computers. Addison Wesley, 1995.
11. C. Livadas, J. Lygeros, and N.A. Lynch. High-level modeling and analysis of TCAS.
Proceedings of IEEE — Special Issue on Hybrid Systems: Theory & Applications,
88(7):926–947, 2000.
12. J. Lygeros, D.N. Godbole, and S.S. Sastry. Verified hybrid controllers for auto-
mated vehicles. IEEE Transactions on Automatic Control, 43(4):522–539, 1998.
13. N.A. Lynch, R. Segala, and F. Vaandrager. Hybrid I/O automata. Information
and Computation, 185(1):105–157, 2003.
14. E.-R. Olderog, A. P. Ravn, and J. U. Skakkebæk. Refining system requirements to
program specifications. In C. Heitmeyer and D. Mandrioli, editors, Formal Methods
for Real-Time Computing, pages 107–134. Wiley, 1996.
15. A.P. Ravn. Design of embedded real-time computing systems. Technical Report
ID-TR: 1995-170, Tech. Univ. Denmark, 1995. Thesis for Doctor of Technics.
16. C. Tomlin, J. Lygeros, and S.S. Sastry. A game theoretic approach to controller
design for hybrid systems. Proc. of the IEEE, 88(7):949–970, 2000.
17. C. Tomlin, G. Pappas, and S.S. Sastry. Conflict resolution for air traffic man-
agement: A case study in multi-agent hybrid systems. IEEE Transactions on
Automatic Control, 43(4), April 1998.
18. C. Zhou and M.R. Hansen. Duration Calculus: A Formal Approach to Real-Time
Systems. Springer, 2004.
19. C. Zhou, C.A.R. Hoare, and A.P. Ravn. A calculus of durations. Information
Processing Letters, 40(5):269–276, 1991.
How to Cook a Complete Hoare Logic
for Your Pet OO Language

Frank S. de Boer1,2.3 and Cees Pierik3


1
Centre of Mathematics and Computer Science, The Netherlands
2
Leiden Institute of Advanced Computer Science, The Netherlands
3
Utrecht University, The Netherlands
F.S.de.Boer@cwi.nl cees@cs.uu.nl

Abstract. This paper introduces a general methodology for obtaining


complete Hoare logics for object-oriented languages. The methodology
is based on a new completeness result of a Hoare logic for a procedu-
ral language with dynamically allocated variables. This new result in-
volves a generalization of Gorelick’s seminal completeness result of the
standard Hoare logic for recursive procedures with simple variables. We
show how this completeness result can be generalized to existing Hoare
logics for typical object-oriented concepts like method calls, sub-typing
and inheritance, and dynamic binding, by transforming an encoding of
these concepts into this procedural language with dynamically allocated
variables.

1 Introduction
Information technology industry is still in need for more reliable techniques that
guarantee the quality of software. This need is recognized by C.A.R. Hoare as
a Grand Challenge for Computing Research in his proposal of the Verifying
Compiler [14].
A verifying compiler resembles a type checker in the sense that it statically
checks properties of a program. The properties that should be checked have to be
stated by the programmer in terms of assertions in the code. The first sketch of
such a tool is given by Floyd [10]. In recent years, several programming languages
(e.g., EIFFEL [16], Java [9]) have been extended to support the inclusion of
assertions in the code. These assertions can be used for testing but they can also
be used as a basis for a proof outline of a program. Hoare logic [13] can be seen as
a systematic way of generating the verification conditions which ensure that an
annotated program indeed constitutes a proof outline. A verifying compiler then
consists of a front-end tool to a theorem prover which checks the automatically
generated verification conditions (see for example [7]).
Obviously, to be useful in practice Hoare logics should, first of all, be sound,
i.e., only correct programs should be provably correct. Conversely, completeness
means that all correct programs are provably correct. Its practical relevance,
apart from providing a firm formal justification, is that in case we cannot prove

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 111–133, 2004.

c Springer-Verlag Berlin Heidelberg 2004
112 F.S. de Boer and C. Pierik

our program correct, we know that this is due to incorrectness of our program
(which in practice is most frequently the case). An incomplete proof method
does not allow this simple inference.
Furthermore, to be useful in practice Hoare logics should allow the specifica-
tion and verification of a program at the same level of abstraction as the pro-
gramming language itself. This requirement has some important consequences
for the specification of programs which involve dynamically allocated variables
(e.g., ‘pointers’ or ‘objects’). Firstly, this means that, in general, there is no
explicit reference in the assertion language to the heap which stores the dynam-
ically allocated variables at run-time. Moreover, it is only possible to refer to
variables in the heap that do exist. Variables that do not (yet) exist never play
a role.
The main contribution of this paper is a recipe for complete Hoare logics for
reasoning about closed object-oriented programs at an abstraction level which
coincides with that of object-oriented languages in general. This recipe is based
on the transformational approach as introduced and applied in [18]: we first
present a new completeness result of a Hoare logic for what is basically a proce-
dural language with dynamically allocated variables. This result itself involves a
non-trivial generalization of Gorelick’s basic completeness result of the standard
Hoare logic for recursive procedures with simple variables only ([11]). Then we
show how this completeness result can be further generalized to existing Hoare
logics for typical object-oriented concepts like method calls, sub-typing and in-
heritance by transforming an encoding of these concepts into this procedural
language with dynamically allocated variables.

Proving Completeness
Below we summarize Gorelick’s seminal completeness proof for recursive pro-
cedures with simple variables to clarify the context of the crucial steps in the
enhanced proof.
Consider a simple sequential programming language with (mutually) recur-
sive (parameterless) procedures. Let

p1 ⇐ S1 , . . . , pn ⇐ Sn

be a set of mutually recursive procedure definitions, where p ⇐ S indicates that


the statement S is the body of procedure p. Central to the completeness proof
is the Most General (correctness) Formula (MGF) of a statement S, which is a
formula of the form
{x̄ = z̄}S{SP(S, x̄ = z̄)}.
Here, z̄ = z1 , . . . , zn is a sequence of logical variables (not occurring in state-
ments) that correspond with the program variables x̄ = x1 , . . . , xn of S. The
precondition x̄ = z̄, which abbreviates the conjunction of the equalities xi = zi ,
for i = 1, . . . , n, states that the (initial) values of the program variables are stored
in the corresponding logical variables. For this reason, these logical variables are
called ‘freeze’ variables. The postcondition SP(S, x̄ = z̄) describes the strongest
postcondition of the statement S with respect to the precondition x̄ = z̄.
How to Cook a Complete Hoare Logic for Your Pet OO Language 113

The completeness proof in [11] shows that every valid correctness formula
{P }S{Q} can be derived if we first prove the most general correctness formula
{x̄ = z̄}pi {SP(pi , x̄ = z̄)} of every procedure pi . In the sequel, we denote this
formula by Φi (for i = 1, . . . , n). The above claim is proved by induction on
the complexity of S. The most interesting case where S equals pi , for some
i = 1, . . . , n, requires an application of the invariance axiom:

{P [z̄/x̄]}pi {P [z̄/x̄]}.

Here P [z̄/x̄] denotes the result of replacing the program variables by their
corresponding logical variables. By the conjunction rule this is combined with
Φi to obtain
{P [z̄/x̄] ∧ x̄ = z̄}pi {P [z̄/x̄] ∧ SP(pi , x̄ = z̄)}.
From the validity of the given correctness formula {P }pi {Q} and the defini-
tion of the strongest postcondition one then proves the validity of the assertion

P [z̄/x̄] ∧ SP(pi , x̄ = z̄) → Q.

This formula and the consequence rule allows us to replace the postcondition
by Q. By applying a substitution rule and the rule of consequence we finally
change the precondition to P and obtain {P }pi {Q}.
The final step in the proof consists of deriving the MGF of every procedure
call pi (that is, Φi ) in the logic by means of the recursion rule

F1 , . . . , Fn  {P1 }S1 {Q1 }, . . . , {Pn }Sn {Qn }


.
F1

Here, Fi denotes {Pi }pi {Qi }, for i = 1, . . . , n. Since

{x̄ = z̄}Si {SP(pi , x̄ = z̄)}

(with Si the body of pi ) is by definition a valid correctness formula, we have as


a particular case of the above, its derivability from Φ1 , . . . , Φn . The above rule
then allows us to conclude Φi , for i = 1, . . . , n, thus finishing the completeness
proof.
Our generalization of the above requires an intricate analysis of some general
characteristics of correctness proofs for a procedural language with dynamically
allocated variables. First, we have to formulate an assertion corresponding to
x̄ = z̄ which ‘freezes’ an initial state. In our setting, a state describes the heap
which cannot captured statically by a finite number of variables. Next, we ob-
serve that the above invariance axiom as such is also not valid because the
creation of new variables (or objects in object-oriented jargon) affects the scope
of quantified logical variables. Therefore, creation of new variables in general
will affect the validity of an assertion even if it does not refer to program vari-
ables at all. Summarizing, dynamically allocated variables basically require a
proof-theoretical reconsideration of the concepts of initialization and invariance.
114 F.S. de Boer and C. Pierik

Below, the operator op is an element of Op, the language-specific set of op-


erators, and m is an arbitrary identifier.

Table 1. The syntax of the programming language

e ∈ Expr
::= y | op(e1 , . . . , en )
y ∈ Var
::= u | e.x
s ∈ SExpr::= new C() | m(e1 , . . . , en )
S ∈ Stat
::= y = e ; | y = s ; | while (e) { S }
| S S | if (e) { S } else { S }
meth ∈ Meth ::= m(u1 , . . . , un ) { S return e ; }
π ∈ Prog ::= class∗

The results of the transformational approach then justifies our conclusion that
Hoare logic for object-oriented programs basically boils down to a formalization
of dynamically allocated variables.

Plan of the Paper


This paper is organized in the following way. Section 2 and 3 introduce a proce-
dural programming language with dynamically allocated variables and the asser-
tion language. In Section 4 we outline the corresponding Hoare logic. Section 5
contains the completeness proof. In Section 6 we introduce the transformational
approach and sketch briefly some of its applications to typical object-oriented
concepts. In the last section we draw some conclusions.

2 A Procedural Language with Dynamically Allocated


Variables
The syntax of the basic strongly typed programming language considered in this
paper is described in Table 1. We only consider the built-in types boolean and
int. We assume given a set C of class names, with typical element C. The set
of (basic) types T is the union of the set {int, boolean} and C. In the sequel, t
will be a typical element of T .
We assume given a set TVar of typed temporary (or local) variables and a set
IVar of typed instance variables. Instance variables belong to a specific instance
of a class and store its internal state. Temporary variables belong to a method
and last as long as this method is active. A method’s formal parameters are also
temporary variables. We use u and x as typical elements of the set of temporary
variables and the set of instance variables, respectively. We denote by [[u]] and [[x]]
the (static) type of the temporary variable u and instance variable x, respectively.
A variable y is either a temporary variable or a navigation expression e.x. A
navigation expression e.x, with C the static type of e, refers to the value of the
variable x of the instance e of the class C.
A program is a finite set of classes. A class defines a finite set of procedures
(or methods in object-oriented jargon). A method m consists of its formal pa-
How to Cook a Complete Hoare Logic for Your Pet OO Language 115

rameters u1 , . . . , un , a statement S, and an expression e without side effects


which denotes the return value. We have the usual assignments y = e without
side effect and y = s with side effect. A variable y of type C, for some class C,
can be dynamically allocated by means of an assignment y = new C() which
involves the creation of an instance of class C. An assignment y = m(e1 , . . . , en )
involves a call of method m with actual parameters e1 , . . . , en .
Observe that our language does not support constructor methods. Therefore,
an expression like new C() will call the default constructor method, which will
assign to all instance variables their default value.

2.1 Semantics
In this section, we will only describe the overall functionality of the semantics
of the presented language because this suffices to understand the completeness
proof.
Each instance of a class (or object in object-oriented jargon) has its own
identity. For each class C ∈ C we introduce therefore an infinite set dom(C) of
identities of instances of class C, with typical element o. For different classes
these sets of object identities are assumed to be disjoint. By O we denote the
union of the sets dom(C). For t = int, boolean, dom(t) denotes the set of
boolean and integer values, respectively.
A configuration σ (also called the heap) is a partial function that maps each
existing object to its internal state (a function that assigns values to the instance
variables of the object). Formally, σ is an element of the set
Σ=O dom([[x]]).
x∈IVar

Here i∈I denotes a generalized cartesian product over an index set I. In
this way, σ(o), if defined, denotes the internal state of an object, i.e., σ(o)(x)
denotes the value of the instance variable x of the object o. It is not defined for
objects that do not exist in a particular configuration σ. Thus the domain of
σ specifies the set of existing objects. We will only consider configurations that
are consistent. We say that a configuration is consistent if no instance variable
of an existing object refers to a non-existing object.
On the other hand, a temporary context γ specifies the values of the tempo-
rary variables of the executed method: γ is a typical element of the set
Γ = dom([[u]]).
u∈TVar

The value of a temporary variable u is denoted by γ(u).


A state is a pair (σ, γ), where the temporary context γ is required to be
consistent with the configuration σ, which means that all temporary variables
refer to objects that exist in σ.
Given a set of class definitions the semantics of statements is given by the
(strict) function:
S : Stat → (Σ × T)⊥ → (Σ × T)⊥ ,
116 F.S. de Boer and C. Pierik

such that S(S)(σ, γ) = (σ  , γ  ) indicates that the execution of S in the configu-


ration (σ, γ) terminates in the configuration (σ  , γ  ). Divergence is denoted by ⊥.
A compositional characterization of S - which is fairly standard - can be given
following [5].
In this paper we only need the following semantic definition of a method call
y = m(e1 , . . . , en ), with method body S. We have that

S(y = m(e1 , . . . , en ))(σ, γ) = (σ  , γ  )

if
S(S)(σ, γ0 ) = (σ  , γ  )
where
– γ0 results from γ by assigning to the formal parameters of m the values of
the actual parameters e1 , . . . , en in (σ, γ);
– (σ  , γ  ) results from (σ  , γ) by assigning the value of the result expression
e in (σ  , γ  ) to the variable y (which either will affect γ, in case of a local
variable, or σ  , in case of a navigation expression).

3 The Assertion Language


In this section we present an assertion language for the specification of properties
of dynamically evolving object structures. We want to reason about these struc-
tures on an abstraction level that is at least as high as that of the programming
language. In more detail, this means the following:
– The only operations on “pointers” (references to objects) are
• testing for equality
• dereferencing (looking at the value of an instance variable of the refer-
enced object)
– In a given state of the system, it is only possible to mention the objects that
exist in that state. Objects that do not (yet) exist never play a role.
The above restrictions have quite severe consequences for the proof system.
The limited set of operations on pointers implies that first-order logic is too
weak to express some interesting properties of pointer structures like reachability.
Therefore we have to extend our assertion language to make it more expressive.
We will do so by allowing the assertion language to reason about finite sequences
of objects.
The set of logical expressions in the assertion language is obtained by sim-
ply extending the set of programming language expressions without side effect
(Expr) with logical variables. Logical variables are used for quantification and
for reasoning about the constancy of program expressions. In the sequel, z will
be a typical element of the set of logical variables LVar. Logical variables can
also have a sequence type t∗ , for some t ∈ T . In such a case, its value is a finite
sequence of elements of type t.
How to Cook a Complete Hoare Logic for Your Pet OO Language 117

The syntax of the assertion language is summarized as follows.

l ∈ LExpr ::= u | z | l.x | if l0 then l1 else l2 fi | op(l1 , . . . , ln )


P, Q ∈ Ass ::= l1 = l2 | ¬P | P ∧ Q | ∃z : t(P )

(we omit typing information: e.g., the expression l in l.x should refer to an object
of some class and only boolean expressions can be used as assertions).
The conditional expression is introduced in order to reason about aliases, i.e.,
variables which refer to the same object. To reason about sequences we assume
the presence of notations to express the length of a sequence (denoted by |l|)
and the selection of an element of a sequence (denoted by l[n], where n is an
integer expression). More precisely, we assume in this paper that the elements of
a sequence are indexed by 1, . . . , n, for some integer value n ≥ 0 (the sequence is
of zero length, i.e., empty, if n = 0). Accessing a sequence with an index which
is out of its bounds results in the value ⊥.
Only equations are allowed as basic assertions. This allows us to remain
within the realm of a two-valued boolean logic, as explained in more detail
below.
As stated above, quantification over finite sequences allows us to specify
interesting properties of the heap. For example, that two objects hd and tl are
connected in a linked list. This is asserted by the formula

∃ z : C ∗ (z[1] = hd ∧ z[|z|] = tl ∧
∀ i : int(1 ≤ i ∧ i < |z| → z[i].next = z[i + 1])).

Here z is a logical variable ranging over finite sequences of objects in class C


and next is an instance variable of class C.
Logical expressions are evaluated in a configuration σ, a temporary context γ,
and a logical environment ω, which assigns values to the logical variables. The
logical environment should also be consistent with the configuration σ in the
sense that no logical variable should refer to a non-existing object. The resulting
value is denoted by L(l)(σ, γ, ω). By L(l)(σ, γ, ω) = ⊥ we indicate that the value
of the expression is undefined (e.g., dereferencing the null-pointer, division by
zero, etc.). The definition of L is rather standard and therefore omitted.
The resulting boolean value of the evaluation of an assertion P is denoted by
A(P )(σ, γ, ω). We define

true if A(l1 )(σ, γ, ω) = A(l2 )(σ, γ, ω)
A(l1 = l2 )(σ, γ, ω) =
false otherwise

Assuming the constant nil for denoting the value ⊥, we can can assert that
the value of an expression e is undefined simply by the equation e = nil.
As already explained above, a formula ∃z : C(P ) states that P holds for
an existing instance of class C. Thus the quantification domain of a variable
depends not only on the (static) type of the variable but also dynamically on the
configuration. Similarly, a formula ∃z : C ∗ (P ) states the existence of a sequence
of existing objects (in class C).
118 F.S. de Boer and C. Pierik

The notation σ, γ, ω |= P means that A(P )(σ, γ, ω) = true. An assertion


P is valid, denoted by |= P , if σ, γ, ω |= P for every state (σ, γ), and logical
environment ω which are consistent w.r.t. the existing objects of σ, i.e., instance
variables of existing objects, temporary variables and logical variables refer only
to existing objects.
It is worthwhile to observe that quantification over finite sequences implies
that the logic is not compact [8] (and as such transcends the realm of first-order
logic) because we can express that there exists a finite number of objects (in a
given class C): The assertion

∃z : C ∗ ∀v : C∃n : int(v = z[n])

states that there exists a finite sequence that stores all (existing) objects in class
C.
Furthermore, we observe that the domain of discourse of our assertion lan-
guage consists only of the built-in data types of the integers and booleans, and
the program variables, i.e., it excludes the methods. This allows for a clear sep-
aration of concerns between what a program is supposed to do as expressed by
the assertion language and how this is implemented as described by its methods.

4 The Proof System


Given a set of class definitions, correctness formulas are written in the usual form
{P }S{Q}, where P and Q are assertions and S is a statement. We say that a cor-
rectness formula {P }S{Q} is true w.r.t. a state (σ, γ) and a logical environment
ω, written as σ, γ, ω |= {P }S{Q}, if σ, γ, ω |= P and S(S)(σ, γ) = (σ, γ  ) implies
σ  , γ  , ω |= Q. This corresponds with the standard partial correctness interpre-
tation of correctness formulas. By |= {P }S{Q}, i.e., the correctness formula
{P }S{Q} is valid, we denote that σ, γ, ω |= {P }S{Q}, for every state (σ, γ)
and logical environment ω which are consistent. Finally,  {P }S{Q} denotes
derivability of {P }S{Q} in the logic.
We only discuss the rules for method invocations in detail. The axioms for the
other statements of our basic language are already introduced in, for example,
[7]. Here, we will only give a brief description of these axioms. They all involve
substitution operations that compute the weakest precondition of a statement.
By a standard argument this implies that we can derive all valid specifications
of such statements. For this reason, we focus on reasoning about methods invo-
cations in this paper.
For an assignment to temporary variables, we have the usual axiom

{P [e/u]} u = e {P },

where P [e/u] denotes the result of replacing every occurrence of the temporary
variable u in P by the expression e. Soundness of this axiom is stated by the
following theorem.
How to Cook a Complete Hoare Logic for Your Pet OO Language 119

Theorem 1. We have

σ, γ, ω |= P [e/u] if and only if σ, γ  , ω |= P ,

where γ  results from γ by assigning L(e)(σ, γ, ω) to u.

Proof. Standard induction on the complexity of P .

We have a similar theorem for the substitution of logical variables.

Theorem 2. We have

σ, γ, ω |= P [e/z] if and only if σ, γ, ω  |= P ,

where ω  results from ω by assigning L(e)(σ, γ, ω) to z.

In the axiom
{P [e /e.x]} e.x = e {P }

for assignments to instance variables the substitution operation [e /e.x] differs
from the standard notion of structural substitution. An explicit treatment of
possible aliases is required for this type of assignments. Possible aliases of the
variable e.x are expressions of the form l.x: After the assignment it is possible
that l refers to the object denoted by e, so that l.x is the same variable as e.x
and should be substituted by e . It is also possible that, after the assignment, l
does not refer to the object denoted by e, and in this case no substitution should
take place. Since we can not decide between these possibilities by the form of the
expression only, a conditional expression is constructed which decides ”dynam-
ically”. Let l denote the expression l[e /e.x]. If [[l]] = [[e]], i.e., the expressions l
and e refer to objects of the same class, we define (l.x)[e /e.x] inductively by

if l = e then e else l .x fi

In case the types of the expressions l and e do not coincide, i.e., they do not
refer to objects of the same class, aliasing does not occur, and we can simply
define (l.x)[e /e.x] inductively by l .x.
Soundness of the above axiom for assignments to instance variables is stated
by the following theorem.

Theorem 3. We have that

σ, γ, ω |= P [e /e.x] if and only if σ  , γ, ω |= P ,

where σ  (o)(x) = L(e )(σ, γ, ω), for o = L(e)(σ, γ, ω), and in all other cases σ
agrees with σ  .

Proof. Induction on the complexity of P (for details see [21]).


120 F.S. de Boer and C. Pierik

In [7] we also discuss the substitution operation [new C/u], which computes
the weakest precondition of an assignment u = new C() involving a temporary
variable u. The definition of this substitution operation is complicated by the
fact that the newly created object does not exists in the state just before its
creation, so that in this state we can not refer to it! We however are able to
carry out the substitution due to the fact that this variable u can essentially
occur only in a context where either one of its instances variables is referenced,
or it is compared for equality with another expression. In both of these cases
we can predict the outcome without having to refer to the new object. Another
complication dealt with by this substitution operation is the changing scope of
a bound occurrence of a variable z ranging over objects in class C which is
induced by the creation of a new object (in class C). For example, we define
∃z : C(P )[new C/u] by
∃z : C(P [new C/u]) ∨ (P [u/z][new C/u]).
The first disjunct ∃z : C(P [new C/u]) represents the case that P holds for
an ‘old’ object (i.e. which exists already before the creation of the new object)
whereas the second disjunct P [u/z][new/u] represents the case that the new
object itself satisfies P . Since a logical variable does not have aliases, the sub-
stitution [u/z] consists of simply replacing every occurrence of z by u.
Given this substitution we have the following axiom for object creation in-
volving a temporary variable:
{P [new C/u]} u = new C() {P }.
Soundness of this axiom is stated by the following theorem.
Theorem 4. We have
σ, γ, ω |= P [new/u] if and only if σ  , γ  , ω |= P ,
where σ  is obtained from σ by extending the domain of σ with a new object o
and initializing its instance variables. Furthermore the resulting local context γ 
is obtained from γ by assigning o to the variable u.
Proof. The proof proceeds by induction on the complexity of P (for the details
we refer to [21]).
Observe that an assignment e.x = new C() can be simulated by the sequence
of assignments u = new C(); e.x = u. Therefore we have the axiom
{P [u/e.x][new C/u]} e.x = new C() {P },
where u is a fresh temporary variable which does not occur in P and e. Note
that the substitution [u/e.x] makes explicit all possible aliases of e.x. Soundness
of this axiom follows from the above.
It is worthwhile to observe that aliasing and object creation are phenomena
characteristic of dynamically allocated variables. Furthermore, the above substi-
tutions provide a formal account of aliasing and object creation at an abstraction
How to Cook a Complete Hoare Logic for Your Pet OO Language 121

level which coincides with that of the programming language, e.g., without any
explicit reference to the heap.
We now turn to method invocations. Suppose we have in class C a method
m(u1 , . . . , un ) { S return e }. The following rule for method invocation (MI)
allows to derive a correctness specification of a call y = m(e1 , . . . , en ) from a
correctness specification of the body S of m.

{P }S{Q[e/return]} Q[f¯/z̄] → R[return/y]


(MI)
{P [ē/ū][f¯/z̄]} y = m(e1 , . . . , en ) {R}

Here we do not allow temporary variables in Q. Except for the formal pa-
rameters u1 , . . . , un , no other temporary variables are allowed in P .
The simultaneous substitution [ē/ū] models the assignment of the actual
parameters ē = e1 , . . . , en to the formal parameters ū = u1 , . . . , un .
The substitution [e/return] applied to the postcondition Q of S in the first
premise models a (virtual) assignment of the result value to the logical variable
return, which must not occur in the assertion R. The substitution [return/y]
applied to the postcondition R of the call models the actual assignment of the
return value to y. It corresponds to the usual notion of substitution if y is
a temporary variable. Otherwise, it corresponds to the substitution operation
[e /e.x] for reasoning about aliases.
It is worthwhile to observe that the use of the additional return variable
allows to resolve possible clashes between the temporary variables of the post-
condition R of the call and those possibly occurring in the return expression e.
As a typical example, consider the object-oriented keyword this, which denotes
the current active object, as an additional parameter of each method. We observe
that, in case a method returns the identity of the callee, in the postcondition
R[this/y] of the caller, this, however, would refer to the caller.
Next, we observe that a temporary expression f of the caller generated by
the following abstract syntax

f ::= u | op(f1 , . . . , fn )

is not affected by the execution of S by the receiver. A sequence of such ex-


pressions f¯ can be substituted by a corresponding sequence of logical variables
z̄ of exactly the same type in the specification of the body S. Thus the rule re-
flects the fact that the values of such expressions do not change by the execution
of the call. Without these substitutions one cannot prove anything about the
temporary state of the caller after the method invocation.
The generalization of the rule for non-recursive method invocations to one for
(mutually) recursive method invocations follows the standard pattern. The idea
is to assume correctness of the specification of the method call while proving
the correctness formula of the body. In case of mutually recursive methods, we
may assume correctness of the specification of several method calls. But this also
forces us to prove the corresponding specifications of the bodies of all these calls.
122 F.S. de Boer and C. Pierik

The rule schema for mutually recursive method invocations (MRMI) is as


follows.

I1 , . . . , Ik A1 , . . . , Ak  F1 , . . . , Fk
(MRMI)
F1
where
– Ai is a specification of a call, i.e., for i = 1, . . . , k, Ai denotes a correctness
formula
{Pi [ēi /ūi ][f¯i /z̄i ]} yi = mi (ēi ) {Ri }
with ūi the formal parameters of method mi and ēi a corresponding sequence
of actual parameters and;
– Fi is a correctness formula
{Pi } Si {Qi [ei /returni ]}
with Si the body of method mi i and ei its return expression;
– Ii denotes the implication
Qi [f¯i /z̄i ] → Ri [returni /yi ]
which relates he postconditions of the call and the body.
The syntactical restrictions of rule (MI) also apply to all assertions in this
rule.

5 Completeness
In this section, we will prove (relative) completeness of the logic [4] (soundness
follows from a standard inductive argument). That is, given a finite set of class
definitions, we prove that |= {P }S{Q} implies  {P }S{Q}, assuming as addi-
tional axioms all valid assertions. We do so following the structure of the proof
that is given in the introduction. In the following sections we give solutions for
each of the issues that were mentioned in the introduction.
First we show how to formulate an assertion that freezes the initial state. Sub-
sequently, we give an enhanced version of the invariance axiom. As a counterpart
of the substitution [z̄/x̄] in the invariance axiom, we define a substitution that
modifies an assertion in such a way, that its validity is not affected by execution
of any object-oriented program. In particular, it is immune to object-creation.
Next, we show that the our technique for freezing the initial state leads to Most
General Formulas about method calls that enable us to derive any valid correct-
ness formula. Finally, we show how to derive these MGFs for method calls.
Definition 1. Our completeness proof is based on the following standard se-
mantic definition of the strongest postcondition SP(S, P ) as the set of triples
(σ, γ, ω) such that for some initial state (σ  , γ  ) we have S(S)(σ  , γ  ) = (σ, γ)
and σ  , γ  , ω |= P .
How to Cook a Complete Hoare Logic for Your Pet OO Language 123

It can be shown in a straightforward although rather tedious manner that


SP(S, P ) is expressible in the assertion language (see [6, 23]). The main idea fol-
lowed in [6] is based on a logical encoding of states as described in the next section
which heavily relies on the presence of quantification over finite sequences.
At several points in the completeness proof, we have to refer to the set M (S)
of method calls that might be executed as a result of executing S. This set M (S)
is defined as the smallest set which satisfies:

– y = m(e1 , . . . , en ) ∈ M (S) if y = m(e1 , . . . , en ) occurs in S


– M (S  ) ⊆ M (S) if if y = m(e1 , . . . , en ) ∈ M (S), where S  is the body of
method m.

5.1 Freezing the Initial State


The first problem is to freeze the initial state which consists of the internal
states of the existing objects and the values of the temporary variables. The set
of existing objects is not statically given. Therefore we store the existing objects
of a class C ‘dynamically’ in a logical variable seqC of type C ∗ . For storing the
values of instance variables, we introduce for each class C and instance variable

x ∈ IVar a corresponding logical variable Θ(C, x) of type [[x]] such that the
value of the instance variable x of an existing object of class C is stored in the
sequence denoted by Θ(C, x) at the position of the object in seqC . Storing the
initial values of the temporary variables of the active object in logical variables
is straightforward. We simply introduce a logical variable Θ(u) of type [[u]], for
each temporary variable u. Figure 1 pictures the relation between the heap and
its representation in the logical state (here C represents the sequence of existing
objects in class C and X a sequence storing all the values of the instance variable
x).
The above encoding of a configuration can be captured logically by a finite
(we assume given a finite set of class definitions) conjunction of the following
assertions:

– ∀z : C(∃i(z = seqC [i]))


which states that the sequence seqC stores all existing objects of class C.
– ∀i(seqC [i].x = Θ(C, x)[i])
which states that for each position in seqC the value of the instance variable
x of the object at that position is stored in the sequence Θ(C, x) at the same
position.
– u = Θ(u)
which simply states that the value of u is stored in Θ(u).

The resulting conjunction we denote by init.


In the remainder of this section, we will work towards a new invariance axiom
that replaces the old axiom

{P [z̄/x̄]}S{P [z̄/x̄]}.
124 F.S. de Boer and C. Pierik

HEAP

LOGICAL STATE

Fig. 1. Freezing the Initial State

We start by introducing a counterpart of the substitution [z̄/x̄]. Note that


this substitution replaces program variables by their logical counterparts as in-
troduced above. We therefore want to extend the mapping Θ to assertions such
that it replaces all references to program variables by their logical counterparts.
Since in general we cannot statically determine the position of an object of class
C in the sequence seqC we ‘skolemize’ the assertion ∀z : C(∃i(z = seqC [i])) by
introducing for every class C a function posC (l) which satisfies the assertion
∀z : C(z = seqC [posC (z)]).
Note that in practice assertion languages should allow the introduction of
user-defined functions and predicates.
We have the following main case of the transformation Θ on logical expres-
sions l (using postfix notation).
(l.x)Θ ≡ Θ(C, x)[posC (lΘ)]
where C = [[l]]. As a simple example we have that (u.x)Θ yields the expression
Θ(C, x)[posC (Θ(u))], where C is the static type of the temporary variable u.
Quantification requires additional care when extending Θ to assertions be-
cause the scope of the quantifiers in general will be affected by object creation.
How to Cook a Complete Hoare Logic for Your Pet OO Language 125

This can be solved by restricting quantification to the objects that exist in the
initial state. That is, by restricting quantification to objects in seqC . Therefore
we introduce the following bounded form of quantification:

(∃z : C(P ))Θ ≡ ∃z : C(z ∈ seqC ∧ P Θ)


(∃z : C ∗ (P ))Θ ≡ ∃z : C ∗ (z seqC ∧ P Θ)

The assertions z ∈ seqC and z seqC stand for ∃i(z = seqC [i]) and
∀i∃j(z[i] = seqC [j]). Note that the assertion ∃i(z = seqC [i]) states that the
object denoted by z appears in the sequence denoted by seqC . The assertion
∀i∃j(z[i] = seqC [j]) states that all objects stored in the sequence denoted by z
are stored in the sequence seqC . We thus restrict quantification to the sequences
seqC . (For quantification over basic types or sequences of basic types we simply
have (∃zP )Θ = ∃z(P Θ).)
The transformation Θ is truth-preserving in the following manner.

Theorem 5. For every assertion P , and any state (σ, γ) and logical environ-
ment ω which are consistent, we have that σ, γ, ω |= init implies σ, γ, ω |= P iff
σ, γ, ω |= P Θ.

Proof. By structural induction on P . '

The following theorem states that we can prove that P Θ is an invariant for
every statement S and assertion P . Thus this theorem can replace the invariance
axiom.

Theorem 6. For any statement S and assertion P we have  {P Θ} S {P Θ}.

Proof. The main idea behind the proof is that P Θ is immutable to all substitu-
tions in the proof rules since the transformation Θ replaces all program variables
by logical variables. Let P  denote P Θ.
Since P  does not contain any program variables, clearly we have that P  [e/u]
and P  [e /e.x] both equal (syntactically) P  , for any substitutions [e/u] and
[e /e.x].
Moreover, since all quantification in P  is bounded we have that P  [new C/u]
is logically equivalent to P  : By the soundness of the axiom for object creation
(as stated in Theorem 4) we have that σ, γ, ω |= P  [new C/u] if and only if
σ  , γ  , ω |= P  , where σ  results from adding to the domain of σ a new instance of
class C and γ  results from assigning the identity of this newly created instance
to the temporary variable u. Since u does not appear in P  , it follows that
σ  , γ  , ω |= P  if and only if σ  , γ, ω |= P  . Finally, since the newly created object
in class C does not appear in ω(seqC ) (note that by definition ω(seqC ) may refer
only to objects existing in σ) and all quantification in P  involving this newly
created object is restricted to ω(seqC ) it follows that σ  , γ, ω |= P  if and only if
σ, γ, ω |= P  .
126 F.S. de Boer and C. Pierik

5.2 Most General Correctness Formula


In the previous section, we described a way to freeze the initial state of an
object-oriented program. The information about the initial state was captured
by the formula init. With this formula we can define the most general correctness
formula (MGF) of an object-oriented statement. It is the formula
{init}S{SP(S, init)}.
In this section, we will show that assuming that the MGF holds for every
method invocation in the program suffices to derive every valid correctness for-
mula, as stated by the following theorem.
Theorem 7. If |= {P }S{Q} we have
A1 , . . . , Ak  {P }S{Q},
where Ai = {init}Si {SP (Si , init)}, for i = 1, . . . , k, and M (S) = {S1 , . . . , Sk }.
The proof proceeds by structural induction on S. The theorem holds for
assignments y = e and y = new (C) because the soundness of their axioms in fact
show that the corresponding substitutions compute the weakest precondition.
Complex control structures are treated in a standard manner ([3]). The following
lemma describes the most interesting case of a method call.
Lemma 1. For every call S ≡ y = m(e1 , . . . , en ) we have
|= {P }S{Q} implies {init}S{SP(S, init)}  {P }S{Q}
Proof. For technical convenience only we assume that P and Q do not contain
free occurrences of the logical variables Θ(u) and Θ(C, x) (otherwise we first
have to rename them). By Theorem 6 we have  {P Θ}S{P Θ}. An application
of the conjunction rule gives us the correctness formula
 {P Θ ∧ init}S{P Θ ∧ SP(S, init)}.
Our next step is to prove
|= P Θ ∧ SP(S, init) → Q.
Let σ, γ, ω |= P Θ ∧ SP(S, init). By the definition of SP there exists an initial
configuration (σ0 , γ0 ) such that both σ0 , γ0 , ω |= init and S(S)(σ0 , γ0 ) = (σ, γ)
hold.
Next, we show that σ0 , γ0 , ω |= P Θ leads to a contradiction in order to obtain
σ0 , γ0 , ω |= P Θ. So let σ0 , γ0 , ω |= P Θ. By Theorem 6 we have  {¬P Θ}S{¬P Θ}.
Soundness of the proof system ensures that we also have |= {¬P Θ}S{¬P Θ}.
This implies that σ, γ, ω |= P , which contradicts an earlier assumption.
By σ0 , γ0 , ω |= P Θ and Theorem 5 we arrive at σ0 , γ0 , ω |= P . Since |=
{P }S{Q} we conclude that σ, γ, ω |= Q. So we can proceed by an application of
the consequence rule to obtain
 {P Θ ∧ init}S{Q}.
How to Cook a Complete Hoare Logic for Your Pet OO Language 127

Next, we can apply the standard rules which allow one to replace in the
precondition every logical variable Θ(u) by the temporary variable u and ex-
istentially quantifying all the logical variables Θ(C, x) (x an arbitrary instance
variable in an arbitrary class C). Finally, we existentially quantify the variables
seqC , for every C. It is not difficult to prove that the assertion P logically im-
plies the resulting precondition. Therefore an application of the consequence rule
finishes the proof.
The final step in the completeness proof is to show that we can derive the
correctness formula
{init}S{SP(S, init)}
for any assignment S ≡ y = m(e1 , . . . , en ) that involves a method call.
Theorem 8. For any assignment S ≡ y = m(e1 , . . . , en ) we have

 {init}S{SP(S, init)}.

Proof. We use the following notational conventions in this proof. Let

M (S) = {S1 , . . . , Sk },

with Si ≡ yi = mi (ei1 , . . . , eini ), for i = 1, . . . , k. Let S0 denote y = m(e1 , . . . , en ).


Let Ai , for i = 0, . . . , n, denote the correctness formula

{init}Si {SP(Si , init)}.

Let Si , for i = 0, . . . , n, denote the body of mi and let ei be its return
expression. The expression ēi abbreviates the sequence ei1 , . . . , eini . We denote
the formal parameters ui1 , . . . , uini of method mi by ūi .
We must prove  {init}S{SP(S, init)}. By the rule (MRMI) and Theorem 7
it suffices to find, for i = 1, . . . , k, valid correctness formulas

{Pi } Si {Qi [ei /returni ]}, (1)

such that
|= init → Pi [ēi /ūi ][f¯i /z̄i ], (2)
and
|= Qi [f¯i /z̄i ] → SP(Si , init)[returni /yi ], (3)

for some substitutions [f¯i /z̄i ]. Observe that Theorem 7 requires us to prove
validity, not derivability. The substitutions [f¯i /z̄i ] involve temporary expressions
f¯i and corresponding logical variables satisfying the conditions of rule (MRMI).
Obvious candidates for the assertions Pi and Qi are the formulas init and
SP(Si , init)[returni /yi ], respectively. However, for (2) and (3) to be valid we
introduce a renaming function Φ which transforms any temporary variable u
into a (new) logical variable Φ(u) of the same type. Note that applying Φ to any
assertion effectively neutralizes the passing of the actual parameters. That is,
128 F.S. de Boer and C. Pierik

(P Φ)θ ≡ P Φ
for every assertion P and substitution θ which only transforms temporary
variables. So candidates for Pi and Qi are initΦ and SP(Si , init)[returni /yi ]Φ,
respectively. Using the inverse Φ−1 of Φ for [f¯i /z̄i ], we trivially obtain (2) and (3).
However, to prove the validity of the correctness formulas in (1) we have to
strengthen initΦ with additional information about the actual parameters spec-
ified by the call Si . After the invocation of the method the formal parameters
should have the values of the actual parameters. This information is given by
the conjunction of the equations uij = (eij Φ), for j = 1, . . . , ni . Observe that (2)
still holds because
(uij = (eij Φ))[ēi /ūi ]Φ−1 yields eij = eij .
Summarizing, for the assertion Pi defined by
ni
initΦ ∧ uij = (eij Φ)
j=1

we can now prove the validity of the correctness formula


{Pi } Si {Qi [ei /returni ]}.
The proof proceeds as follows: let σ, γ, ω |= Pi and S(Si )(σ, γ) = (σ  , γ  ). We
extend this computation of the method body Si to one of the call Si : we define
the initial local context γ0 of the call Si by assigning ω(Φ(u)) to every temporary
variable u, i.e., γ0 (u) = ω(Φ(u)). From σ, γ, ω |= initΦ we derive σ, γ0 , ω |= initi
(formally this follows from Theorem 2). Furthermore, from σ, γ, ω |= uij = (eij Φ)
it follows that the value of the actual parameter eij , with j = 1, . . . , ni , in the
configuration (σ, γ0 ) of the call equals the value of the corresponding formal pa-
rameter uij in the configuration (σ, γ) of the method body (formally this follows
from Theorem 1).
The final state of the call (σ  , γ0 ) can be obtained by assigning the value
of the result expression ei to the variable yi . We must consider two options,
because yi can be either a navigation expression or a temporary variable. If yi
is a navigation expression of the form e.x, we have γ0 = γ0 , i.e., the initial local
context of the call is not affected by the return, and σ  is obtained from σ  by
assigning the value of the return expression ei in the state (σ  , γ  ) to the instance
variable x of the object denoted by e. If yi is a temporary variable u, we define
σ  = σ  , i.e., the return does not affect the heap, and γ0 is obtained from γ0 by
assigning the value of the return expression ei to the temporary variable u.
From the semantics of method calls is described in Section 2.1it follows that
S(Si )(σ, γ0 ) = (σ  , γ0 ).
Since σ, γ0 , ω |= init, we have by Definition 1 that σ  , γ0 , ω |= SP(Si , init).
Next, let ω  be obtained from ω by assigning the result value, i.e., the value
of yi in the configuration (σ  , γ0 ) of the call, to the logical variable returni . It
How to Cook a Complete Hoare Logic for Your Pet OO Language 129

follows that σ  , γ0 , ω  |= SP(Si , init)[returni /yi ]. Observe that the truth of the
assertion SP(Si , init)[returni /yi ]Φ does not depend on the local context or the
variable yi due to the substitutions [returni /yi ]Φ. So we have

σ  , γ  , ω  |= SP(Si , init)[returni /yi ]Φ.

Finally, because of the definition of ω  (note that ω(returni ) equals the value
of the return expression ei in state (σ  , γ  )) we conclude

σ  , γ  , ω |= SP(Si , init)[returni /yi ]Φ[ei /returni ].

6 The Transformational Approach


In this section we introduce a general methodology for generalizing our complete-
ness result to more advanced object-oriented concepts. Our methodology is based
on the transformational approach introduced in [18]. Given an object-oriented
extension L+ of our very simple object-oriented language L, this transforma-
tional approach consists of the definition of a translation

T : L+ → L

such that for every statement S in L+ we have

{P }S{Q} iff {P }T (S){Q},

for every precondition P and postcondition Q. A complete Hoare logic for L+


thus can be derived by transforming the translation T into corresponding proof
rules.
Let us apply this approach first to the proof rule MI for method calls. Consider
the translation of statements in our language L which transforms every object-
oriented method call
e0 .m(e1 , . . . , en )
into the procedural call
m(e0 , e1 , . . . , en ),
where the typical object-oriented concept of this is simply viewed as an addi-
tional formal parameter of the method m. We thus obtain the following new rule
for typical object-oriented method invocations y = e0 .m(e1 , . . . , en ).

{P }S{Q[e/return]} Q[f¯/z̄] → R[return/y]


(MI’)
{P [ē/ū][f¯/z̄]} y = e0 m(e1 , . . . , en ) {R}

where [ē/ū] abbreviates the substitution [e0 , e1 , . . . , en /this, u1 , . . . , un ]. The


syntactic restrictions of rule MI also apply to this rule MI’.
130 F.S. de Boer and C. Pierik

Next we apply the transformational approach to dynamic binding (in the


context of inheritance and sub-typing as defined in Java [12]). In general, dy-
namic binding destroys the static connection between a method call and the
implementation that is bound to the call. This implies that every implementa-
tion that might possibly be bound to a call must satisfy the specification of the
call. That is the most significant change that has to be made to the rule for
reasoning about methods calls. In order to obtain such a rule we first add to
each class its inherited methods. Moreover, in order to avoid name clashes, we
rename every method of a class C by C.m. We extend the assertion language
with for every class C a monadic predicate C(e), which holds if and only if the
run-time type of the object denoted by e is C.
In the context of dynamic binding of method we then translate every call
e0 .m(e1 , . . . , en ) into the (nested) conditional statement:
if C1 (e0 )
then e0 .C1 .m(e1 , . . . , en )
else if C2 (e0 )
then e0 .C2 .m(e1 , . . . , en )
..
.
if Ck (e0 )
then e0 .Ck .m(e1 , . . . , en )
fi
..
.
fi
fi
where {C1 , . . . , Cn } are all the subclasses of C1 and C1 the (static) type of the
expression e0 . This translation forms the basis of the existing logical formaliza-
tions of dynamic binding as described, for example, in [1] and [22]. We refer to
[20] for the details of a transformation of this translation into a corresponding
proof rule for the given method call e0 .m(e1 , . . . , en ).
Of particular interest is an application of the transformational approach to
Hoare logics for various object-oriented concurrency models. We are already
working on a completeness proof of a Hoare logic for a simple extension with
shared-variable concurrency of our object-oriented language: We assume that
each class contains a so-called start-method denoted by start. The body of the
start method is executed by a new thread and the calling thread continues its
own execution. Like in Java we assume that the start method can be called upon
an object only once (consequently the number of threads is less than or equal
to the number of existing objects). A thread is formally defined as a stack of
temporary configurations of the form (S, γ), where S denotes the statement to
be executed and γ specifies the values of the temporary variables in S and the
identity of the active object (denoted by this).
The assertions in an annotated program describe the top configurations of the
existing threads. The verification conditions underlying the Hoare logic presented
in Section 4 describe the sequential flow of control of a thread.
How to Cook a Complete Hoare Logic for Your Pet OO Language 131

In order to characterize the interference between different threads we assume


that each method has a formal parameter thread which is used to identify the
executing thread. Every method invocation not involving the start method is
assumed to be of the form
e0 .m(thread, e1 , . . . , en ).
That is, we pass the identity of the thread to which the caller belongs to the
callee. On the other hand, the invocation of a start method is assumed to be of
the form
e0 .start(e0 , e1 , . . . , en ),
because this invocation starts a new thread originating from the callee denoted
by e0 .
We can now generalize the interference freedom test introduced in [19] to
threads as follows. Let P be the precondition of an assignment y := e and
P  be the precondition of a statement S. We define P  to be invariant over
the execution of the assignment y := e by a different thread if the following
verification condition holds:
(P ∧ P  ∧ thread = thread ) → P  [e/y],
where P  is obtained from P  by replacing every temporary variable u by a fresh
u and this by this , in order to avoid name clashes between the temporary
variables and the keyword this in P and P  . Note that thread is assumed to be
a temporary variable too and is therefore also renamed by thread .
This simple extension of our Hoare logic for shared variable concurrency
will form the basis of an application of the transformational approach to the
concurrency model of Java involving threads and coordination via reentrant
synchronization monitors (as described in [2]).

7 Conclusions
In recent years, many formal analysis techniques for object-oriented program-
ming languages have been proposed. The large amount of interest can be ex-
plained by the wide-spread use of object-oriented languages. However, the for-
mal justification of many existing Hoare logics for object-oriented programming
languages is still under investigation. Notably the problem of completeness until
now defied clear solutions. For example, the logic of object-oriented programs as
given by Abadi and Leino was not complete [1]. They suspect that their use of
a ”global store” model is the source of the incompleteness.
Von Oheimb showed that his logic for reasoning about a substantial subset
of Java is (relatively) complete [24]. However, he uses the logic of Isabelle/HOL
(higher order logic) as specification language. This trivializes the matter of ex-
pressiveness of the intermediate assertions. Therefore their result does not auto-
matically carry over to assertion languages that are closer to the programming
language. This point is further discussed in [17].
132 F.S. de Boer and C. Pierik

Future work concerns first of all the further development of the transforma-
tional approach to Hoare logics of object-oriented programming. Of particular
interest is an application of the transformational approach to Hoare logics for
various object-oriented concurrency models.
Another interesting line of research concerns the systematic development of
compositional Hoare logics for object-oriented programs. Note that the Hoare
logic presented in this paper is compositional only with respect to the flow of
control structures. However, it is not compositional in the sense of class-based.
Also our Hoare logic does not provide any formalization of the basic concept of
an object as an unit of data-encapsulation. We think that such a logic requires
a formalization of the external observable behavior of an object in terms of
its traces of events which indicate the sending and reception of messages (as
described for example in [15]). It is worthwhile to observe that such a notion
of external observable behavior (which abstracts from the internal state space
of an object) in fact involves a concurrency view even if the object-oriented
programming language is purely sequential.
Finally, we remark that most existing logics for object-oriented programs deal
with closed programs. Currently we are also investigating trace semantics as a
possible semantic basis of Hoare logics for open object-oriented programs.

References
1. M. Abadi and R. Leino. A logic of object-oriented programs. In M. Bidoit
and M. Dauchet, editors, TAPSOFT ’97: Theory and Practice of Software Devel-
opment, 7th International Joint Conference CAAP/FASE, Lille, France, volume
1214, pages 682–696. Springer-Verlag, 1997.
2. E. Abraham-Mumm, F. de Boer, W. de Roever, and M. Steffen. Verification for
Java’s reentrant multithreading concept. In Proc. of FoSSaCS 2002, volume 2303
of LNCS, pages 5–20, 2002.
3. K. R. Apt. Ten Years of Hoare’s Logic: A Survey - Part I. ACM Transactions on
Programming Languages and Systems, 3(4):431–483, October 1981.
4. S. A. Cook. Soundness and completeness of an axiom system for program verifi-
cation. Siam Journal of Computing, 7(1):70–90, February 1978.
5. J. de Bakker. Mathematical theory of program correctness. Prentice-Hall, 1980.
6. F. de Boer. Reasoning about dynamically evolving process structures. PhD thesis,
Vrije Universiteit, 1991.
7. F. de Boer and C. Pierik. Computer-aided specification and verification of anno-
tated object-oriented programs. In B. Jacobs and A. Rensink, editors, FMOODS
V, pages 163–177. Kluwer Academic Publishers, 2002.
8. H.-D. Ebbinghaus and J. Flum. Finite Model Theory. Springer-Verlag, 1995.
9. C. Flanagan, K. R. M. Leino, M. Lillibridge, G. Nelson, J. B. Saxe, and R. Stata.
Extended static checking for Java. In Proceedings of the ACM SIGPLAN 2002
Conference on Programming Language Design and Implementation (PLDI), pages
234–245, 2002.
10. R. W. Floyd. Assigning meaning to programs. In Proc. Symposium on Applied
Mathematics, volume 19, pages 19–32. American Mathematical Society, 1967.
How to Cook a Complete Hoare Logic for Your Pet OO Language 133

11. G. Gorelick. A complete axiomatic system for proving assertions about recursive
and non-recursive programs. Technical Report 75, Dep. Computer Science, Univ.
Toronto, 1975.
12. J. Gosling, B. Joy, and G. Steele. The Java Language Specification. Addison-
Wesley, 1996.
13. C. A. R. Hoare. An axiomatic basis for computer programming. Communications
of the ACM, 12(10):576–580, 1969.
14. T. Hoare. Assertions. In M. Broy and M. Pizka, editors, Models, Algebras and Logic
of Engineering Software, volume 191 of NATO Science Series, pages 291–316. IOS
Press, 2003.
15. A. Jeffrey and J. Rathke. A fully abstract may testing semantics for concurrent
objects. In Proceedings of Logics in Computer Science, pages 101–112, 2002.
16. B. Meyer. Eiffel: The Language. Prentice-Hall, 1992.
17. T. Nipkow. Hoare logics for recursive procedures and unbounded nondeterminism.
In J. Bradfield, editor, Computer Science Logic (CSL 2002), volume 2471 of LNCS,
pages 103–119. Springer, 2002.
18. E.-R. Olderog and K. R. Apt. Fairness in parallel programs: The transformational
approach. TOPLAS, 10(3):420–455, 1988.
19. S. Owicki and D. Gries. An axiomatic proof technique for parallel programs I.
Acta Informatica, 6:319–340, 1976.
20. C. Pierik and F. S. de Boer. A syntax-directed Hoare logic for object-oriented
programming concepts. In E. Najm, U. Nestmann, and P. Stevens, editors, Formal
Methods for Open Object-Based Distributed Systems (FMOODS) VI, volume 2884
of LNCS, pages 64–78, 2003.
21. C. Pierik and F. S. de Boer. A syntax-directed Hoare logic for object-oriented
programming concepts. Technical Report UU-CS-2003-010, Institute of Informa-
tion and Computing Sciences, Utrecht University, The Netherlands, March 2003.
Available from http://www.cs.uu.nl/research/techreps/UU-CS-2003-010.html.
22. A. Poetzsch-Heffter and P. Müller. A programming logic for sequential Java. In
S. D. Swierstra, editor, ESOP ’99, volume 1576 of LNCS, pages 162–176, 1999.
23. J. Tucker and J. Zucker. Program correctness over abstract data types with error-
state semantics. North-Holland, 1988.
24. D. von Oheimb. Hoare logic for Java in Isabelle/HOL. Concurrency and Compu-
tation: Practice and Experience, 13(13):1173–1214, 2001.
Behavioural Specification for Hierarchical
Object Composition

Răzvan Diaconescu

Institute of Mathematics “Simion Stoilow”, PO-Box 1-764,


Bucharest 014700, Romania
Razvan.Diaconescu@imar.ro

Abstract. Behavioural specification based on hidden (sorted) algebra


constitutes one of the most promising recently developed formal specifi-
cation and verification paradigms for system development.
Here we formally introduce a novel concept of behavioural object
within the hidden algebra framework. We formally define several object
composition operators on behavioural objects corresponding to the hi-
erarchical object composition methodology introduced by CafeOBJ. We
study their basic semantical properties and show that our most gen-
eral form of behavioural object composition with synchronisation has
final semantics and a composability property of behavioural equivalence
supporting a high reusability of verifications. We also show the commu-
tativity and the associativity of parallel compositions without synchro-
nisation.

1 Introduction
The current Internet/Intranet technologies have led to an explosive increase in
demand for the construction of reliable distributed systems. Among the new
technologies proposed for meeting this new technological challenge, component-
based software engineering is one of the most promising. If we have an adequate
set of components and a good design pattern, a system development process
may become easier and the quality of the product may be greatly improved.
However such development process raises some serious problems. How can we
get an adequate set of components or how can we know the components we get
are adequate for our systems?
A good solution seems to be given by formal specifications supporting the
following characteristics:
– can specify the interface of components,
– can specify the behaviour of components,
– supports a precise semantics of composition, and
– be executable or have tools supporting testing and verification.
Here we adopt the behavioural algebraic specification framework [1, 2, 3, 4].
Due to its simple logical foundations and to its efficient specification and verifi-

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 134–156, 2004.

c Springer-Verlag Berlin Heidelberg 2004
Behavioural Specification for Hierarchical Object Composition 135

cation methodologies, behavioural algebraic specification provides a good frame-


work for such formal specifications. Informally, behavioural algebraic specifica-
tion describe both data types and states of abstract machines by axioms based
on strict equality, in the case of the data types, and behavioural equality (i.e.
equality under ‘observations’ to data), in the case of the states of abstract ma-
chines. Implementations of behavioural specifications are formalised as (many
sorted) ‘hidden’ algebras interpreting two kinds of sorts as sets, one kind for the
data types, another kind for the states of the abstract machines, and interpreting
operations as functions. The operations which determine the behavioural equiv-
alence between states are specified as special ‘behavioural’ operations. Such a
‘hidden’ algebra is a model of a specification when all sentences (axioms) of the
specification are valid for that algebra.
The work around the CafeOBJ algebraic specification language [5, 6] has
proposed a hierarchical object composition methodology (see [5, 7, 8]) based
on behavioural specification. The behavioural specification paradigm is reflected
rather directly in the definition of CafeOBJ, this being maybe the most distinctive
feature of this language among other modern algebraic specification languages
such as CASL [9] or Maude [10].
Here we formally define the novel concept of behavioural object within the
hidden algebra framework, which is the logic of CafeOBJ behavioural specifi-
cation. Informally, a behavioural object is just a special kind of behavioural
specification which emphasises a special ‘hidden’ sort for the space of the states
of the object and special operations for modelling method invocations and at-
tributes. One of the most important novel related concepts introduced is that of
equivalence between behavioural objects, which plays an important role in the
study of the semantical properties of hierarchical object composition.
This concept is the basis for a precise definition of several types of compo-
sition operators on behavioural objects, such as parallel composition (without
synchronisation), dynamic composition (in which component objects gets cre-
ated and deleted dynamically), and composition with synchronisation generalis-
ing both the former operators. Informally, these composition operators are based
on specifications of projections from the state space of the compound object to
the state spaces of the components. Our definitions give mathematical founda-
tions for the corresponding methodological definitions of object composition in
[5, 11, 8]. Our composition operators support hierarchical composition processes
in the sense that the result of a composition is still a behavioural object which
can be therefore used in another composition.
Our framework permits a clear formulation of semantical properties of the
composition operators, such as associativity and commutativity, and final
semantics (i.e. the existence of final composition models). We show that the
basic parallel composition operator is commutative and associative (modulo ob-
ject equivalence). For the general composition with synchronisation operator we
prove a compositionality result for the behavioural equivalence relation, result
which constitute the foundation for automation of the verification process at the
level of a compound object (see [5, 11, 8]), and the existence of final semantics.
136 R. Diaconescu

The paper is structured as follows. The first section recalls briefly the basic
mathematical notions necessary for this work. We present first general algebra
notions, and then we give a very brief overview of hidden algebra. At the end
of this section we define the concept of behavioural object. The next section
introduces briefly the CafeOBJ notation for behavioural objects, which will be
used for the examples, and which is illustrated with two simple examples. The
main section of the paper is dedicated to the composition operators and contains
their mathematical definitions together with their main semantic properties. All
composition operators are illustrated with examples. The final section develops
an example showing how the compositionality of behavioural equivalence result
for compositions with synchronisation can be applied for reusing verifications
within the framework of the CafeOBJ system.
Here we only give the statements of the mathematical results and omit their
proofs. These will appear in an extended full version of the paper.

Acknowledgement
The author is grateful to all people who contributed to the development of
the CafeOBJ object composition methodology based on projection operations,
especially to Shusaku Iida and Kokichi Futatsugi.

2 The Logic of Behavioural Specification


The semantics of behavioural specification is based on hidden algebra [2, 4,
3] which is a refinement of general many-sorted algebra. Although the hidden
algebra formalism accommodates well (and even gets more power from) the
order-sorted approach (see [12]), for reasons of simplicity of presentation, we
develop all formal definition and results in a many-sorted framework.

2.1 General Algebra


We review here the basic general algebra concepts, notations, and terminology,
which constitute now the folklore of algebraic specification.
Given a sort set S, an S-indexed (or sorted ) set A is a family {As }s∈S of sets
indexed by the elements of S. In this context, a ∈ A means that a ∈ As for some
s ∈ S. Similarly, A ⊆ B means that As ⊆ Bs for each s ∈ S, and an S-indexed (or
sorted ) function f : A → B is a family {fs : As → Bs }s∈S . Also, we let S ∗ denote
the set of all finite sequences of elements from S, with [] the empty sequence.
Given an S-indexed set A and w = s1 ...sn ∈ S ∗ , we let Aw = As1 × · · · × Asn ; in
particular, we let A[] = {}, some one point set. Also, for an S-sorted function
f : A → B, we let fw : Aw → Bw denote the function product mapping a tuple
of elements (a1 , . . . , an ) to the tuple (fs1 (a1 ), . . . , fsn (an )).
A (n S-sorted ) signature (S, F ) is an S ∗ × S-indexed set F = {Fw→s | w ∈
S ∗ , s ∈ S} of operation symbols. Note that this definition permits overload-
Behavioural Specification for Hierarchical Object Composition 137

ing, in that the sets Fw→s need not be disjoint. Call σ ∈ F[]→s (sometimes
denoted simply F→s ) a constant symbol of sort s. A signature morphism ϕ from
a signature (S, F ) to a signature (S  , F  ) is a pair (ϕsort , ϕop ) consisting of a
map ϕsort : S → S  of sorts and of a map ϕop : F → F  on operation symbols
 sort ∗
such that ϕop w→s (Fw→s ) ⊆ F(ϕsort )∗ (w)→ϕsort (s) , where (ϕ ) : S ∗ → S ∗ is the
extension of ϕsort to strings.
A (S, F )-algebra A consists of an S-indexed set A and a function Aσ : Aw →
As for each σ ∈ Fw→s ; the set As is called the carrier of A of sort s. If σ ∈ F→s
then Aσ determines a point in As which may also be denoted Aσ . An (S, F )-
homomorphism from one (S, F )-algebra A to another B is an S-indexed function
h : A → B such that hs (Aσ (a)) = Bσ (hw (a)) for each σ ∈ Fw→s and a ∈ Aw .
A (S, F )-homomorphism h : A → B is a (S, F )-isomorphism if and only if
each function hs : As → Bs is bijective (i.e., one-to-one and onto, in an older
terminology). The category (class) of algebras of a signature (S, F ) is denoted
Alg(S, F ).
A F -congruence on a (S, F )-algebra A is an S-sorted family of relations, ≡s on
As , each of which is an equivalence relation, and which also satisfy the congruence
property, that given any σ ∈ Fw→s and any a ∈ Aw , then Aσ (a) ≡s Aσ (a )
whenever a ≡w a .1
Given a signature morphism ϕ : (S, F ) → (S  , F  ) and a (S  , F  )-algebra A ,
we can define the reduct of A to (S, F ), denoted A ϕ , or simply A (S,F ) when
ϕ is an inclusion of signatures, to have carriers Aϕ(s) for s ∈ S, and to have
operations (A ϕ )σ = Aϕ(σ) for σ ∈ F . Then A is called an expansion of A
along ϕ. Reducts can also be easily extended to algebra homomorphisms.
For any signature (S, F ) and (S, F )-algebra A, let (S, FA ) be the elementary
extension of (S, F ) via A which adds the elements of A as new constants, i.e.
(FA )→s = F→s ∪ As and (FA )w→s = Fw→s when w is not empty. Let AA
denote the expansion of A to (S, FA ) interpreting each element of A by itself,
i.e. (AA )a = a for each a ∈ A.
Any F -term t = σ(t1 . . . tn ), where σ ∈ Fw→s is an operation symbol and
t1 , . . . , tn are F -(sub)terms corresponding to the arity w, gets interpreted as an
element At ∈ As in a (S, F )-algebra A by At = Aσ (At1 . . . Atn ). When each
element a of A can be denoted as a = At for some term t, then we call A a
reachable algebra.
For any set X of new constants, called variables, the (F ∪ X)-terms can be
regarded as F -derived operations by defining the arity ar(t) by the following
procedure:
– consider the set var(t) ⊆ X of all variables occurring within t,
– transform var(t) into a string by fixing an arbitrary order on this set, and
– finally, replace the variables in the string previously obtained by their sorts.
Any F -derived operation t with arity w and sort s determines a function
At : Aw → As such that for each string a of elements corresponding to ar(t),

1
Meaning ai ≡si ai for i = 1, ..., n, where w = s1 . . . sn and a = (a1 , . . . , an ).
138 R. Diaconescu

At (a) is the evaluation of t in the expansion of A to F ∪ X which interprets the


variables of X be the corresponding elements of a.
A F -context c[z] is any F -term c with a marked variable z occurring only
once in c.
Given a signature (S, F ), the set of (S, F )-sentences is the least set of sen-
tences containing the (quantifier-free) equations and which is closed under log-
ical connectives and quantification. An equation is an equality t = t between
F -terms t and t . For ρ1 and ρ2 any (S, F )-sentences, let ρ1 ∧ρ2 be their conjunc-
tion which is also a (S, F )-sentence. Other logical connectives are the disjunction,
implication, negation, etc. For any set X of variables for a signature (S, F ), then
(∀X)ρ is a (S, F )-sentence for each (S, F ∪ X)-sentence ρ. Similar definition can
be applied to the existential quantification.
Given an algebraic signature morphism ϕ : (S, F ) → (S  , F  ), each (S, F )-
sentence ρ can be translated to a (S  , F  )-sentence ρ , denoted ϕ(ρ), by replacing
any symbol of (S, F ) fro ρ by its corresponding symbol from (S  , F  ) given by
ϕ.2
The satisfaction between algebras and sentences is the Tarskian satisfaction
defined inductively on the structure of sentences. Given a fixed arbitrary signa-
ture (S, F ) and an (S, F )-algebra A,

– A |= t = t if At = At for equations,


– A |= ρ1 ∧ ρ2 if A |= ρ1 and A |= ρ2 and similarly for the other logical
connectives, and
– for each (S, F ∪ X)-sentence A |= (∀X)ρ if A |= ρ for each expansion A of
A along the signature inclusion (S, F ) → (S, F ∪ X).

2.2 Hidden Algebra


Hidden algebra (abbreviated HA) is the logical formalism underlying behavioural
specification. It extends ordinary general algebra with sorts representing ‘states’
of objects, or abstract machines, rather than data elements and also introduces
a new satisfaction between algebras and sentences, called ‘behavioural satisfac-
tion’. In the literature there are several versions of hidden algebra, with only
slight technical differences between them [2, 3, 4]. In the following we review the
basic concepts of hidden algebra.
A hidden algebraic signature (H, V, F, F b ) consists of a set H of hidden sorts,
a set V of ordinary visible sorts, a set F of (H ∪V )-sorted operation symbols, and
a distinguished subset F b ⊆ F of behavioural operations. Behavioural operations
are required to have at least one hidden sort in their arity. An operation symbol
which has visible arity and sort is called data operation.
From an object-oriented methodological perspective, the hidden sorts denote
sets of ‘states of objects’, the visible sorts denote data types, and the operations
σ ∈ Fw→s
b
can be thought as ‘methods’ whenever w has exactly one hidden sort

2
In the particular case of quantifications, notice that this changes the sorts of the
variables.
Behavioural Specification for Hierarchical Object Composition 139

and s is hidden also, and as ‘attributes’ whenever w has exactly one hidden
sort and s is visible. This object-oriented interpretation of behavioural logic will
be formally clarified in the section below by the introduction of the concept of
‘behavioural object’.
A (H, V, F, F b )-algebra is just an (H ∪ V, F )-algebra. A homomorphism of
hidden algebras h : A → B for a signature (H, V, F, F b ) is just a (H ∪ V, F )-
algebra homomorphism preserving the behavioural equivalence, i.e. such that
h(∼A ) ⊆∼B .
Given a (H, V, F, F b )-algebra A, a hidden congruence ∼ on A is just an
b
F -congruence which is identity on the visible sorts. The largest hidden F -
congruence ∼A on A is called behavioural equivalence. The following is probably
the most fundamental result in hidden algebra, providing the foundations for
the so-called ‘coinduction’ proof method.
Theorem 1. Behavioural equivalence always exists.
Hence in order to prove by coinduction that two elements are behaviourally
equivalent it is enough to prove that they are congruent3 for some arbitrarily
but conveniently chosen hidden congruence.
An operation symbol σ is coherent for an algebra A when it preserves the
behavioural equivalence, i.e. Aσ (a) ∼A Aσ (a ) whenever a ∼a a (possibly
component-wise).
A hidden algebra signature morphism ϕ : (H, V, F, F b ) → (H  , V  , F  , F  ) is
b
  
a signature morphism (H ∪ V, F ) → (H ∪ V , F ) such that
– ϕ(V ) ⊆ V  and ϕ(H) ⊆ H  ,
– ϕ(F b ) = F  and ϕ−1 (F  ) ⊆ F b ,
b b

These conditions say that hidden sorted signature morphisms preserve vis-
ibility and invisibility for both sorts and operations, and the object-oriented
intuition behind the inclusion F  ⊆ ϕ(F b ) is the encapsulation of classes (in
b

the sense that no new ‘methods’ or ‘attributes’ can be defined on an imported


class)4 . However, this last inclusion condition applies only to the case when sig-
nature morphisms are used as module imports (the so-called horizontal signature
morphisms); when they model specification refinement this condition might be
dropped (this case is called vertical signature morphism).
Algebra reducts along hidden algebra signature morphisms are instances of
the ordinary general algebra reducts along algebraic signature morphisms.
Given a hidden algebraic signature (H, V, F, F b ), a behavioural equation t ∼ t
consists of a pair of F -terms of the same sort. An (H, V, F, F b )-algebra A satisfies
it, i.e. A |= t ∼ t , when At ∼A At .
Full first order behavioural sentences are constructed from strict and be-
havioural equations by iteration of logical connectives and first order quantifica-
tion in a way similar to the case of ordinary general algebra.

3
In a hidden congruence relation.
4
For the model theoretic relevance of this condition see [1].
140 R. Diaconescu

A behavioural presentation (Σ, E) consists of a hidden algebraic signature Σ


and a set E of Σ-sentences. A presentation morphism ϕ : (Σ, E) → (Σ  , E  ) is
just a signature morphism ϕ : Σ → Σ  such that E  |= ϕ(E).5
An operation symbol σ is coherent with respect to a presentation (Σ, E)
when it is coherent in each algebra of the presentation.

2.3 Behavioural Objects


The definition below introduces the novel concept of ‘behavioural object’ as a
stylised way of structuring a hidden algebra based behavioural specification,
so that it models the behaviour of objects from object-oriented programming.
Notice that our hidden algebra approach to object semantics is more general than
corresponding co-algebraic approaches (such as [13], for example) because of the
much greater specification power of hidden algebras, which is due to the smooth
integration between state and data types, and to the possibility of operations
with multiple hidden sorts in the arity.
Definition 1. A behavioural object B is pair consisting of a behavioural pre-
sentation ((HB , VB , FBb , FB ), EB ) and a hidden sort hB ∈ HB such that each be-
havioural operation in FBb is monadic, i.e. it has only one hidden sort in its arity.
The hidden sort hB denotes the states of B. The visible sorted behavioural
operations on hB are called B-observations and the hB -sorted behavioural oper-
ations on hB are called B-actions. The hB -sorted operations with visible sorted
arity are called constant states.6
For the sake of simplifying notation, without loss of generality, we may as-
sume that the arity of any behavioural operation of an object is of the form hw
with h hidden sort.
A derived behavioural operation is any derived operation of the form σ(τ, t)
such that σ is a behavioural operation or coherent with respect to the behavioural
object and τ is a variable or a derived behavioural operation. Notice that ordinary
behavioural operations can be regarded as special cases of derived behavioural op-
erations.
The following expresses the fact that any object is essentially defined by its
state space, its actions, and the behavioural equivalence between the states.
Definition 2. For any behavioural object B, a B-algebra is just an algebra for
the signature of B satisfying the sentences EB of the presentation of the object
B. The category of B-algebras is denoted by Alg(B).
Two B-algebras A and A are equivalent, denoted A ≡ A , when
– AhB = AhB and ∼A =∼A (on the sort hB ), and
– Aσ = Aσ for each B-action σ

Definition 3. Two behavioural objects B and B  are equivalent, denoted B ≡


B  , when there exists a pair of functors Φ : Alg(B) → Alg(B  ) and Ψ : Alg(B  ) →

5
Any hidden algebra satisfying E  satisfies ϕ(E) too.
6
They should be considered as parameterised by the data arguments of the arity.
Behavioural Specification for Hierarchical Object Composition 141

Alg(B) such that A ≡ Ψ (Φ(A)) for each B-algebra A and A ≡ Φ(Ψ (A )) for each
B  -algebra A .
Therefore behavioural objects are equivalent when they admit the same ‘im-
plementations’. Notice that this defines indeed an equivalence relation and that
isomorphic objects are equivalent.
We may also extend the concept of reduct of algebras from behavioural pre-
sentation morphisms to behavioural object morphisms.

3 The CafeOBJ Notation


CafeOBJ (whose definition is given by [5] and logical foundations in [6]) is a
modern successor of the OBJ [14] language incorporating several new major
developments in algebraic specification theory and practice. CafeOBJ is aimed
both to researchers and (industry) practitioners.
Behavioural specification might be the most distinctive feature of CafeOBJ
within the broad family of algebraic specification languages. This is incorporated
into the design of the language in a rather direct way.
CafeOBJ methodologies introduce a graphical notation extending the classical
ADJ-diagram notation for data types to behavioural objects in which
G1. Sorts are represented by ellipsoidal disks with visible (data) sorts represented
in white and hidden (state) sorts represented in grey, and
G2. Operations are represented by multi-source arrows with the monadic part
from the hidden sort thickened in case of actions and observations.
As example let us consider the signature of a bank account behavioural object
ACCOUNT which uses a pre-defined data type of natural numbers (represented
in this figure by the sort Nat and in the equations below by the operation + ):

init

Account

balance
deposit
withdraw

Int Nat
142 R. Diaconescu

and with the following universally quantified equations


eq balance(init-account) = 0 .
eq balance(deposit(A, N)) = balance(A) + N .
ceq balance(withdraw(A, N)) = balance(A) - N if N <= balance(A) .
ceq balance(withdraw(A, N)) = balance(A) if N > balance(A) .
Notice that the last two equations are conditional, i.e. they are universally
quantified implications between a condition and an equation. In this example the
conditions are just binary relations, however they can be regarded as equations
between Boolean-valued terms. More generally, a condition of a conditional equa-
tion can be thought as quantifier-free formula formed from equations by iterative
applications of conjunction, disjunction, and negation.
We can also easily prove that in any ACCOUNT-algebra A,

a ∼A a if and only if Abalance (a) = Abalance (a )

A more sophisticated example is provided by a behavioural object of sets


BSET in which the sets of elements appear as states of this object. The signature
of BSET is given by the diagram below and which uses a sort Elt for elements of
sets and a pre-defined Boolean type (represented in the figure by the sort Bool
and in the equations below by several standard Boolean operations):

_U_,_-_

empty

Set

{_}
in

Bool Elt

Notice that there is only one behavioural operation, the membership obser-
vation in, the operations { } standing for regarding elements as singleton sets,
U standing for union of sets, and - standing for the difference between sets not
being specified as behavioural. The equations are as follows:
eq E in empty = false .
eq E in { E’ } = (E == E’) .
eq E in (S U S’) = (E in S) or (E in S’) .
eq E in (S - S’) = (E in S) and not(E in S’) .
Behavioural Specification for Hierarchical Object Composition 143

Notice that the behavioural equivalence is given by the element membership


observation only and that in all algebras the interpretation of the other opera-
tions preserve the behavioural equivalence, hence these operations are coherent
with respect to this specification.
The reader is invited to compare the level of complexity of the behavioural
specification of sets with that of the classical data type specification based on ini-
tial semantics. This gap in favour the behavioural style is even bigger in the case
of proofs, such as distributivity of the set difference - over the set union U .

4 Hierarchical Object Composition


4.1 General Considerations
Our methodology for behavioural object composition has been defined informally
within the framework of the CafeOBJ language [5, 7]. Here by formally defining
composition operations on behavioural objects (see Definition 1), we can export
the CafeOBJ behavioural object composition methodology to any specification
and verification language implementing a form of behavioural logic close to our
hidden algebra formalism.
Our behavioural object composition methodology is hierarchical in the sense
that the composition of behavioural objects yields another behavioural object
in the sense of Definition 1, which can also be used for another composition.
Hierarchical behavioural object composition can be represented in UML notation
as follows:

Object A

Object B

Object D Object E Object C

base level objects

In the above UML figure, B is composed of D and E, A of B and C, and


non-compound objects (i.e., objects with no components) are called base level
objects. A composition is represented in UML by lines tipped by diamonds, and
if necessary, qualified by the numbers of components (1 for one and * for many).
144 R. Diaconescu

Projection operations from the hidden sort of the states of the compound
object to the hidden sorts of the states of the component objects constitute the
main technical concept underlying the CafeOBJ composition method; projection
operations are related to the lines of UML figures. Projection operations are
subject to several mathematical conditions which will be formally clarified later.
These are in essence as follows:
1. all actions of the compound object are related via projection operations to
actions in each of the components,
2. each observation of the compound object is related via the projection oper-
ations to an observation of some component, and
3. each constant state of the compound object is projected to a constant state
on each component.
In the compound objects we only define communication between the compo-
nents; this means that the only equations at the level of the specification of the
compound objects are the ones relating the actions and observations of the com-
pound objects to those of the components as described above. All the equations
for the projection operations are strict rather than behavioural, however we may
also define them behaviourally without affecting our semantics and methodology.
The components of a compound object are connected in parallel if there is no
synchronisation between them. In order to define the concept of synchronisation,
we have to introduce the following concept.
Definition 4. Two actions of a compound object are in the same action group
when they change the state of the same component object via a projection oper-
ation.
Synchronisation appears when:
– there exists an overlapping between some action groups, in the sense that
some action of the compound object is projected to at least two components
affecting their state changing simultaneously, or
– the projected state of the compound object (via a projection operation)
depends on the state of a different (from the object corresponding to the
projection operation) component.
The first case is sometimes called broadcasting and the second case is some-
times called client-server computing. In the unsynchronised case, we have full
concurrency between all the components, which means that all the actions of the
compound object can be applied concurrently, therefore the components can be
implemented as distributed processes or concurrent processes with multi-thread
which are based on asynchronous communications.
In the case of synchronised compositions, the equations for the projection
operations are conditional rather than unconditional. Informally, their conditions
are subject to the following conditions:
– each condition is a quantifier-free formula formed from equations by itera-
tion of negation, conjunction, and disjunction, the terms in the equations
Behavioural Specification for Hierarchical Object Composition 145

being compositions between a projection and a composition chain of ac-


tions/observations (at the level of the component) or terms in the data sig-
nature, and
– the disjunction of all the conditions corresponding to a given left hand side
(of equations regarded as a rewrite rule) is true.

4.2 Parallel Composition


Parallel composition (i.e. without synchronisation) is the most fundamental form
of behavioural object composition. As example we consider a very simple bank
account system which consists of a fixed numbers of individual accounts, lets
actually consider the case of just two accounts. The specification of an account
can be obtained just by renaming the specification ACCOUNT of a counter object
with integers. In CafeOBJ notation this is achieved as follows

mod* ACCOUNT1 { protecting(ACCOUNT *{ hsort Account -> Account1,


op init-account -> init-account1 })}
mod* ACCOUNT2 { protecting(ACCOUNT *{ hsort Account -> Account2,
op init-account -> init-account2 })}

We then compose these two account objects as in the following double fig-
ure containing both the UML and the CafeOBJ graphical7 representation of this
composition, where deposit1 and withdraw1 are the actions for the first account,
balance1 is the observation for the first account, account1 is the projection oper-
ation for the first account, and deposit2, withdraw2, balance2, and account2 are
the corresponding actions, observation, and projection operation for the second
account:

AccountSys
account1 account2
AccountSys
deposit1 Account1 Account2
deposit2
withdraw1
withdraw2 deposit1
balance1 deposit2
1 1 balance2 withdraw1
withdraw2
1 1
Account1 Account2 Int
deposit deposit Nat
withdraw withdraw

7
The CafeOBJ graphical representation corresponds to the module defining this object
composition rather than to the “flattened” specification, hence the operations of the
components are not included in the figure.
146 R. Diaconescu

The equations for this parallel composition are as follows:

eq balance1(AS) = balance(account1(AS)) .
eq balance2(AS) = balance(account2(AS)) .
eq account1(deposit1(AS, N)) = deposit(account1(AS), N) .
eq account1(deposit2(AS, N)) = account1(AS) .
eq account1(withdraw1(AS, N)) = withdraw(account1(AS), N) .
eq account1(withdraw2(AS, N)) = account1(AS) .
eq account2(init-account-sys) = init-account2 .
eq account2(deposit1(AS, N)) = account2(AS) .
eq account2(deposit2(AS, N)) = deposit(account2(AS), N) .
eq account2(withdraw1(AS, N)) = account2(AS) .
eq account2(withdraw2(AS, N)) = withdraw(account2(AS), N) .

Notice that besides the first two equations relating the observations on the
compound object to those on the components, the other equations relate the ac-
tions of the account system to the actions of the components. Remark that the
actions corresponding to one component do not change the state of the second
component (via the projection operation), hence this composition is unsynchro-
nised. In fact these equations expressing the concurrency of composition need
not be specified by the user, in their absence they may be generated internally
by the system, thus reducing the specification of the composition to the essential
information which should be provided by the user.
The following provides a formal definition for parallel composition of be-
havioural objects. Another parallel composition concept as operators on speci-
fications has been defined in [1] within a more restricted hidden algebra frame-
work.

Definition 5. A behavioural object B is a parallel composition of behavioural


objects B1 and B2 when
– HB = HB1 * HB2 * {hB },8
– V B = V B1 ∪ V B2 ,
– (FB )w→s = (FB1 )w→s ∪ (FB2 )w→s when all sorts in ws are visible,
– (FB )w→s = (FBi )w→s when ws contains hidden sorts from Hi only, for
i ∈ {1, 2},
– (FB )w→s = ∅ when ws contains hidden sorts from both H1 and H2 only,
– (FB )hB →hBi = {πi } for i ∈ {1, 2},
– (FB )hB w→hB = {σi | σ ∈ (FBi )hBi w→hBi Bi -action, i ∈ {1, 2}}
– the behavioural operations FBb of FB are those from FBb1 and FBb2 , π1 , π2 ,
and the actions and the observations on hB ,
– EB = EB1 ∪ EB2 ∪
{(∀{x} ∪ W )πi (σi (x, W )) = σ(πi (x), W ) | σ Bi -action, i ∈ {1, 2}}
∪ {(∀{x} ∪ W )πj (σi (x, W )) =πj (x) | σ Bi -action {i, j} = {1, 2}}
∪ {e(σ) | σ B-observation} ∪ c B-state constant E(c)

8
By  we denote the disjoint union.
Behavioural Specification for Hierarchical Object Composition 147

where e(σ) is a derived observational definition of σ and E(c) is a derived


constant set of definitions for c.
For each B-observation σ we say that an equation (∀{x} ∪ W )σ(x, W ) =
τσ (πi (x), W ) is a derived observational definition of σ when i ∈ {1, 2} and where
τσ is a (possibly derived) Bi -observation.
For each B-state constant c we say that E(c) = {πi (c) = ci | i ∈ {1, 2}} for
some ci is a Bi -state constant is a derived constant set of definitions for c.
Let us denote by B1 B2 the class of behavioural objects B which are parallel
compositions of behavioural objects B1 and B2 .

This definition can be easily extended to any finite number of objects.


The following shows that in the case of parallel composition without synchro-
nisation the behavioural equivalence on the compound object is compositional
with respect to the behavioural equivalences of its components.

Proposition 1. For any behavioural objects B1 and B2 , for each parallel com-
position B ∈ B1 B2 , we have that

a ∼A a if and only if Aπ1 (a) ∼A1 Aπ1 (a ) and Aπ2 (a) ∼A2 Aπ2 (a )

for each B-algebra A, elements a, a ∈ AhB , and where Ai = ABi for each
i ∈ {1, 2}.

The following shows that parallel composition is unique up to object equiv-


alence.

Proposition 2. Let B1 and B2 be behavioural objects. Then all B, B  ∈ B1 B2


have isomorphic classes of algebras.

Corollary 1. For all behavioural objects B1 and B2 , all B, B  ∈ B1 B2 are


equivalent objects, i.e. B ≡ B  .

Notice that we cannot expect two parallel compositions to be isomorphic (as


presentations) because observations on the compound objects can be defined
differently, hence their signatures need not be isomorphic. However, modulo the
definition of the observations on the compound objects, parallel composition
without synchronisation is uniquely determined. This permits a high degree of
automation of the specification of parallel composition.

Definition 6. Let B ∈ B1 B2 and let Ai be algebras of Bi for i ∈ {1, 2} such


that they are consistent on the common data part.9 A B-algebra A expands A1

9
B1 (V,FV ) = B2 (V,FV ) where V = VB1 ∩ VB2 and FV is the set of all data operation
symbols in FB1 ∩ FB2 .
148 R. Diaconescu

and A2 when ABi = Ai for each i ∈ {1, 2}. A B-algebra homomorphism


f : A → A expands A1 and A2 when f Bi = 1Ai for each i ∈ {1, 2}.

The following shows that parallel composition admits final semantics:

Theorem 2. Let B ∈ B1 B2 and let Ai be algebras of Bi for i ∈ {1, 2} such


that they are consistent on the common data part. Then there exists a B-algebra
A expanding A1 and A2 such that for any other B-algebra A expanding A1 and
A2 there exists an unique B-algebra homomorphism A → A expanding A1 and
A2 .

Parallel composition has several expected semantic properties such as asso-


ciativity and commutativity.

Theorem 3. For all behavioural objects B1 and B2 and B3

1. B1 B2 = B2 B1 , and


2. B(12)3 ≡ B1(23) for all B(12)3 ∈ B12 B3 and all B1(23) ∈ B1 B23 , where Bij
is any composition in Bi Bj .

B(12)3 B1(23)
π12 π3 π’1 π’23

B12 B23

π1 π2 π’2 π’3

B1 B2 B3 B1 B2 B3

4.3 Dynamic Composition


Let us extend the previous bank account system example to support an arbi-
trary number of accounts as a ‘dynamic’ object ACCOUNT-SYS. The accounts
are created or deleted dynamically, so we call such architecture pattern dy-
namic composition and we call the objects composed dynamically as dynamic
objects.
The actions add-account and del-account maintain the user accounts. add-
account creates accounts while del-account deletes the accounts; both of them
are parameterised by the user identifiers UID (represented by the sort UId). Each
of deposit and withdraw are also parameterised by the user identifiers. Most
notably, the projection operation for ACCOUNT is also parameterised by UID.
The structure of the new bank account system can be represented in UML and
CafeOBJ graphical notation as follows:
Behavioural Specification for Hierarchical Object Composition 149

AccountSys deposit
AccountSys withdraw

deposit1
deposit2 Nat
withdraw1
withdraw2
add-account
* account del-account

1
Account Uid
Account
deposit
withdraw init-account
no-account

Finally, the equations relate the actions of ACCOUNT-SYS to those of AC-


COUNT via the projection operation only when they correspond to the specified
user account. Here is the essential part of the CafeOBJ equations for the dynamic
system of accounts specification:
ceq account(add-account(AS, U’), U) = init-account if U == U’ .
ceq account(add-account(AS, U’), U) = account(AS, U) if U =/= U’ .
ceq account(del-account(AS, U’), U) = no-account if U == U’ .
ceq account(del-account(AS, U’), U) = account(AS, U) if U =/= U’ .
ceq account(deposit(AS, U’, N), U) = deposit(account(AS, U), N) if U == U’ .
ceq account(deposit(AS, U’, N), U) = account(AS, U) if U =/= U’ .
ceq account(withdraw(AS, U’, N), U) = withdraw(account(AS, U), N) if U == U’ .
ceq account(withdraw(AS, U’, N), U) = account(AS, U) if U =/= U’ .

Notice that dynamic object compositions generalise the ordinary projections


to projections which are parameterised by the data types (UId) and also that
dynamic compound objects might add new actions (add-account and del-account)
which do not correspond to actions of the components.

4.4 Synchronised Parallel Composition


In this section we define the most general form of object composition of our
approach. This supports dynamic compositions and synchronisation both in the
broadcasting and client-server computing forms.
As example let us add to the parallel system of two accounts specified above
a transfer action transfer from the first account to the second one. This is of
course parameterised by the amount of money to be transfered. The signature
of this composition looks now as follows:
150 R. Diaconescu

transfer

AccountSys
account1 account2

Account1 Account2

deposit1
balance1 deposit2
balance2 withdraw1
withdraw2

Int
Nat

and the equations for the transfer are as follows:


eq account1(transfer(AS, N)) = withdraw(account1(AS), N) .
ceq account2(transfer(AS, N)) = account2(AS) if N > balance1(AS) .
ceq account2(transfer(AS, N)) = deposit(account2(AS), N) if N <= balance1(AS)
This example of transfer between accounts, although very simple, contains
both the broadcasting and the client-server computing cases. Broadcasting ap-
pears because the transfer changes the states of both account components. Client-
server computing appears because transfer is related to deposit of ACCOUNT2
and using information of ACCOUNT1.
The following is the formal definition of composition with synchronisation
generalising the Definition 5 of parallel composition without synchronisation.
Definition 7. A behavioural object B is a synchronised composition of be-
havioural objects B1 and B2 when
– HB = HB1 * HB2 * {hB },
– V B ⊇ V B1 ∪ V B2 ,
– (FB )w→s ⊇ (FB1 )w→s ∪ (FB2 )w→s when all sorts in ws are visible,
– (FB )w→s = (FBi )w→s when ws contains hidden sorts from HBi only, for
i ∈ {1, 2},
– (FB )w→s = ∅ when ws contains hidden sorts from both HB1 and HB2 only,
– for each i ∈ {1, 2}, there exists an unique wi string of visible sorts, such that
(FB )hB wi →hBi is not empty, and it contains only one operation symbol πi ,
– (FB )hB w→hB ⊇ {σi | σ ∈ (FBi )hBi w→hBi Bi -action, i ∈ {1, 2}}
– the behavioural operations FBb of FB are those from FBb1 and FBb2 , π1 , π2 ,
and the actions andthe observations on hB ,
– EB = EB1 ∪ EB2 ∪ σ B -action Eσ ∪ {e(σ) | σ B-observation}
∪ c B-state constant E(c)
Behavioural Specification for Hierarchical Object Composition 151

where Eσ is a complete set of derived action definitions for σ, e(σ) is a derived


observational definition for σ, and E(c) is a derived constant set of definitions
for c.
For any B-action σ,
i i
{(∀{x} ∪ W ∪ Wi ) πi (σ(x, W ), Wi ) = τσ,k [x, W, Wi ] if Cσ,k [x, W, Wi ] |
i
τσ,k term, i ∈ {1, 2}, k ∈ {1, . . . , ni }} is a complete set of derived action defini-
tions for σ when
i
1. each τσ,k [x, W, Wi ] is a hBi -sorted term of behavioural or coherent Bi -opera-
tions applied either to πi (x, Wi ) or to a Bi -state constant, and
i
2. each Cσ,k [x, W, Wi ] is a quantifier-free formula formed by iterations of nega-
tions, conjunctions, and disjunctions, from equations formed by terms which
are either data signature terms or visible sorted terms of the form c[πj (x, Wj )]
for c some derived behavioural Bj -operation with Wj ⊆ W ∪ Wj and such
that
i
(2.1) the disjunction (∀{x} ∪ W ∪ Wi ) ∨{Cσ,k | k ∈ {1, . . . , ni }} is true for
each i ∈ {1, 2},
i
(2.2) for a given i, the conditions Cσ,k are disjoint, i.e. (∀{x} ∪ W ∪ Wi )
Cσ,k ∧ Cσ,k is false whenever k = k  ,
i i

The meaning of condition (2.1) is that of completeness in the sense that all
cases are covered, while the meaning of (2.2) is that of non-ambiguity in the sense
that each case falls exactly within the scope of only one conditional equation.
Let us denote by B1 ⊗ B2 the class of behavioural objects B which are syn-
chronised compositions of behavioural objects B1 and B2 .
The above definition for composition with synchronisation can be extended
easily to the case of any finite number of objects.
Notice that the example of dynamic account system presented above is a
special case of Definition 7.
The following result showing that the behavioural equivalence on the com-
pound object is compositional with respect to the behavioural equivalences of
its components extends Proposition 1.
Theorem 4. For any behavioural objects B1 and B2 , for each composition with
synchronisation B ∈ B1 ⊗ B2 , we have that
a ∼A a if and only if (∀Wi )Aπi (a, Wi ) ∼Ai Aπi (a , Wi ) for i ∈ {1, 2}
for each B-algebra A, elements a, a ∈ AhB , and where Ai = ABi for each
i ∈ {1, 2}.
Our object composition with synchronisation has final semantics shown by
the result below generalising Theorem 2:
Theorem 5. Let B ∈ B1 ⊗ B2 and let Ai be algebras of Bi for i ∈ {1, 2} such
that they are consistent on the common data part. Then there exists a B-algebra
A expanding A1 and A2 such that for any other B-algebra A expanding A1 and
A2 there exists an unique B-algebra homomorphism A → A expanding A1 and
A2 .
152 R. Diaconescu

5 Compositionality of Verifications
In object-oriented programming, reusability of the source code is important, but
in object-oriented specification, reusability of the proofs is also very important
because of the complexity of the verification process. We call this composition-
ality of verifications of components. Our approach supports compositionality of
verifications via Theorem 4.
Let us specify a dynamic bank account system having a user management
mechanism given by a user database (USER-DB) which enables querying whether
an user already has an account in the bank account system. The users data base
is obtained just by reusing (renaming) the behavioural sets object BSET.
mod* USER-DB { protecting(BSET *{ hsort BSet -> UserDB, hsort Elt -> UId})}
The following is the UML and CafeOBJ graphical representation of this dy-
namic bank account system specification:

AccountSys

user-db account
add-account del-account
del-account AccountSys add-account
transfer
deposit
init-account
withdraw
deposit
withdraw transfer
1 1

UserDB Account
1 *
UserDB Account

{_} deposit Uid no-account


_U_ withdraw Nat Int
_-_

and here are the CafeOBJ equations for the projection operation for UserDB:

eq user-db(add-account(AS, U)) = { U } U user-db(AS)) .


eq user-db(del-account(AS, U)) = user-db(AS) - { U } .
eq user-db(transfer(AS, U, U’, N)) = user-db(AS) .
eq user-db(deposit(AS, U, N)) = user-db(AS) .
eq user-db(withdraw(AS, U, N)) = user-db(AS) .

The following is the CafeOBJ code for the equations for the projection oper-
ation for Account:
ceq account(add-account(AS, U’), U) = init-account
if (U == U’) and not(U in user-db(AS)) .
ceq account(add-account(AS, U’), U) = account(AS, U)
if (U =/= U’) or (U in user-db(AS)) .
ceq account(del-account(AS, U’), U) = no-account
if (U == U’) .
ceq account(del-account(AS, U’), U) = account(AS, U)
if (U =/= U’) .
ceq account(transfer(AS, U’, U”, N), U) = account(AS, U)
if (U’ == U”) .
Behavioural Specification for Hierarchical Object Composition 153

ceq account(transfer(AS, U’, U”, N), U) = account(AS, U)


if (U’ =/= U”) and (U’ =/= U) and (U” =/= U) .
ceq account(transfer(AS, U’, U”, N), U) = withdraw(account(AS, U), N)
if (U’ =/= U”) and (U’ == U)
ceq account(transfer(AS, U’, U”, N), U) = account(AS, U)
if (U’ =/= U”) and (U” == U) and (balance(account(AS, U’)) < N) .
ceq account(transfer(AS, U’, U”, N), U) = deposit(account(AS, U), N)
if (U’ =/= U”) and (U” == U) and (N <= balance(account(AS, U’))) .
ceq account(deposit(AS, U’, N), U) = deposit(account(AS, U), N)
if (U == U’) .
ceq account(deposit(AS, U’, N), U) = account(AS, U)
if (U =/= U’) .
ceq account(withdraw(AS, U’, N), U) = withdraw(account(AS, U), N)
if (U == U’) .
ceq account(withdraw(AS, U’, N), U) = account(AS, U)
if (U =/= U’) .
By iterative application of Theorem 4, in the case of a hierarchic object
composition, the behavioural equivalence for the whole system is just the con-
junction of the behavioural equivalences of the base level objects, which are
generally rather simple.
For example, the behavioural equivalence for the bank account system is a
conjunction of the behavioural equivalence Account (indexed by the user iden-
tifiers) and UserDB, and these two are checked automatically by the CafeOBJ
system. This means that behavioural proofs for the bank account system are
almost automatic, without having to go through the usual coinduction process.
Therefore, the behavioural equivalence R[ ] of AccountSys can be defined by
the following CafeOBJ code:

mod BEQ-ACCOUNT-SYSTEM { protecting(ACCOUNT-SYSTEM)


op R[ ] : AccountSys UId AccountSys -> Bool
vars AS1 AS2 : AccountSys
var U : UId
eq AS1 R[U] AS2 = account(AS1, U) =b= account(AS2, U) and
user-db(AS1) =b= user-db(AS2 ) . }

Notice the use of the parameterized relation for handling the conjunction
indexed by the user identifiers, and we use =b= to denote the behavioural equiv-
alence on the components. We may recall that the definition of =b= for ACCOUNT
is just equality under the observation balance and that of =b= for USER-DB is just
equality under arbitrary membership, thus both of them are coded very easily.
Now, we will prove the true concurrency of deposit operations of two (possibly
but not necessarily different) users, which can be considered as a safety prop-
erty for this system of bank accounts and which is formulated as the following
behavioural commutativity property:
154 R. Diaconescu

deposit(deposit(as, u2, n2), u1, n1) ∼ deposit(deposit(as, u1, n1), u2, n2)
The following CafeOBJ code builds the proof tree containing all possible cases
formed by orthogonal combinations of atomic cases for the users with respect to
their membership to the user accounts data base. The basic proof term is TERM.
The automatic generation of the proof tree (RESULT ) is done by a meta-level
encoding in CafeOBJ by using its rewrite engine for one-directional construction
of the proof tree (this process uses the rewriting logic feature of CafeOBJ, hence
the use of transitions (trans) rather than equations).
mod PROOF-TREE { protecting(BEQ-ACCOUNT-SYSTEM)
ops n1 n2 : -> Nat -- arbitrary amounts for deposit
ops u u1 u1’ u2 u2’ : -> UId -- arbitrary user identifiers
op as : -> AccountSys -- arbitrary state of the account system
eq u1 in user-db(as) = true . -- first user is in the data base
eq u2 in user-db(as) = true . -- second user is in the data base
eq u1’ in user-db(as) = false . -- first user is not in the data base
eq u2’ in user-db(as) = false . -- second user is not in the data base
vars U U1 U2 : UId
op TERM : UId UId UId -> Bool -- basic proof term
trans TERM(U, U1, U2) => deposit(U1, n1, deposit(U2, n2, as)) R[U]
deposit(U2, n2, deposit(U1, n1, as)) .
op TERM1 : UId UId -> Bool
trans TERM1(U, U1) => TERM(U, U1, u2) and TERM(U, U1, u2’) .
op TERM2 : UId -> Bool
trans TERM2(U) => TERM1(U, u1) and TERM1(U, u1’) .
op RESULT : -> Bool -- final proof term
trans RESULT => TERM2(u1) and TERM2(u1’) and TERM2(u) . }

The execution of the proof term RESULT gives true.


The same problem for withdrawals rather than deposits is a bit more subtle.
If we run the system for the behavioural commutativity property
withdraw (withdraw (as, u2, n2), u1, n1) ∼ withdraw (withdraw (as, u1, n1), u2, n2)
in the same manner as for the deposit case, we do not get true because for the
case when the users are not different, two withdrawals are not necessarily commu-
tative. This is due to the relation between the amount required for withdrawals
and the actual balance of the account. However we still get useful information
consisting of the list of cases which cannot be reduced to true. This shows the
debugging power of this verification methodology.
As further exercise the reader is invited to check other behavioural properties
of the dynamic bank account system with user data base, such as
transfer (transfer (as, u1, u2, n), u2, u3, n) ∼ transfer (as, u1, u3, n)

6 Conclusions and Future Research


Based on a novel formalisation of the concept of behavioural object in hid-
den algebra, we have formally defined several composition operators underlying
Behavioural Specification for Hierarchical Object Composition 155

the object composition methodology of CafeOBJ, including parallel composition


(without synchronisation), dynamic composition, and a most general form of
composition with synchronisation. We have showed the associativity and com-
mutativity of parallel composition (without synchronisation), the existence of
final semantics and a compositionality result for the behavioural equivalence in
the most general case of composition with synchronisation. This latter result is
the basis for making the verification process almost automatic and also leads to
easy debugging.
Within this framework we plan to investigate sufficient conditions on synchro-
nisation allowing final associativity and/or commutativity of the composition
operator.
The concepts introduced in this paper can also be used for the definition of
an object-oriented algebraic specification language supporting hierarchical object
composition on top of existing algebraic specification languages. For example,
any specification in such object-oriented extension of CafeOBJ could be compiled
into a CafeOBJ specification.

References
1. Goguen, J., Diaconescu, R.: Towards an algebraic semantics for the object para-
digm. In Ehrig, H., Orejas, F., eds.: Recent Trends in Data Type Specification.
Volume 785 of Lecture Notes in Computer Science., Springer (1994) 1–34
2. Diaconescu, R., Futatsugi, K.: Behavioural coherence in object-oriented algebraic
specification. Universal Computer Science 6 (2000) 74–96 First version appeared
as JAIST Technical Report IS-RR-98-0017F, June 1998.
3. Hennicker, R., Bidoit, M.: Observational logic. In Haeberer, A.M., ed.: Algebraic
Methodology and Software Technology. Number 1584 in LNCS, Springer (1999)
263–277 Proc. AMAST’99.
4. Goguen, J., Roşu, G.: Hiding more of hidden algebra. In Wing, J.M., Woodcock,
J., Davies, J., eds.: FM’99 – Formal Methods. Volume 1709 of Lecture Notes in
Computer Science., Springer (1999) 1704–1719
5. Diaconescu, R., Futatsugi, K.: CafeOBJ Report: The Language, Proof Tech-
niques, and Methodologies for Object-Oriented Algebraic Specification. Volume 6
of AMAST Series in Computing. World Scientific (1998)
6. Diaconescu, R., Futatsugi, K.: Logical foundations of CafeOBJ. Theoretical Com-
puter Science 285 (2002) 289–318
7. Iida, S., Futatsugi, K., Diaconescu, R.: Component-based algebraic specification:
- behavioural specification for component-based software engineering -. In: Behav-
ioral specifications of businesses and systems. Kluwer (1999) 103–119
8. Diaconescu, R., Futatsugi, K., Iida, S.: Component-based algebraic specification
and verification in CafeOBJ. In Wing, J.M., Woodcock, J., Davies, J., eds.: FM’99
– Formal Methods. Volume 1709 of Lecture Notes in Computer Science., Springer
(1999) 1644–1663
9. Astesiano, E., Bidoit, M., Kirchner, H., Krieg-Brückner, B., Mosses, P., Sannella,
D., Tarlecki, A.: CASL: The common algebraic specification language. Theoretical
Computer Science 286 (2002) 153–196
10. Meseguer, J.: A logical theory of concurrent objects and its realization in the
Maude language. In Agha, G., Wegner, P., Yonezawa, A., eds.: Research Directions
in Concurrent Object-Oriented Programming. The MIT Press (1993)
156 R. Diaconescu

11. Iida, S., Futatsugi, K., Diaconescu, R.: Component-based algebraic specifications:
- behavioural specification for component based software engineering -. In: 7th
OOPSLA Workshop on Behavioral Semantics of OO Business and System Spec-
ification. (1998) 167–182 Also in the technical report of Technical University of
Munich TUM-I9820.
12. Burstall, R., Diaconescu, R.: Hiding and behaviour: an institutional approach. In
Roscoe, A.W., ed.: A Classical Mind: Essays in Honour of C.A.R. Hoare. Prentice-
Hall (1994) 75–92 Also in Technical Report ECS-LFCS-8892-253, Laboratory for
Foundations of Computer Science, University of Edinburgh, 1992.
13. Reichel, H.: An approach to object semantics based on terminal co-algebras. Math-
ematical Structures in Computer Science 5 (1995) 129–152
14. Goguen, J., Winkler, T., Meseguer, J., Futatsugi, K., Jouannaud, J.P.: Introducing
OBJ. In Goguen, J., Malcolm, G., eds.: Software Engineering with OBJ: algebraic
specification in action. Kluwer (2000)
Consistency Management Within Model-Based
Object-Oriented Development of Components

Jochen M. Küster and Gregor Engels


Faculty of Computer Science, Electrical Engineering and Mathematics,
University of Paderborn, Germany
{jkuester,engels}@upb.de

Abstract. The Unified Modeling Language (UML) favors the construc-


tion of models composed of several submodels, modeling the system com-
ponents under development at different levels of abstraction and from
different viewpoints. Currently, consistency of object-oriented models ex-
pressed in the UML is not defined in the UML language specification.
This allows the construction of inconsistent UML models. Defining con-
sistency of UML models is complicated by the fact that UML models
are applied differently, depending on the application domain and devel-
opment process. As a consequence, a form of consistency management
is required that allows the software engineer to define, establish and
manage consistency, tailored specifically to the development context. In
recent years, we have developed a general methodology and tool sup-
port to overcome this problem. The methodology is based on a thorough
study of the notion of consistency and has led to a generic definition of
the notion of consistency. Our methodology itself aims at a step-wise sys-
tematic construction of a consistency management process, by providing
a number of activities to be performed by the software engineer. It is
complemented by a tool called Consistency Workbench which supports
the software engineer in performing the methodology. In this paper, we
provide an overview and summary of our approach.

1 Introduction
A model-based approach to the development of component-based systems fa-
vors the construction of models prior to the coding of components. Benefits of
such an approach are the ability to study properties of the system early in the
development process on the model level and the idea that components can be
deployed more easily to different target platforms.
Currently, the Unified Modeling Language [27] is the de-facto industrial stan-
dard for object-oriented modeling of components. In the UML a model is
composedof several submodels for modeling the system at differentlevels of abstrac-
tion and from different viewpoints. As the UML language specification does not
sufficiently define consistency of UML models, inconsistent UML models can be
constructed. This may lead to a situation where no common implementation con-
forming to all submodels exists. Further, with the UML being applied in diverse

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 157–176, 2004.

c Springer-Verlag Berlin Heidelberg 2004
158 J.M. Küster and G. Engels

contexts, the ability of defining and checking consistency conditions depending


on the application domain, development process, and platform is of increasing
importance.
Besides well-formedness rules in OCL as part of user-defined UML profiles,
little support is available for customized specification and checking of consistency
conditions. This applies both for the definition as well as check of consistency
conditions. In particular, no support is provided to the developer to specify be-
havioral consistency conditions, like specific notions of compatibility between
statecharts and sequence diagrams. The general problem of defining and estab-
lishing consistency in UML is complicated by a missing formal semantics.
In [12, 8] we have developed a general methodology for consistency manage-
ment in UML-based development. Our approach to defining consistency concepts
is by means of partial translations of models into a formal language (called se-
mantic domain) that provides a language and tool support to formulate and
verify consistency conditions. For a given consistency concept, within a con-
sistency check one or more submodels are translated into a semantic domain
and the specified consistency conditions are verified. The result may then be
translated back into a UML notation or simply expressed in a message to the
modeller.
Given a development process and application domain, our approach system-
atically constructs a consistency management process in several activities. First,
consistency problem types are identified and then formalized in consistency con-
cepts. The formalization includes the choice of a suitable semantic domain, the
specification of partial translations and consistency conditions. For each consis-
tency concept, consistency checks are defined and integrated into the develop-
ment process. Primary ideas of our approach are to define consistency concepts
based on the development context, i. e. depending on application domain, devel-
opment process and platform, and further to abstract from unnecessary details
of the model not relevant to the consistency condition.
In order to make our approach applicable in practice, tool support is required
both for the definition of translations and consistency checks and for their auto-
mated execution. This has led to the development of the Consistency Workbench,
a research prototype for demonstrating the feasibility of our approach.
In this paper, we give an overview and summary of our approach. We first dis-
cuss the issue of consistency of models made up of different submodels, introduc-
ing a generic definition of consistency and the notion of consistency management.
We then present a general methodology for consistency management. Based on
this methodology, we summarize the tool support we have developed. We finally
sketch the application of the methodology to an example consistency problem.
This paper summarizes contributions previously published: The concepts of the
methodology have first been presented in [12] and then been elaborated in [10]
and [11]. The concepts of the consistency management tool have been published
in [9]. In [22], the methodology is described in detail, together with an application
to a simplified development process.
Consistency Management Within Development of Components 159

2 Concepts of Consistency and Consistency Management


In this section, we first introduce the main notions of our definition of consistency.
We then explain the idea of consistency management and briefly discuss related
approaches.

2.1 Consistency
The use of models consisting of different submodels within software development
has numerous advantages. Different persons may work on different submodels
simultaneously driving forward the development of the system. Different types
of submodels allow the separation of different aspects of the system to be built
such as structural aspects or dynamic behavior of system components.
However, making use of different submodels also involves drawbacks. In a
traditional approach, there exists only one model of the system. This model
can then be transformed during coding into a running software product. In the
case of a collection of submodels, this is not as easy anymore because one needs
to describe which submodel is transformed into which part of the code. This
gives rise to the problem of different parts of the code not working together as
one wishes leading to a system not functioning. In order not to run into such
problems, it has to be ensured that different submodels are compatible with each
other or consistent on the model level.
Different submodels of a model are usually called consistent if they can be
integrated into a single model with a well-defined semantics. The required form
of integration is dependent on the type of models, the modeling process and
the application domain. One important aspect of consistency is to ensure the
existence of an implementation conforming to all submodels. If consistency of a
model is ensured, an implementation of submodels is obtained by implementing
the integrated model. Otherwise, such an integrated model and implementation
might not exist.
Technically, consistency of a model is defined by establishing a set of con-
sistency conditions. A consistency condition is a predicate that defines whether
or not a model is consistent with respect to this consistency condition. We can
distinguish between consistency conditions defined on the syntax and on the se-
mantics of models, leading to syntactic and semantic consistency conditions. In
case of defining consistency conditions on the semantics of models, typically also
a semantic mapping is required. This can be the semantic mapping of the mod-
eling language definition but can also involve an abstraction of this semantics.
Different related consistency conditions can be grouped to form a consistency
concept. Such a consistency concept consists of a set of syntactic and semantic
consistency conditions and a semantic mapping of each submodel type into a
common semantic domain if applicable. The definition of a consistency concept
is illustrated in Figure 1. Mappings of submodel types are called mi in the
figure, consistency conditions are called ci . As a consistency concept consists of
a semantic mapping into a common semantic domain and conditions specified
160 J.M. Küster and G. Engels

Consistency Concept

Partial mappings
Submodel m1
Type 1
… Semantic Domain
mn
Submodel
Type n

Consistency conditions
c1 … cm

Fig. 1. Visualization of consistency concept

on the syntax and within the semantic domain, a consistency concept can be
viewed as a sort of integration of submodels.
Within our approach, we can define different consistency concepts for a mod-
eling language. As a consequence, a model can be consistent with respect to one
consistency concept but inconsistent with respect to another consistency con-
cept. This is motivated by different application domains, development processes
and platforms a modeling language can be applied to and in contrast to the idea
that a modeling language comes with pre-defined consistency concepts in the
language specification.
The motivation that gives rise to defining a consistency concept is called a
consistency property. Such a consistency property is a model-independent de-
scription of a property that a model should have. A consistency property can
be informally characterized by stating what it ensures i. e. what characteristics
a model must have that conforms to the consistency property. Examples for
consistency properties include syntactic correctness, trace consistency or timing
consistency.
On the basis of consistency properties and consistency concepts, we can define
consistency checks. A consistency check is a description how, given a model
and a consistency condition, to decide whether or not the model fulfills the
consistency condition. As a consequence, consistency checks can be thought of
definining the fulfillment relation between a model and a consistency condition.
Consistency checks can be performed using theorem proving, model checking [5]
or by executing specialized algorithms [6].
Given a consistency condition and a concrete model, we can identify those
submodel types of the larger model that lead to the inconsistency. This allows
the distinction between consistency problem types and consistency problem in-
stances: A consistency problem type is a configuration of submodel types that
may give rise to the violation of a consistency condition. On the contrary, a
consistency problem instance is a configuration of submodels such that each
submodel corresponds to a submodel type of a given consistency problem type
violating the consistency condition. This distinction between consistency prob-
Consistency Management Within Development of Components 161

Consistency Checks

Consistency Consistency Consistency


Check 1 Check 2 … Check m

Consistency Concepts
Consistency Consistency Consistency
Concept 1 Concept 2
… Concept n

Consistency Properties
Syntactic Deadlock Timing Process
Implementability
Correctness Freedom Consistency Properties

Modeling Application Development


Language Domain Process

Fig. 2. Layers of consistency

lem type and instance is similar to the type-instance notions commonly known
from object-orientation. Note that a consistency concept can also be thought of
as one solution of a consistency problem type.
In this section, we have introduced a generic definition of consistency. The
terms in our definition of consistency lead to a layered approach to consistency,
illustrated in Figure 2. Given a modeling language, in the property layer, dif-
ferent properties exist that drive the definition of the consistency concept. A
consistency concept comprises a number of consistency conditions and a seman-
tic domain (not shown in the figure but cf. Figure 1). Once the conditions are
determined, consistency checks are defined which verify or validate a consistency
condition.

2.2 Characteristics of Consistency Problems


Consistency problems can be characterized on the one hand according to the
situation they occur and on the other hand depending on the consistency con-
ditions.
One problem of consistency arises in cases where a model consists of dif-
ferent submodels because a system is modeled from different viewpoints [2, 13].
This allows the concentration on different aspects in different submodels. How-
ever, different viewpoint specifications must be consistent and not contradictory,
because the implementation of such an inconsistent model would otherwise be
infeasible. This type of consistency problem we will call horizontal consistency.
Another quite different problem of consistency arises when a model is trans-
formed into another model by replacing one or more submodels. It is then de-
162 J.M. Küster and G. Engels

sirable that the replaced submodel is a refinement of the previous submodel, in


order to keep the overall model consistent. This type of consistency problem we
will call vertical consistency. Vertical consistency problems are often induced by
a development process which prescribes how and when models are iteratively
refined.
A quite different characterization is obtained by looking at the consistency
conditions for a consistency problem. Here we can distinguish between syntac-
tic consistency conditions and semantic consistency conditions. In general, con-
sistency can be considered a semantic property. However, in order to ensure
consistency, a number of inconsistent models can already be detected by regard-
ing their syntax which means that the semantic property of consistency can be
established by formulating syntactic consistency conditions.
Additionally, we can make a distinction between syntactic consistency and se-
mantic consistency. Concerning horizontal consistency problems, syntactic con-
sistency ensures that the overall model consisting of submodels is syntactically
correct. With regards to vertical consistency problems, syntactic consistency en-
sures that changing of one part of the model within the development process still
results into a syntactically correct model. With respect to a horizontal consis-
tency problem, semantic consistency requires models of different viewpoints to
be semantically compatible with regards to the aspects of the system which are
described in the submodels. For vertical consistency problems, semantic consis-
tency requires that a refined model is semantically consistent with the model it
refines.
In this section, we have introduced a characterization of consistency prob-
lems, using our notion of consistency. We have clarified the notions of syntactic
and semantic consistency and horizontal and vertical consistency. Such charac-
terizations will prove helpful when identifying consistency problem types in a
given development process. In the next section, we will introduce the idea of
an explicit consistency management on the basis of consistency properties and
consistency concepts.

2.3 The Notion of Consistency Management


Given a set of models and a development process, it arises the question how to
ensure consistency of such models within the development process. Obviously,
this requires specific activities taken including the definition of consistency con-
ditions, the specification when consistency conditions are checked and what to
do in the case of inconsistencies in order to resolve these. We thus need a sort
of management of consistency and introduce the term consistency management.
The importance of consistency management has been apparent in other disci-
plines of computer science such as databases and programming languages (see
e. g. Tarr et al. [28]). In general, a consistency management process is a process
in the larger software engineering process. The goal of a consistency management
process is to define and ensure a certain form of consistency of models.
In order to generally ensure the consistency of models, the foundation of any
consistency management is the ability to decide whether a model composed of
Consistency Management Within Development of Components 163

Consistency Consistency Management Approach


Management Consistency Management Process

Activity 1 … Activity n
Stakeholders
process layer

constituent layer

Consistency Checks

Consistency Concepts

Consistency Properties

Development Modeling Application


Process Language Domain

Fig. 3. Layers of Consistency Management

submodels is consistent or not. As a consequence, consistency management relies


on a so-called constituent layer consisting of consistency properties, consistency
concepts and consistency checks. In addition to these basic constituents, there
might also be further specialized constituents.
Consistency management also involves dealing with consistency within a de-
velopment process. This leads to a so-called process layer of consistency man-
agement. Here it must be determined how to organize the constituents within a
consistency management process i. e. how to make use of the constituents to form
a consistency management process. For example, it must be prescribed which
consistency checks should be performed when and in which order. This includes
the description when to, given a concrete model consisting of submodels, look for
potential inconsistencies. If an inconsistency occurs, then it must be prescribed
within the process when to handle and resolve it, if possible.
A consistency management process is described by activities and stakehold-
ers performing the activities. Activities include when to locate potential incon-
sistencies within models and when to perform a consistency check associated
to a consistency condition. In addition to these general activities, consistency
management may also involve the description of how to avoid rechecking of con-
sistency conditions and how to organize consistency checks in order to achieve
consistency with respect to all consistency conditions.
In Figure 3 consistency management is illustrated. In the lower part, the
development process, modeling language and application domain are visualized.
On top of them, a consistency management approach is shown being composed of
consistency properties, consistency concepts, consistency checks and the consis-
164 J.M. Küster and G. Engels

tency management process. Note that we distinguish between such a specific con-
sistency management approach and the overall field of consistency management.

2.4 Related Approaches


Existing approaches can be categorized into several categories. The first cat-
egory contains approaches where a particular consistency problem is tackled.
For instance, Fradet et al. [15] propose an approach to consistency checking of
diagrams consisting of nodes and edges with multiplicities. They distinguish be-
tween generic and instance graphs and define the semantics of a generic graph
to be the set of instance graphs that fulfill the constraints of the generic graph.
Consistency is then defined to be equivalent to the semantics of the generic
graph being not an empty set. Consistency checking is then performed by solv-
ing a system of linear inequalities derived from the generic graph. Also in this
category falls the approach by Li et al. [24] who analyze timing constraints
of sequence diagrams for their consistency solving systems of linear inequali-
ties.
Another category can be seen in approaches that achieve consistency of
object-oriented models by completely formalizing them, thereby integrating all
models into one semantic domain. Moreira and Clark [25] translate object-
oriented analysis models to LOTOS in order to detect inconsistencies. Cheng
et al. [4] formalize OMT models in terms of LOTOS specifications. Using these
specifications, they perform consistency analysis using tools for state exploration
and concurrency analysis. Grosse-Rhode [18] integrates all models by translating
them into transformation systems. The problem involved with completely for-
malizing models is that the application is then restricted to certain consistency
problems mirrored by the choice of semantic domain. For example, a formal-
ization in terms of LOTOS is not capable of dealing with consistency problems
involving the aspect of time because LOTOS does not support this aspect. For
a general-purpose modeling language such as UML, different application do-
mains may give rise to quite different consistency problems which are difficult
to treat within one formalization, not to speak of the numerous problems of for-
malizing UML itself. As an example, for some applications modeled with UML,
consistency problems involving the aspect of time might be of no relevance at
all whereas in other applications (e. g. real-time applications), timing consis-
tency is of high importance. As a consequence, approaches involving a complete
formalization are currently not capable of dealing with all the consistency prob-
lems arising within the development with UML in various application domains
involving quite different sets of consistency problems.
A third category can be seen in approaches that deal with consistency of
models that are not object-oriented. Zave et al. [31] define consistency based on
a translation into logics and define a set of partial specifications to be consistent
if and only if its composition is satisfiable. Their approach therefore requires
that models are given a semantics in form of logics. Boiten et al. [3] define
consistency on the basis of development relations and define a set of partial
specifications to be consistent if and only if a common development relation
Consistency Management Within Development of Components 165

exists. This approach requires the existence of a formal semantics for models
and the concept of development relations defined for models used within the
development process.
Another, quite different, category of related work can be seen in approaches
that deal with inconsistency management [26] [17]. Rather then trying to achieve
complete consistency, these approaches tackle the problem of managing incon-
sistencies. This management is based on the location of inconsistencies and the
appropriate handling (resolving of inconsistency or tolerating it). Concentrating
on the process of consistency management, they assume that the foundation of
consistency management in terms of consistency conditions is already in place.
From our discussion of related work we can see that our generic definition
of consistency is applicable: In the first category, we are dealing with quite
different semantic domains such as systems of linear inequalities or the set of
instance graphs. In the second category, semantic domains used are LOTOS
or transformation systems and in the third category first-order logic is used
as semantic domain. On the contrary to existing approaches, we concentrate
on the technical mechanisms that are used to define consistency in different
scenarios. This will enable us to describe a general methodology how to deal with
consistency in a situation as currently encountered by the UML: consistency
is not defined as part of the standard and further, quite different consistency
concepts are needed depending on the development context. Before we move on
to our methodology, we will first discuss characteristics of consistency problems.
In the following section, we present an approach to consistency management
based on the idea of partially formalizing the model for enabling consistency
checks. As a consequence, our approach can be regarded as a combination of the
approaches in the first and the second category. As our approach also comprises
the idea of consistency management within a development process, it is also
related to the fourth category although the idea of tolerating inconsistencies is
not in focus.

3 A General Methodology for Consistency Management


Due to the unclear semantics and different employments of UML within the
development process, a general concept of consistency management for UML
models is missing [7]. Concerning syntactic consistency, this type of consistency
is partially achieved by defining well-formedness rules on the metamodel. Due
to the absence of a generally-accepted formal semantics, semantic consistency
is currently not well-supported. Nevertheless, semantic consistency is of great
importance and must be dealt with. In our opinion, waiting for a complete for-
malization of UML models is not feasible as we need consistency management
now and the existence of a complete formalization of UML models applicable to
all usages of UML models is doubtful. Another problem is the informal develop-
ment process followed which leads to the need of flexible notions of consistency.
Our approach to consistency management is based on the following obser-
vation: The fundamental question to answer when given a model consisting of
166 J.M. Küster and G. Engels

Consistency Concept

Submodel m
Type Semantic
Domain
a b
Consistency Conditions …
c1 ck

Consistency Concept

Submodel m1
Type 1
Semantic
Submodel m2
Domain
a Type 2

b
Consistency Conditions c1 … cl

Fig. 4. Concept of our approach

submodels and a consistency property is whether there exists an integration of


the submodels fulfilling the consistency property. Although a formally defined
semantics does not exist, it is still possible to restrict oneself to certain aspects
of models and then determine whether an integration fulfilling the consistency
property exists. The concept of our approach is illustrated in Figure 4. Sub-
models used within development overlap in a number of aspects. For ensuring
consistency, submodels are integrated into a common semantic domain, that
supports these overlapping aspects. Note that our approach also applies to a
submodel type with overlapping aspects, to deal with consistency problem types
within a submodel type. If a concrete model cannot be integrated, then it is
inconsistent.
Our idea of partially formalizing models for the purpose of consistency man-
agement yields a better degree of consistency than without any formalization
and overcomes the problems associated with complete formalizations. It allows
to conduct suitable consistency checks for consistency problems within quite dif-
ferent application domains by using different semantic domains. Our approach
of partially formalizing models therefore uses the strength of completely formal-
izing approaches because it allows precisely stated consistency conditions. On
the other hand, it also overcomes the disadvantage of restricted applicability by
the idea of partial formalization within a suitable semantic domain. Further-
more, partial formalizations are often easier to handle than one large complex
complete formalization trying to capture all possible aspects.
Given the goal of achieving consistency by constructing suitable partial for-
malizations, each providing a consistency concept, we are faced with the problem
of how to come up with suitable consistency concepts, consistency conditions,
consistency checks and how to integrate all these into a consistency management
process.
Consistency Management Within Development of Components 167

Consistency
Management
Consistency Management Approach
Consistency
Management
Methodology Consistency Management Process

defines

Consistency Checks
Activities

Consistency Concept

Techniques

Consistency Properties

Development Modeling Application


Process Language Domain

Fig. 5. The consistency management methodology

Our approach to consistency management is to describe a methodology for


consistency management. By a methodology, we understand a set of methods
and techniques [16] that are used for consistency management. A method con-
tains a set of activities and general guidelines how to execute the activities.
Insofar, our methodology contains activities and techniques that yield a partic-
ular consistency management process, taking into account different application
domains and development processes. The methodology can be applied to differ-
ent problem situations and therefore constitutes a set of methods rather than
one particular method.
In Figure 5, the idea of defining a methodology is illustrated, building on the
explanation of consistency management in the previous chapter. On the right,
the ingredients of a consistency management approach are shown which are con-
sistency properties, consistency concepts, consistency checks and a consistency
management process. On the left, we introduce the methodology, which takes as
input the development process, the modeling language and the application do-
main and then produces the consistency management approach. In the following,
we describe the activities of our methodology.
Activity 1: Identification of Consistency Problem Types. The goal of this activ-
ity is the identification of all relevant consistency problem types. The basis for
the identification of consistency problem types is obtained by discovering which
submodel types model the same aspects of the system and which aspects con-
sistency properties affect. Due to the common description of aspects in different
submodel types or the aspectual overlap of a submodel type and a consistency
property, consistency problems may occur. These consistency problems are iden-
168 J.M. Küster and G. Engels

tified, categorized to consistency problem types and informally defined, including


an informal description of the desired consistency condition to hold. Each con-
sistency problem type must be documented.

Activity 2: Formalization of a Consistency Problem Type. This activity aims


at establishing a formal consistency concept for each consistency problem type.
For each consistency problem type identified, we choose an appropriate seman-
tic domain. In this semantic domain, those aspects that lead to the consistency
problem type must be expressible (i. e. the aspects where submodels overlap).
Furthermore, tool support should be available for the semantic domain in or-
der to facilitate consistency checks. All aspects of the model that lead to the
identified consistency problem type must be mapped into the semantic domain.
Formal consistency conditions must be formulated that correspond to the infor-
mal description of consistency conditions. The definition of the partial mapping
is crucial for the correctness of later defined consistency conditions and no as-
pects of the model should be left out that influence the consistency of the model.
On the other hand, only those aspects of the model should be mapped into the
semantic domain that are important for the consistency because otherwise anal-
ysis may get too complex.
Activity 3: Operational Specification of Model Transformations. Each formal con-
sistency concept must be transformed such that models can be mapped into the
semantic domain in an automated way. For that purpose, model transformations
for the mappings of the consistency concept will be introduced. Further, it must
be taken care of that consistency conditions can also be generated automatically.
Activity 4: Specification of Consistency Checks. For each consistency problem
type, a consistency check is defined that validates the formal consistency con-
ditions. For each consistency problem type, it must also be determined what to
do in case of an inconsistency. The handling of such an inconsistency involves
either the resolution or tolerance of the inconsistency.
Activity 5: Embedding of Consistency Management into Development Process.
For each consistency problem type, it must be specified when to deal with it in an
existing development process. The order of consistency checks to be performed
by grouping the consistency conditions must be determined and fixed within
the development process. The existing development process must be adapted in
so far that concrete activities must be introduced that define when to perform
which consistency check. These activities include the location of inconsistencies
in a concrete model and the handling of inconsistencies.
The final result of having performed all these meta-level activities is a devel-
opment process that contains a concrete consistency management process. It is in
so far concrete that it defines consistency management for a given development
process, application domain and modeling language. The concrete consistency
management process, together with a concrete development problem, is simulta-
neously the starting situation for the activities performed on the concrete level
which are the location and handling of inconsistencies.
Consistency Management Within Development of Components 169

p1 <<connector>> p2
con
caps1 caps2
SC1_p1 SC2_p2

Simple
collaboration SC2
SC1
s1 s4
s4 s1
/send b /send d
d b SCProt
/send c s2 s3
s3 s2 s1 c

SC2_p2.b

s2
SC1_p1.c
s4 s3
SC2_p2.d

Fig. 6. Consistency problem example

In the following section, we sketch the application of the activities to an


example consistency problem type. Then, we discuss how the methodology can
be supported in order to be performable by the software engineer.

4 Application
In this section, we sketch the application of our methodology to a sample con-
sistency problem type.
In Figure 6, two structured objects caps1 and caps2 are shown, joined by a
connector via two ports p1 and p2 . Attached to this connector is a collabora-
tion with its behavior modeled in the protocol statechart SCP rot . The behavior
of the structured objects is specified in two statecharts, named SC1 and SC2 .
Intuitively, the interaction arising from executing the statecharts of the struc-
tured objects should conform to the protocol specified in the protocol statechart.
Activity 1 of the methodology will yield as outcome that there exists a consis-
tency problem type in this case (called protocol consistency), together with the
informal consistency condition formulated above.
Activity 2 aims at constructing a formal consistency concept. In this case, we
will choose CSP [21] as a semantic domain. The formal method CSP is supported
by the model checker FDR [14] for evaluating consistency conditions formulated
in CSP. We then have to define partial mappings of the statecharts and the
protocol statechart into the semantic domain of CSP. In other words, the consis-
tency concept consists of the submodel types protocol statechart, the statechart
of structured objects and the collaboration, mappings of these submodel types
170 J.M. Küster and G. Engels

LS LT ::= RT

:StateMachine +top
ε::= <SMName>(state,static) =
name = <SMName> act_<SMName> →
:CompositeState
if (state == <SName_1>) then
p1 +subvertex State(<SName_1>,static)
else …
if (state == final) then
:SimpleState FinalState(<SMName>)
name = {<SName_1>,…,<SName_n>} else STOP

State(<SName>,static)::=
:State
<Event_1>?x_<Event_1> →
name = {<Target_1>, ..., <Target_n>} ..
<Event_n>?x_<Event_n> →
+target
+outgoing
if (x_<Event_1> == 1) then
p2 :SimpleState ExitAction(<SName>) →
name = <SName> :Transition
:Transition
Transition(<SName>, <Target_1>)
else ..
if (x_<Event_n> == 1) then
ExitAction(<SName>) →
+trigger
Transition(<SName>, <Target_n>)
:Event
else
StaticReaction(<SName>)
name = {<Event_1>, ..., <Event_n>}

Fig. 7. Two rules for statechart translation

into CSP and a set of consistency conditions formalizing the informally noted
form of consistency.
For the consistency problem of protocol consistency, we can define two dif-
ferent consistency conditions: For weak protocol consistency we require that all
traces of the interaction of the structured object statecharts must be contained
in the set of traces of the protocol statechart. For strong protocol consistency
we additionally assume that all the traces of the protocol statechart must occur
in the system. Extending the statechart SC1 by introducing another transition
sending another event will violate the condition of weak protocol consistency.
Removing the last transition of SC2 will violate the condition of strong protocol
consistency. In previous work (e.g. [22]), we have reported on the details of such
a consistency concept which are beyond the scope of this work.
Activity 3 aims at making the consistency concept operational. In our case,
the partial translations of the submodel types must be defined in such a way that
they are executable automatically. We have recently explored a graph transfor-
mation approach [19] [8] [23] which allows the translation to be specified by a set
of compound graph transformation rules. In our case, such a compound graph
transformation rule consists of two parts, a source production rule specified by a
UML metamodel extract and a target production rule in the semantic formalism,
here CSP. As we do not want to change the source model, the source production
is the identical production, with equal left and right side. In Figure 7, two com-
Consistency Management Within Development of Components 171

pound graph transformation rules are shown for translating statecharts to CSP,
inspired by existing work of Hiemer [20].
Graph transformation rules of this form can then be used to specify a model
transformation from a given source UML model to a target CSP model. The
semantics of rule applications is briefly described as follows: Given a concrete
UML model, a match for the UML metamodel extract is searched for in the
concrete UML model. Once such a match is found, a match of the left side of
the target production is searched for in the CSP model. Once the two matches
have been found, the match of the CSP model is replaced by the right side of
the target production. Note that using these kind of graph transformation rules,
no additional specification language for describing the transformation is needed:
Each rule is basically expressed in terms of the source and target language, in
our case in UML and CSP, enriched with mechanisms for c3ommon variables and
placeholders. A detailed explanation of this model transformation approach can
be found in [19] and [8]. The problem of ensuring termination and confluence of
such rule-based transformations is treated in [23]. For this activity, also related
model transformation approaches such as the one by Whittle [30] or Varro et al.
[29] could be used.
In Activity 4, the consistency check must be defined, on the basis of the
previously developed transformation units. Typically, such a consistency check
can be specified by using activity diagram for modeling the overall workflow.
Such an activity diagram will define for example that given a situation like in
Figure 6, first the statecharts of the structured objects are translated to CSP
and then the protocol statechart. The overall result will then be fed into a model
checker and the result will be interpreted.
On the basis of such consistency checks, an overall development process can
be modified, by introducing consistency checks. For that, the order of consistency
checks must be determined and also how inconsistency handling influences the
overall consistency of a model. The details of these tasks are defined in Activity
5. In our sample application we have not described a development process but
concentrated on one consistency problem. For a more detailed example of this
activity, the reader is referred to [22].

5 The Consistency Workbench


In this section, we provide an overview of the consistency workbench, a research
prototype that has been developed for consistency management. In principle, the
software engineer needs support for all activities of the methodology. Neverthe-
less, one quickly realizes that adequate tool support for all activities is difficult
to provide. For example, the formalization process in Activity 2 cannot be sup-
ported by tools. In the following, we describe the main functionalities of the
consistency workbench:
1. Definition of Consistency Problem Catalogue. Using a template that contains
a name, classification, and informal description of the problem, a pattern of the
172 J.M. Küster and G. Engels

Fig. 8. Defining a Transformation Unit

meta model to localize potential occurrences, and an example, the Consistency


Workbench allows the software engineer to define a catalogue of problem types
that may be reused in different development processes.
2. Definition of Transformations. By a set of graph transformation rules, con-
trolled by a simple control flow language based on regular expressions, the soft-
ware engineer can define translations of models into a semantic domain. Each
such transformation rule consists of two parts, a source and a target transfor-
mation rule (see Figure 8), coupled by the use of common variables. The source
transformation rule is specified by providing a UML metamodel extract, repre-
sented by an XMI description which is an existing exchange format for UML
models. Note that other than the concept explained in Figure 7, the tool also
supports full source productions which also enables changes of the UML model
(not implemented at the moment). Rather than writing an XMI description by
hand, we currently use existing UML CASE tools for designing the UML meta-
model extract. The generated XMI description can then be used after slight
modifications. The target transformation rule is defined by providing a trans-
formation rule in a context-free grammar notation for textual languages such
as CSP [21]. Here, additional iteration constructs are provided for looping over
multi-objects matched at the UML side.
3. Definition of Consistency Checks. Based on previously defined transformation
units, the software engineer can define consistency checks for a problem type in
the catalogue. Such a consistency check is modeled as a workflow consisting of
several activities, like the localization of potential problems based on the pat-
tern defined in 1., the translation defined in 2., the generation of a consistency
condition in the target language defined by a transformation in 2., and its ver-
Consistency Management Within Development of Components 173

Fig. 9. Defining a Consistency Check

ification by an external tool (e.g., a model checker). In Figure 9, the definition


of a consistency check is illustrated.
4. Execution of Consistency Checks. Consistency checks can then be executed on
a concrete UML model following the workflows defined. For that purpose, UML
models constructed with UML CASE tools such as Poseidon [1] can be loaded
into the consistency workbench. Currently, such models can be visualized but
not edited. For a given model, a consistency check can be manually triggered.
Intermediate results of the consistency check such as models constructed dur-
ing execution of the model transformations can be accessed in the consistency
workbench. The result of a consistency check is currently displayed in the con-
sistency workbench by showing the result of the model checker, together with a
predefined explanation.
Being a research prototype, the consistency workbench has currently several
limitations: The capability of illustrating an inconsistency by means of a coun-
terexample expressed in an additional UML sequence diagram has not yet been
implemented. Further, complex transformation units have been developed for
translating statecharts to CSP but, apart from that, no other semantic domain
has been used in the consistency workbench so far. Nevertheless, the main idea
of our approach can be considered feasible: by systematically developing par-
tial translations of UML models into suitable semantic domains, the consistency
workbench could evolve into a technical support tool for the software engineer.

6 Conclusion
In this paper, we have presented our approach to consistency management of
object-oriented models. Motivated by the situation that currently a general ap-
174 J.M. Küster and G. Engels

proach for consistency management is not provided by the UML, we have first
introduced the concepts for consistency management such as consistency condi-
tion, consistency concept, consistency check and consistency management. Using
this thorough investigation of consistency, we have explained how our general
methodology builds a consistency management approach, depending on the mod-
eling language, application domain and development process. Activities of this
methodology have been discussed. The overall methodology has been demon-
strated by applying it to an example consistency problem type and sketching
the outcome of each activity. Finally, we have reported on the tool Consistency
Workbench which is a research prototype designed for supporting the software
engineer in the complex task of consistency management. Due to the nature of
this paper being a summary and overview paper, we have not been able to pro-
vide the full details of all activities. For that the reader is referred to the existing
publications.
Future work can be performed into several directions: With regards to our
general methodology, by applying it to real-world development processes, it can
be validated and refined. Furthermore, more consistency problems occuring in
UML-based development will be discovered and treated which will lead to a
number of predefined model transformations. Further, also suitable abstraction
mechanisms must be developed. In the context of our work, it has turned out
that this issue is vital for being able to perform consistency checking on larger,
real-world models. This is due to the underlying approach of model checking
which suffers from the well-known state explosion problem. With regards to tool
support, we envisage that our consistency workbench could be integrated into
an existing CASE tool.

References
1. M. Boger, T. Sturm, E. Schildhauer, and E. Graham. Poseidon for UML Users
Guide. Gentleware AG, 2003. Available under http://www.gentleware.com.
2. E. Boiten, H. Bowman, J. Derrick, P. Linington, and M. Steen. Viewpoint Consis-
tency in ODP. Computer Networks, 34(3):503–537, August 2000.
3. E. Boiten, H. Bowman, J. Derrick, and M. Steen. Viewpoint consistency in Z
and LOTOS: A case study. In J. Fitzgerald, C. B. Jones, and P. Lucas, editors,
FME’97: Industrial Applications and Strengthened Foundations of Formal Methods
(Proc. 4th Intl. Symposium of Formal Methods Europe, Graz, Austria, September
1997), volume 1313 of Lecture Notes in Computer Science, pages 644–664. Springer-
Verlag, Heidelberg, September 1997.
4. B. Cheng, L. Campbell, and E. Wang. Enabling Automated Analysis Through
the Formalization of Object-Oriented Modeling Diagrams. In Proceedings of IEEE
International Conference on Dependable Systems and Networks, pages 433–442.
IEEE Computer Society, 2000.
5. E. M. Clarke, O. Grumberg, and D. A. Peled. Model Checking. The MIT Press,
Cambridge, MA, 1999.
6. A. Egyed. Heterogenous View Integration and its Automation. Dissertation, Uni-
versity of Southern California, 2000.
Consistency Management Within Development of Components 175

7. G. Engels and L. Groenewegen. Object-Oriented Modeling: A Roadmap. In An-


thony Finkelstein, editor, Future Of Software Engineering 2000, pages 105–116.
ACM, June 2000.
8. G. Engels, R. Heckel, and J. M. Küster. Rule-Based Specification of Behavioral
Consistency Based on the UML Meta-model. In M. Gogolla and C. Kobryn, edi-
tors, UML 2001 - The Unified Modeling Language. Modeling Languages, Concepts,
and Tools., 4th International Conference, Toronto, Canada, October 1-5, 2001,
Proceedings, volume 2185 of LNCS, pages 272–287. Springer-Verlag, 2001.
9. G. Engels, R. Heckel, and J. M. Küster. The Consistency Workbench - A Tool for
Consistency Management in UML-based Development. In P. Stevens, J. Whittle,
and G. Booch, editors, UML 2003 - The Unified Modeling Language. Modeling
Languages and Applications. 6th International Conference, San Francisco, October
20-24, USA, Proceedings, volume 2863 of LNCS, pages 356–359. Springer-Verlag,
2003.
10. G. Engels, J. M. Küster, and L. Groenewegen. Consistent Interaction of Software
Components. In Proceedings of Sixth International Conference on Integrated Design
and Process Technology (IDPT 2002), 2002.
11. G. Engels, J. M. Küster, and L. Groenewegen. Consistent Interaction of Software
Components. Transactions of the SDPS: Journal of Integrated Design and Process
Science, 6(4):2–22, December 2002.
12. G. Engels, J. M. Küster, L. Groenewegen, and R. Heckel. A Methodology for
Specifying and Analyzing Consistency of Object-Oriented Behavioral Models. In
V. Gruhn, editor, Proceedings of the 8th European Software Engineering Conference
(ESEC), pages 186–195. ACM Press, 2001.
13. A. Finkelstein, D. Gabbay, A. Hunter, J. Kramer, and B. Nuseibeh. Inconsistency
Handling in Multi-Perspective Specifications. In Ian Sommerville and Manfred
Paul, editors, Proceedings of the Fourth European Software Engineering Confer-
ence, pages 84–99. Springer-Verlag, 1993.
14. Formal Systems Europe (Ltd). Failures-Divergence-Refinement: FDR2 User Man-
ual, 1997.
15. P. Fradet, D. Le Métayer, and M. Périn. Consistency Checking for Multiple View
Software Architectures. In O. Nierstrasz and M. Lemoine, editors, ESEC/FSE ’99,
volume 1687 of Lecture Notes in Computer Science, pages 410–428. Springer-
Verlag/ ACM Press, 1999.
16. C. Ghezzi, M. Jazayeri, and D. Mandrioli. Fundamentals of Software Engineering.
Prentice-Hall, 1991.
17. C. Ghezzi and B. A. Nuseibeh. Special Issue on Managing Inconsistency in Software
Development (2). IEEE Transactions on Software Engineering, 25(11), November
1999.
18. M. Grosse-Rhode. Integrating Semantics for Object-Oriented System Models. In
F. Orejas, P. G. Spirakis, and J. van Leeuwen, editors, Proceedings of ICALP’01,
LNCS 2076, pages 40–60. Springer-Verlag, 2001.
19. R. Heckel, J. M. Küster, and G. Taentzer. Towards Automatic Translation of
UML Models into Semantic Domains. In H.-J. Kreowski and P. Knirsch, editors,
Proceedings of the Appligraph Workshop on Applied Graph Transformation, pages
11–22, March 2002.
20. J.-J. Hiemer. Statecharts in CSP: Ein Prozessmodell in CSP zur Analyse von
STATEMATE-Statecharts. DrKovac Verlag, Hamburg, 1999.
21. C. A. R. Hoare. Communicating Sequential Processes. Prentice Hall, 1985.
22. J. M. Küster. Consistency Management of Object-Oriented Behavioral Models.
PhD thesis, University of Paderborn, March 2004.
176 J.M. Küster and G. Engels

23. J. M. Küster, R. Heckel, and G. Engels. Defining and Validating Transformations


of UML Models. In J. Hosking and P. Cox, editors, IEEE Symposium on Human
Centric Computing Languages and Environments (HCC 2003) - Auckland, October
28 - October 31 2003, Auckland, New Zealand, Proceedings, pages 145–152. IEEE
Computer Society, 2003.
24. X. Li and J. Lilius. Timing Analysis of UML Sequence Diagrams. In Robert France
and Bernhard Rumpe, editors, UML’99 - The Unified Modeling Language. Beyond
the Standard. Second International Conference, Fort Collins, CO, USA, October
28-30. 1999, Proceedings, volume 1723 of LNCS, pages 661–674. Springer-Verlag,
1999.
25. A. Moreira and R. Clark. Combining Object-Oriented Modeling and Formal De-
scription Techniques. In M. Tokoro and R. Pareschi, editors, Proceedings of the 8th
European Conference on Object-Oriented Programming (ECOOP’94), pages 344 –
364. LNCS 821, Springer-Verlag, 1994.
26. B. Nuseibeh, S. Easterbrook, and A. Russo. Making Inconsistency Respectable in
Software Development. Journal of Systems and Software, 58(2):171–180, Septem-
ber 2001.
27. Object Management Group (OMG). OMG Unified Modeling Language Specifica-
tion, Version 1.5. OMG document formal/03-03-01, March 2003.
28. P. Tarr and L. A. Clarke. Consistency Management for Complex Applications.
Technical report, Technical Report 97-46, Computer Science Department, Univer-
sity of Massachusetts at Amherst, 1997.
29. D. Varró, G. Varró, and A. Pataricza. Designing the Automatic Transformation
of Visual Languages. Science of Computer Programming, 44(2):205–227, August
2002.
30. J. Whittle. Transformations and Software Modeling Languages: Automating
Transformations in UML. In J.-M. Jezequel, H. Hussmann, and S. Cook, edi-
tors, UML 2002 - The Unified Modeling Language. 5th International Conference,
Dresden, Germany, September 30 - October 4, 2002, Proceedings, volume 2460 of
LNCS, pages 227–242. Springer-Verlag, 2002.
31. P. Zave and M. Jackson. Conjunction as Composition. ACM Transactions on
Software Engineering and Methodology, 2(4):379–411, October 1993.
CommUnity on the Move:
Architectures for Distribution and Mobility

1 2
José Luiz Fiadeiro and Antónia Lopes
1
Department of Computer Science, University of Leicester
University Road, Leicester LE1 7RH, UK
jose@fiadeiro.org
2
Department of Informatics, Faculty of Sciences, University of Lisbon
Campo Grande, 1749-016 Lisboa, PORTUGAL
mal@di.fc.ul.pt

Abstract. Mobility has become a new factor of complexity in the construction


and evolution of software systems. In this paper, we report on the extensions
that we have made to CommUnity, a prototype language for architectural de-
scription, with modelling techniques that support the incremental and composi-
tional construction of location-aware systems. We illustrate, around an exam-
ple, how the proposed extensions lead to a true separation of concerns between
computation, coordination and distribution in architectural models.

1 Introduction

The evolution of the internet and wireless communication is inducing an unprece-


dented and unpredictable variety and complexity on the roles that software can play.
Now that software development methods and techniques were finally starting to cope
with the building of distributed applications over static configurations, mobility is
introducing an additional factor of complexity due to the need to account for the
changes that can occur, at run time, at the level of the topology over which compo-
nents perform computations and interact with one another.
Architectural modelling techniques [16] have helped to tame the complexity of
building distributed applications over static networks by enforcing a strict separation
of concerns. On the one hand, we have what in systems can account for the opera-
tional aspects (what we call “computations” in general) that are responsible for the
behaviour that individual components ensure locally, e.g. the functionality of the
services that they offer. On the other hand, we have the mechanisms that control the
behaviour of individual components and coordinate the interconnections among
groups of components, so that global properties of systems emerge.
This separation between “Computation” and “Coordination” [11] supports the ex-
ternalisation, and definition as first-class citizens, of the rules according to which the
joint behaviour of given components of a system needs to be controlled. As a conse-
quence, one can build complex systems from simpler components by superposing the
architectural connectors that coordinate their interactions. This gross modularisation

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 177–196, 2004.
© Springer-Verlag Berlin Heidelberg 2004
178 J.L. Fiadeiro and A. Lopes

of systems can also be progressively refined, in a compositional way, by adding detail


to the way computations execute in chosen platforms and the communication proto-
cols that support coordination. Compositionality means that refinements over one of
the dimensions can be performed without interfering with the options made already
on the other one.
The levels of compositionality that architectural approaches can bring to software
development also apply to evolution [2]. On the one hand, connectors can be changed
or replaced without interfering with the code that components execute locally to per-
form the computations required by system services. On the other hand, the code run-
ning in the core components of the system can itself be evolved, e.g. optimised, with-
out interfering with the connectors, for instance with the communication protocols
being used for interconnecting components.
The major challenge that we face, and that justifies this paper, is to take this sepa-
ration of concerns one step further and address distribution/mobility aspects as a first-
class architectural dimension. On the one hand, it seems clear that, when we
(re)configure a system, we need to take into account the support that locations provide
for the operational/computational aspects of the individual components, and the abil-
ity for the interconnections to be effective over the communication network. For
instance, it is essential that a system as a whole may self-adapt to changes occurring
in the network topology, either to maintain agreed levels of quality of service, or to
take advantage of new services that may become available. On the other hand, we
need to be able to understand and refine global properties of a system separately in
each of the three dimensions.
In this paper, we report on work that we are pursuing within the IST-2001-32747
project AGILE – Architectures for Mobility – with the aim of extending the level of
separation and compositionality that has been obtained for computation and coordina-
tion to distribution/mobility. By focusing on an example – an airport luggage han-
dling system – that, in past papers, we handled both in a location-transparent way [19]
and in a preliminary experiment of the new primitives [3], we show how we can sup-
port the construction and evolution of location-aware architectural models by super-
posing explicit connectors that handle the mobility aspects. In this sense, this paper
can be used as a companion to [13], which is where we have formalised the new ar-
chitectural modelling techniques that we shall be discussing.

2 Designing Location-Aware Components in CommUnity

Location-transparency is usually considered to be an important abstraction principle


for the design of distributed systems. It assumes that the infrastructure masks the
physical and logical distribution of the system, and provides location-transparent
communication and access to resources: components do not need to know where the
components to which they are interconnected reside and execute their computations,
nor how they themselves move across the distribution network.
Traditionally, architectural approaches to software design also adhere to this prin-
ciple; essentially, they all share the view that system architectures are structured in
CommUnity on the Move: Architectures for Distribution and Mobility 179

terms of components and architectural connectors. Components are computation loci


while connectors, superposed on certain components or groups of components, ex-
plicitly define the way these components interact. In this section, we focus on the
way individual components are designed in CommUnity with a special emphasis on
the primitives that capture distribution and mobility aspects.

2.1 Location-Unaware Components

CommUnity is a parallel program design language that is similar to Unity [6] and IP
[10] in its computational model but adopts a different coordination model. More
concretely, whereas, in Unity, the interaction between a program and its environment
relies on the sharing of memory, CommUnity relies on the sharing (synchronisation)
of actions and exchange of data through input and output channels.
To illustrate the way components can be designed in CommUnity, and provide a
fair idea of the range of situations that our approach can address, we use a variation of
a problem that we previously developed in [3,19] – a typical airport luggage delivery
system in which carts move along a track and stop at designated stations for handling
luggage.
In order to illustrate how incremental development is supported in CommUnity, we
start with a very high-level design of a cart:
design Cart is
prv busy:bool
do move[]: ¬busy, false → true
dock[busy]: ¬busy, false → busy’
undock[busy]: busy, false → ¬busy’

This design caters for the very basic description that we gave of a cart’s behaviour:
the fact that it can move and stop at stations to handle luggage. In CommUnity, com-
ponents are designed having in mind the interactions that they can establish with other
components in terms of exchanging data through communication channels and syn-
chronising to perform joint actions. The design above does not mention any public
channels because, at this stage, we have not identified any need for the cart to ex-
change data with its environment. However, the cart needs to keep some internal data
to know when it is parked at a station; this is modelled by the private channel busy.
We call it channel because it can be used to exchange information between different
components inside the cart, but we make it private to hide it from the environment.
The actions that a component can perform are declared under “do” and their speci-
fication takes the general form:
g[D(g)]: L(g), U(g) → R(g)
where
• D(g) consists of the local channels into which executions of the action can place
values. This is normally called the write frame of g. We omit this set when it can
be inferred from the assignments in R(g). Given a private or output channel v, we
will also denote by D(v) the set of actions g such that v僆D(g). Hence, above,
180 J.L. Fiadeiro and A. Lopes

move has an empty write frame and busy is in the write frames of dock and un-
dock.
• L(g) and U(g) are two conditions that establish the interval in which the enabling
condition of any guarded command that implements g must lie: the lower bound
L(g) is implied by the enabling condition, and the upper bound U(g) implies the
enabling condition. Hence, the enabling condition of g is fully determined only if
L(g) and U(g) are equivalent, in which case we write only one condition. From a
specification point of view, U(g) allows us to place requirements on the states in
which the action should be enabled (progress) and L(g) allows us to restrict the
occurrence of the action to given sets of states (safety). By setting U to false, as
in the examples above, we are not making any requirements as to when we want
the actions to be enabled; this is useful for being able to add requirements in an
incremental way as illustrate below. For instance, restrictions on how a cart can
move will certainly arise when taking into consideration other aspects of the sys-
tem. On the other hand, each of the three actions was given a safety guard, basi-
cally ensuring that carts do not move while docked at a station for handling lug-
gage.
• R(g) is a condition that uses primed channels to account for references to the val-
ues that the channels take after the execution of the action. This is usually a con-
junction of implications of the form pre ⊃ pos where pre does not involve primed
channels. Each such implication corresponds to a pre/post-condition specification
in the sense of Hoare. When R(g) is such that the primed channels are fully de-
termined, we obtain a conditional multiple assignment, in which case we can use
the notation that is normally found in programming languages (||v僆D(g) v:=F(g,v)).
Hence, we could have used busy:=true for R(dock) and busy:=false for
R(undock). When the write frame D(g) is empty, R(g) is tautological. This is the
case of move.
A CommUnity design is called a program when, for every g僆Γ, L(g) and U(g) co-
incide, and the relation R(g) defines a conditional multiple assignment. The behav-
iour of a program is as follows. At each execution step, any of the actions whose
enabling condition holds can be executed if requested, in which case its assignments
are performed atomically.
Actions can also be declared to be private, a situation not illustrated above, mean-
ing that they cannot be shared with the environment of the component. Private ac-
tions that are infinitely often enabled are guaranteed to be selected for execution infi-
nitely often. A model-theoretic semantics of CommUnity can be found in [15].

2.2 Location-Aware Components

The design that we gave above does not take into account the fact that the cart can
only dock when it reaches the station to which it has been sent, nor does it model the
way a cart comes to know about its destination. The design below refines the previ-
ous one with this kind of information:
CommUnity on the Move: Architectures for Distribution and Mobility 181

design Located Cart is


inloc pos
in next:Loc
prv busy@pos:bool, dest@pos:Loc
do move[]@pos: ¬busy∧pos≠dest, false → true
dock[busy]@pos: ¬busy∧pos=dest, false → busy:=true
undock[busy,dest]@pos:
busy∧pos=dest, false → busy:=false || dest:=next

This design uses new primitives, some of which relate to the way we handle the
notions of location, distribution and mobility. In CommUnity, the underlying “space
of mobility” is constituted by the set of possible values of a special data type with a
designated sort Loc and whatever operations are necessary to characterise locations,
for instance hierarchies or taxonomies. The only requirement that we make is for a
special position –⊥– to be distinguished; its role will be discussed further below.
By not adopting a fixed notion of location, CommUnity can remain independent of
any specific notion of space and, hence, be used for designing systems with different
kinds of mobility. For instance, for physical mobility, the space is, typically, the
surface of the earth, represented through a set of GPS coordinates, but it may also be a
portion of a train track represented through an interval of integers. In other kinds of
logical mobility, space is formed by IP addresses. Other notions of space can be
modelled, namely multidimensional spaces, allowing us to accommodate richer per-
spectives on mobility. For instance, in order to combine logical mobility with secu-
rity concerns, it is useful to consider locations that incorporate information about
administrative domains.
CommUnity designs are made location-aware by associating their “constituents”
— code and data — with “containers” that can move to different positions. Designs
are not located: they can address components that are distributed across different
locations. Hence, the unit of mobility, i.e., the smallest constituent of a system that is
allowed to move, is fine-grained and different from the unit of execution.
More precisely, location-awareness comes about in CommUnity designs as fol-
lows:
• Location variables, or locations for short, can be declared as “containers” that can
be moved to different positions. Locations can be input or output. Input locations,
declared under inloc, are controlled by the environment and cannot be modified by
the component. Hence, the movement of any constituent located at an input loca-
tion is operated by the environment. Output locations, declared under outloc, can
only be modified locally through assignments performed within actions and,
hence, the movement of any constituent located at an output location is under
the control of the component. In the case above, we declared only one location
– pos – because the cart is not a distributed component. This location is declared
as input because we want other components to be able to control the movement of
the cart.
• Each local channel x is associated with a location variable l. We make this as-
signment explicit by simply writing x@l in the declaration of x. The intuition is
that the value of l indicates the current position of the space where the values of x
are made available. A modification in the value of l entails the movement of x as
182 J.L. Fiadeiro and A. Lopes

well as of the other channels and actions located at l. Because the cart is not dis-
tributed, busy has no choice but to be located at pos.
• Every action g is associated with a set of location variables Λ(g) meaning that the
execution of action g is distributed over those locations. In other words, the exe-
cution of g consists of the synchronous execution of a guarded command in each
of these locations: given l僆Λ(g), g@l is the guarded command that g executes at l.
Again, because carts are not distributed, all the actions are located at pos.
Notice that guarded commands may now include assignments involving the read-
ing or writing of location variables. This is the case of the actions of the cart: they
were refined in order to make use of locations. More precisely, we have now re-
stricted the enabling of move to the situation in which the cart has not reached its
destination, and dock and undock to when the cart is at its destination.
The destination of the cart is kept in a private channel dest and updated before
leaving the station by reading it from an input channel next. Input channels are used
for reading data from the environment; the component has no control on the values
that are made available in such channels. Notice that reading a value at a channel
does not consume it: the value will remain available until it is changed by the envi-
ronment.
Input channels are assigned a distinguished output location – λ – usually omitted in
designs. This location has the special value ⊥ that is used whenever one wants to
make no commitment as to the location of a channel or action. For instance, input
channels are always located at λ because the values that they carry are provided by
the environment in a way that is location-transparent; their location is determined at
configuration time when they are connected to output channels of other components.
Actions uniquely located at λ model activities for which no commitments with re-
spect to location-awareness have been mad. The reference to λ in these cases is usu-
ally omitted. This is what happened with our first design: all its constituents were
assumed to be located at λ. In later stages of the development process, the execution
of such actions can be distributed over several locations, i.e. the guarded command
associated with g@λ can be split in several guarded commands associated with lo-
cated actions of the form g@l, where l is a proper location. Whenever the command
associated with g@λ has been fully distributed over a given set of locations in the
sense that all its guards and effects have been accounted for, the reference to g@λ is
usually omitted. In the second design, we made location-awareness more explicit and
introduced references to specific location variables. However, distribution was not
illustrated. This will be done below.

2.3 Distributed Components

In order to illustrate how CommUnity can handle distribution, consider the situation
in which a cart can move in two different modes: slow and fast. More specifically, by
default, a cart will move in fast mode. However, control points may be placed dy-
namically along the track to slow down the cart: when the cart comes in the proximity
CommUnity on the Move: Architectures for Distribution and Mobility 183

of a control point, it changes to slow mode and, before it leaves the restricted area, it
goes back to fast mode.
design Controlled Located Cart is
outloc pos:Loc
inloc cpoint:Loc
in next:Loc
prv busy@pos, in@cpoint:bool, dest@pos:Loc, mode@pos:[slow,fast]
do move[pos]@pos:
¬busy∧pos≠dest, false → pos:=controlled(mode,pos)
prv enter[mode,in]
@pos: true → mode:=slow
@cpoint: ¬in → in:=true
prv leave[mode,in]
@pos: true → mode:=fast
@cpoint: in → in:=false
dock[busy]@pos: ¬busy∧pos=dest, false → busy:=true
undock[busy,dest]@pos:
busy∧pos=dest, false → busy:=false || dest:=next

This design introduces a new location cpoint accounting for a control point; this
location is declared to be input because we are leaving to the environment to
(re)distribute the control points along the track. However, the position pos of the cart
has now become output because it became under the control of the extended compo-
nent (subsystem).
A private channel in is located at the control point to indicate when a cart enters its
proximity, which is controlled by the action enter. This action is distributed between
the control point, where it updates in, and the cart, where it changes the mode to slow.
The action leave operates the other way around. Both actions are declared to be pri-
vate and their components are designed with a fully determined enabling condition
because their execution is completely under the control of the component.
The execution of a distributed action requires that the locations involved be “in-
touch” so that one can ensure that they are executed as a single transaction. For in-
stance, the physical links that support communication between the positions of the
space of mobility (e.g. wired networks, or wireless communications through infrared
or radio links) may be subject to failures or interruptions, making communication
temporarily impossible. Formally, we rely on a set bt(l) for every location l that, at
any given moment of time, consists of the locations that are “in touch” with l. Hence,
for any action g and any locations l1,l2 to which it is distributed, g can only be exe-
cuted if l1僆bt(l2) and l2僆bt(l1). In the case of the cart, this means that enter and
leave actions can only take place when the cart is in the proximity of the control point.
Notice that the action move is now making explicit that the next position is calcu-
lated from the current one taking the mode into account. The function controlled that
is being used will need to be defined, at specification time, on the representation cho-
sen for the tracks. However, because move implies calculating a new position, an
important condition applies: it can only be executed if the new position can be
reached from the current one.
Typically, the space of mobility has some structure, which can be given by walls
and doors, barriers erected in communication networks by system administrators, or
the simple fact that not every position of the space has a host where code can be exe-
184 J.L. Fiadeiro and A. Lopes

cuted. This structure can change over time. Hence, it is not realistic to imagine that
entities can migrate from any point to any point at any time without restrictions.
Formally, we rely, for every location l, on a set reach(l) consisting, at any given in-
stant of time, of the locations that can be reached from l. Hence, for any located ac-
tion g@l, if a location l1 can be affected by the execution of g@l, then the new value
of l1 must be a position reachable from l. In the case of the cart, this means that move
can only be executed if controlled returns a position that is reachable from the current
one – e.g. no other cart is in between.
Simpler modes of the movement of the cart could be envisioned, for instance
design Step Located Cart is
outloc pos:Loc
in next:Loc
prv busy@pos:bool, dest@pos:Loc
do move[pos]@pos: ¬busy∧pos≠dest, false → pos:=inc(pos)
dock[busy]@pos: ¬busy∧pos=dest, false → busy:=true
undock[busy,dest]@pos:
busy∧pos=dest, false → busy:=false || dest:=next

In this case, we are relying on a simpler increment function on locations that leads
the cart step by step through a path of a pre-established graph. The function itself can
define the graph. For instance, by defining Loc as nat5, two different alternative
graphs are:
0 1
1
0 2
3

4 3
4 2

2.4 Formal Semantics

In this section, we provide a summary of the mathematical semantics of CommUnity


designs. More details can be found in [9,13].
We start by mentioning that designs in CommUnity are defined over a collection of
data types that are used for structuring the data that the channels transmit and define
the operations that perform the computations that are required. Hence, the choice of
data types determines, essentially, the nature of the elementary computations that can
be performed locally by the components, which are abstracted as operations on data
elements. For simplicity, we assume a fixed collection of data types, i.e. we shall not
discuss the process of data refinement that needs to be involved when mapping de-
signs and their interconnections to the platforms that support computations and coor-
dination. In order to remain independent of any specific language for the definition of
these data types, we take them in the form of a first-order algebraic specification.
That is to say, we assume a data signature Σ=<S,Ω>, where S is a set (of sorts) and Ω
is a S ×S-indexed family of sets (of operations), to be given together with a collection
*

Φ of first-order sentences specifying the functionality of the operations. We refer to


this data type specification by Θ.
CommUnity on the Move: Architectures for Distribution and Mobility 185

A CommUnity design is a pair <θ,Δ> where:


— θ, the signature of the design, is a tuple <L,X,Γ,tv,ta,D,Λ> where
• L is a finite pointed set, we use λ to designate its point;
• X is an S-indexed family of mutually disjoint finite sets;
• Γ is a finite set;
• tv: X傼L→{out,in,prv} is a total function s.t. tv(λ)=out; for A債X傼L, we
shall use out(A) to denote the set {a僆 A:tv(a)=out} (and similarly for
in(A) and prv(A)) and local(A) to denote out(A)傼prv(A);
• ta: Γ→{sh,prv} is a total function,
• Λ: X傼Γ→2 is a total function s.t. λ僆Λ(g), for every g僆X傼Γ and
L

Λ(i)={λ}, for every i僆in(X);


local(X傼L)
• D: Γ→2 is a total function.
— Δ, the body of the design, is a tuple <R,L,U> where:
• R assigns to every action g僆Γ and l僆Λ(g), a proposition over

X傼L傼D(g)' s.t. | R(g@λ)⊃*l僆Λ(g)R(g@l);
• L and U assign a proposition over X傼L to every action g僆Γ and l僆Λ(g)
– –
s.t. | L(g@λ)⊃*l僆Λ(g)L(g@l) and | U(g@λ)⊃*l僆Λ(g)U(g@l).
It is important to notice that the conditions on the guarded commands associated
with located actions of the form g@λ justify why, as mentioned before, the reference
to g@λ can be omitted in some situations. When the command associated with g has
been fully distributed over a given set of locations (i.e., R(g@λ)⇔*l僆Λ(g)R(g@l),
L(g@λ)⇔*l僆Λ(g)L(g@l) and U(g@λ)⇔*l僆Λ(g)U(g@l)), the guard of g@λ and its ef-
fects have been accounted for and, hence, g@λ can be omitted because it does not
provide any additional information.
In order to define the behaviour of a program P, we have to fix, first of all, an al-
gebra U for the data types in Σ. The sets Us define the possible values of the each data
sort s in Σ. In particular, the set ULoc defines the positions of the space of mobility for
the situation at hand. In addition, we also have to fix a function rs:Ω→N. This func-
tion establishes the level of resources required for the computation of each operation
in Ω. In order to define the behaviour of P, we also need a model of the “world”
where P is placed to run which should capture that this world may be continuously
changing. In fact, we only need to know the properties and behaviour of the part of
the environment that may affect the program execution – its context.
In CommUnity, the context that a component perceives is determined by its current
position. A context is defined by a set Cxt of pairs obs:type, where obs is simply an
identifier and type is a data sort. Cxt models the notion of context that is considered
to be suitable for the situation at hand. Each obs represents an observable that can be
used for designing the system, and type defines the type of its values. As a conse-
quence, obs can be used in CommUnity designs as any other term of sort type. The
Σ
only requirement that we make is for three special observables – rssv:nat×2 , bt:2
Loc

Loc
and reach:2 – to be distinguished.
Σ
The purpose of rssv:nat×2 is to represent the resources and services that are avail-
able for computation. The first component of rssv quantifies the resources available.
It may be defined as a function of more specific observables in Cxt, for instance, the
remaining lifetime of a battery or the amount of memory available. In this way, it is
186 J.L. Fiadeiro and A. Lopes

possible to model the fact that the same resources may affect different applications in
different ways. The second component of rssv represents the services available and it
is taken as a part of the data signature Σ. This is because, as we have seen in the pre-
vious sections, the services that perform the computations are abstracted as operations
on data elements. In this paper, we will not illustrate this particular aspect of Com-
mUnity; see [13] instead.
Loc Loc
The intuition behind bt:2 and reach:2 is even simpler: both represent the set of
locations that are accessible. The former represents the locations that can be reached
through communication while the latter concerns reachability through movement. We
have already motivated the use of these relations in section 2.3.
We consider that such models are Cxt-indexed sets {Mobs:type}obs:type僆Cxt of infinite
sequences of functions over ULoc. Each Mobs:type is an infinite sequence of functions
that provide the value of the observable obs at every position of the space, at a par-
ticular instant of time. For the special observables rssv:nat×2 , bt:2 and reach:2 ,
Σ Loc Loc

these functions are constrained as follows.


Every function in Mbt and Mreach maps a position m into a set of positions that must
include m. Intuitively, this means that we require that “be in touch” and “reachabil-
ity” are reflexive relations. Furthermore, for the observable bt, only the sets of posi-
tions that include the special position ⊥U are admissible values. This condition estab-
lishes part of the special role played by ⊥U: at every position of the space, the position
⊥U is always “in touch”. In addition, we require that every function in Mbt maps ⊥U
to Loc. In this way, any entity located at ⊥U can communicate with any other entity in
a location-transparent manner and vice-versa.
The position ⊥U is also special because it supports context-transparent computa-
tion, i.e. a computation that takes place at ⊥U is not subject to any kind of restriction.
This is achieved by requiring that every function in M rssv assigns the value (+∞,Σ) to
the position ⊥U. In other words, the computational resources available at ⊥U are
unlimited and all the services are available.
The behaviour of a program running in the context of {Mobs:type}obs:type僆Cxt is as fol-
lows. The conditions under which a distributed action g can be executed at time i are
the following:
1. For every l1,l2僆Λ(g), [l2] 僆Mbt ([l1] ) and [l1] 僆Mbt ([l2] )): the execution of g
i i i i i i

involves the synchronisation of its local actions and, hence, their positions have
to be mutually in touch.
2. For every l僆Λ(g), g@l can be executed, i.e.,
i. If Mrssv([l] )=(n,Σ’) then, for every operation symbol f used in the guarded
i

command associated to g@l, f is an operation in Σ’ and rs(f)≤n: in order to


perform the computations that are required, the services and the level of re-
sources needed for these computations have to be available.
ii. For every local channel x used in the guarded command associated to g@l, if
x is located at l1 (x@l1), then [l1] 僆Mbt ([l] ): the execution of the guarded
i i i

command associated with g@l requires that every channel in its frame can be
accessed from its current position and, hence, l has to be in touch with the lo-
cations of each of these channels.
CommUnity on the Move: Architectures for Distribution and Mobility 187

iii. For every location l1僆D(g), [F(g@l,l1)] 僆Mreach ([l1] ): if a location l1 can be
i i i

effected by the execution of g@l, then the new value of l1 must be a position
reachable from the current one.
iv. The local guard L(g@l)(=U(g@l)) evaluates to true.
i
By [e] we denote the value of the expression e at time i. It is important to notice
that, because observables can be used in programs as terms, the evaluation of an ex-
pression e at time i may also depend on the model of Cxt.
Given this, the execution of the action consists of the transactional execution of its
guarded commands at their locations, which requires the atomic execution of the
multiple assignments. Private actions are subject to a fairness requirement: if infi-
nitely often enabled, they are guaranteed to be selected infinitely often.

3 Architectural Concerns in CommUnity

Section 2 illustrated several of the primitives provided in CommUnity for the design
of distributed and mobile components. It did so through an incremental process of
addition of detail to an initial, abstract account of the behaviour of a cart. CommmU-
nity does not prescribe any specific method of incremental development. Instead, it
provides the ability for different concerns to be modelled independently and super-
posed dynamically over the configuration of a system to account for new design deci-
sions.
For instance, if we consider the addition of the distribution/mobility aspects in sec-
tion 2.3, it is clear that the behaviour of the cart at the stations is not concerned.
Moreover, it should be possible to capture these aspects as an architectural element
(connector) that is being plugged to control the movement of cart. Changing from the
fast/slow control to the step-by-step mode should be just a matter of unplugging a
connector and plugging a new one; it should not require the cart to be re-designed or
re-implemented.
Another example can be given as follows. Consider that we now need to monitor
the number of times a cart docks at a station. It seems clear that we should be able
to:
• Address this issue independently of the way the movement of the cart is being
controlled, i.e. regardless of whether we are monitoring a step or a controlled lo-
cated cart.
• Separate the “logic” of monitoring from the location in which the data that is
required is provided, i.e. address interaction separately from the location as-
pects.
In this section, we address these issues in two steps. First, we illustrate how con-
nectors can be externalised from components. Then, we address the separate model-
ling of coordination and distribution.
188 J.L. Fiadeiro and A. Lopes

3.1 Externalising the Connector

Consider again the controlled located cart and the way it was obtained from the lo-
cated cart. The following design attempts at externalising the extension that was
performed.
design Mode controller is
inloc mine:Loc
outloc theirs:Loc
prv in@mine:bool, mode@theirs:[slow,fast]
do control[theirs]@theirs: true → theirs:=controlled(mode,theirs)
prv enter[mode,in]
@theirs: true → mode:=slow
@mine: ¬in → in:=true
prv leave[mode,in]
@theirs: true → mode:=fast
@mine: in → in:=false

This design contains more than just what was added to the located cart; it repeats
the elements that are necessary for this extension to be autonomous as a design. This
is why it includes both the location of the cart and the action of the cart that is being
controlled. Notice, however, that the action control is always enabled; the idea, as
detailed below, is that restrictions on its occurrence are left to the component being
controlled.
We deliberately changed the names of some of the design elements to highlight the
fact that we want this mode controller to exist, as a design, independently of the lo-
cated cart. However, this renaming is not necessary because it is automatically en-
forced in Category Theory [8], which is the mathematical framework in which we
give semantics to our interconnection mechanisms. As far as we are concerned, this
mode controller could even pre-exist the located cart as part of a library of connectors
that a software architect uses for designing a system. What we need to say is how it
can be applied to a component like a located car.
CommUnity supports the design of interactions through configurations; these are
diagrams that exhibit interconnections between designs. For instance, a located cart
under the control of a mode controller is specified by the following configuration:

dock undock

Mode
Located cart controller
next mine
pos theirs
move control

In configuration diagrams, components only depict their public elements. The


lines connect locations, channels and actions. In contrast with what happens with
most architectural description languages, “boxes and lines” in CommUnity have a
mathematical semantics. As explained in section 3.2, configuration diagrams are
diagrams in a category of CommUnity designs whose morphisms capture notions of
CommUnity on the Move: Architectures for Distribution and Mobility 189

superposition that are typical in parallel program design [6,10,12]. The semantics of
such a diagram is given by its colimit [8], which can be informally explained as the
design of the system viewed as a single, distributed component:
• Connected locations are amalgamated; input locations can be connected with in-
put connections, in which case their amalgamation is an input location, our with
output locations, in which case their amalgamation is an output location; output
locations cannot be connected with output locations.
• The same rule applies to channels; only channels carrying the same type of data
can be connected.
• Connected actions give rise to joint distributed actions; every set {g1,…,gn} of ac-
tions that are synchronised is represented by a single action g1||…||gn whose occur-
rence captures the joint execution of the actions in the set. The transformations
performed by the joint action are distributed over the locations of the synchro-
nised actions. Each located action g1||…||gn@l is specified by the conjunction of
the specifications of the local effects of each of the synchronised actions gi that is
distributed over l, and the guards of joint actions are also obtained through the
conjunction of the guards specified by the components.
We must call attention to the fact that elements (locations, channels and actions)
that have the same name in different designs but are not connected need to be re-
named. This is because there can be no implicit interconnection between designs
resulting from accidental naming of locations, channels or actions. Any name binding
needs to be made explicit through a line as illustrated.
The design that we have just described is itself obtained only up to renaming; the
actual choice of names for locations, channels, and actions does not really matter as
long as all the interconnections are respected and no additional interconnections are
introduced. Hence, in the case of the cart, what we obtain from the configuration
above is a specification of the controlled located cart as given in section 2.3. Notice
in particular that the result of synchronising the action
move[pos]@pos: ¬busy∧pos≠dest, false → true

of located cart, and the action


control[theirs]@theirs: true → theirs:=controlled(mode,theirs)

of mode controller is, after the amalgamation of the locations,


move[pos]@pos: ¬busy∧pos≠dest, false → pos:=controlled(mode,pos)

This is because the semantics of the composition is given by the conjunction of the
guards and of the effects of the component actions.
Summarising, we have expressed the behaviour of the controlled located cart as re-
sulting from a configuration in which the located cart is connected to a component
that controls its movement. The advantage of this representation is that, in order to
change to a step-by-step control, we just need to replace the mode controller by an-
other connector, namely
190 J.L. Fiadeiro and A. Lopes

design Step controller is


outloc theirs:Loc
do control[theirs]@theirs: true → theirs:=inc(theirs)

In this case, the required configuration is

dock undock

Step
Located cart controller
next
pos theirs

move control

3.2 Semantics of Interconnection

As already mentioned, the semantics of interconnection and configuration in Com-


mUnity is based on Category Theory [8]. In this section, we will only provide the
basic ingredients of this semantics. More details are available in [9,13].
We call a morphism σ: P1→P2 of CommUnity designs a triple consisting of a to-
tal function σch: X1→X2, a partial mapping σac: Γ2→Γ1 and a total function
,

σlc: L1→L2 that maps the designated location of P1 to that of P2, satisfying:
1. for every x僆X1:
(a) sort2(σch(x))=sort1(x);
(b) if x僆out(X1) then σch(x)僆out(X2);
(c) if x僆prv(X1) then σch(x)僆prv(X2);
(d) if x僆in(X1) then σch(x)僆out(X2)傼in(X2).
2. for every g僆Γ2 s.t. σac(g) is defined:
(a) if g僆sh(Γ2) then σac(g)僆sh(Γ1);
(b) if g僆prv(Γ2) then σac(g)僆prv(Γ1).
3. for every x僆X1and l僆L1
(e) σlc (λ2)={λ1}
-1

(f) if l僆out(L1) then σlc(l)僆out(L2);


(g) σlc(Λ1(x))債Λ2(σch(x)).
4. for every g僆Γ2 s.t. σac(g) is defined:
(c) σlc(Λ1(σac(g)))債Λ2(g).
5. for every x僆local(X1傼L1), σac is total on D2(σch(x)) and
σac(D2(σch(x)))債D1(x).
6. for every g僆Γ2 s.t. σac(g) is defined and l僆Λ1(σac(g)):
(a) σ(D1(σac(g))債D2(g);

(b) Φ | R2(g@σlc(l))⊃σ(R1(σac(g)@l));

(c) Φ | L2(g@σlc(l))⊃σ(L1(σac(g)@l));

(d) Φ | U2(g@σlc(l))⊃σ(U1(σac(g)@l)).

By | we mean validity in the first-order sense taken over the axiomatisation Φ
of the underlying data types. These morphisms define a category MDSG.
CommUnity on the Move: Architectures for Distribution and Mobility 191

Configuration diagrams define diagrams in this category, i.e. graphs whose nodes
are labelled with CommUnity designs as defined in section 2.4, and arrows are la-
belled with morphisms as defined above. Colimits of such diagrams define the se-
mantics of configurations.

3.3 Separating Coordination and Distribution

Consider now the situation described at the beginning of this section: assume that we
want to monitor how many times a cart has docked at a station. Intuitively, we should
be able to start by putting in place just the mechanisms that coordinate the interaction
between the cart and the monitor, which should not depend on the location and mobil-
ity aspects of the cart.
The interaction aspects of the monitor can be resumed to a counting function and
designed as follows:
design Counter is
out count:nat
do inc[count]: true → count:=count+1
reset[count]: true → count:=0

The counter can be connected to the original design of the cart because their inter-
action does not involve mobility explicitly:

inc dock undock

Counter Cart
count

reset move

The semantics of the configuration is given by the following design:


design Monitored cart is
prv busy:bool
out count:nat
do move[]: ¬busy, false → true
dock[busy,count]: ¬busy, false → busy:=true || count:=count+1
undock[busy]: busy, false → busy:=false
reset[count]: true → count:=0
move&reset[count]: ¬busy, false → count:=0
undock&reset[busy,count]: busy, false → busy:=false || count:=0

Notice that the synchronisations of reset with move and undock are automatically
added! This is because there is no reason to prevent the counter from being reset
while the cart is moving or undocking. Should such synchronisations be undesirable,
one would have to configure the system in a way that prevents them; the default se-
mantics is of maximum parallelism. It is important to stress that this complex design
is just providing the semantics of the configuration; it is the configuration that one
should use to develop and evolve the system, not its semantics! In order to simplify
192 J.L. Fiadeiro and A. Lopes

the presentation, we shall omit these synchronisations in the examples below and
replace them by ‘…’.
This interconnection can be extended to the located cart because the designs Cart
and Located Cart are related by a morphism as defined in 3.2. Indeed, because mor-
phisms preserve locations, channels and actions, the interconnection propagates from
the source to the target of the morphism:

inc dock undock

counter Located cart


count pos
next
reset move

Once again, details on the semantics of this propagation of interconnections can be


found in [13]. This semantics is based on the fact that, as explained in section 2.4,
every design involves the implicit location λ.
design Monitored Located cart is
outloc pos:Loc
in next:Loc
out count@λ:nat
prv busy@pos:bool, dest@pos:Loc
do move[pos]@pos: ¬busy∧pos≠dest, false → true
dock[busy,count]
@pos: ¬busy∧pos=dest, false → busy:=true
@ λ: ¬busy∧pos=dest, false → busy:=true || count:=count+1
undock[busy,dest]@pos:
busy∧pos=dest, false → busy:=false || dest:=next
reset[count]@λ: true → count:=0

Decisions on the location of the counter can now be made independently of those
made for the cart. A “minimal” decision is to consider that the location and mobility
of the counter is left to the environment:
design Counter* is
inloc where:Loc
out count@where:nat
do inc[count]@where: true → count:=count+1
reset[count]@where: true → count:=0

If one wants to place the counter in the cart, then the following configuration
should be used:

inc dock undock

Counter* Located cart


count next
where pos

reset move
CommUnity on the Move: Architectures for Distribution and Mobility 193

design Monitored co-Located cart is


inloc pos:Loc
in next:Loc
out count@pos:nat
prv busy@pos:bool, dest@pos:Loc
do move[pos]@pos: ¬busy∧pos≠dest, false → true
dock[busy,count]@pos:
¬busy∧pos=dest, false → busy:=true || count:=count+1
undock[busy.dest]@pos:
busy∧pos=dest, false → busy:=false || dest:=next
reset[count]@pos: true → count:=0

Notice that the dock action is no longer distributed and combines the actions of the
cart and the counter.
If one wants to place the counter at some fixed location, the following connector
can be used:
design Fixed is
outloc stay:Loc

Notice that, because no actions are provided for changing the location of Fixed, it
cannot be moved!

inc dock undock

Fixed Counter* Located cart


stay where next
count
pos

reset move

design Fixed Monitored Located cart is


outloc stay:Loc
inloc pos:Loc
in next:Loc
out count@stay:nat
prv busy@pos:bool, dest@pos:Loc
do move[pos]@pos: ¬busy∧pos≠dest, false → true
dock[busy,count]
@pos: ¬busy∧pos=dest, false → busy:=true
@stay: true → count:=count+1
undock[busy,dest]@pos:
busy∧pos=dest, false → busy:=false || dest:=next
reset[count]@stay: true → count:=0

We can now put together our system by combining these different connectors, for
instance a cart monitored by a fixed counter and step controller:
194 J.L. Fiadeiro and A. Lopes

inc dock undock

Fixed Counter* Located cart


stay where count next
pos

reset move

control

theirs

Step
controller

The ability to externalise connectors and address coordination and distribution in a


separate way is what supports a true compositional approach to development and
evolution: we can plug-in and plug-out the connectors that address the different con-
cerns without having to redesign the system at the level of its code. Hence, it is pos-
sible for changes to be performed immediately at the configuration level, without
having to interfere with the lower levels of design.

4 Conclusions and Further Work

This paper presented, around a typical example – a luggage handling system, how
CommUnity is being extended with primitives that support the modelling of distribu-
tion and mobility aspects at an architectural level. This extension is being pursued
within the IST-2001-32747 Project AGILE – Architectures for Mobility with two main
goals in mind:
• To provide support for the description of the mobility aspects of systems in a way
that is completely separated from the computational and coordination concerns.
• To be based on proper abstractions for modelling the part of the run-time envi-
ronment that may affect their behaviour, what is often referred as context.
This paper focused essentially on the first goal. We showed how a new class of ar-
chitectural connectors can be defined that externalise patterns and policies related to
the locations in which components perform computations and the network topology
that supports coordination. Such connectors can be superposed over location-
transparent models of components and connectors as a means of addressing the mo-
bility-based aspects that reflect the properties of the operational and communication
infrastructure without having to redesign the other dimensions.
In this respect, our work goes one step beyond what can be found in the literature
that addresses the formalisation of software architectures, e.g. [1]. In all the ap-
CommUnity on the Move: Architectures for Distribution and Mobility 195

proaches that we know, including those around Mobile Unity [17,18], the mobility
dimension is not taken as a separate and first-class aspect.
Further work is in progress in several directions.
On the one hand, we are using these results on CommUnity to make available this
level of architectural support in modelling languages like the UML. The aim is to
extend the set of coordination-based semantic primitives that we developed in the past
[2] with similar ones for distribution and mobility [4]. At the same time, we are relat-
ing architectural design in CommUnity with extensions of process languages like
KLAIM [5] that can handle distribution and mobility at a lower level of abstraction.
On the other hand, and towards the second goal mentioned above, we are further
exploring the notion of context that we only briefly mentioned in section 2.4. Con-
texts usually model these different types of resources as well as other kinds of exter-
nal factors, from the screen size of a device to the power left on a battery. Given that
different kinds of applications typically require different notions of context, it is im-
portant that formalisms for designing mobile systems consider contexts as first-class
design entities and support their explicit modelling. If a specific notion of context is
assumed as, for instance, in Ambients [7], the encoding of a different notion of con-
text can be cumbersome and entangled with other aspects, if at all possible. By ex-
tending CommUnity with the explicit modelling of a notion of context, we hope to
make it possible for such aspects to be progressively refined through the addition of
detail, without interfering with the parts of the system already designed.

Acknowledgements

This work was partially supported through the IST-2001-32747 Project AGILE –
Architectures for Mobility. We wish to thank our partners for much useful feedback.

References

1. R.Allen and D.Garlan, “A Formal Basis for Architectural Connectors”, ACM TOSEM 6(3),
213-249, 1997.
2. L.F.Andrade and J.L.Fiadeiro, “Architecture Based Evolution of Software Systems”, in
M.Bernardo and P.Inverardi (eds), Formal Methods for Software Architectures, LNCS
2804, 148-181, Springer Verlag 2003.
3. L.F.Andrade, J.L.Fiadeiro, A.Lopes and M.Wermelinger, “Architectural Techniques for
Evolving Control Systems”, in G.Tarnai and E.Schnieder (eds), Formal Methods for Rail-
way Operation and Control Systems, 61-70, L’Harmattan Press 2003
4. N.Aoumeur, J.L.Fiadeiro and C.Oliveira, “Towards an Architectural Approach to Loca-
tion-Aware Business Processes”, in Proc. 13th IEEE International Workshops on Enabling
Technologies: Infrastructures for Collaborative Enterprises (WETICE-2004), IEEE Com-
puter Society Press 2004.
5. L. Bettini, M. Loreti, and R. Pugliese “An Infrastructure Language for Open Nets”, in
Proceedings of the 2002 ACM Symposium on Applied Computing, 373-377, ACM 2002
6. K.Chandy and J.Misra, Parallel Program Design - A Foundation, Addison-Wesley 1988.
196 J.L. Fiadeiro and A. Lopes

7. L.Cardelli and A.Gordon, “Mobile Ambients”, in Nivat (ed), FOSSACS’98, LNCS 1378,
140-155, Springer-Verlag 1998.
8. J.L.Fiadeiro, Categories for Software Engineering, Springer-Verlag 2004.
9. J.L.Fiadeiro, A.Lopes and M.Wermelinger, “A Mathematical Semantics for Architectural
Connectors”, in R.Backhouse and J.Gibbons (eds), Generic Programming, LNCS 2793,
190-234, Springer-Verlag 2003.
10. N.Francez and I.Forman, Interacting Processes, Addison-Wesley 1996.
11. D.Gelernter and N.Carriero, “Coordination Languages and their Significance”,
Communications ACM 35(2), 97-107, 1992.
12. S.Katz, “A Superimposition Control Construct for Distributed Systems”, ACM TOPLAS
15(2), 337-356, 1993.
13. A.Lopes and J.L.Fiadeiro, “Adding Mobility to Software Architectures”, in A.Brogi and
J.-M.Jacquet (eds), FOCLASA 2003 – Foundations of Coordination Languages and Soft-
ware Architecture, Electronic Notes in Theoretical Computer Science. Elsevier Science, in
print.
14. A.Lopes, J.L.Fiadeiro and M.Wermelinger, “Architectural Primitives for Distribution and
Mobility”, in Proc. SIGSOFT 2002/FSE-10, 41-50, ACM Press 2002.
15. A.Lopes and J. L. Fiadeiro, “Using Explicit State to Describe Architectures”, in
E.Astesiano (ed), FASE’99, LNCS 1577, 144–160, Springer-Verlag 1999.
16. D.Perry and A.Wolf, “Foundations for the Study of Software Architectures”, ACM SIG-
SOFT Software Engineering Notes 17(4), 40-52, 1992.
17. G.Picco, G.-C.Roman and P.McCann, “Expressing Code Mobility in Mobile Unity”, in
M.Jazayeri and H.Schauer (eds), Proc. 6th ESEC, LNCS 1301, 500-518, Springer-Verlag
1998.
18. G.-C.Roman, A.L.Murphy and G.P.Picco, “Coordination and Mobility”, in A.Omicini et al
(eds), Coordination of Internet Designs: Models, Techniques, and Applications, 253-273,
Springer-Verlag 2001.
19. M.Wermelinger and J.Fiadeiro, “Connectors for Mobile Programs”, IEEE Transactions on
Software Engineering 24(5), 331-341, 1998.
TulaFale: A Security Tool for Web Services

Karthikeyan Bhargavan, Cédric Fournet,


Andrew D. Gordon, and Riccardo Pucella
Microsoft Research

Abstract. Web services security specifications are typically expressed


as a mixture of XML schemas, example messages, and narrative expla-
nations. We propose a new specification language for writing comple-
mentary machine-checkable descriptions of SOAP-based security pro-
tocols and their properties. Our TulaFale language is based on the pi
calculus (for writing collections of SOAP processors running in paral-
lel), plus XML syntax (to express SOAP messaging), logical predicates
(to construct and filter SOAP messages), and correspondence assertions
(to specify authentication goals of protocols). Our implementation com-
piles TulaFale into the applied pi calculus, and then runs Blanchet’s
resolution-based protocol verifier. Hence, we can automatically verify
authentication properties of SOAP protocols.

1 Verifying Web Services Security


Web services are a wide-area distributed systems technology, based on asyn-
chronous exchanges of XML messages conforming to the SOAP message for-
mat [BEK+ 00, W3C03]. The WS-Security standard [NKHBM04] describes how
to sign and encrypt portions of SOAP messages, so as to achieve end-to-end
security. This paper introduces TulaFale, a new language for defining and au-
tomatically verifying models of SOAP-based cryptographic protocols, and illus-
trates its usage for a typical request/response protocol: we sketch the protocol,
describe potential attacks, and then give a detailed description of how to define
and check the request and response messages in TulaFale.

1.1 Web Services


A basic motivation for web services is to support programmatic access to web
data. The HTML returned by a typical website is a mixture of data and presen-
tational markup, well suited for human browsing, but the presence of markup
makes HTML a messy and brittle format for data processing. In contrast, the
XML returned by a web service is just the data, with some clearly distinguished
metadata, well suited for programmatic access. For example, search engines ex-
port web services for programmatic web search, and e-commerce sites export
web services to allow affiliated websites direct access to their databases.
Generally, a broad range of applications for web services is emerging, from
the well-established use of SOAP as a platform and vendor neutral middleware

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 197–222, 2004.

c Springer-Verlag Berlin Heidelberg 2004
198 K. Bhargavan et al.

within a single organisation, to the proposed use of SOAP for device-to-device


interaction [S+ 04].
In the beginning, “SOAP” stood for “Simple Object Access Protocol”, and
was intended to implement “RPC using XML over HTTP” [Win98, Win99,
Box01]. HTTP facilitates interoperability between geographically distant ma-
chines and between those in protection domains separated by corporate fire-
walls that block many other transports. XML facilitates interoperability between
different suppliers’ implementations, unlike various binary formats of previous
RPC technologies. Still, web services technology should not be misconstrued
as HTTP-specific RPC for distributed objects [Vog03]. HTTP is certainly at
present the most common transport protocol, but the SOAP format is inde-
pendent of HTTP, and some web services use other transports such as TCP
or SMTP [SMWC03]. The design goals of SOAP/1.1 [BEK+ 00] explicitly pre-
clude object-oriented features such as object activation and distributed garbage
collection; by version 1.2 [W3C03], “SOAP” is a pure name, not an acronym.
The primitive message pattern in SOAP is a single one-way message that may
be processed by zero or more intermediaries between two end-points; RPC is a
derived message pattern built from a request and a response. In brief, SOAP is
not tied to objects, and web services are not tied to the web. Still, our running
example is an RPC over HTTP, which still appears to be the common case.

1.2 Securing Web Services with Cryptographic Protocols


Web services specifications support SOAP-level security via a syntax for embed-
ding cryptographic materials in SOAP messages. To meet their security goals,
web services and their clients can construct and check security headers in mes-
sages, according to the WS-Security format [IM02, NKHBM04]. WS-Security can
provide message confidentiality and authentication independently of the under-
lying transport, using, for instance, secure hash functions, shared-key encryp-
tion, or public-key cryptography. WS-Security has several advantages compared
to using a secure transport such as SSL, including scalability, flexibility, trans-
parency to intermediaries such as firewalls, and support for non-repudiation.
Significantly, though, WS-Security does not itself prescribe a particular security
protocol: each application must determine its security goals, and process security
headers accordingly.
Web services may be vulnerable to many of the well-documented classes of
attack on ordinary websites [SS02, HL03]. Moreover, unlike typical websites, web
services relying on SOAP-based cryptographic protocols may additionally be
vulnerable to a new class of XML rewriting attacks: a range of attacks in which an
attacker may record, modify, replay, and redirect SOAP messages, but without
breaking the underlying cryptographic algorithms. Flexibility comes at a price
in terms of security, and it is surprisingly easy to misinterpret the guarantees
actually obtained from processing security headers. XML is hence a new setting
for an old problem going back to Needham and Schroeder’s pioneering work
on authentication protocols; SOAP security protocols should be judged safe,
or not, with respect to an attacker who is able to “interpose a computer on
TulaFale: A Security Tool for Web Services 199

all communication paths, and thus can alter or copy parts of messages, replay
messages, or emit false material” [NS78]. XML rewriting attacks are included
in the WS–I threat model [DHK+ 04]. We have found a variety of replay and
impersonation attacks in practice.

1.3 Formalisms and Tools for Cryptographic Protocols


The use of formal methods to analyze cryptographic protocols and their vulner-
abilities begin with work by Dolev and Yao [DY83]. In the past few years there
has been intense research on the Dolev-Yao model, leading to the development
of numerous formalisms and tools.
TulaFale builds on the line of research using the pi calculus. The pi cal-
culus [Mil99] is a general theory of interaction between concurrent processes.
Several variants of the pi calculus, including spi [AG99], and a generalization,
applied pi [AF01], have been used to formalize and prove properties of cryp-
tographic protocols. A range of compositional reasoning techniques is available
for proving protocol properties, but proofs typically require human skill and de-
termination. Recently, however, Blanchet [Bla01, Bla02] has proposed a range
of automatic techniques, embodied in his theorem prover ProVerif, for checking
certain secrecy and authentication properties of the applied pi calculus. ProVerif
works by compiling the pi calculus to Horn clauses and then running resolution-
based algorithms.

1.4 TulaFale: A Security Tool for Web Services


TulaFale is a new scripting language for specifying SOAP security protocols, and
verifying the absence of XML rewriting attacks:
TulaFale = processes + XML + predicates + assertions
The pi calculus is the core of TulaFale, and allows us to describe SOAP
processors, such as clients and servers, as communicating processes. We extend
the pi calculus with a syntax for XML plus symbolic cryptographic operations;
hence, we can directly express SOAP messaging with WS-Security headers. We
declaratively specify the construction and checking of SOAP messages using
Prolog-style predicates; hence, we can describe the operational details of SOAP
processing. Independently, we specify security goals using various assertions,
such as correspondences for message authentication and correlation.
It is important that TulaFale can express the detailed structure of XML sig-
natures and encryption so as to catch low-level attacks on this structure, such
as copying part of an XML signature into another; more abstract representa-
tions of message formats, typical in the study of the Dolev-Yao model and used
for instance in previous work on SOAP authentication protocols [GP03], are
insensitive to such attacks.
Our methodology when developing TulaFale has been to study particular
web services implementations, and to develop TulaFale scripts modelling their
security aspects. Our experiments have been based on the WSE development
200 K. Bhargavan et al.

Fig. 1. Modelling WS-Security protocols with TulaFale

kit [Mic02], a particular implementation of WS-Security and related specifica-


tions. We have implemented the running example protocol of this paper using
WSE, and checked that the SOAP messages specified in our script faithfully
reflect the SOAP messages observed in this implementation. For a discussion of
the implementation of related protocols, including logs of SOAP messages, see
the technical report version of an earlier paper [BFG04a].
Fig. 1 illustrates our methodology. On the left, we have the user-supplied
code for implementing a web services protocol, such as the one of this paper,
on top of the WSE library. On the right, we have the TulaFale script modelling
the user-supplied code, together with some library predicates modelling opera-
tions performed by WSE. Also on the right, we have the TulaFale tool, which
compiles its input scripts into the pure applied pi calculus to be analyzed via
ProVerif.
TulaFale is a direct implementation of the pi calculus described in a pre-
vious formal semantics of web services authentication [BFG04a]. The original
contribution of this paper is to present a concrete language design, to report an
implementation of automatic verification of assertions in TulaFale scripts using
Blanchet’s ProVerif, and to develop a substantial example.
Section 2 informally introduces a simple request/response protocol and its
security goals: authentication and correlation of the two messages. Section 3
presents TulaFale syntax for XML with symbolic cryptography and for pred-
icates, and as a source of examples, explains a library of TulaFale predicates
for constructing and checking SOAP messages. Section 4 describes predicates
specific to the messages of our request/response protocol. Section 5 introduces
processes and security assertions in TulaFale, and outlines their implementation
via ProVerif. Section 6 describes processes and predicates specific to our proto-
col, and shows how to verify its security goals. Finally, Section 7 concludes.
TulaFale: A Security Tool for Web Services 201

2 A Simple Request/Response Protocol


We consider a simple SOAP-based request/response protocol, of the kind easily
implemented using WSE to make an RPC to a web service. Our security goals
are simply message authentication and correlation. To achieve these goals, the
request includes a username token identifying a particular user and a signature
token signed by a key derived from user’s password; conversely, the response in-
cludes a signature token signed by the server’s public key. Moreover, to preserve
the confidentiality of the user’s password from dictionary attacks, the username
token in the request message is encrypted with the server’s public key. (For
simplicity, we are not concerned here with any secrecy properties, such as confi-
dentiality of the actual message bodies, and we do not model any authorization
policies.)
In the remainder of this section, we present a detailed but informal specifi-
cation of our intended protocol, and consider some variations subject to XML
rewriting attacks. Our protocol involves the following principals:
– A single certification authority (CA) issuing X.509 public-key certificates for
services, signed with the CA’s private key.
– Two servers, each equipped with a public key certified by the CA and ex-
porting an arbitrary number of web services.
– Multiple clients, acting on behalf of human users.
Trust between principals is modelled as a database associating passwords to
authorized user names, accessible from clients and servers. Our threat model
features an active attacker in control of the network, in possession of all public
keys and user names, but not in possession of any of the following:
(1) The private key of the CA.
(2) The private key of any public key certified by the CA.
(3) The password of any user in the database.
The second and third points essentially rule out “insider attacks”; we are
assuming that the clients, servers, and CA belong to a single close-knit institu-
tion. It is easy to extend our model to study the impact of insider attacks, and
indeed to allow more than two servers, but we omit the details in this expository
example.
Fig. 2 shows an intended run of the protocol between a client and server.

– The principal Client(kr,U) acts on behalf of a user identified by U (an XML


encoding of the username and password). The parameter kr is the public key
of the CA, needed by the client to check the validity of public key certificates.
– The principal Server(sx,cert,S) implements a service identified by S (an XML
encoding of a URL address, a SOAP action, and the subject name appearing
on the service’s certificate). The parameter sx is the server’s private signing
key, while cert is its public certificate.
– The client sends a request message satisfying isMsg1(−,U,S,id1,t1,b1), which
we define later to mean the message has body b1, timestamp t1, and message
202 K. Bhargavan et al.

Fig. 2. An intended run of a client and server

identifier id1, is addressed to a web service S, and has a <Security> header


containing a token identifying U and encrypted with S’s public key, and a
signature of S, id1, t1, and b1 by U.
– The server sends a response message satisfying isMsg2(−,S,id1,id2,t2,b2),
which we define later to mean the message has body b2, timestamp t2,
and message identifier id2, is sent from S, and has a <Security> header
containing S’s certificate cert and a signature of id1, id2, t2, and b2 by S.
– The client and server enact begin- and end-events labelled C1(U,S,id1,t1,b1)
to record the data agreed after receipt of the first message. Similarly, the
begin- and end-events labelled C2(U,S,id1,t1,b1,id2,t2,b2) record the data
agreed after both messages are received. Each begin-event marks an intention
to send data. Each end-event marks apparently successful agreement on data.

The begin- and end-events define our authentication and correlation goals: for
every end-event with a particular label, there is a preceding begin-event with the
same label in any run of the system, even in the presence of an active attacker.
Such goals are known as one-to-many correspondences [WL93] or non-injective
agreements [Low97]. The C1 events specify authentication of the request, while
the C2 events specify authentication of the response. By including data from the
request, C2 events also specify correlation of the request and response.
Like most message sequence notations, Fig. 2 simply illustrates a typical
protocol run, and is not in itself an adequate specification. In Sections 4 and 6 we
present a formal specification in TulaFale: we define the principals Client(kr,U)
and Server(sx,cert,S) as parametric processes, and we define the checks isMsg1
and isMsg2 as predicates on our model of XML with symbolic cryptography. The
formal model clarifies the following points, which are left implicit in the figure:
TulaFale: A Security Tool for Web Services 203

Fig. 3. A replay attack

– The client can arbitrarily choose which service S to call, and which data
t1 and b1 to send. (In the formal model, we typically make such arbitrary
choices by inputting the data from the opponent.) Conversely, the client
must generate a fresh identifier id1 for each request, or else it is impossible to
correlate the responses from two simultaneous requests to the same service.
– Similarly, the server can arbitrarily choose the response data id2, t2, and b2.

On the other hand, our formal model does not directly address replay pro-
tection. To rule out direct replays of correctly signed messages, we would need
to specify that for each end-event there is a unique preceding begin-event with
the same label. This is known as a one-to-one correspondence or injective agree-
ment. In practice, we can protect against direct replays using a cache of recently
received message identifiers and timestamps to ensure that no two messages are
accepted with the same identifier and timestamp. Hence, if we can prove that
the protocol establishes non-injective agreement on data including the identifiers
and timestamps, then, given such replay protection, the protocol implementation
also establishes injective agreement.
We end this section by discussing some flawed variations of the protocol,
corresponding to actual flaws we have encountered in user code for web services.

– Suppose that the check isMsg1(−,U,S,id1,t1,b1) only requires that S, id1, and
b1, are signed by U, but not the timestamp t1. Replay protection based on
the timestamp is now ineffective: the opponent can record a message with
timestamp t1, wait until some time t2 when the timestamp has expired,
and the message identifier id1 is no longer being cached, rewrite the original
204 K. Bhargavan et al.

Fig. 4. A failure of message correlation

message with timestamp t2, and then replay the message. The resulting
message satisfies isMsg1(−,U,S,id1,t2,b1), since t2 does not need to be signed,
and hence is accepted by the server. Fig. 3 shows the attack, and the resulting
failure of correspondence C1.
– Suppose that a client re-uses the same message identifier in two different
calls to a web service; the opponent can manipulate messages so that the
client treats the response to the first call as if it were the response to the
second call. Fig. 4 shows the attack. The client sends a first request with
body b1 and identifier id1. The opponent intercepts the response with body
b2, and sends a SOAP fault back to the client. Subsequently, the client sends
a second request with the same identifier id1 as the first, and body b1’. The
opponent can delete this request to prevent it reaching the service, and then
replay the original response. The client now considers that b2 is the response
to b1’, when in fact it is the response to b1, perhaps completely different.
Formally, this is a failure of correspondence C2.
– Suppose that the server does not include the request identifier id1 in the
signature on the response message. Then the opponent can mount a similar
correlation attack, breaking C2—we omit the details.

We can easily adapt our TulaFale script to model these variations in the
protocol. Our tool automatically and swiftly detects the errors, and returns
descriptions of the messages sent during the attacks. These flaws in web services
code are typical of errors in cryptographic protocols historically. The practical
impact of these flaws is hard to assess, as they were found in preliminary code,
before deployment. Still, it is prudent to eliminate these vulnerabilities, and tools
such as TulaFale can systematically rule them out.
TulaFale: A Security Tool for Web Services 205

3 XML, Principals, and Cryptography in TulaFale


This section introduces the term and predicate language of TulaFale, via a series
of basic constructions needed for the example protocol of Section 2. Throughout
the paper, for the sake of exposition, we elide some details of SOAP envelopes,
such as certain headers and attributes, that are unconnected to security.

3.1 XML Elements, Attributes, and Strings


Here is a TulaFale term for a SOAP request, illustrating the format of the first
message in our example protocol:
<Envelope>
<Header>
<To>uri</>
<Action>ac</>
<MessageId>id</>
<Security>
<Timestamp><Created>"2004-03-19T09:46:32Z"</></>
utok
sig
</>
</>
<Body Id="1">request</>
</>
Every SOAP message consists of an XML <Envelope> element, with two
children: an optional <Header> and a mandatory <Body>. In this example, the
header has four children, and the body has an Id-attribute, the literal string "1".
We base TulaFale on a sorted (or typed) term algebra, built up from a set
of function symbols and variables. The basic sorts for XML data include string
(for string literals), att (for named attributes), and item (either an element or
a string). Every element or attribute tag (such as Envelope or Id, for example)
corresponds to a sorted function symbol in the underlying algebra.
Although TulaFale syntax is close to the XML wire format, it is not identi-
cal. We suppress all namespace information. As previously mentioned, we omit
closing element tags; for example, we write </> instead of </Envelope>. Literal
strings are always quoted, as in <Created>"2004-03-19T09:46:32Z"</>. In the
standard wire format, the double quotes would be omitted when a string is an
element body. We use quotation to distinguish strings from term variables, such
as the variables uri, ac, id, utok, sig, and request in the example above.

3.2 Symbolic Cryptography


In TulaFale, we represent cryptographic algorithms symbolically, as function
symbols that act on a sort bytes of byte arrays. Each function is either a data con-
structor, with no accompanying rewrite rule, or it is a destructor, equipped with
a rewrite rule for testing or extracting data from an application of a constructor.
206 K. Bhargavan et al.

For example, encryption functions are constructors, and decryption functions are
destructors. This approach, the basis of the Dolev-Yao model [DY83], assumes
that the underlying cryptography is perfect, and can be faithfully reflected by ab-
stract equational properties of the functions. It also abstracts some details, such
as the lengths of strings and byte arrays. The TulaFale syntax for introducing
constructors and destructors is based on the syntax used by ProVerif.
For instance, we declare function symbols for RSA key generation, public-key
encryption, and private-key decryption using the following TulaFale declarations:
constructor pk(bytes):bytes.
constructor rsa(bytes,bytes):bytes.
destructor decrsa(bytes,bytes):bytes with
decrsa(s,rsa(pk(s),b)) = b.
The constructor pk represents the relationship between private and public
keys (both byte arrays, of sort bytes); it takes a private key and returns the
corresponding public key. There is no inverse or destructor, as we intend to
represent a one-way function: given only pk(s) it is impossible to extract s.
The constructor rsa(k,x) encrypts the data x:bytes under the public key k,
producing an encrypted byte array. The destructor decrsa(s,e) uses the corre-
sponding private key s to decrypt a byte array generated by rsa(pk(s),x). The
destructor definition expresses the decryption operation as a rewrite rule: when
an application of decrsa in a term matches the left-hand side of the rule, it may
be replaced by the corresponding right-hand side.
To declare RSA public-key signatures, we introduce another constructor
rsasha1(s,x) that produces a RSA signature of a cryptographic hash of data
x under the private key s:
constructor rsasha1(bytes,bytes):bytes.
destructor checkrsasha1(bytes,bytes,bytes):bytes with
checkrsasha1(pk(s),x,rsasha1(s,x))=pk(s).
To check the validity of a signature sig on x using a public key k, one can
form the term checkrsasha1(k,x,sig) and compare it to k. If k is a public key of
the form pk(s) and sig is the result of signing x under the corresponding private
key s, then this term rewrites to k.
For the purposes of this paper, an X.509 certificate binds a key to a subject
name by embedding a digital signature generated from the private key of some
certifying authority (CA). We declare X.509 certificates as follows:
constructor x509(bytes,string,string,bytes):bytes.
destructor x509key(bytes):bytes with
x509key(x509(s,u,a,k))=k.
destructor x509user(bytes):string with
x509user(x509(s,u,a,k))=u.
destructor x509alg(bytes):string with
x509alg(x509(s,u,a,k))=a.
destructor checkx509(bytes,bytes):bytes with
checkx509(x509(sr,u,a,k),pk(sr))=pk(sr).
TulaFale: A Security Tool for Web Services 207

The term x509(sr,u,a,k) represents a certificate that binds the subject name u
to the public key k, for use with the signature algorithm a (typically rsasha1).
This certificate is signed by the CA with private key sr. Given such a certifi-
cate, the destructors x509key, x509user, and x509alg extract the three public
fields of the certificate. Much like checkrsasha1 for ordinary digital signatures,
an additional destructor checkx509 can be used to check the authenticity of the
embedded signature.

3.3 XML Encryption and Decryption


Next, we write logical predicates to construct and parse XML encrypted under
some known RSA public key. A predicate is written using a Prolog-like syntax; it
takes a tuple of terms and checks logical properties, such as whether two terms
are equal or whether a term has a specific format. It is useful to think of some
of the terms given to the predicate as inputs and the others as outputs. Under
this interpretation, the predicate computes output terms that satisfy the logical
properties by pattern-matching.
The predicate mkEncryptedData takes a plaintext plain:item and an RSA
public encryption key ek:bytes, and it generates an XML element encrypted
containing the XML encoding of plain encrypted under ek.
predicate mkEncryptedData (encrypted:item,plain:item,ek:bytes) :−
cipher = rsa(ek,c14n(plain)),
encrypted = <EncryptedData>
<CipherData>
<CipherValue>base64(cipher)</></></>.
The first binding in the predicate definition computes the encrypted byte
array, cipher, using the rsa encryption function applied to the key ek and the
plaintext plain. Since rsa is only defined over byte arrays, plain:item is first
converted to bytes using the (reversible) c14n constructor. The second bind-
ing generates an XML element (<EncryptedData>) containing the encrypted
bytes. Since only strings or elements can be embedded into XML elements, the
encrypted byte array, cipher, is first converted to a string using the (reversible)
base64 constructor.
In this paper, we use three transformation functions between sorts: c14n
(with inverse ic14n) transforms an item to a bytes, base64 (with inverse ibase64)
transforms a bytes to a string, and utf8 (with inverse iutf8) transforms a string
to a bytes. All three functions have specific meanings in the context of XML
transformations, but we treat them simply as coercion functions between sorts.
To process a given element, <Foo> say, we sometimes write distinct predicates
on the sending side and on the receiving side of the protocol, respectively. By
convention, to construct a <Foo> element, we write a predicate named mkFoo
whose first parameter is the element being constructed; to parse and check it,
we write a predicate named isFoo.
For <EncryptedData>, for instance, the logical predicate isEncryptedData
parses elements constructed by mkEncryptedData; it takes an element encrypted
208 K. Bhargavan et al.

and a decryption key dk:bytes and, if encrypted is an <EncryptedData> element


with some plaintext encrypted under the corresponding encryption key pk(dk),
it returns the plaintext as plain.
predicate isEncryptedData (encrypted:item,plain:item,dk:bytes) :−
encrypted = <EncryptedData>
<CipherData>
<CipherValue>base64(cipher)</></></>,
c14n(plain) = decrsa(dk,cipher).
Abstractly, this predicate reverses the computations performed by mkEncryp-
tedData. One difference, of course, is that while mkEncryptedData is passed the
public encryption key ek, the isEncryptedData predicate is instead passed the
private key, dk. The first line matches encrypted against a pattern representing
the <EncryptedData> element, extracting the encrypted byte array, cipher. Re-
lying on pattern-matching, the constructor base64 is implicitly inverted using its
destructor ibase64. The second line decrypts cipher using the decryption key dk,
and implicitly inverts c14n using its destructor ic14n to compute plain.

3.4 Services and X509 Security Tokens


We now implement processing for the service identifiers used in Section 2. We
identify each web service by a structure consisting of a <Service> element con-
taining <To>, <Action>, and <Subject> elements. For message routing, the web
service is identified by the HTTP URL uri where it is located and the name of
the action ac to be invoked at the URL. (In SOAP, there may be several differ-
ent actions available at the same uri.) The web service is then willing to accept
any SOAP message with a <To> header containing uri and an <Action> header
containing ac. Each service has a public key with which parts of requests may be
encrypted, and parts of responses signed. The <Subject> element contains the
subject name bound to the server’s public key by the X.509 certificate issued by
the CA. For generality, we do not assume any relationship between the URL and
subject name of a service, although in practice the subject name might contain
the domain part of the URL.
The logical predicate isService parses a service element to extract the <To>
field as uri, the <Action> field as ac, and the <Subject> field as subj:
predicate isService(S:item,uri:item,ac:item,subj:string) :−
S = <Service><To>uri</> <Action>ac</> <Subject>subj</></>.
We also define predicates to parse X.509 certificates and to embed them in
SOAP headers:
predicate isX509Cert (xcert:bytes,kr:bytes,u:string,a:string,k:bytes) :−
checkx509(xcert,kr) = kr,
u = x509user(xcert),
k = x509key(xcert),
a = x509alg(xcert).
TulaFale: A Security Tool for Web Services 209

predicate isX509Token (tok:item,kr:bytes,u:string,a:string,k:bytes) :−


tok = <BinarySecurityToken ValueType="X509v3">base64(xcert)</>,
isX509Cert (xcert,kr,u,a,k).
The predicate isX509Cert takes a byte array xcert containing an X.509 certifi-
cate, checks that it has been issued by a certifying authority with public key kr,
and extracts the user name u, its user public key k, and its signing algorithm a.
In SOAP messages, certificates are carried in XML elements called security to-
kens. The predicate isX509Token checks that an XML token element contains a
valid X.509 certificate and extracts the relevant fields.

3.5 Users and Username Security Tokens


In our system descriptions, we identify each user by a <User> element that con-
tains their username and password. The predicate isUser takes such an element,
U, and extracts its embedded username u and password pwd.
U = <User><Username>u</><Password>pwd</></>.
In SOAP messages, the username is represented by a UsernameToken that
contains the <Username> u, a freshly generated nonce n, and a timestamp t. The
predicate isUserTokenKey takes such a token tok and extracts u, n, t, and then
uses a user U for u to compute a key from pwd, n, and t.
predicate isUserTokenKey (tok:item,U:item,n:bytes,t:string,k:bytes) :−
isUser(U,u,pwd),
tok = <UsernameToken @ _>
<Username>u</>
<Nonce>base64(n)</>
<Created>t</></>,
k = psha1(pwd,concat(n,utf8(t))).

The first line parses U to extract the username u and password pwd. The
second line parses tok to extract n and t, implicitly checking that the user-
name u is the same. In TulaFale, lists of terms are written as tm1 . . .tmm @ tm
with m ≥ 0, where the terms tm1 , . . ., tmm are the first m members of the list,
and the optional term tm is the rest of the list. Here, the wildcard @ \ of the
<UsernameToken> element matches the entire list of attributes. The last line
computes the key k by applying the cryptographic hash function psha1 to pwd,
n, and t (converted to bytes). (This formula for k is a slight simplification of the
actual key derivation algorithm used by WSE.) The concat function returns the
concatenation of two byte arrays.

3.6 Polyadic Signatures


An XML digital signature consists of a list of references to the elements being
signed, together with a signature value that binds hashes of these elements using
some signing secret. The signature value can be computed using several different
210 K. Bhargavan et al.

cryptographic algorithms; in our example protocol, we rely on hmacsha1 for user


signatures and on rsasha1 for service signatures.
The following predicates describe how to construct (mkSigVal) and check
(isSigVal) the signature value sv of an XML element si, signed using the algo-
rithm a with a key k. Each of these predicates is defined by a couple of clauses,
representing symmetric and asymmetric signature algorithms. When a predicate
is defined by multiple clauses, they are interpreted as a disjunction; that is, the
predicate is satisfied if any one of its clauses is satisfied.

predicate mkSigVal (sv:bytes,si:item,k:bytes,a:string) :−


a = "hmacsha1", sv = hmacsha1(k,c14n(si)).

predicate isSigVal (sv:bytes,si:item,k:bytes,a:string) :−


a = "hmacsha1", sv = hmacsha1(k,c14n(si)).

predicate mkSigVal (sv:bytes,si:item,k:bytes,a:string) :−


a = "rsasha1", sv = rsasha1(k, c14n(si)).

predicate isSigVal (sv:bytes,si:item,p:bytes,a:string) :−


a = "rsasha1", p = checkrsasha1(p,c14n(si),sv).

The first clause of mkSigVal takes an item si to be signed and a key k


for the symmetric signing algorithm hmacsha1, and generates the signature
value sv. The first clause of isSigVal reverses this computation, taking sv, si,
k, and a = "hmacsha1" as input and checking that sv is a valid signature value
of si under k. Since the algorithm is symmetric, the two clauses are identical. The
second clause of mkSigVal computes the signature value using the asymmetric
rsasha1 algorithm, and the corresponding clause of isSigVal checks this signa-
ture value. In contrast to the symmetric case, the two clauses rely on different
computations.
A complete XML signature for a SOAP message contains both the signature
value sv, as detailed above, and an explicit description of the message parts are
used to generate si. Each signed item is represented by a <Reference> element.
The predicate mkRef takes an item t and generates a <Reference> element r
by embedding a sha1 hash of t, with appropriate sort conversions. Conversely,
the predicate isRef checks that r is a <Reference> for t.

predicate mkRef(t:item,r:item) :−
r = <Reference>
<Other></> <Other></>
<DigestValue> base64(sha1(c14n(t))) </> </>.

predicate isRef(t:item,r:item) :−
r = <Reference>

<DigestValue> base64(sha1(c14n(t))) </> </>.


TulaFale: A Security Tool for Web Services 211

(The XML constructed by mkRef abstracts some of the detail that is included in
actual signatures, but that tends not to vary in practice; in particular, we include
<Other> elements instead of the standard <Transforms> and <DigestMethod>
elements. On the other hand, the <DigestValue> element is the part that de-
pends on the subject of the signature, and that is crucial for security, and we
model this element in detail.)
More generally, the predicate mkRefs(ts,rs) constructs a list ts and from a
list rs, such that their members are pairwise related by mkRef. Similarly, the
predicate mkRefs(ts,rs) checks that two given lists are pairwise related by mkRef.
We omit their definitions.
A <SignedInfo> element is constructed from <Reference> elements for every
signed element. A <Signature> element consists of a <SignedInfo> element si
and a <SignatureValue> element containing sv. Finally, the following predicates
define how signatures are constructed and checked.
predicate mkSigInfo (si:item,a:string,ts:item) :−
mkRefs(ts,rs),
rs = <list>@ refs</>,
si = <SignedInfo>
<Other></> <SignatureMethod Algorithm=a> </>
@ refs </>.

predicate isSigInfo (si:item,a:string,ts:item) :−


si = <SignedInfo>
<SignatureMethod Algorithm=a> </>
@ refs</>,
rs = <list>@ refs</>,
isRefs(ts,rs).

predicate mkSignature (sig:item,a:string,k:bytes,ts:item) :−


mkSigInfo(si,a,ts),
mkSigVal(sv,si,k,a),
sig = <Signature> si <SignatureValue> base64(sv) </> </>.

predicate isSignature (sig:item,a:string,k:bytes,ts:item) :−


sig = <Signature> si <SignatureValue> base64(sv) </>@ </>,
isSigInfo(si,a,ts),
isSigVal(sv,si,k,a).
The predicate mkSigInfo takes a list of items to be signed, embedded in a
<list> element ts, and generates a list of references refs for them, embedded in
a <list> element rs, which are then embedded into si. The predicate isSigInfo
checks si has been correctly constructed from ts.
The predicate mkSignature constructs si using mkSigInfo, generates the signa-
ture value sv using mkSigVal, and puts them together in a <Signature> element
called sig; isSignature checks that a signature sig has been correctly generated
from a, k, and ts.
212 K. Bhargavan et al.

4 Modelling SOAP Envelopes for our Protocol


Relying on the predicate definitions of Section 3, which reflect (parts of) the
SOAP and WS-Security specifications but do not depend on the protocol, we
now define custom “top-level” predicates to build and check Messages 1 and 2
of our example protocol.

4.1 Building and Checking Message 1


Our goal C1 is to reach agreement on the data
(U,S,id1,t1,b1)
where
U=<User><Username>u</><Password>pwd</></>
S=<Service><To>uri</><Action>ac</><Subject>subj</></>
after receiving and successfully checking Message 1. To achieve this, the message
includes a username token for U, encrypted with the public key of S (that is, one
whose certificate has the subject name subj), and also includes a signature token,
signing (elements containing) uri, ac, id1, t1, b1, and the encrypted username
token, signed with the key derived from the username token.
We begin with a predicate setting the structure of the first envelope:
predicate env1(msg1:item,uri:item,ac:item,id1:string,t1:string,
eutok:item,sig1:item,b1:item) :−
msg1 =
<Envelope>
<Header>
<To>uri</>
<Action>ac</>
<MessageId>id1</>
<Security>
<Timestamp><Created>t1</></>
eutok
sig1</></>
<Body>b1</></>.
On the client side, we use a predicate mkMsg1 to construct msg1 as an output,
given its other parameters as inputs:
predicate mkMsg1(msg1:item,U:item,S:item,kr:bytes,cert:bytes,
n:bytes,id1:string,t1:string,b1:item) :−
isService(S,uri,ac,subj),
isX509Cert(cert,kr,subj,"rsasha1",ek),
isUserTokenKey(utok,U,n,t1,sk),
mkEncryptedData(eutok,utok,ek),
mkSignature(sig1,"hmacsha1",sk,
<list>
TulaFale: A Security Tool for Web Services 213

<Body>b1</>
<To>uri</>
<Action>ac</>
<MessageId>id1</>
<Created>t1</>
eutok</>),
env1(msg1,uri,ac,id1,t1,eutok,sig1,b1).
On the server side, with server certificate cert, associated private key sx, and
expected user U, we use a predicate isMsg1 to check the input msg1 and produce
S, id1, t1, and b1 as outputs:
predicate isMsg1(msg1:item,U:item,sx:bytes,cert:bytes,S:item,
id1:string,t1:string,b1:item) :−
env1(msg1,uri,ac,id1,t1,eutok,sig1,b1),
isService(S,uri,ac,subj),
isEncryptedData(eutok,utok,sx),
isUserTokenKey(utok,U,n,t1,sk),
isSignature(sig1,"hmacsha1",sk,
<list>
<Body>b1</>
<To>uri</>
<Action>ac</>
<MessageId>id1</>
<Created>t1</>
eutok</>).

4.2 Building and Checking Message 2


Our goal C2 is to reach agreement on the data
(U,S,id1,t1,b1,id2,t2,b2)
where
U=<User><Username>u</><Password>pwd</></>
S=<Service><To>uri</><Action>ac</><Subject>subj</></>
after successful receipt of Message 2, having already agreed on
(U,S,id1,t1,b1)
after receipt of Message 1.
A simple implementation is to make sure that the client’s choice of id1 in
Message 1 is fresh and unpredictable, to include <relatesTo>id1</> in Mes-
sage 2, and to embed this element in the signature to achieve correlation with
the data sent in Message 1. In more detail, Message 2 includes a certificate for S
(that is, one with subject name subj) and a signature token, signing (elements
containing) id1, id2, t2, and b2 and signed using the private key associated with
S’s certificate. The structure of the second envelope is defined as follows:
214 K. Bhargavan et al.

predicate env2(msg2:item,uri:item,id1:string,id2:string,
t2:string,cert:bytes,sig2:item,b2:item) :−
msg2 =
<Envelope>
<Header>
<From>uri</>
<RelatesTo>id1</>
<MessageId>id2</>
<Security>
<Timestamp><Created>t2</></>
<BinarySecurityToken>base64(cert)</>
sig2</></>
<Body>b2</></>.

A server uses the predicate mkMsg2 to construct msg2 as an output, given


its other parameters as inputs (including the signing key):

predicate mkMsg2(msg2:item,sx:bytes,cert:bytes,S:item,
id1:string,id2:string,t2:string,b2:item):−
isService(S,uri,ac,subj),
mkSignature(sig2,"rsasha1",sx,
<list>
<Body>b2</>
<RelatesTo>id1</>
<MessageId>id2</>
<Created>t2</></>),
env2(msg2,uri,id1,id2,t2,cert,sig2,b2).

A client, given the CA’s public key kr, and awaiting a response from S to a
message with unique identifier id1, uses the predicate isMsg2 to check its input
msg2, and produce data id2, t2, and b2 as outputs.

predicate isMsg2(msg2:item,S:item,kr:bytes,
id1:string,id2:string,t2:string,b2:item) :−
env2(msg2,uri,id1,id2,t2,cert,sig2,b2),
isService(S,uri,ac,subj),
isX509Cert(cert,kr,subj,"rsasha1",k),
isSignature(sig2,"rsasha1",k,
<list>
<Body>b2</>
<RelatesTo>id1</>
<MessageId>id2</>
<Created>t2</></>).
TulaFale: A Security Tool for Web Services 215

5 Processes and Assertions in TulaFale


A TulaFale script defines a system to be a collection of concurrent processes
that may compute internally, using terms and predicates, and may also commu-
nicate by exchanging terms on named channels. The top-level process defined
by a TulaFale script represents the behaviour of the principals making up the
system—some clients and servers in our example. The attacker is modelled as an
arbitrary process running alongside the system defined by the script, interacting
with it via the public channels. The style of modelling cryptographic protocols,
with an explicit given process representing the system and an implicit arbitrary
process representing the attacker, originates with the spi calculus [AG99]. We
refer to the principals coded explicitly as processes in the script as being compli-
ant, in the sense they are constrained to follow the protocol being modelled, as
opposed to the non-compliant principals represented implicitly by the attacker
process, who are not so constrained.
A TulaFale script consists of a sequence of declarations. We have seen already
in Sections 3 and 4 many examples of Prolog-style declarations of clauses defining
named predicates. This section describe three further kinds of declaration—
for channels, correspondence assertions, and processes. Section 6 illustrate their
usage in a script that models the system of Section 2.
We describe TulaFale syntax in terms of several metavariables or nonter-
minals: sort, term, and form range over the sorts, algebraic terms, and logical
formulas, respectively, as introduced in Section 3; and ide ranges over alphanu-
meric identifiers, used to name variables, predicates, channels, processes, and
correspondence assertions.
A declaration channel ide(sort1 , . . ., sortn ) introduces a channel, named ide,
for exchanging n-tuples of terms with sorts sort1 , . . ., sortn . As in the asyn-
chronous pi calculus, channels are named, unordered queues of messages. By
default, each channel is public, that is, the attacker may input or output mes-
sages on the channel. The declaration may be preceded by the private keyword
to prevent the attacker accessing the channel. Typically, SOAP channels are
public, but channels used to represent shared secrets, such as passwords, are
private.
In TulaFale, as in some forms of the spi calculus, we embed correspondence
assertions in our process language in order to state certain security properties
enjoyed by compliant principals.
A declaration correspondence ide(sort1 , . . ., sortn ) introduces a label, ide,
for events represented by n-tuples of terms with sorts sort1 , . . ., sortn . Each
event is either a begin-event or an end-event; typically, a begin-event records an
initiation of a session, and an end-event records the satisfactory completion of a
session, from the compliant principals’ viewpoint. The process language includes
constructs for logging begin- and end-events. The attacker cannot observe or
generate events. We use correspondences to formalize the properties (C1) and
(C2) of Section 2. The declaration of a correspondence on ide specifies a security
assertion: that in any run of the explicit system composed with an arbitrary
implicit attacker, every end-event labelled ide logged by the system corresponds
216 K. Bhargavan et al.

to a previous begin-event logged by the system, also labelled ide, and with the
same tuple of data. We name this property robust safety, following Gordon and
Jeffrey [GJ03]. The implication of robust safety is that two compliant processes
have reached agreement on the data, which typically include the contents of a
sequence of one or more messages.
A declaration process ide(ide1 :sort1 , . . ., iden :sortn ) = proc defines a para-
metric process, with body the process proc, named ide, whose parameters ide1 ,
. . ., iden have sorts sort1 , . . ., sortn , respectively.
Next, we describe the various kinds of TulaFale process.
– A process out ide(tm1 , . . ., tmn ); proc sends the tuple (tm1 , . . ., tmn ) on the
ide channel, then runs proc.
– A process in ide(ide1 , . . ., iden ); proc blocks awaiting a tuple (tm1 , . . ., tmn )
on the ide channel; if one arrives, the process behaves as proc, with its pa-
rameters ide1 , . . ., iden bound to tm1 , . . ., tmn , respectively.
– A process new ide:bytes; proc binds the variable ide to a fresh byte array, to
model cryptographic key or nonce generation, for instance, then runs proc.
Similarly, a process new ide:string; proc binds the variable ide to a fresh
string, to model password generation, for instance, then runs as proc.
– A process proc1 |proc2 is a parallel composition of subprocesses proc1 and
proc2 ; they run in parallel, and may communicate on shared channels.
– A process !proc is a parallel composition of an unbounded array of replicas
of the process proc.
– The process 0 does nothing.
– A process let ide = tm; proc binds the term tm to the variable ide, then runs
proc.
– A process filter form → ide1 , . . ., iden ; proc binds terms tm1 , . . ., tmn to the
variables ide1 , . . ., iden such that the formula form holds, then runs proc.
(The terms tm1 , . . ., tmn are computed by pattern-matching, as described
in a previous paper [BFG04a].)
– A process ide(tm1 , . . ., tmn ), where ide corresponds to a declaration process
ide(ide1 :sort1 , . . ., iden :sortn ) = proc binds the terms tm1 , . . ., tmn to the
variables ide1 , . . ., iden , then runs proc.
– A process begin ide(tm1 , . . ., tmn ); proc logs a begin-event labelled with ide
and the tuple (tm1 , . . ., tmn ), then runs proc.
– A process end ide(tm1 , . . ., tmn ); proc logs an end-event labelled with ide
and the tuple (tm1 , . . ., tmn ), then runs proc.
– Finally, the process done logs a done-event. (We typically mark the successful
completion of the whole protocol with done. Checking for the reachability
of the done-event is then a basic check of the functionality of the protocol,
that it may run to completion.)
The main goal of the TulaFale tool is to prove or refute robust safety for all the
correspondences declared in a script. Robust safety may be proved by a range of
techniques; the first paper on TulaFale [BFG04a] uses manually developed proofs
of behavioural equivalence. Instead, our TulaFale tool translates scripts into the
applied pi calculus, and then runs Blanchet’s resolution-based protocol verifier;
TulaFale: A Security Tool for Web Services 217

the translation is essentially the same as originally described [BFG04a]. ProVerif


(hence TulaFale) can also check secrecy assertions, but we omit the details here.
In addition, TulaFale includes a sort-checker and a simulator, both of which help
catch basic errors during the development of scripts. For example, partly relying
on the translation, TulaFale can show the reachability of done processes, which
is useful for verifying that protocols may actually run to completion.

6 Modelling and Verifying Our Protocol

Relying on the predicates given in Section 4, we now define the processes mod-
elling our sample system.

6.1 Modelling the System with TulaFale Processes


In our example, a public channel publish gives the attacker access to the certifi-
cates and public keys of the CA and two servers, named "BobsPetShop" and
"ChasMarket". Another channel soap is for exchanging all SOAP messages; it is
public to allow the attacker to read and write any SOAP message.
channel publish(item).
channel soap(item).
The following is the top-level process, representing the behaviour of all com-
pliant principals.
new sr:bytes; let kr = pk(sr);
new sx1:bytes; let cert1 = x509(sr,"BobsPetShop","rsasha1",pk(sx1));
new sx2:bytes; let cert2 = x509(sr,"ChasMarket","rsasha1",pk(sx2));
out publish(base64(kr));
out publish(base64(cert1));
out publish(base64(cert2));
( !MkUser(kr) |!MkService(sx1,cert1) |!MkService(sx2,cert2) |
(!in anyUser(U); Client(kr,U)) |
(!in anyService(sx,cert,S); Server(sx,cert,S)) )
The process begins by generating the private and public keys, sr and kr, of
the CA. It generates the private keys, sx1 and sx2, of the two servers, plus their
certificates, cert1 and cert2. Then it outputs the public data kr, cert1, cert2 to the
attacker. After this initialization, the system behaves as the following parallel
composition of five processes.
!MkUser(kr) |!MkService(sx1,cert1) |!MkService(sx2,cert2) |
(!in anyUser(U); Client(kr,U)) |
(!in anyService(sx,cert,S); Server(sx,cert,S)
As explained earlier, the ! symbol represents unbounded replication; each of
these processes may execute arbitrarily many times. The first process allows the
opponent to generate fresh username/password combinations that are shared
218 K. Bhargavan et al.

between all clients and servers. The second and third allow the opponent to
generate fresh services with subject names "BobsPetShop" and "ChasMarket",
respectively. The fourth acts as an arbitrary client U, and the fifth acts as an
arbitrary service S.
The process MkUser inputs the name of a user from the environment on
channel genUser, then creates a new password and records the username/pass-
word combination U as a message on private channel anyUser, representing the
database of authorized users.
channel genUser(string).
private channel anyUser(item).
process MkUser(kr:bytes) =
in genUser(u);
new pwd:string;
let U = <User><Username>u</><Password>pwd</></>;
!out anyUser (U).
The process MkService allows the attacker to create services with the subject
name of the certificate cert.
predicate isServiceData(S:item,sx:bytes,cert:bytes) :−
isService(S,uri,ac,x509user(cert)),
pk(sx) = x509key(cert).

channel genService(item).
private channel anyService(bytes,bytes,item).
process MkService(sx:bytes,cert:bytes) =
in genService(S);
filter isServiceData(S,sx,cert) → ;
!out anyService(sx,cert,S).

Finally, we present the processes representing clients and servers. Our desired
authentication and correlation properties are correspondence assertions embed-
ded within these processes. We declare the sorts of data to be agreed by the
clients and servers as follows.
correspondence C1(item,item,string,string,item).
correspondence C2(item,item,string,string,item,string,string,item).
The process Client(kr:bytes,U:item) acts as a client for the user U, assuming
that kr is the CA’s public key.
channel init(item,bytes,bytes,string,item).
process Client(kr:bytes,U:item) =
in init (S,certA,n,t1,b1);
new id1:string;
begin C1 (U,S,id1,t1,b1);
filter mkMsg1(msg1,U,S,kr,certA,n,id1,t1,b1) → msg1;
TulaFale: A Security Tool for Web Services 219

out soap(msg1);
in soap(msg2);
filter isMsg2(msg2,S,kr,id1,id2,t2,b2) → id2,t2,b2;
end C2 (U,S,id1,t1,b1,id2,t2,b2);
done.
The process generates a globally fresh, unpredictable identifier id1 for Mes-
sage 1, to allow correlation of Message 2 as discussed above. It then allows the
attacker to control the rest of the data to be sent by receiving it off the public
init channel. Next, the TulaFale filter operator evaluates the predicate mkMsg1
to bind variable msg1 to Message 1. At this point, the client marks its intention
to communicate data to the server by logging an end-event, labelled C1, and
then outputs the message msg1. The process then awaits a reply, msg2, checks
the reply with the predicate isMsg2, and if this check succeeds, ends the C2 cor-
respondence. Finally, to mark the end of a run of the protocol, it becomes the
done process—an inactive process, that does nothing, but whose reachability
can be checked, as a basic test of the protocol description.
The process Server(sx:bytes,cert:bytes,S:item) represents a service S, with
private key sx, and certificate cert.
channel accept(string,string,item).
process Server(sx:bytes,cert:bytes,S:item) =
in soap(msg1);
in anyUser(U);
filter isMsg1(msg1,U,sx,cert,S,id1,t1,b1) → id1,t1,b1;
end C1 (U,S,id1,t1,b1);

in accept (id2,t2,b2);
filter mkMsg2(msg2,sx,cert,S,id1,id2,t2,b2) → msg2;
begin C2 (U,S,id1,t1,b1,id2,t2,b2);
out soap(msg2).
The process begins by selecting a SOAP message msg1 and a user U off the
public soap and the private anyUser channels, respectively. It filters this data
with the isMsg1 predicate, which checks that msg1 is from U, and binds the
variables S, id1, t1, and b1. At this point, it asserts an end-event, labelled C1, to
signify apparent agreement on this data with a client. Next, the process inputs
data from the opponent on channel accept, to determine the response message.
The server process completes by using the predicate mkMsg2 to construct the
response msg2, asserting a begin-event for the C2 correspondence, and finally
sending the message.

6.2 Analysis
The TulaFale script for this example protocol consists of 200 lines specific to
the protocol, and 200 lines of library predicates. (We have embedded essentially
the whole script in this paper.) Given the applied pi calculus translation of this
script, ProVerif shows that our two correspondences C1 and C2 are robustly
220 K. Bhargavan et al.

safe. Failure of robust safety for C1 or C2 would reveal that the server or the
client has failed to authenticate Message 1, or to authenticate Message 2 and
correlate it with Message 1, respectively. Processing is swift enough—around 25s
on a 2.4GHz 1GB P4—to support easy experimentation with variations in the
protocol, specification, and threat model.

7 Conclusions
TulaFale is a high-level language based on XML with symbolic cryptography,
clausally-defined predicates, pi calculus processes, and correspondence asser-
tions. Previous work [BFG04a] introduces a preliminary version of TulaFale,
defines its semantics via translation into the applied pi calculus [AF01], illus-
trates TulaFale via several single-message protocols, and describes hand-crafted
correctness proofs.
The original reasons for choosing to model WS-Security protocols using the
the pi calculus, rather than some other formal method, include the generality
of the threat model (the attacker is an unbounded, arbitrary process), the ease
of simulating executable specifications written in the pi calculus, and the exis-
tence of a sophisticated range of techniques for reasoning about cryptographic
protocols.
Blanchet’s ProVerif [Bla01, Bla02] turns out to be a further reason for using
the pi calculus to study SOAP security. Our TulaFale tool directly implements
the translation into the applied pi calculus and then invokes Blanchet’s verifier,
to obtain fully automatic checking of SOAP security protocols. This checking
shows no attacker expressible as a formal process can violate particular SOAP-
level authentication or secrecy properties. Hence, we expect TulaFale will be
useful for describing and checking security aspects of web services specifica-
tions. We have several times been surprised by vulnerabilities discovered by the
TulaFale tool and the underlying verifier. Of course, every validation method,
formal or informal, abstracts some details of the underlying implementation, so
checking with TulaFale only partially rules out attacks on actual implementa-
tions. Still, ongoing work is exploring how to detect vulnerabilities in web ser-
vices deployments, by extracting TulaFale scripts from XML-based configuration
data [BFG04b].
The request/response protocol presented here is comparable to the abstract
RPC protocols proposed in earlier work on securing web services [GP03], but
here we accurately model the SOAP and WS-Security syntax used on the wire.
Compared with the SOAP-based protocols in the article introducing the TulaFale
semantics [BFG04a], the novelties are the need for the client to correlate the
request with the reply, and the use of encryption to protect weak user passwords
from dictionary attacks. In future work, we intend to analyse more complex
SOAP protocols, such as WS-SecureConversation [KN+ 04], for securing client-
server sessions.
Although some other process calculi manipulate XML [BS03, GM03], they are
not intended for security applications. We are beginning to see formal methods
TulaFale: A Security Tool for Web Services 221

applied to web services specifications, such as the TLA+ specification [JLLV04]


of the Web Services Atomic Transaction protocol, checked with the TLC model
checker. Still, we are aware of no other security tool for web services able to
analyze protocol descriptions for vulnerabilities to XML rewriting attacks.

Acknowledgement. We thank Bruno Blanchet for making ProVerif available,


and for implementing extensions to support some features of TulaFale. Vittorio
Bertocci, Ricardo Corin, Amit Midha, and the anonymous reviewers made useful
comments on a draft of this paper.

References
[AF01] M. Abadi and C. Fournet. Mobile values, new names, and secure com-
munication. In 28th ACM Symposium on Principles of Programming
Languages (POPL’01), pages 104–115, 2001.
[AG99] M. Abadi and A. D. Gordon. A calculus for cryptographic protocols:
The spi calculus. Information and Computation, 148:1–70, 1999.
[BEK+ 00] D. Box, D. Ehnebuske, G. Kakivaya, A. Layman, N. Mendelsohn,
H. Nielsen, S. Thatte, and D. Winer. Simple Object Access Proto-
col (SOAP) 1.1, 2000. W3C Note, at http://www.w3.org/TR/2000/
NOTE-SOAP-20000508/.
[BFG04a] K. Bhargavan, C. Fournet, and A. D. Gordon. A semantics for web
services authentication. In 31st ACM Symposium on Principles of Pro-
gramming Languages (POPL’04), pages 198–209, 2004.
[BFG04b] K. Bhargavan, C. Fournet, and A. D. Gordon. Verifying policy-based
security for web services. Submitted for publicaton, 2004.
[Bla01] B. Blanchet. An Efficient Cryptographic Protocol Verifier Based on
Prolog Rules. In 14th IEEE Computer Security Foundations Workshop
(CSFW-14), pages 82–96. IEEE Computer Society, 2001.
[Bla02] B. Blanchet. From Secrecy to Authenticity in Security Protocols. In
9th International Static Analysis Symposium (SAS’02), volume 2477 of
Lecture Notes on Computer Science, pages 342–359. Springer, 2002.
[Box01] D. Box. A brief history of SOAP. At http://webservices.xml.com/
pub/a/ws/2001/04/04/soap.html, 2001.
[BS03] G. Bierman and P. Sewell. Iota: a concurrent XML scripting language
with application to Home Area Networks. Technical Report 557, Uni-
versity of Cambridge Computer Laboratory, 2003.
[DHK+ 04] M. Davis, B. Hartman, C. Kaler, A. Nadalin, and J. Schwarz.
WS–I Security Scenarios, February 2004. Working Group Draft
Version 0.15, at http://www.ws-i.org/Profiles/BasicSecurity/
2004-02/SecurityScenarios-0.15-WGD.pdf.
[DY83] D. Dolev and A.C. Yao. On the security of public key protocols. IEEE
Transactions on Information Theory, IT–29(2):198–208, 1983.
[GJ03] A. D. Gordon and A. Jeffrey. Authenticity by typing for security pro-
tocols. Journal of Computer Security, 11(4):451–521, 2003.
[GM03] P. Gardner and S. Maffeis. Modeling dynamic web data. In DBPL’03,
LNCS. Springer, 2003.
222 K. Bhargavan et al.

[GP03] A. D. Gordon and R. Pucella. Validating a web service security abstrac-


tion by typing. In ACM Workshop on XML Security 2002, pages 18–29,
2003.
[HL03] M. Howard and D. LeBlanc. Writing secure code. Microsoft Press,
second edition, 2003.
[IM02] IBM Corporation and Microsoft Corporation. Security in a
web services world: A proposed architecture and roadmap. At
http://msdn.microsoft.com/library/en-us/dnwssecur/html/
securitywhitepaper.asp, April 2002.
[JLLV04] J. E. Johnson, D. E. Langworthy, L. Lamport, and F. H. Vogt. Formal
specification of a web services protocol. In 1st International Workshop
on Web Services and Formal Methods (WS-FM 2004), 2004. University
of Pisa.
[KN+ 04] C. Kaler, A. Nadalin, et al. Web Services Secure Conversation Lan-
guage (WS-SecureConversation), May 2004. Version 1.1. At http:
//msdn.microsoft.com/ws/2004/04/ws-secure-conversation/.
[Low97] G. Lowe. A hierarchy of authentication specifications. In Proceedings
of 10th IEEE Computer Security Foundations Workshop, 1997, pages
31–44. IEEE Computer Society Press, 1997.
[Mic02] Microsoft Corporation. Web Services Enhancements for Microsoft
.NET, December 2002. At http://msdn.microsoft.com/webservices/
building/wse/default.aspx.
[Mil99] R. Milner. Communicating and Mobile Systems: the π-Calculus. Cam-
bridge University Press, 1999.
[NKHBM04] A. Nadalin, C. Kaler, P. Hallam-Baker, and R. Monzillo. OASIS Web
Services Security: SOAP Message Security 1.0 (WS-Security 2004),
March 2004. At http://www.oasis-open.org/committees/download.
php/5941/oasis-200401-wss-soap-message-security-1.0.pdf.
[NS78] R.M. Needham and M.D. Schroeder. Using encryption for authenti-
cation in large networks of computers. Communications of the ACM,
21(12):993–999, 1978.
[S+ 04] J. Schlimmer et al. A Proposal for UPnP 2.0 Device Architecture, May
2004. At http://msdn.microsoft.com/library/en-us/dnglobspec/
html/devprof.asp.
[SMWC03] J. Shewchuk, S. Millet, H. Wilson, and D. Cole. Expand-
ing the communications capabilities of web services with WS-
Addressing. At http://msdn.microsoft.com/library/default.asp?
url=/library/en-us/dnwse/html/soapmail.asp, 2003.
[SS02] J. Scambay and M. Shema. Hacking Web Applications Exposed.
McGraw-Hill/Osborne, 2002.
[Vog03] W. Vogels. Web services are not distributed objects. IEEE Internet
Computing, 7(6):59–66, 2003.
[W3C03] W3C. SOAP Version 1.2, 2003. W3C Recommendation, at http://
www.w3.org/TR/soap12.
[Win98] D. Winer. RPC over HTTP via XML. At http://davenet.scripting.
com/1998/02/27/rpcOverHttpViaXml, 1998.
[Win99] D. Winer. Dave’s history of SOAP. At http://www.xmlrpc.com/
discuss/msgReader$555, 1999.
[WL93] T.Y.C. Woo and S.S. Lam. A semantic model for authentication pro-
tocols. In IEEE Computer Society Symposium on Research in Security
and Privacy, pages 178–194, 1993.
A Checker for Modal Formulae
for Processes with Data

Jan Friso Groote and Tim A.C. Willemse


Department of Mathematics and Computer Science,
Eindhoven University of Technology, P.O. Box 513,
5600 MB Eindhoven, The Netherlands
{J.F.Groote, T.A.C.Willemse}@tue.nl

Abstract. We present a new technique for the automatic verification of


first order modal μ-calculus formulae on infinite state, data-dependent
processes. The use of boolean equation systems for solving the model-
checking problem in the finite case is well-studied. We extend this
technique to infinite state and data-dependent processes. We describe a
transformation of the model checking problem to the problem of solving
equation systems, and present a semi-decision procedure to solve these
equation systems and discuss the capabilities of a prototype implement-
ing our procedure. This prototype has been successfully applied to many
systems. We report on its functioning for the Bakery Protocol.
Keywords: Model Checking, μCRL, First Order Modal μ-Calculus,
First Order Boolean Equation Systems, Data-Dependent Systems, In-
finite State Systems

1 Introduction
Model checking has come about as one of the major advances in automated ver-
ification of systems in the last decade. It has earned its medals in many applica-
tion areas (e.g. communications protocols, timed systems and hybrid systems),
originating from both academic and industrial environments.
However, the class of systems to which model checking techniques are ap-
plicable, is restricted to systems in which dependencies on infinite datatypes
are absent, or can be abstracted from. The models for such systems therefore
do not always represent these systems best. In particular, for some systems the
most vital properties are sensitive to data. There, the model checking technique
breaks down. This clearly calls for an extension of model checking techniques for
systems that are data-dependent.
In this paper, we explore a possibility for extending model checking techniques
to deal with processes which can depend on data. We describe a procedure, for
which we have also implemented a prototype, that verifies a given property
on a given data-dependent process. The problem in general is easily shown to
be undecidable, so, while we can guarantee soundness of our procedure, we
cannot guarantee its termination. However, as several examples suggest, many
interesting systems with infinite state spaces can be verified using our procedure.

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 223–239, 2004.

c Springer-Verlag Berlin Heidelberg 2004
224 J.F. Groote and T.A.C. Willemse

Naturally, our technique also applies to systems with finite (but extremely large)
state-spaces.
The framework we use for describing the behaviour of a system is process
algebraic. We use the process algebraic language μCRL [10, 11], which is an ex-
tension of ACP [2]; this language includes a formal treatment of data, as well as
an operational and axiomatic semantics of process terms. Compared to CCS or
ACP, the language μCRL is more expressive [17]. For our model checking pro-
cedure, we assume that the processes are written in a special format, the Linear
Process Equation (LPE) format, which is discussed in e.g. [23]. Note that this
does not pose a restriction on the set of processes that can be modelled using
μCRL, as all sensible process descriptions can be transformed to this format [23].
When dealing with datatypes, an explicit representation of the entire state space
is often not possible, since it can very well be infinite. Using the LPE format has
the advantage of working with a finite representation of the (possibly infinite)
state space.
The language we use to denote our properties in is an extension of the modal
μ-calculus [16]. In particular, we allow first order logic predicates and param-
eterised fixpoint variables in our properties. These extensions, which are also
described in e.g. [9], are needed to express properties about data.
The approach we follow is very much inspired by the work of e.g. Mader [18],
and uses (in our case, first order) boolean equation systems as an intermediate
formalism. We present a translation of first order modal μ-calculus expressions
to first order boolean equation systems in the presence of a fixed Linear Process
Equation. The procedure for solving the first order boolean equation systems is
based on the Gauß elimination algorithm described in, e.g. [18].
This paper is structured as follows: Section 2 briefly introduces the language
μCRL and the Linear Process Equations format that is used in all subsequent
sections. In Section 3, we describe the first order modal μ-calculus. Section 4
discusses first order boolean equation systems and describes the translation of
first order modal μ-calculus formulae, given a Linear Process Equation, to a
sequence of first order boolean equations. A procedure for solving the first order
boolean equations is then described in Section 5; its implementation is discussed
in Section 6, and a sample verification is described in Section 7. Section 8 is
reserved for concluding remarks.

2 Preliminaries
Our main focus in this paper is on processes with data. As a framework, we use
the process algebra μCRL [10]. Its basic constructs are along the lines of ACP [2]
and CCS [20], though its syntax is influenced mainly by ACP. In the process
algebra μCRL, data is an integral part of the language. For the exhibition of
the remainder of the theory, we assume we work in the context of a data alge-
bra without explicitly mentioning its constituent components. As a convention,
we assume the data algebra contains all the required data types; in particular,
A Checker for Modal Formulae for Processes with Data 225

we always have the domain B of booleans with functions ,:→B and ⊥:→B,
representing true and false at our disposal.
The language μCRL has only a small number of carefully chosen operators
and primitives. Processes are the main objects in the language. A set of param-
eterised actions Act is assumed. Actions represent atomic events; for an action
a∈Act, taking a number of data arguments d, a(d) is a process. The process
representing no behaviour, i.e. the process that cannot perform any actions is
denoted δ. This constant is often referred to as deadlock or inaction. All actions
a terminate successfully immediately after execution; δ cannot be executed.
Processes are constructed using several operators. The main operators are
alternative composition (p + q for some processes p and q) and sequential com-
position (p · q for some processes p and q). Conditional behaviour is denoted
using a ternary operator (we write p  b  q when we mean process p if b holds
and else process q). The process [b]:→p serves as a shorthand for the process
p  b  δ, which represents the process p under the premise that b holds. Recursive
behaviour is specified using equations. Process variables are typed; they should
be considered as functions from a data domain to processes.
Example 1. The behaviour, denoted by process X(n) is the increasing and de-
creasing of an internal counter n or showing its current value.

X(n:N) = up · X(n+1) + show(n) · X(n) + [n > 0]:→down · X(n−1)

For the formal exposition, it can be more convenient to assume that actions
and processes have a single parameter. This assumption is easily justified, as we
can assume the existence of an appropriate data domain, together with adequate
pairing and projection functions.
A more complex notion of process composition consists of the parallel com-
position of processes (we write pq to denote the process p parallel to the process
q). Synchronisation is achieved using a separate partial communication function
γ, prescribing the result of a communication of two actions (e.g. γ(a, b) = c de-
notes the communication between actions a and b, resulting in action c). Two
parameterised actions a(n) and b(n ) can communicate to action c(n ) only if
the communication between actions a and b results in action c (i.e. γ(a, b) = c)
and n = n = n.
The communication function is used to specify when communication is pos-
sible; this, however, does not mean communication is enforced. To this end, we
must encapsulate the individual occurrences of the actions that participate in
the communication. This is achieved using the encapsulation operator (we write
∂H (p) to specify that all actions in the set of actions H are to be encapsulated
in process p).
The last operator considered here is data-dependent alternative quantification
(we write d:D p to denote the alternatives of process p, dependent on some
arbitrary
 datum d selected from the (possibly infinite) data domain D). The
-operator is best compared to e.g. input prefixing, but is more expressive (see
e.g. [17]).
226 J.F. Groote and T.A.C. Willemse

Example 2. The behaviour, denoted by process V (n) is the setting of an internal


variable to an arbitrary value, which can be read at will.

V (n:N) = read(n) · V (n) + set(n ) · V (n )
n :N

For the purpose of verification or analysis, it is often most convenient to elim-


inate parallelism in favour of sequential composition and (quantified) alternative
composition. A behaviour of a process can then be denoted as a state-vector of
typed variables, accompanied by a set of condition-action-effect rules. Processes
denoted in this fashion are called Linear Process Equations.

Definition 1. A Linear Process Equation (LPE) is a parameterised equation


taking the form
 
X(d:D) = [ci (d, ei )] :→ ai (fi (d, ei )) · X(gi (d, ei ))
i∈I ei :Di

where I is a finite index set; D and Di are data domains; d and ei are data
variables; ai ∈Act are actions with parameters of sort Dai ; fi :D × Di → Dai ,
gi :D × Di → D and ci :D × Di → B are functions. The function fi yields, on the
basis of the current state d and the bound variable ei , the parameter for action
ai ; the “next-state” is encoded in the function gi . The function ci describes when
action ai can be executed.

In this paper, we restrict ourselves to the use of non-terminating processes,


i.e. we do not consider processes that, apart from executing an infinite number
of actions, also have the possibility to perform a finite number of actions and
then terminate successfully. Including termination into our theory does not pose
any theoretical challenges, but is omitted in our exposition for brevity.
Several techniques and tools exist to translate a guarded μCRL process to
linear form (see e.g. [3, 23]) and to further analyse these processes using symbolic
manipulation. In the remainder of this paper, we use the LPE-notation as a
vehicle for our exposition of the theory and practice.
The operational semantics for μCRL can be found in e.g. [10, 11]. Since we
restrict our discussions to process expressions in LPE-form, we here only provide
a definition of the labelled transition system that is induced by a process in LPE-
form.
Definition 2. The labelled transition system of a Linear Process Equation X
as defined in Def. 1 is a quadruple M = (S, Σ, →, s0 ), where
– S = {X(d) | d∈D} is the (possibly infinite) set of states;
– Σ = {ai (dai ) | i∈I ∧ ai ∈Act ∧ dai ∈Dai } is the (possibly infinite) set of labels;
– →= {(X(d), ai (da ), X(d )) | i∈I ∧ ai ∈Act ∧ ∃ei ∈Di ci (d, ei ) ∧ da = fi (d, ei ) ∧
a(e)
d = gi (d, ei )} is the transition relation. We write X(d) −→ X(d ) rather
than (X(d), a(e), X(d ))∈ →;
– s0 = X(d0 )∈S, for a given d0 ∈D, is the initial state.
A Checker for Modal Formulae for Processes with Data 227

3 First Order Modal μ-Calculus


The logic we consider is based on the modal μ-calculus [16], extended with
data variables, quantifiers and parameterisation (see [9]). This logic allows us to
express data dependent properties. We refer to this logic as the first order modal
μ-calculus. Its syntax and semantics are defined below.
Definition 3. An action formula is defined by the following grammar

α ::= a(e) | , | ¬α1 | α1 ∧ α2 | ∀d:D.α1

Here, a is a parameterised action of set Act and e of datatype D, is some data


expression, possibly containing data variables d of a set D. We use the standard
def def def
abbreviations ⊥ = ¬,, (α1 ∨ α2 ) = ¬(¬α1 ∧ ¬α2 ) and (∃d:D.α1 ) = (¬∀d:D.¬α1 ).
The action formulae are interpreted over a labelled transition system M ,
which is induced by an LPE (see Def. 2).
Remark 1. We use environments for registering the (current) values of variables
in formulae. Hence, an environment is a (partial) mapping of a set of variables
to elements of a given type. We use the following notational convention: for an
arbitrary environment θ, a variable x and a value v, we write θ[v/x] for the
environment θ , defined as θ (x ) = θ(x ) for all variables x different from x and
θ (x) = v. In effect, θ[v/x] stands for the environment θ where the variable x
has value v. The interpretation of a variable x in an environment θ is written as
θ(x).

Definition 4. The interpretation of an action formula α in the context of a


data environment ε:D→D, denoted by [[α]]ε, is defined inductively as:

[[,]]ε =Σ
[[a(e)]]ε = {a(ε(e))}
[[¬α]]ε = Σ \ [[α]]ε
[[α1 ∧ α2 ]]ε = 
[[α1 ]]ε ∩ [[α2 ]]ε
[[∀d:D.α]]ε = v∈D [[α]]ε[v/d]

Hence, we can use , to denote an arbitrary (parameterised) action. This is


useful for expressing e.g. progress conditions. We subsequently define the set of
state formulae.
Definition 5. A State Formula is given by the following grammar.

ϕ ::= b | Z(e) | ¬ϕ | ϕ1 ∧ ϕ2 | [α]ϕ1 | ∀d:D.ϕ | (νZ(d:D).ϕ)(e)

where b is an expression of the domain B, possibly containing data variables d


of the set D, e of datatype D, is some data expression possibly containing data
variables d of the set D, α is an action formula; Z is a propositional variable from
a set P, and (νZ(d:D).ϕ)(e) is subject to the restriction that any free occurrence
of Z in ϕ must be within the scope of an even number of negation symbols.
228 J.F. Groote and T.A.C. Willemse

def def
We use the abbreviations as usual: (ϕ1 ∨ ϕ2 ) = ¬(¬ϕ1 ∧ ¬ϕ2 ), (α)ϕ = ¬[α]¬ϕ,
def
(∃d:D.ϕ) = (¬∀d:D.¬ϕ) and (μZ(d:D).ϕ)(e) = (¬νZ(d:D).¬ϕ[¬Z/Z])(e). We
write σ for an arbitrary fixpoint, i.e. σ ∈ {ν, μ}.
We only consider state formulae in which all variables are bound exactly
once by a fixpoint operator or quantifier. State formulae are interpreted over a
labelled transition system M , induced by an LPE, according to Def. 2.
Definition 6. The interpretation of a state formula ϕ in the context of data
environment ε:D→D and propositional environment ρ:P→(D→2S ), denoted by
[[ϕ]]ρε, is defined inductively as:

S if [[b]]ε
[[b]]ρε =
∅ otherwise
[[Z(e)]]ρε = ρ(Z)(ε(e))
[[¬ϕ]]ρε = S \ [[ϕ]]ρε
[[ϕ1 ∧ ϕ2 ]]ρε = [[ϕ1 ]]ρε ∩ [[ϕ2 ]]ρε
[[[α]ϕ]]ρε = {X(v)∈S | ∀v  ∈D ∀a∈Act ∀va ∈Da
a(va ) 
(X(v) −→ X(v ) ∧ a(va )∈[[α]]ε) ⇒ X(v  )∈[[ϕ]]ρε}

[[∀d:D.ϕ]]ρε = v ∈D [[ϕ]]ρ(ε[v /d])
[[(νZ(d:D).ϕ)(e)]]ρε = (νΦρε )(ε(e))

def
where we have Φρε = λF :D → 2S .λv:D.[[ϕ]](ρ[F/Z])(ε[v/d]). For states X(d) ∈ S
of the transition system induced by an LPE X, and a formula ϕ, we write d |=X ϕ
for X(d) ∈ [[ϕ]]ρε.

Remark 2. In the remainder of this paper, we restrict ourselves to state formulae


given in Positive Normal Form (PNF). This means that negation only occurs on
the level of atomic propositions and, in addition, all bound variables are distinct.

We denote the set of functions D→2S by [D→2S ]. Define the ordering ⊆ ˙ on


S
the set [D→2 ] as X ⊆Y iff for all d:D we have X(d) ⊆ Y (d). Then, the fixpoint
˙
operators are monotonic over the complete lattice ([D→2S ], ⊆)˙ (see [14, 24]).
From this, the existence and uniqueness of fixpoints in state formulae immedi-
ately follows.
Example 3. Standard (first order) modal μ-calculus formulae often consist of the
constructs νZ.([,]Z ∧ ϕ) and μZ.(ϕ ∨ ([,]Z ∧ (,),)). These formulae represent
“always ϕ” and “eventually ϕ” resp. The greatest fixpoint can be interpreted as
“infinitely often”, whereas the least fixpoint can be interpreted as “finitely many
times”.

Example 4. Consider a process with at least the states s0 , s1 and s2 , the labels
a(,) and a(⊥) and the state formula ϕ. We write s |= ϕ to denote that ϕ is
satisfied in state s, and, likewise, we write s |= ϕ to denote that ϕ is not satisfied
in state s.
A Checker for Modal Formulae for Processes with Data 229

1. The state formula ∃b:B. [a(b)]ϕ holds in state s0 ,


since there is a b (viz. b = ,), such that whenever s0
r
we execute a(b), we end up in a state satisfying B
ϕ.  B
2. The state formula [∃b:B.a(b)]ϕ does not hold in a(,)  B a(⊥)
state s0 , since by executing a(⊥) we end up in  B
a state not satisfying ϕ. An alternative phrasing r BNr
s1 |= ϕ s2 |= ϕ
of the same property is ∀b:B.[a(b)]ϕ.

Note that data-quantification in action formulae can be used for abstracting


from the actual values for parameterised actions. One may be led to believe that
the quantifiers inside modalities can all be replaced by quantifiers in state for-
mulae. This, however, is not true, as we can only derive the following properties.

Property 1. Let ϕ be a state formula, such that d does not occur in ϕ, and let
α be an action formula. Then, we have the following identities:
1. (∃d:D.α)ϕ ⇔ ∃d:D.(α)ϕ, and [∃d:D.α]ϕ ⇔ ∀d:D.[α]ϕ,
2. ∃d:D.[α]ϕ ⇒ [∀d:D.α]ϕ, and (∀d:D.α)ϕ ⇒ ∀d:D.(α)ϕ
Note: here we use implication as an abbreviation for ⊆ and bi-implication as an
abbreviation for = on the interpretations of the state formulae.
These properties state a weak relation between quantifiers in state formulae
and quantifiers in action formulae. Note that the converse of the second item
does not hold, see e.g. [14, 24] for a counterexample. Thus, compared to the frag-
ment of the first order modal μ-calculus that disallows quantifiers inside action
formulae, the quantifiers inside action formulae indeed increase the expressive
power of the whole first order modal μ-calculus.
Remark 3. Note that negation in combination with universal quantifiers in ac-
tion formulae can yield more exciting sets than the empty set.

4 Equation Systems
Following [9], we use an extension of the formalism of boolean equation systems
(see e.g. [18]) as an intermediate formalism that allows us to combine an LPE
with a first order modal μ-calculus expression.
Definition 7. A first order boolean expression is a formula ϕ in positive form,
defined by the following grammar:

ϕ ::= b | ϕ1 ∧ ϕ2 | ϕ1 ∨ ϕ2 | Z(e) | ∀d:D.ϕ | ∃d:D.ϕ

where b is an expression of datatype B, possibly containing data variables d of


a set D, Z is a variable of a set P of propositional variables and e is a term of
datatype D, possibly containing data variables d of a set D.
230 J.F. Groote and T.A.C. Willemse

We denote the set of functions of type D→B by [D→B]. The set of first order
boolean expressions ([D→B], ⇒),˙ where ϕ⇒ψ ˙ iff for all d:D ϕ(d)⇒ψ(d), forms a
complete lattice and is isomorphic to (2D , ⊆). The propositional variables Z∈P,
occurring as free variables in first order boolean expressions are bound in first
order boolean equation systems (also known as parameterised boolean equation
systems [15] or equation systems for short), used in the sequel.
Definition 8. The interpretation of a first order boolean expression ϕ in the
context of propositional environment θ:P→(D→B) and data environment
η:D→D, written as [[ϕ]]θη is either true or false, determined by the following
induction:
[[b]]θη = [[b]]η
[[ϕ1 ∧ ϕ2 ]]θη = [[ϕ1 ]]θη ∧ [[ϕ2 ]]θη
[[ϕ1 ∨ ϕ2 ]]θη = [[ϕ1 ]]θη ∨ [[ϕ2 ]]θη
[[Z(e)]]θη = θ(Z)([[e]]η)
[[∀d:D.ϕ]]θη = true, if for all v:D it holds that [[ϕ]]θ(η[v/d]) else false
[[∃d:D.ϕ]]θη = true, if there exists a v:D such that [[ϕ]]θ(η[v/d]) else false

Lemma 1. First order boolean expressions are monotone over ([D→B], ⇒),
˙
see [14, 24].

Definition 9. An equation system E is a finite sequence of equations of the


form σZ(d:D) = ϕ, where σ is ν or μ and ϕ:D→B is a first order boolean
expression. We require that all data variables are bound exactly once and all
bound propositional variables are unique;  represents the empty equation system.
The equation system E  that is obtained by applying an environment θ to an
equation system E is the equation system in which every propositional variable
Z is assigned the value θ(Z).
Definition 10. The solution [E]θ to the equation system E (containing only
bound data variables), in the context of propositional environment θ:P→(D→B),
is an environment that is defined as follows (see also e.g. [18], Definition 3.3).
1) []θ = θ, and 2) [(σZ(d:D) = ϕ)E]θ = [E](θ[σZ.ϕ([E]θ)/Z]), where
μZ.ϕ([E]θ) = {ψ:D → B | λd:D.[[ϕ]]([E]θ[ψ/Z])⇒ψ}
˙
νZ.ϕ([E]θ) = {ψ:D → B | ψ ⇒λd:D.[[ϕ]]([E]θ[ψ/Z])}
˙

where and denote the infimum and the supremum from ([D→B], ⇒).
˙
We denote the set of all environments by [P→(D→B)]. The set ([P→(D→B)], ≤)
is a complete lattice, where the ordering ≤ is defined as θ1 ≤ θ2 iff for all Z ∈ P,
we have θ1 (Z)⇒θ˙ 2 (Z).
Lemma 2. Equation systems are monotone over ([P→(D→B)], ≤), see [14, 24].
We subsequently transform the model-checking problem for processes with
data to the problem of finding the solution to an equation system. We define a
translation, transforming an LPE and a first order modal μ-calculus formula to
an equation system.
A Checker for Modal Formulae for Processes with Data 231

Definition 11. Let ϕ = (σZ(df :D).Φ)(e) be a first order modal μ-calculus for-
mula ϕ. Let X(dp :D) be an LPE (see Def. 1). The equation system E that corre-
sponds to the expression ϕ for LPE X is given by E(ϕ). The translation function
E is defined by structural induction in Table 1.

Theorem 1. Let X be an LPE with initial state X(d0 ) and parameter space
D. For each environment ρ:P→(D →2S ) the environment θρ :P→(D×D →B)
def
is defined as θρ (Z̃) = λ(d, d ):D×D .(X(d)∈ρ(Z)(d )). Let ϕ = (σZ(d:D ).Φ)(e).
Then, we have
d0 |=X ϕ iff ([E(ϕ)]θρ )(Z̃(e, d0 ))
Proof. See [9].
The translation function E splits nested fixpoint expressions into equations
for which the right-hand side is determined by the function RHS. This latter
function takes care of the logical connectives, such as ∧ and ∨, and the modalities
[α]ϕ and (α)ϕ.
Example 5. Consider a coffee-vending machine that produces either cappuccino
or espresso on the insertion of a special coin. It accepts coins as long as there
is at least one type of coffee that can still be dispensed. If the machine has run
out of a type of coffee, it can be filled again with C fillings of cappuccino or E
fillings of espresso.
proc X(b:B, c, e:N) = [b ∧ c > 0]:→ cappuccino · X(¬b, c − 1, e)
+[b ∧ e > 0]:→ espresso · X(¬b, c, e − 1)
+[¬b ∧ c + e > 0]:→ coin · X(¬b, c, e)
+[¬b ∧ c = 0]:→ refillcappuccino · X(b, C, e)
+[¬b ∧ e = 0]:→ refillespresso · X(b, c, E)
Boolean b indicates whether a coin has been inserted or not and parameters
c and e register the number of servings of cappuccino, resp. espresso that are
left in the coffee-vending machine. Consider the (first order) modal μ-calculus
expression μZ.((coin ∨ cappuccino)Z ∨ (refillcappuccino ),), expressing that there
exists a path where cappuccino can be refilled when it is the only thing that has
been ordered. We briefly illustrate how we can obtain an equation system by
combining process X and the above modal formula using Def. 11:
E(μZ.((coin ∨ cappuccino)Z ∨ (refillcappuccino ),))
= (μZ̃(b:B, c, e:N) = RHS(((coin ∨ cappuccino)Z ∨ (refillcappuccino ),))
= (μZ̃(b:B, c, e:N) = RHS((coin ∨ cappuccino)Z) ∨ RHS((refillcappuccino ),))
= (μZ̃(b:B, c, e:N) = (¬b ∧ c + e > 0 ∧ RHS(Z)[(¬b, c, e)/(b, c, e)])
∨(b ∧ c > 0 ∧ RHS(Z)[(¬b, c − 1, e)/(b, c, e)])
∨(¬b ∧ c = 0 ∧ RHS(,)[(b, C, e)/(b, c, e)]))
= (μZ̃(b:B, c, e:N) = (¬b ∧ (c = 0 ∨ (c + e > 0 ∧ Z̃(¬b, c, e))))
∨(b ∧ c > 0 ∧ Z̃(¬b, c − 1, e))
Notice that this equation carries the parameters of the LPE, even though the
first order modal μ-calculus expression was not parameterised.
232 J.F. Groote and T.A.C. Willemse

Table 1. Translation of first order μ-calculus formula ϕ and LPE X(dp :D) to an
equation system E(ϕ). Note that Z̃ is a fresh propositional variable, associated to the
propositional variable Z. Function E determines the number and order of equations for
E(ϕ), whereas function RHS breaks down ϕ to obtain first order boolean expressions
that form the right-hand side of each equation in E(ϕ). The satisfaction relation |= and
the function Par are listed in Table 2. The function ParX (ϕ) yields a list of parameters
with types that must be bound by the parameterised propositional variable X. Here,
we have abused the notation ParX (ϕ) to also denote the list of parameters without
typing information. Note that ParX (ϕ) is always calculated for the entire formula ϕ,
and not for subformulae

def
E(b) = 
def
E(Z(df )) = 
def
E(Φ1 ∧ Φ2 ) = E(Φ1 )E(Φ2 )
def
E(Φ1 ∨ Φ2 ) = E(Φ1 )E(Φ2 )
def
E([α]Φ) = E(Φ)
def
E((α)Φ) = E(Φ)
def
E(∀d:D.Φ) = E(Φ)
def
E(∃d:D.Φ) = E(Φ)
def
E((σZ(df :Df ).Φ)(e)) = (σ Z̃(df :Df , dp :Dp , ParZ (ϕ)) = RHS(Φ) ) E(Φ)
def
RHS(b) = b
def
RHS(Z(e)) = Z̃(e, dp , ParZ (ϕ))
def
RHS(Φ1 ∧ Φ2 ) = RHS(Φ1 ) ∧ RHS(Φ2 )
def
RHS(Φ1 ∨ Φ2 ) = RHS(Φ1 ) ∨ RHS(Φ2 )
def
RHS([α]Φ) = i:I ∀ei :Di (ai (fi (dp , ei )) |= α ∧ ci (dp , ei )) ⇒
RHS(Φ)[gi (dp , ei )/dp ]
def 
RHS((α)Φ) = i:I ∃ei :Di (ai (fi (dp , ei )) |= α ∧ ci (dp , ei )∧
RHS(Φ)[gi (dp , ei )/dp ])
def
RHS(∀d:D.Φ) = ∀d:D.RHS(Φ)
def
RHS(∃d:D.Φ) = ∃d:D.RHS(Φ)
def
RHS((σZ(df :Df ).Φ)(e)) = Z̃(e, dp , ParZ (ϕ))

5 Model-Checking
Mader [18] describes an algorithm for solving boolean equation systems. The
method she uses resembles the well-known Gauß elimination algorithm for solv-
ing linear equation systems, and is therefore also referred to as Gauß elimination.
The semi-decision procedure we use (see Fig. 1) is an extension of the Gauß elim-
ination algorithm of [18]. Basically, all lines (except for line 3) are part of the
algorithm of [18]. The essential difference is in the addition of line 3, where an
A Checker for Modal Formulae for Processes with Data 233

Table 2. Auxiliary functions used in the translation of Table 1. Here, + + denotes list
concatenation. The satisfaction relation |= checks whether a symbolic action a(d) is
part of an action formula α. The function ParX (ϕ) yields a list of parameters together
with their types that have to be bound by the equation for X

def
a(d) |= a (d ) = a = a ∧ d = d
def
a(d) |= , = true
def
a(d) |= ¬α = ¬(a(d) |= α)
def
a(d) |= α1 ∧ α2 = (a(d) |= α1 ) ∧ (a(d) |= α2 )
def
a(d) |= α1 ∨ α2 = (a(d) |= α1 ) ∨ (a(d) |= α2 )
 def
a(d) |= ∃d :D.α = ∃d :D.(a(d) |= α)
def
a(d) |= ∀d :D.α = ∀d :D.(a(d) |= α)

def
ParX (b) = []
def
ParX (Z(df )) = [] for all Z ∈ P
def
ParX (Φ1 ∧ Φ2 ) = ParX (Φ1 ) ++ParX (Φ2 )
def
ParX (Φ1 ∨ Φ2 ) = ParX (Φ1 ) ++ParX (Φ2 )
def
ParX ([α]Φ) = ParX (Φ)
def
ParX ((α)Φ) = ParX (Φ)
def
ParX (∀d:D.Φ) = [d:D] ++ParX (Φ)
def
ParX (∃d:D.Φ) = [d:D] ++ParX (Φ)
def
ParX ((σZ(df :Df ).Φ)(e)) = ParX (Φ) for all Z ∈ P such that Z = X
def
ParX ((σX(df :Df ).Φ)(e)) = []

extra loop for calculating a stable point in the (transfinite) approximation for
each equation is given.
The reduction of an equation system proceeds in two separate steps. First,
a stabilisation step is issued, in which an equation σi Xi (d:D) = ϕi is reduced
to a stable equation σi Xi (d:D) = ϕi , where ϕi is an expression containing no
occurrences of Xi . Second, we substitute each occurrence of Xi by ϕi in the
rest of the equations of the first order boolean equation system. This does not
change the solution of the equation system (see [14, 24, 15]). Since there are no
more occurrences of Xi in the right-hand side of the equations, it suffices to
reduce a smaller equation system. The semi-decision procedure terminates iff
the stabilisation step terminates for each equation.
Theorem 2. On termination of the semi-decision procedure in Fig. 1, the so-
lution of the given equation system has been computed [14, 24].

Remark 4. Given the undecidability of the model-checking problem in the set-


ting of (arbitrary) infinite state systems, the stabilisation step in our procedure
(which is based on transfinite approximations of fixpoints) cannot be made to
234 J.F. Groote and T.A.C. Willemse

Input: (σ1 X1 (d1 :D1 ) = ϕ1 ) . . . (σn Xn (dn :Dn ) = ϕn ).

1. for i = n downto 1 do
2. j := 0; ψ0 := ( if σbi = ν then , else ⊥);
3. repeat ψj+1 := ϕi [Xi := ψj ]; j := j + 1 until (ψj ≡ ψj−1 )
4. for k = 1 to i do ϕk := ϕk [Xi := ψj ] od ;
5. od

Fig. 1. Semi-decision procedure for computing the solution of an equation system

terminate. However, there are a number of cases for which termination is guar-
anteed, e.g. if all considered datatypes are finite.
Example 6. Consider the infinite-state system C(0) that counts from zero to
infinity, and reports its current state via an action current. Even though
this process is not very complex, it cannot be represented by a finite labelled
transition system, which is why most current tools cannot handle such
processes.
proc C(n:N) = current(n) · C(n+1)

Verifying absence of deadlock, i.e. νZ.((,), ∧ [,]Z) on process C requires


solving the associated equation system ν Z̃(n:N) = (Z̃(n+1) ∧ ,). Substituting
, for Z̃(n+1) immediately leads to the stable solution ,.
Example 7. Consider a process C representing a counter that counts down from
a randomly chosen natural number to zero and then randomly selects a new
natural number.

proc C(n:N) = [n = 0]:→ reset · C(m) + [n > 0]:→ dec · C(n−1)
m:N

Verifying whether it is possible to eventually execute a reset action, expressed


by the formula μZ.([,]Z ∧(,),)∨(reset),, requires solving the equation system
μZ̃(n:N) = (n = 0 ∨ (Z̃(n−1) ∧ (n = 0 ∨ n > 0))). Here, the approximation of Z̃
does not stabilise in a finite time, as we end up with approximations ψk , where
ψk = n ≤ k. Thus, we cannot find a ψj , such that ψj = ψj+1 . However, it
is straightforward to see that the minimal solution for this equation should be
μZ(n:N) = ,. In [15], we set out to investigate techniques and theorems that
allow one to solve (by hand) a substantially larger class of equation systems than
the class that can be solved using our semi-decision procedure of Figure 1. This
class includes equation system of the type of this example.
A Checker for Modal Formulae for Processes with Data 235

6 Implementation
Based on our semi-decision procedure, described in the previous section, we
have implemented a prototype of a tool1 . The prototype implementation of our
algorithm employs Equational Binary Decision Diagrams (EQ-BDDs) [13] for
representing first order boolean expressions. These EQ-BDDs extend on standard
BDDs [6] by explicitly allowing equality on nodes. We first define the grammar
for EQ-BDDs.

Definition 12. Assume a set P of propositions and a set V of variables, then


EQ-BDDs are formulae, determined by the following grammar.

Φ ::= 0 | 1 | ITE(V = V, Φ, Φ) | ITE(P, Φ, Φ)

The constants 0 and 1 represent false and true. An expression of the form
ITE(ϕ, ψ, ξ) must be read as an if-then-else construct, i.e. (ϕ ∧ ψ) ∨ (¬ϕ ∧ ξ), or,
alternatively, (ϕ ⇒ ψ) ∧ (¬ϕ ⇒ ξ). For data variables d and e, and ϕ of the form
d = e, the extension to EQ-BDDs is used, i.e. we explicitly use ITE(d = e, ψ, ξ)
in such cases. Using the standard BDD and EQ-BDD encodings [6, 13], we can
then represent all quantifier-free first order boolean expressions.
Representing expressions that contain quantifiers over finite domains is done
in a straightforward manner, i.e. we construct explicit encodings for each distinct
element in the domain. Expressions containing quantifiers over infinite domains
are in general problematic, when it comes to representation and calculation. The
following theorem, however, identifies a number of cases in which we can deal
with these.

Theorem 3. Quantification over datatypes can be eliminated in the following


cases:
∃d:D.ITE(d
 = e1 , ϕ1 , ITE(d = e2 , ϕ2 , . . . , ITE(d = en , ϕn , ψ) . . .))
= 1≤i≤n (( 1≤j<i ej = ei ) ∧ ϕi [ei /d]) ∨ ψ

∀d:D.ITE(d = e1 , ϕ1 , ITE(d = e2 , ϕ2 , . . . , ITE(d = en , ϕn , ψ) . . .))


= 1≤i≤n (( 1≤j<i ej = ei ) ∨ ϕi [ei /d]) ∧ ψ

provided D contains at least one element not in {ei |1 ≤ i ≤ n}. By abuse


of notation we write ei instead of its value.

Proof. See [24].

Even though Theorem 3 applies to a restricted class of first order boolean


expressions, we find that in practice, it adds considerably to the verification
power of the prototype implementation.
1
The tool, called MuCheck, is distributed as part of the μCRL tool-suite [3].
236 J.F. Groote and T.A.C. Willemse

7 Example Verification
We have successfully used the prototype on several applications, including many
communications protocols, such as the IEEE-1394 firewire, sliding window proto-
cols, the bounded retransmission protocol, etc. As an example of its capabilities,
we here report on our findings for Lamport’s Bakery Protocol [21]. A μCRL
specification of the Bakery Protocol is given in Fig. 2. The bakery protocol we

comm get, send = c


init ∂{get, send} (P(,)P(⊥))

proc P(b:B) = request(b) · P0 (b, 0) + send(b, 0) · P(b)


proc P0 (b:B, n:N) = m:N get(¬b, m) · P1 (b, m + 1) + send(b, n) · P0 (b, n)
proc P1 (b:B, n:N) = 
send(b, n) · P1 (b, n) + m:N get(¬b, m) · (C1 (b, n)  n < m ∨ m = 0  P1 (b, n))
proc C1 (b:B, n:N) = enter(b) · C2 (b, n) + send(b, n) · C1 (b, n)
proc C2 (b:B, n:N) = leave(b) · P(b) + send(b, n) · C2 (b, n)

Fig. 2. Lamport’s Bakery Protocol

consider is restricted to two processes. Each process, waiting to enter its critical
section, can choose a number, larger than any other number already chosen.
Then, the process with the lower number is allowed to enter the critical section
before the process with the larger number. Due to the unbounded growth of
the numbers that can be chosen, the protocol has an infinite state-space. How-
ever, our techniques are immediately applicable. Below, we list a number of key
properties we verify for the bakery protocol.
1. No deadlocks, i.e.
νZ.([,]Z ∧ (,),),
2. Two processes can never be in the critical section at the same time, i.e.
νZ.([T ]Z ∧ ∀b:B.([enter(b)]νZ  .([enter(¬b)]F ∧ [¬leave(b)]Z  ))),
3. All processes requesting a number eventually enter the critical section, i.e.
νZ.([,]Z ∧ ∀b:B.([request(b)]μZ  .(([,]Z  ∧ (,),) ∨ (enter(b)),)))
The first two properties are satisfied. Using our prototype we were able to
produce these results in 1 and 2 seconds2 resp. The third property was proved
not to hold in 1 second.

8 Closing Remarks
Related Work. In a setting without data, the use of boolean equation systems for
the verification of modal μ-calculus formulae on finite and infinite state systems

2
These results were obtained using a 1.47Ghz AMD Athlon XP1700+ machine with
256Mb main memory, running Linux kernel 2.2.
A Checker for Modal Formulae for Processes with Data 237

was studied by Mader [18]. As observed by Mader, the use of boolean equation
systems is closely related to the tableau methods of Bradfield and Stirling [5],
but avoids certain redundancy of tableaux. It is therefore likely that in the case
with data our approach performs better than tableau methods if these would be
extended to deal with data.
Closely related to our work is the tool Evaluator 3.0 [19], which is an on-
the-fly model checker for the regular alternation-free μ-calculus. The machinery
of this tool is based on boolean equation systems. Although the alternation-
free μ-calculus allows for the specification of temporal logic properties involving
data, the current version of the tool does not support the data-based version
of this language. It is well imaginable that this tool can be extended with our
techniques.
A different approach altogether is undertaken by e.g. Bryant et al. [7]. Their
Counter arithmetic with Lambda expressions and Uninterpreted function (CLU)
can be used to model both data and control, and is shown to be decidable.
For this, CLU sacrifices expressiveness, as it is restricted to the quantifier-free
fragment of first order logic. Moreover, their tool (UCLID) is restricted to dealing
with safety properties only. We allow for safety, liveness and fairness properties to
be verified automatically. Nevertheless, CLU is interesting as it provides evidence
that there may be a fragment in our logic or in our specification language that
is decidable, even for infinite state systems.
Much work on symbolic reachability analysis of infinite state systems has
been undertaken, but most of it concentrates on safety properties only. Bouaj-
jani et al. (see e.g. [4]) describe how first-order arithmetical formulae, expressing
safety and liveness conditions, can be verified over Parametric Extended Au-
tomaton models, by specifying extra fairness conditions on the transitions of the
models. The main difference with our approach is that we do not require fairness
conditions on transitions of our models and that the first order modal μ-calculus
is in fact capable of specifying fairness properties.
The technique by Bultan et al. [8] seems to be able to produce results that
are comparable to ours. Their techniques, however, are entirely different from
ours. In fact, their approach is similar to the approach used by Alur et al. [1] for
hybrid systems. It uses affine constraints on integer variables, logical connectives
and quantifiers to symbolically encode transition relations and sets of states. The
logic, used to specify the properties is a CTL-style logic. In order to guarantee
termination, they introduce conservative approximation techniques that may
yield “false negatives”, which always converges. It is interesting to investigate
whether the same conservative approximation techniques can be adapted to our
techniques.

Summary. We discussed a pragmatic approach to verifying data-dependent sys-


tems. The techniques and procedure we used, are based upon the techniques
and algorithms, described by e.g. Mader [18]. A prototype tool implementation
is described and a sample verification is discussed. Apart from the example from
Section 7, the prototype was successfully applied to other systems, among others
the Alternating Bit Protocol, see the discussion in [24].
238 J.F. Groote and T.A.C. Willemse

Summarising, we find that the verifications conducted with our prototype


take in many cases an acceptable run-time, even though for systems with fi-
nite state spaces, our prototype is often outperformed by most well-established
tool-suites. However, we expect some improvements can still be made on the pro-
totype. More importantly, we have been able to successfully use our prototype on
systems with a finite (but extremely large) state-space, for which the standard
μCRL tool-suite (which is competitive with tool-suites that use explicit state-
space representations) failed to calculate the exact state-space (see [24]). Since
this is where current state-of-the-art technologies break down, our technique is
clearly a welcome addition.
Still, several other issues remain to be investigated. For instance, we think our
technique may eventually be used to generalise specialised techniques, such as
developed by Bryant et al. [7, 22]. Furthermore, in [15], we identify several results
and techniques for solving equations and equation systems. In some cases, these
would allow one to skip or speed up the approximation step in our procedure. A
promising step is to implement those techniques and results and integrate them
with the approach that is outlined in this paper.
The prototype implementation also revealed a number of issues to be resolved.
For instance, using our prototype, we were not able to prove absence of deadlock
in the Bounded Retransmision Protocol [12], when the bound on the number
of retransmissions is unknown. Finding effective work-arounds such problems is
necessary to improve the overall efficacy of our technique. Here, the techniques
and results of [15] may turn out to be of particular importance.

References
1. R. Alur, C. Courcoubetis, N. Halbwachs, T.A. Henzinger, P.-H. Ho, X. Nicollin,
A. Olivero, J. Sifakis, and S. Yovine. The algorithmic analysis of hybrid systems.
Theoretical Computer Science, 138:3–34, 1995.
2. J.C.M. Baeten and W.P. Weijland. Process Algebra. Cambridge Tracts in Theo-
retical Computer Science. Cambridge University Press, 1990.
3. S.C.C. Blom, W.J. Fokkink, J.F. Groote, I. Van Langevelde, B.Lisser, and J.C.
van de Pol. μCRL: A toolset for analysing algebraic specification. In CAV’01,
volume 2102 of LNCS, pages 250–254. Springer-Verlag, 2001.
4. A. Bouajjani, A. Collomb-Annichini, Y. Lacknech, and M. Sighireanu. Analysis of
fair extended automata. In P. Cousot, editor, Proceedings of SAS’01, volume 2126
of LNCS, pages 335–355. Springer-Verlag, 2001.
5. J.C. Bradfield and C. Stirling. Local model checking for infinite state spaces.
Theoretical Computer Science, 96(1):157–174, 1992.
6. R.E. Bryant. Graph-based algorithms for Boolean function manipulation. IEEE
Transactions on Computers, C-35(8):677–691, 1986.
7. R.E. Bryant, S.K. Lahiri, and S.A. Seshia. Modeling and verifying systems using a
logic of counter arithmetic with lambda expressions and uninterpreted functions.
In CAV 2002, volume 2404 of Lecture Notes in Computer Science, pages 78–92.
Springer-Verlag, 2002.
A Checker for Modal Formulae for Processes with Data 239

8. T. Bultan, R. Gerber, and W. Pugh. Symbolic model checking of infinite state


systems using Presburger arithmetic. In O. Grumberg, editor, CAV’97, volume
1254 of LNCS, pages 400–411. Springer-Verlag, 1997.
9. J.F. Groote and R. Mateescu. Verification of temporal properties of processes in
a setting with data. In A.M. Haeberer, editor, AMAST’98, volume 1548 of LNCS,
pages 74–90. Springer-Verlag, 1999.
10. J.F. Groote and A. Ponse. The syntax and semantics of μCRL. In A. Ponse,
C. Verhoef, and S.F.M. van Vlijmen, editors, Algebra of Communicating Processes
’94, Workshops in Computing Series, pages 26–62. Springer Verlag, 1995.
11. J.F. Groote and M.A. Reniers. Algebraic process verification. In J.A. Bergstra,
A. Ponse, and S.A. Smolka, editors, Handbook of Process Algebra, chapter 17, pages
1151–1208. Elsevier (North-Holland), 2001.
12. J.F. Groote and J.C. van de Pol. A bounded retransmission protocol for large data
packets. a case study in computer checked verification. In M. Wirsing and M. Nivat,
editors, AMAST’96, volume 110 of LNCS, pages 536–550. Springer-Verlag, 1996.
13. J.F. Groote and J.C. van der Pol. Equational binary decision diagrams. In
M. Parigot and A. Voronkov, editors, LPAR2000, volume 1955 of LNAI, pages
161–178. Springer-Verlag, 2000.
14. J.F. Groote and T.A.C. Willemse. A checker for modal formulas for processes
with data. Technical Report CSR 02-16, Eindhoven University of Technology,
Department of Mathematics and Computer Science, 2002.
15. J.F. Groote and T.A.C. Willemse. Parameterised Boolean Equation Systems.
Technical Report CSR 04-09, Eindhoven University of Technology, Department
of Mathematics and Computer Science, 2004. An extended abstract is to appear
in CONCUR’04, LNCS, Springer-Verlag, 2004.
16. D. Kozen. Results on the propositional mu-calculus. Theoretical Computer Science,
27:333–354, 1983.
17. S.P. Luttik. Choice quantification in process algebra. PhD thesis, University of
Amsterdam, April 2002.
18. A. Mader. Verification of Modal Properties Using Boolean Equation Systems. PhD
thesis, Technical University of Munich, 1997.
19. R. Mateescu and M. Sighireanu. Efficient on-the-fly model-checking for regular
alternation-free mu-calculus. In S. Gnesi, I. Schieferdecker, and A. Rennoch, edi-
tors, FMICS’2000, pages 65–86, 2000.
20. R. Milner. Communication and Concurrency. Prentice Hall International, 1989.
21. M. Raynal. Algorithms for Mutual Exclusion. North Oxford Academic, 1986.
22. O. Strichman, S.A. Seshia, and R.E. Bryant. Deciding separation formulas with
SAT. In CAV 2002, volume 2404 of LNCS, pages 209–222. Springer-Verlag, 2002.
23. Y.S. Usenko. Linearization in μCRL. PhD thesis, Eindhoven University of Tech-
nology, December 2002.
24. T.A.C. Willemse. Semantics and Verification in Process Algebras with Data and
Timing. PhD thesis, Eindhoven University of Technology, February 2003.
Semantic Essence of AsmL:
Extended Abstract

Yuri Gurevich, Benjamin Rossman, and Wolfram Schulte


Microsoft Research
One Microsoft Way, Redmond, WA 98052, USA

Abstract. The Abstract State Machine Language, AsmL, is a novel exe-


cutable specification language based on the theory of Abstract State Ma-
chines. AsmL is object-oriented, provides high-level mathematical data-
structures, and is built around the notion of synchronous updates and
finite choice. AsmL is fully integrated into the .NET framework and Mi-
crosoft development tools. In this paper, we explain the design rationale
of AsmL and sketch semantics for a kernel of the language. The details
will appear in the full version of the paper.

1 Introduction
For years, formal method advocates have criticized specification and documenta-
tion practices of the software industry. They point out that neither more rigorous
English nor semi-formal notation like UML protect us from unintended ambigu-
ity or missing important information. The more practical among them require
specifications to be linked to an executable code. Without such a linkage one can-
not debug the specification or impose it. Non-linked specifications tend quickly
to become obsolete.
We agree with the critique. We need specifications that are precise, readable
and executable. The group of Foundations of Software Engineering at Microsoft
Research [4] was not satisfied with existing solutions of the specification problem
(we address related work in the full paper [8]) and has worked out a new solution
based on the theory of abstract state machines [5, 6, 3, 9]. We think of specifica-
tions as executable models that exhibit the desired behavior on the appropriate
level of abstraction. Abstract State Machine Language, AsmL, is a language for
writing such models [1].
The FSE group has designed AsmL, implemented it and integrated it with
the Microsoft runtime and tool environment. Furthermore, the group has built
various tools on top of AsmL.

1.1 Language Features


The language features of AsmL were chosen to give the user a familiar program-
ming paradigm. For instance, AsmL supports classes and interfaces in the same
way as C# or Java do. In fact all .NET structuring mechanisms are supported:
enumerations, delegates, methods, events, properties and exceptions. Neverthe-

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 240–259, 2004.

c Springer-Verlag Berlin Heidelberg 2004
Semantic Essence of AsmL 241

less, AsmL is primarily a specification language. Users familiar with the speci-
fication language literature will find familiar data structures and features, like
sets, sequences, maps, pattern matching, bounded quantification, and set com-
prehension.
But the crucial features of AsmL, intrinsic to ASMs, are massive synchronous
parallelism and finite choice. These features give rise to a cleaner programming
style than is possible with standard imperative programming languages. Syn-
chronous parallelism means that AsmL has transactional semantics. (A single
step of AsmL can be viewed as a transaction. This transaction may involve mas-
sive parallelism.) This provides for a clean separation between the generation
of new values and the committal of those values into the persistent state. For
instance, when an exception is thrown, the state is automatically rolled back
rather than being left in an unknown and possibly inconsistent state. Finite
choice allows the specification of a range of behaviors permissible for an (even-
tual) implementation.

1.2 AsmL-S, a Core of AsmL


AsmL is rich. It incorporates features needed for .NET integration and features
needed to support the tools built on top of AsmL. AsmL-S represents the stable
core of AsmL; the S alludes to “simple”. In this semantical study we allow
ourselves to compactify the syntax and ignore some features that do not add
semantical complexity. In particular, maps, sequences and sets are first-class
citizens of the full AsmL. In AsmL-S only maps are first-class citizens. Sets of
type t can be represented as maps from t to a unit type.

Acknowledgments. Without the support from the FSE group this work would
not be possible; particular thanks go to Wolfgang Grieskamp and Nikolai Till-
mann for developing the runtime mechanism of AsmL. Thanks to Robert Stärk
for his comments on the draft of this extended abstract.

2 AsmL-S Through Examples


One can see AsmL as a fusion of the ASM paradigm and the .NET type system,
influenced to an extent by other specification languages like VDM [2] or Z [15].
This makes it a powerful modeling tool. On the other hand, we also aimed for
simplicity. That is why AsmL is designed in such a way that its core, AsmL-
S, is small. AsmL-S is expression and object oriented. It supports synchronous
parallelism, finite choice, sequential composition and exception handling. The
rest of this section presents examples of AsmL-S programs and expressions. For
the abstract syntax of AsmL-S, see Fig. 1 in Section 3.
Remark 1. The definitions in this section are provisional, having been simplified
for the purpose of explaining examples. The notions introduced here (stores,
effects, evaluation, etc.) are, of course, defined formally in the full paper.
242 Y. Gurevich, B. Rossman, and W. Schulte

2.1 Expressions
In AsmL-S, expressions are the only syntactic means for writing executable spec-
ifications. Binding and function application are call-by-value. (The necessity of
.NET integration is a good reason all by itself not to use lazy evaluation.)
Literal is the set of literals, like 1, true, null or void . We write the value
denoted by a literal as the literal itself. Literals are typed; for instance, 1 is of
type Int and true is of type Bool . AsmL-S has various operations on Literal , like
addition over integers or conjunction, i.e. and , over Bool .
Exception is an infinite set of exceptions, that is disjoint from Literal . For
now, think of exceptions as values representing different kinds of errors. We will
discuss exceptions further in Section 2.8.
If e is a closed expression, i.e. an expression without free variables, and v is a
v
literal or an exception, then e −→ v means that e evaluates to v. The “v” above
the arrow alludes to “value”.
Examples 1–5 show how to evaluate simple AsmL-S expressions.

Evaluation of Simple Expressions


v
1 + 2 −→ 3 (1)
v
1/0 −→ argX (2)
v
let x = 1 do x + x −→ 2 (3)
v
let x = 1/0 do 2 −→ argX (4)
v
if true then 0 else 3 −→ 0 (5)

For instance, Example 4 shows that let-expressions expose call-by-value se-


mantics: if the evaluation of the binding fails (in this case, resulting in the argu-
ment exception), then the complete let-expression fails, irrespective of whether
the body is used the binding.

2.2 Object Orientation


AsmL-S encapsulates state and behavior in classes. As in C# or Java, classes
form a hierarchy according to single inheritance. We use only the single dispatch
of methods. Objects are dynamically allocated. Each object has a unique identity.
Objects can be created, compared and passed around.
ObjectId is an infinite set of potential object identifiers, that is disjoint from
Literal and Exception. Normal values are either object identifiers in ObjectId or
literals. Values are either normal values or exceptions.

Nvalue = ObjectId ∪ Literal


Value = Nvalue ∪ Exception

A type map is a partial function from ObjectId to Type. It sends allocated


objects to their runtime types. A location is an object identifier together with a
Semantic Essence of AsmL 243

field name drawn from a set FieldId . A content map is a partial function from
Location to Nvalue. It records the initial bindings for all locations.

TypeMap = ObjectId → Type


Location = ObjectId × FieldId
ContentMap = Location → Nvalue
θ,ω,v
If e is a closed expression, then e −−−−→ θ, ω, v means that the evaluation of
e produces the type map θ, the content map ω and the value v. Examples 6–14
demonstrate the object oriented features of AsmL-S.
θ,ω,v
class A {} : new A() −−−−→ {o → A}, ∅, o (6)
The execution of a nullary constructor returns a fresh object identifier o and
extends the type map. The fresh object identifier o is mapped to the dynamic
type of the object.

class A {i as Int}, class B extends A {b as Bool } : (7)


θ,ω,v
new B(1, true) −−−−→ {o → B}, {(o, i) → 1, (o, b) → true}, o

The default constructor in AsmL-S takes one parameter for each field in
the order of their declaration. The constructor extends the type map, extends
the field map using the corresponding arguments, and returns a fresh object
identifier.
v
class A {i as Int} : new A(1).i −→ 1 (8)

Instance fields can immediately be accessed.


  
class A Fact(i as Int) as Int do if i = 0 then 1 else i ∗ me.Fact(n − 1) :
θ,ω,v
new A().Fact(3) −−−−→ {o → A}, ∅, 6 (9)

Method calls have call-by-value semantics. Methods can be recursive. Within


methods the receiver object is denoted by me.

class A {One() as Int do 1,


Two() as Int do me.One() + me.One()}, (10)
v
class B extends A {One() as Int do − 1} : new B().Two() −→ −2

As in C# or Java method, dispatch is dynamic. Accordingly, in this example,


it is the redefined method that is used for evaluation.

class A {i as Int} : (11)


  v
let x = if 3 < 4 then null else new A(1) do x.i −→ nullX
244 Y. Gurevich, B. Rossman, and W. Schulte

If the receiver of a field or method selection is null , evaluation fails and throws
a null pointer exception.
v
class A {}, class B extends A {} : new B() is A −→ true (12)

The operator is tests the dynamic type of the expression.


θ,ω,v
class A {}, class B extends A {} : new B() as A −−−−→ {o → B}, ∅, o (13)

Casting checks that an instance is a subtype of the given type, and if so then
yields the instance without changing the dynamic type of the instance.
v
class A {}, class B extends A {} : new A() as B −→ castX (14)

If casting fails, evaluation throws a cast exception.

2.3 Maps
Maps are finite partial functions. A map display is essentially the graph of the
partial function. For example, a map display m = {1 → 2, 3 → 4} represents the
partial function that maps 1 to 2 and 3 to 4. The map m consists of two maplets
1 → 2 and 3 → 4 mapping keys (or indices) 1, 3 to values 2, 4 respectively.
Remark 2. In AsmL, maps can be also described by means of comprehension
expressions. For example, {x → 2 ∗ x | x ∈ {1, 2, 3}} denotes {1 → 2, 2 →
4, 3 → 6}. In AsmL-S map comprehension should be programmed.
The maps of AsmL-S are similar to associative arrays of AWK or Perl. Maps
have identities and each key gives rise to a location. Arbitrary normal values can
serve as keys. We extend the notion of a location accordingly.

Location = ObjectId × (FieldId ∪ Nvalue)

Maps may be modified (see Section 2.4). Maps are often used in forall and
choose expressions (see Sections 2.5 and 2.7). Examples 15–19 exhibit the use of
maps in AsmL-S.

new Int → Bool {1 → true, 5 → false} (15)


θ,ω,v
−−−−→ {o → (Int → Bool )}, {(o, 1) → true, (o, 5) → false}, o

A map constructor takes the map type and the initial map as arguments.
v
new Int → Bool {1 → true, 1 → false} −→ argconsistencyX (16)

If a map constructor is inconsistent (i.e. includes at least two maplets with


identical keys but different values), then the evaluation throws an inconsistency
exception.
  v
new Int → Bool {1 → true} [1] −→ true (17)
Semantic Essence of AsmL 245

The value of a key can be extracted by means of an index expression.


  v
if true then null else new Int → Int {1 → 7} [1] −→ nullX (18)

  v
new Int → Int {1 → 7} [2] −→ mapkeyX (19)

However, if the receiver of the index expression is null or if the index is not in
the domain of the map, then the evaluation throws a null exception or a map-key
exception, respectively.

Remark 3. AsmL-S treats maps differently than the full AsmL. The full AsmL
is more sophisticated; it treats maps as values which requires partial updates [7].
In AsmL-S, maps are objects. An example illustrating this difference is given in
Section 2.10.

2.4 Assignments
One of AsmL’s unique features is its handling of state. In sequential languages,
like C# or Java, assignments trigger immediate state changes. In ASMs, and
therefore also in AsmL, an assignment creates an update. An update is a pair:
the first component describes the location to update, the second the value to
which it should be updated. An update set is a set of updates. A triple that
consists of a type map, a content map and an update set will be called a store.

Update = Location × (Value ∪ {DEL})


UpdateSet = SetOf (Update)
Store = TypeMap × ContentMap × UpdateSet

Note that we extended V alue with a special symbol DEL which is used only
with locations given by map keys and which marks keys to be removed from the
map.
s,v
If e is a closed expression, then e −−→ s, v means that evaluation of e pro-
duces the store s and the value v. Examples 20–23 show the three ways to create
updates. Note that in AsmL-S, but not in AsmL, all fields and keys can be up-
dated. AsmL distinguishes between constants and variables and allows updates
only to the latter.

class A {i as Int} : (20)


s,v  
new A(1).i := 2 −−→ {o → A}, {(o, i) → 1}, {((o, i), 2)} , void

A field assignment is expressed as usual. However, it does not change the


state. Instead, it returns the proposed update.
 
new Int → Bool {1 → true} [2] := false (21)
s,v  
−−→ {o → Int → Bool }, {(o, 1) → true}, {((o, 2), false)} , void
246 Y. Gurevich, B. Rossman, and W. Schulte

A map-value assignment behaves similarly. Note that the update set is created
irrespective of whether the location exists or not.
 
remove new Int → Bool {1 → true} [1] (22)
s,v  
−−→ {o → Int → Bool }, {((o, 1) → true}, {(o, 1), DEL)} , void
The remove instruction deletes an entry from the map by generating an
update that contains the placeholder DEL in the location to delete.
class A {F (map as Int → A, val as A) as Void do map[0] := val },
class B extends A {} : (23)
v
let a = new A() do a.F (new Int → B {}, a) −→ mapvalueX
Since Int → B is a subtype of Int → A, it is reasonable that this piece of
code type-checks successfully at compile time. However, the assignment fails at
runtime and throws a map-assignment exception. Thus, map assignments must
be type-checked at runtime. (The same reason forces runtime type-checks of
array assignments in C# or Java.)

2.5 Parallel Composition


Hand in hand with the deferred update of the state goes the notion of syn-
chronous parallelism. It allows the simultaneous generation of finitely many up-
dates. Examples 24–27 show two ways to construct synchronous parallel updates
in AsmL-S.
 
let x = new Int → Int {} do x[2] := 4  x[3] := 9 (24)
s,v  
−−→ {o → Int → Int}, ∅, {((o, 2), 4), ((o, 3), 9)} , void
Parallel expressions may create multiple updates. Update sets can be in-
consistent. A consistency check is performed when a sequential composition of
expressions is evaluated and at the end of the program.
let x = new Int → Int {} do
let y = new Int → Void {2 → void , 3 → void } do
forall i in y do x[i] := 2 ∗ i (25)
s,v 
−−→ {o1 → Int → Int, o2 → Int → Void },

{(o2 , 2) → void , (o2 , 3) → void }, {((o1 , 2), 4), ((o1 , 3), 9)} , void
Parallel assignments can also be performed using forall expressions. In a
forall expression forall x in e1 do e2 , the subexpression e1 must evaluate to a
map. The subexpression e2 is then executed with all possible bindings of the
introduced variable to the elements in the domain of the map.
 
let x = new Int → Int {} do forall i in x do x[i] := 1/i (26)
s,v
−−→ (∅, ∅, ∅), void
Semantic Essence of AsmL 247

If the range of a forall expression is empty, it simply returns the literal void .
 
let x = new Int → Int {2 → 4} do let y = x[2] do (x[2] := 8)  y (27)
s,v  
−−→ {o → Int → Int}, {(o, 2) → 4}, {((o, 2), 8)} , 4
Parallel expressions can return values. In full AsmL, the return value is distin-
guished syntactically by writing return. In AsmL-S, the value of the second
expression is returned, whereas forall-expressions return void .

2.6 Sequential Composition


AsmL-S also supports sequential composition. Not only does AsmL-S commit
updates on the state, as in conventional imperative languages, but it also accu-
mulates updates, so that the result of a sequential composition can be used in the
context of a parallel update as well. Examples 28–31 demonstrate this important
feature of AsmL-S.
 
let x = new Int → Int {2 → 4} do (x[2] := 8) ; (x[2] := x[2] ∗ x[2]) (28)
s,v  
−−→ {o → Int → Int}, {(o, 2) → 4)}, {((o, 2), 64)} , void
The evaluation of a sequential composition of e1 ; e2 at a state S proceeds as
follows. First e1 is evaluated at S. If no exception is thrown and the resulting
update set is consistent, then the update set is fired (or executed) at S. This cre-
ates an auxiliary state S  . Then e2 is evaluated at S  , after which S  is forgotten.
The current state is still S. The accumulated update set consists of the updates
generated by e2 at S  and the updates of e1 that have not been overridden by
updates of e2 .
let x = new Int → Int {2 → 4} do (29)
  v
x[2] := 8  x[2] := 6 ; x[2] := x[2] ∗ x[2] −→ updateX
If the update set of the first expression is inconsistent, then execution fails
and throws an inconsistency exception.
let x = new Int → Int {1 → 2} do
 
x[2] := 4  x[3] := 6 ; x[3] := x[3] + 1 (30)
s,v  
−−→ {o → Int → Int}, {(o, 1) → 2)}, {((o, 2), 4), ((o, 3), 7)} , void
In this example, the update ((o, 3), 6) from the first expression of the sequen-
tial pair is overridden by the update ((o, 3), 7) from the second expression, which
is evaluated in the state with content map {(o, 1) → 2, (o, 2) → 4, (o, 3) → 6}.
 
let x = new Int → Int {1 → 3} do while x[1] > 0 do x[1] := x[1] − 1 (31)
s,v  
−−→ {o → Int → Int}, {(o, 1) → 3)}, {((o, 1), 0)} , void
While loops behave as in usual sequential languages, except that a while loop
may be executed in parallel with other expressions and the final update set is
reported rather than executed.
248 Y. Gurevich, B. Rossman, and W. Schulte

2.7 Finite Choice


AsmL-S supports choice between a pair of alternatives or among values in the
domain of a map. The actual job of choosing a value from a given set X of
alternatives is delegated to the environment. On the abstraction level of AsmL-
S, an external function oneof(X) does the job. This is similar to delegating to
the environment the duty of producing fresh object identifiers, by mean of an
external function freshid().
Evaluation of a program, when convergent, returns one effect and one value.
Depending on the environment, different evaluations of the same program may
return different effects and values. Examples 32–36 demonstrate finite choice in
AsmL-S.
v
1 [] 2 −→ oneof{1, 2} (32)

An expression e1 [] e2 chooses between the given pair of alternatives.


 
choose i in new Int → Void {1 → void , 2 → void } do i
s,v  
−−→ oneof ({o → Int → Void }, {(o, 1) → void , (o, 2) → void }, ∅), 1 (33)
 
({o → Int → Void }, {(o, 1) → void , (o, 2) → void }, ∅), 2

Choice-expressions choose from among values in the domain of a map.


  v
choose i in new Int → Int {} do i −→ choiceX (34)

If the choice domain is empty, a choice exception is thrown. (The full AsmL
distinguishes between choose-expressions and choose-statements. The choose-
expression throws an exception if the choice domain is empty, but the choose-
statement with the empty choice domain is equivalent to void .)

class Math{Double(x as Int) as Int do 2 ∗ x} : (35)


v
new Math().Double(1 [] 2) −→ oneof{2, 4}

class Math{Double(x as Int) as Int do 2 ∗ x} : (36)


v
new Math().Double(1) [] new Math().Double(2) −→ oneof{2, 4}

Finite choice distributes over function calls.

2.8 Exception Handling


Exception handling is mandatory for a modern specification language. In any
case, it is necessary for AsmL because of the integration with .NET. The parallel
execution of AsmL-S means that several exceptions can be thrown at once.
Exception handling behaves as a finite choice for the specified caught exceptions.
If an exception is caught, the store (including updates) computed by the try-
expression is rolled back.
Semantic Essence of AsmL 249

In AsmL-S, exceptions are special values similar to literals. For technical


reasons, it is convenient to distinguish between literals and exceptions. Even
though exceptions are values, an exception cannot serve as the content of a field,
for example. (In the full AsmL, exceptions are instances of special exceptional
classes.) There are several built-in exceptions: argX , updateX , choiceX , etc. In
addition, one may use additional exception names e.g. fooX .
 
class A Fact(n as Int) as Int do if n ≥ 0 then
  
if n = 0 then 1 else Fact(n − 1) else throw factorialX : (37)
v
new A.Fact(−5) −→ factorialX

Custom exceptions may be generated by means of a throw-expression. Built-


in exceptions may also be thrown. Here, for instance, throw argX could appro-
priately replace throw factorialX .
Examples 38–40 explain exception handling.
   
let x = new Int → Int {} do try x[1] := 2 ; x[3] := 4/0 catch argX : 5
s,v  
−−→ {o → Int → Int}, ∅, ∅ , 5 (38)

The argument exception triggered by 4/0 in the try-expression is caught, at


which point the update ((x, 1), 2) is abandoned and evaluation proceeds with
the contingency expression 5. In general, the catch clause can involve a sequence
of exceptions: a “catch” occurs if the try expression evaluates to any one of the
enumerated exceptions. Since there are only finitely many built-in exceptions and
finitely many custom exceptions used in a program, a catch clause can enumerate
all exceptions. (This is common enough in practice to warrant its own syntactic
shortcut, though we do not provide one in the present paper.)
  v
try throw fooX catch barX , bazX : 1 −→ fooX (39)

Uncaught exceptions propagate up.


v
throw fooX  throw barX −→ oneof{fooX , barX } (40)

If multiple exceptions are thrown in parallel, one of them is returned nonde-


terministically.
v
throw fooX [] 1 −→ oneof{fooX , 1} (41)

Finite choice is “demonic”. This means that if one of the alternatives of a


choice expression throws an exception and the other one converges normally the
result might be either that the exception is propagated or that the value of the
normally terminating alternative is returned.

2.9 Expressions with Free Variables


Examples 1-41 illustrate operational semantics for closed expressions (containing
no free variables). In general, an expression e contains free variables. In this case,
250 Y. Gurevich, B. Rossman, and W. Schulte

operational semantics of e is defined with respect to an evaluation context (b, r)


consisting of a binding b for the free variables of e and a store r = (θ, ω, u) where
for each free variable x, b(x) is either a literal or a object identifier in dom(θ).
v
We write e −→ b,r v if computation of e in evaluation context (b, r) produces
value v.
v
x + y −→{x → 7, y → 11}, (∅,∅,∅) 18 (42)

v
[2] −→{ → o}, ({o → Int→Bool},{(o,2) → false},∅) false (43)
s,v
A more general notation e −−→ b,r s, v means that a computation of e in
evaluation context (b, r) produces new store s and value v.

2.10 Maps as Objects


This subsection expands Remark 3. It was prompted by a question of Robert
Stärk who raised the following example.

class A {f as Int → Bool , g as Int → Bool } :


let a = new A(new Int → Bool {1 → true, 2 → true},
new Int → Bool {}) do (44)
a.g := a.f ; a.x(2) := false
s,v 
−−→ {o1 → A, o2 → Int → Bool , o3 → Int → Bool },

{(o1 , f ) → o2 , (o1 , g) → o3 }, {((o1 , g), o2 ), (o2 , 2), false)} , void

In this example, the first assignment a.g := a.f is responsible for the update
((o1 , g), o2 ); the second assignment gives rise to the update ((o2 , 2), false). Thus,
a.g[2] has value false after all updates are executed.
This same program has a different semantics in the full AsmL, where maps
are treated as values rather than objects. In AsmL, the assignment a.g := a.f has
the effect of updating a.g to the value of a.f , i.e., the map {1 → true, 2 → false}.
The second assignment, a.f [2] := false, has no bearing on a.g. Thus, a.g[2] has
value true after all updates are executed.
In treating maps as objects in AsmL-S, we avoid having to introduce the
machinery of partial updates [7], which is necessary for the treatment of maps
as values in AsmL. This causes a discrepancy between the semantics of AsmL-S
and of AsmL. Fortunately, there is an easy AsmL-S expression that updates the
value of a map m1 to the value of another map m2 (without assigning m2 to
m1 ):

forall i in m1 do remove m1 [i] ; forall i in m2 do m1 [i] := m2 [i]

The first forall expression erases m1 ; the second forall expression copies m2
to m1 at all keys i in the domain of m2 .
Semantic Essence of AsmL 251

3 Syntax and Semantics


The syntax of AsmL-S is similar to but different from that of the full AsmL.
In this semantics paper, an attractive and user-friendly syntax is not a priority
but brevity is. In particular, AsmL-S does not support the offside rule of the full
AsmL that expresses scoping via indentation. Instead, AsmL-S uses parentheses
and scope separators like ‘:’.

3.1 Abstract Syntax


We take some easy-to-understand liberties with vector notation. A vector x̄ is
typically a list x1 . . . xn of items possibly separated by commas. A sequence
x1 α y1 , . . . , xn α yn can be abbreviated to x̄ α ȳ, where α represents a binary
operator. This allows us, for instance, to describe an argument sequence 1 as t1 ,
. . ., n as tn more succinctly as ¯ as t̄. The empty vector is denoted by .
Figure 1 describes the abstract syntax of AsmL-S. The meta-variables c, f , m,
, prim, op, lit, and exc, in Fig. 1 range over disjoint infinite sets of class names
(including Object), field names, method names, local variable names (including
me), primitive type symbols, operation symbols, literals, and exception names
(including several built-in exceptions: argX , updateX , . . .). Sequences of class
names, field names, method names and parameter declarations are assumed to
have no duplicates.
An AsmL-S program is a list of class declarations, with distinct class names
different from Object, followed by an expression, the body of the program. Each
class declaration gives a super-class, a sequence of field declarations with distinct
field names, and a sequence of method declarations with distinct method names.
AsmL-S has three categories of types — primitive types, classes and map
types — plus two auxiliary types, Null and Thrown. (Thrown is used in the static
semantics, although it is absent from the syntax.) Among the primitive types,
there are Bool , Int and Void . (Ironically, the type Void isn’t void but contains
a single literal void . That is how Void is in the C programming language and
its relatives. We decided to stick to this tradition.) There could be additional
primitive types; this makes no difference in the sequel.
Objects come in two varieties: class instances and maps. Objects are created
with the new operator only; more sophisticated object constructors have to be
programmed in AsmL-S. A new-class-instance expression takes one argument
for each field of the class, thereby initializing all fields with the given arguments.
A new-map expression takes a (possibly empty) sequence of key-values pairs,
called maplets, defining the initial map. Maps are always finite. A map can be
overridden, extended or reduced (by removing some of its maplets). AsmL-S
supports the usual object-oriented expressions for type testing and type casting.
The common sequential programming languages have only one way to com-
pose expressions, namely the sequential composition e1 ; e2 . To evaluate e1 ; e2 ,
first evaluate e1 and then evaluate e2 . AsmL-S provides two additional compo-
sitions: the parallel composition e1  e2 and the nondeterministic composition
e1 [] e2 . To evaluate e1  e2 , evaluate e1 and e2 in parallel. To evaluate e1 [] e2 eval-
252 Y. Gurevich, B. Rossman, and W. Schulte

pgm = cls : e programs


cls = class c extends c {fld mth} classes
fld = f as t fields
mth = m( as t) as t do e methods
lit = null | void | true | 0 | . . . literals
op = + | − | / | = | < | and | . . . primitive operations
prim = Bool | Int | Void | . . . primitive types
t = prim | Null | c | t → t normal types
exc = argX | updateX | choiceX | . . . exceptions
e = expressions
lit |  literals/local variables
| op(e) built-in operations
| let  = e do e local binding
| if e then e else e case distinction
| new c (e) creation of class instances
| new t → t {e → e} creation of maps
| e.f | e [e] | e.m(e) field/index/method access
| e.f := e field update
| e[e] := e | remove e[e] index update
| e is t type test
| e as t type cast
| e e | forall  in e do e parallel composition
| e [] e | choose  in e do e nondeterministic composition
| e;e | while e do e sequential composition
| try e catch exc : e exception handling
| throw exc explicit exception generation

Fig. 1. Abstract Syntax of AsmL-S

uate either e1 or e2 . The related semantical issues will be addressed later. The
while, forall and choose expressions generalize the two-component sequential,
parallel and nondeterministic compositions, respectively.
AsmL-S supports exception handling. In full AsmL, exceptions are instances
of special exception classes. In AsmL-S, exceptions are atomic values of type
Thrown. (Alternatively, we could have introduced a whole hierarchy of exception
types.) There are a handful of built-in exceptions, like argX ; all of them end with
“X”. A user may use additional exception names. There is no need to declare
new exception names; just use them. Instead of prescribing a particular syntactic
form to new exception names, we just presume that they are taken from a special
infinite pool of potential exception names that is disjoint from other semantical
domains of relevance.
Semantic Essence of AsmL 253

3.2 Class Table and Subtypes


It is convenient to view a program as a class table together with the expression
to be evaluated [10]. The class table maps class names different from Object to
the corresponding class definitions. The class table has the structure of a tree
with edge relation c  c meaning that extends c is in the declaration of c; we
say c is the parent of c. Object is of course the root of class tree.
Remark 4. Whenever the “extends c” clause is omitted in examples 1-43, there
is an implicit extends Object.
The subtype relation ≤ corresponding to a given class table is generated
recursively by the rules in Fig. 2, for arbitrary types t, t , t , τ, τ  and classes
c, c :

t ≤ t t ≤ t
• t ≤ t, ≤ is a reflexive partial order
t ≤ t
c  c
• ≤ extends the parent relation over classes
c ≤ c

• t → t ≤ Object maps are objects

t≤τ t ≤ τ 
• maps types are covariant in argument and result types
(t → t ) ≤ (τ → τ  )

t ≤ Object
• Null lies beneath all object types
Null ≤ t

• Thrown ≤ t Thrown lies beneath all other types

Fig. 2. Inductive Definition of the Subtype Relation

Call two types comparable if one of them is a subtype of the other; otherwise
call them incomparable. Primitive types compare the least. If t is a primitive
type, then t ≤ t and Thrown ≤ t are the only subtype relations involving t.
The following proposition is easy to check.
Proposition 1. Every two types t1 , t2 have a greatest lower bound t1 ' t2 . Every
two subtypes of Object have a least upper bound t1 t2 . '

Remark 5. One may argue that map types should be contravariant in the argu-
ment, like function types [13]. In the full paper, we discuss pros and cons of such
a decision.
If c is a class different from Object, then addf (c) is the sequence of distinct
field names given by the declaration of c. These are the new fields of c, acquired
254 Y. Gurevich, B. Rossman, and W. Schulte

in addition to those of parent(c). The sequence of all fields of a class is defined


by induction using the concatenation operation.

fldseq(Object) = 
fldseq(c) = addf (c) · fldseq(parent(c))

We assume that addf (c) is disjoint from fldseq(parent(c)) for all classes c. If
f is a field of c of type t, then fldtype(f, c) = t. If fldseq(c) = (f1 , . . . , fn ) and
fldtype(fi , c) = ti , then

fldinfo(c) = f¯ as t̄ = (f1 as t1 , . . . , fn as tn ).

The situation is slightly more complicated with methods because, unlike


fields, methods can be overridden. Let addm(c) be the set of method names
included in the declaration of c. We define inductively the set of all method
names of a class.

mthset(Object) = ∅
mthset(c) = addm(c) ∪ mthset(parent(c))

For each m ∈ mthset(c), dclr (m, c) is the declaration

m(1 as τ1 , . . . , n as τn ) as t do e

of m employed by c. We assume, as a syntactic constraint, that the variables i


are all distinct and different from me. The declaration dclr (m, c) is the declara-
tion of m in the class home(m, c) defined as follows:

m ∈ addm(c) m ∈ mthset(c) − addm(c)


home(m, c) = c home(m, c) = home(parent(c))

In the sequel, we restrict attention to an arbitrary but fixed class table.

3.3 Static Semantics


We assume that every literal lit has a built-in type littype(lit). For instance,
littype(2) = Int, littype(true) = Bool and littype(null ) = Null . We also assume
that a type function optype(op) defines the argument and result types for every
built-in operation op. For example, optype(and ) = (Bool , Bool ) → Bool .
Suppose e is an expression, possibly involving free variables. A type context
for e is a total function T from the free variables of e to types.
TT (e) is a partial function from expressions and type contexts to types. If
TT (e) is defined, then e is said to be well-typed with respect to T , and TT (e) is
called its static type.
The definition of TT (e) is inductive, given by rules in Fig. 3. See the full
paper for a more thorough exposition.
Semantic Essence of AsmL 255

Fig. 3. Static Types of Expressions in AsmL-S

3.4 Well-Formedness
We now make an additional assumption about the underlying class table: for
each class c and each method m ∈ mthset(c), m is well-formed relative to c
(symbolically: m ok in c).
The definition of m ok in c is inductive. Suppose dclr (m, c) = m(1 as τ1 ,
. . . , n as τn ) as t do e and c  c . Let T denote the type context {me →
c} ∪ {1 → τ1 , . . . , n → τn }.

m ∈ addm(c) − mthset(c ) TT (e) ≤ t



m ok in c

m ∈ mthset(c) − addm(c) m ok in c

m ok in c
256 Y. Gurevich, B. Rossman, and W. Schulte

m ∈ addm(c) ∩ mthset(c ) TT (e) ≤ t m ok in c


dclr (m, c ) = m(1 as τ1 , . . . , n as τn ) as t do e τ̄ → t ≤ τ̄  → t

m ok in c
The statement τ̄ → t ≤ τ̄  → t , in the final premise, abbreviates the inequal-
ities τ1 ≤ τ1 , . . . , τn ≤ τn and t ≤ t .

3.5 Operational Semantics


Suppose (b, r) is an evaluation context (Section 2.9) for an expression e, where
r = (θ, ω, u). Then (b, r) gives rise to a type context [b, r] defined by

θr (b()) if b() ∈ dom(θr )
[b, r]() =
littype(b()) if b() ∈ Literal .
We say e is (b, r)-typed if it is well-typed with respect to the type context
[b, r], that is, if T[b,r] (e) is defined.
In the full paper we define an operator Eb,r over (b, r)-typed expressions. The
computation of Eb,r is in general nondeterministic (as it relies on external func-
tions freshid and oneof) and it may diverge (as it is recursive but not necessarily
well-founded). If it converges, it produces an effect Eb,r (e) = (s, v) where s is a
s,v
store and v is a value, that is, e −−→ b,r s, v in the notation of Section 2.9.
After seeing the examples in Section 2, the reader should have a fairly good
idea how the effect operator works. Figure 4 gives a few of the rules that comprise
the definition of Eb,r (e); each rule covers a different kind of expression e. See the
full paper for a complete set of rules defining the effect operator.

4 Analysis
The effect operator is monotone with respect to stores: if Eb,r (e) = (s, v) then
r is a substore of s. Furthermore, if v is an exception then r = s, meaning that
the store is rolled back whenever an exception occurs.
In addition to these properties, the static-type and effect operators satisfy
the usual notions of type soundness and semantic refinement. See the full paper
for precise statements and proofs of the theorems mentioned in this section.
The type of an effect (s, v), where s = (θ, ω, u), is defined as follows:


⎨θ(v) if v ∈ dom(θ)
type(s, v) = littype(v) if v ∈ Literal


Thrown if v ∈ Exception.

Theorem 2 (Type Soundness). For every evaluation context (b, r) and every
(b, r)-typed expression e, we have
type(Eb,r (e)) ≤ T[b,r] (e)
for any converging computation of Eb,r (e).
Semantic Essence of AsmL 257

Fig. 4. Examples of rules from the definition of Eb,r (e)

In the full paper we define a relation  of semantic refinement among ex-


pressions. (More accurately, a relation T is defined for each type context T .)
The essential meaning of e1  e2 is that, for all evaluation contexts (b, r),

– computation of Eb,r (e1 ) potentially diverges only if computation of Eb,r (e2 )


potentially diverges, and
– the set of “core” effects of convergent computations of Eb,r (e1 ) is included
in the set of “core” effects of convergent computations of Eb,r (e2 ).

Roughly speaking, the “core” of an effect (s, v) is the subeffect (s , v  ) that
remains after a process of garbage-collection relative to the binding b: we remove
all but the objects reachable from values in rng(b).
The refinement relation  has the following property.
258 Y. Gurevich, B. Rossman, and W. Schulte

Theorem 3 (Refinement). Suppose e0 , e0 , e1 are expressions where e0 is a subex-


pression of e1 and e0 refines e0 . Let e1 be the expression obtained from e1
by substituting e0 in place of a particular occurrence of e0 . Then e1 refines
e1 .

Here is a general example for refining binary choice expressions:


 
e0  (true [] false) =⇒ if e0 then e1 else e2  e1 [] e2

To give a similar general example involving the choose construct, we need a


c.d.
relation  of choice-domain refinement defined in the full paper.

Proposition 4
   
e1  e1 =⇒ choose  in e1 do e2  choose  in e1 do e2
c.d.

References
1. The AsmL webpage, http://research.microsoft.com/foundations/AsmL/.
2. Dines Bjoerner and Cliff B. Jones (Editors), “Formal Specification and Software
Development”, Prentice-Hall International, 1982.
3. Egon Boerger and Robert Staerk, “Abstract State Machines: A Method for High-
Level System Design and Analysis”, Springer, Berlin Heidelberg 2003.
4. Foundations of Software Engineering group, Microsoft Research,
http://research.microsoft.com/fse/
5. Yuri Gurevich, “Evolving Algebra 1993: Lipari Guide”, in “Specification and Val-
idation Methods”, Ed. E. Boerger, Oxford University Press, 1995, 9–36.
6. Yuri Gurevich, “For every Sequential Algorithm there is an Equivalent Sequential
Abstract State Machine”, ACM Transactions on Computational Logic 1:1 (2000),
pages 77–111.
7. Yuri Gurevich and Nikolai Tillmann, “Partial Updates: Exploration”, Springer J.
of Universal Computer Science, vol. 7, no. 11 (2001), pages 918-952.
8. Yuri Gurevich, Benjamin Rossman and Wolfram Schulte, “Semantic Essence of
AsmL”, submitted for publication. A preliminary version appeared as Microsoft
Research Technical Report MSR-TR-2004-27, March 2004.
9. James K. Huggins, ASM Michigan web page, http://www.eecs.umich.edu/gasm.
10. Atsushi Igarashi, Benjamin C. Pierce and Philip Wadler, “Featherweight Java:
a minimal core calculus for Java and GJ”, ACM Transactions on Programming
Languages and Systems (TOPLAS) 23:3 (May 2001), 396–450.
11. Gilles Kahn, “Natural semantics”, In Proc. of the Symposium on Theoretical As-
pects of Computer Science, Lecture Notes in Computer Science 247 (1987), 22–
39.
12. Robin Milner, Mads Tofte, Robert Harper, and David MacQueen, “The Definition
of Standard ML (Revised)”, MIT Press, 1997.
13. Benjamin C. Pierce, “Types and Programming Languages”, MIT Press, Cam-
bridge, Massachusetts, 2002
Semantic Essence of AsmL 259

14. Gordon D. Plotkin, “Structural approach to operational semantics”, Technical re-


port DAIMI FN-19, Computer Science Department, Aarhus University, Denmark,
September 1981
15. J. M. Spivey, “The Z Notation: A Reference Manual”, Prentice-Hall, New York,
Second Edition, 1992.
An MDA Approach to Tame Component Based
Software Development

Jean-Marc Jézéquel, Olivier Defour, and Noël Plouzeau

IRISA - Université de Rennes 1


Campus universitaire de Beaulieu, Avenue du général Leclerc
35042 Rennes Cedex, France
{jean-marc.jezequel, olivier.defour, noel.plouzeau}@irisa.fr
http://www.irisa.fr/triskell

Abstract. The aim of this paper is to show how the Model Driven Architecture
(MDA) can be used in relation with component based software engineering. A
software component only exhibits its provided or required interfaces, hence de-
fining basic contracts between components allowing one to properly wire them.
These contractually specified interfaces should go well beyond mere syntactic
aspects: they should also involve functional, synchronization and Quality of
Service (QoS) aspects. In large, mission-critical component based systems, it is
also particularly important to be able to explicitly relate the QoS contracts at-
tached to provided interfaces with the QoS contracts obtained from required in-
terfaces. We thus introduce a QoS contract model (called QoSCL for QoS Con-
straint Language), allowing QoS contracts and their dependencies to be mod-
eled in a UML2.0 modeling environment. Building on Model Driven Engineer-
ing techniques, we then show how the very same QoSCL contracts can be ex-
ploited for (1) validation of individual components, by automatically weaving
contract monitoring code into the components; and (2) validation of a compo-
nent assembly, including getting end-to-end QoS information inferred from in-
dividual component contracts, by automatic translation to a Constraint Logic
Programming language. We illustrate our approach with the example of a GPS
(Global Positioning System) software component, from its functional and con-
tractual specifications to its implementation in a .Net framework.

1 Introduction

Szyperski [22] remarked that while objects were good units for modular composition
at development time, they were not so good for deployment time composition, and he
formulated the now widely accepted definition of a software component: “a software
component is a unit of composition with contractually specified interfaces and explicit
context dependencies only. A software component can be deployed independently and
is subject to composition by third-party”. In this vision, any composite application is
viewed as a particular configuration of components, selected at build-time and con-
figured or re-configured at run-time, as in CORBA [15], or .NET [20].
A software component only exhibits its provided or required interfaces, hence de-
fining basic contracts between components allowing one to properly wire them.

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 260–275, 2004.
© Springer-Verlag Berlin Heidelberg 2004
An MDA Approach to Tame Component Based Software Development 261

These contractually specified interfaces should go well beyond mere syntactic as-
pects: they should also involve functional, synchronization and Quality of Service
(QoS) aspects. In large, mission-critical component based systems, it is also particu-
larly important to be able to explicitly relate the QoS contracts attached to provided
interfaces with the QoS contracts obtained from required interfaces.
It is then natural that people resorted to modelling to try to master this complexity.
According to Jeff Rothenberg, “Modeling, in the broadest sense, is the cost-effective
use of something in place of something else for some cognitive purpose. It allows us
to use something that is simpler, safer or cheaper than reality instead of reality for
some purpose. A model represents reality for the given purpose; the model is an ab-
straction of reality in the sense that it cannot represent all aspects of reality. This
allows us to deal with the world in a simplified manner, avoiding the complexity,
danger and irreversibility of reality.” Usually in science, a model has a different na-
ture that the thing it models. Only in software and in linguistics a model has the same
nature as the thing it models. In software at least, this opens the possibility to auto-
matically derive software from its model. This property is well known from any com-
piler writer (and others), but it was recently be made quite popular with an OMG
initiative called the Model Driven Architecture (MDA).
The aim of this paper is to show how MDA can be used in relation with compo-
nent based software engineering. We introduce a QoS contract model (called QoSCL
for QoS Constraint Language), allowing QoS contracts and their dependencies to be
modeled in a UML2.0 [13] modeling environment. Building on Model Driven Engi-
neering techniques, we then show how the very same QoSCL contracts can be ex-
ploited for (1) validation of individual components, by automatically weaving con-
tract monitoring code into the components; and (2) validation of a component assem-
bly, including getting end-to-end QoS information inferred from individual compo-
nent contracts, by automatic translation to a Constraint Logic Programming.
The rest of the paper is organized as follows. Using the example of a GPS (Global
Positioning System) software component, Section 2 introduces the interest of model-
ling components, their contracts and their dependencies, and describes the QoS Con-
straint Language (QoSCL). Section 3 discusses the problem of validating individual
components against their contracts, and proposes a solution based on automatically
weaving reusable contract monitoring code into the components. Section 4 discusses
the problem of validating a component assembly, including getting end-to-end QoS
information inferred from individual component contracts by automatic translation to
a Constraint Logic Programming. This is applied to the GPS system example, and
experimental results are presented. Finally, Section 5 presents related works.

2 The QoS Contracts Language

2.1 Modeling Component-Based Systems

In modelling techniques such as UML2.0 for example, a component is a behavioural


abstraction of a concrete physical piece of code, called artifacts. A component has
required and provided ports, which are typed by interfaces. These interfaces represent
262 J.-M. Jézéquel, O. Defour, and N. Plouzeau

the required and provided services implemented by the modelled artifact. The rela-
tionship between the required and provided services within one component must be
explicitly stated. The knowledge of this relationship is of utmost importance to the
component-based application designer. In the rest of this section, we address this
relationship using the example of a GPS device.
A GPS device computes its current location from satellite signals. Each signal con-
tains data which specifies the identity of the emiting satellite, the time of its emission,
the orbital position of the satellite and so on. In the illustrating example, each satellite
emits a new data stream every fifteen seconds.
In order to compute its current location, the GPS device needs at least three signals
from three different satellites. The number of received signals is unknown a priori,
because obstacles might block the signal propagation.
Our GPS device is modeled as a component which provides a getLocation() ser-
vice, and requires a getSignal() service from Satellites components. The GPS compo-
nent is made up of four components:
− the decoder which contains twelve satellite receivers (only three are shown on
Fig. 1). This element receives the satellite streams and demutiplexes it in order to
extract the data for each satellite. The number of effective data obtained via the
getData() service depends not only on the number of powered receivers, but also
on the number of received signals. Indeed, this number may change at any time.
− The computer which computes the current location (getLocation()) from the data
(getData()) and the current time (getTime()).
− The battery which provides the power (getPower()) to the computer and the de-
coder.
− The clock component which provides the current time (getTime()).

GPS
decoder
computer getSignal()
getLocation() getData()
receiver satellite

receiver satellite
getTime() getPower()

receiver satellite
clock battery

Fig. 1. The GPS component-based model

2.2 Contract Aware Components

In component-based models, the services are usually specified at a syntactic level.


This level of specification is not precise enough. Indeed, a service can be unavailable
according to the state of the environment and, reciprocally, the environment can be
modified by the execution of a service.
Following [2] component contracts can be classified into four levels. The first level is
the type compatibility. The second level adds pre/post-conditions: the operation’s be-
An MDA Approach to Tame Component Based Software Development 263

havior is specified by using Boolean assertions for each service offered, called pre and
post-conditions, as well as class invariants [14]. The third level adds synchronization
constraints and the fourth level provides extra-functional constraints. To be more pre-
cise, we can build on the well-known idea of design-by-contract [12] negotiable con-
tracts for components. These contracts ensure that a service will perform correctly.
In the previous section 2.1, we have said that a dependency relationship always ex-
ists inside one component between its provided and required services. A component
provides its services inasmuch as its environment provides the services that it re-
quires. All components always support this implicit contract. The extra-functional
properties, which are intrinsic features of services, inherit this dependency relation-
ship. The quality of a provided service depends on the quality of required services
that it depends on. This fact is illustrated in our example.
The GPS application contains several time out constraints. For instance, the pro-
vided getLocation() service must ensure that it is completed in a delay less than 30s,
whereas the getData() service must be completed in less than 25 s for example.
However, it is obvious that the time spent to acquire data from the decoder, de-
noted TethaD, has a direct impact on the global cost in time of the getLocation()
service, denoted ThetaC. Not only ThetaC depends on ThetaD, but also on the num-
ber of active receivers, denoted Nbr, because of the interpolation algorithm imple-
mented by the Computer component. ThetaD and Nbr are two extra-functional prop-
erties associated to the getData() service provided by the Decoder component. The
relation that binds these three quantities is:
ThetaC = ThetaD + Nbr * log ( Nbr ) . (1)
Each receiver demultiplexes a signal, in order to extract the data. This operation
has a fixed time cost: nearly 2 seconds. In addition, the demultiplexed signals must be
transformed into a single data vector. This operation takes 3 s. If ThetaR (resp.
ThetaS) denotes the time spent by the receiver to complete the getDatal() service
(resp. the satellite to complete its getSignal() service), then we have the two follow-
ing formulae:
ThetaR = ThetaS + 2 , (2)
ThetaD = max ( ThetaR ) + 3 . (3)
There exist many QoS contracts languages which allow the designer to specify the
extra-functional properties and their constraints on the provided interfaces only (see
section 5). However, none of them allow specifying dependency relationships be-
tween the provided and required services of a component. To overcome this limita-
tion we introduce the QoS Constraint Language (QoSCL). This language includes the
fundamental QoS concepts defined in the well-known precursor QML [5]. It is the
cornerstone to implement in a second time a QoS prediction tool.

2.3 Extra-Functional Dependencies with QoSCL

Our own contract model for extra-functional contracts extends the UML2.0 compo-
nents metamodel. We designed the QoSCL notation with the following objectives in
mind.
264 J.-M. Jézéquel, O. Defour, and N. Plouzeau

1. Since the extra-functional contracts are constraints on continuous values within


multidimensional spaces, we wanted to keep the QML definitions of dimensions
and contract spaces.
2. Since our extra-functional contracts would be used on software components with
explicit dependency specification, we needed means to express a provided con-
tract in terms of required contracts.
3. Since we targeted platform independent designs, we wanted to use the UML no-
tation and its extension facilities.
We thus designed our extra-functional contract notation as an extension of the
component part of the UML2.0 metamodel:
− Dimension: is a QoS property. This metaclass inherits the operation metaclass.
According to our point of view, a QoS property is a valuable quantity and has to
be concretely measured. Therefore we have chosen to specify a means of
measurement rather than an abstract concept. Its parameters are used to specified
the (optional) others dimensions on which it depends. The type of a Dimension is
a totally ordered set, and it denotes its unit. The pre and post-conditions are used
to specify constraints on the dimension itself, or its parameters.
− ContractType: specializes Interface. It is a set of dimensions defining the contract
supported by an operation. Like an interface, a ContractType is just a specifica-
tion without implementation of its dimensions.
− Contract: is a concrete implementation of a ContractType. The dimensions speci-
fied in the ContractType are implemented inside the component using the aspect
weaving techniques (see section 3). An isValid() operation checks if the contract
is realized or not.
− QoSComponent extends Component, and it has the same meaning. However, its
ports provides not only required and provided interfaces which exhibit its func-
tional behaviour, but also ContractTypes dedicated to its contractual behaviour.

UML2.0 QoSCL

Component QoSComponent

provided required provided required


«implements»
Interface contract ContractType Contract
*
ownedOperation 1 service ownedOperation
Operation Dimension

formalParameter type
Parameter Type
body, pre, post
Constraint

Fig. 2. The QoSCL metamodel


An MDA Approach to Tame Component Based Software Development 265

With the QoSCL metamodel, it is possible to specify contracts, such as the Time-
Out contract useful for our GPS, as an Interface in any UML case tool:

« interface »
TimeOutContract
+timeOut():bool +dimension
+start():bool
« operation »
+onTimeE vent(source:object, e:ElapsedEventArgs)
delay
+return:double

TimeOutC
-delay:double
+delay():double
+timeOut():bool
+start():bool
+isValid():bool
+onTimeE vent(source:object, e:ElapsedEventArgs)

Fig. 3. The TimeOut contract with QoSCL

The QoSCL metamodel handles three specific aspects of contracts: dependency,


composition, and adaptative behaviour. The dependency is the core of this work, and
our main contribution to enhance existing extra-functional contracts specification
languages, such as QML. QoSCL makes it also possible to model a composite
contract via generalization association. At last, like any abstract functional model, it is
possible to implement different behaviors for the same Operation, such as a Dimen-
sion. Thus, the renegotiation of a contract can be implemented according to its
environment. This behavior can be specified thanks the UML2.0 sequence diagrams,
activity diagrams or state machine for instance.

3 Implementing Contract-Aware Components


QoSCL allows the expression of functional and extra-functional properties in a soft-
ware component. The declared properties are useful to the software designer because
this gives predictability to a component's behaviour. However, this predictability is
valid only if the component implementation really has the behaviour declared by the
component. This implementation validity is classical software validation problem,
whatever the kind of contracts used [11].
These problems are usually addressed by two families of techniques. A first family
is based on testing: the system under test is run in an environment that behaves as
described in a test case. An oracle observes the behaviour of the system under test and
then decides whether the behaviour is allowed by the specification. A second family
of techniques relies on formal proof and reasoning on the composition of elementary
operations.
Standard software validation techniques deal with pre/post-condition contract
types [12]. Protocol validation extends this to the synchronization contract types [8].
The rest of this section discusses issues of testing extra-functional property confor-
mance.
266 J.-M. Jézéquel, O. Defour, and N. Plouzeau

3.1 Testing Extra Functional Behaviour

Level 3 contracts (i.e. contracts that include protocols) are more difficult to test be-
cause of non-deterministic behaviours of parallel and distributed implementations.
One of the most difficult problems is the consistent capture of data on the behaviour
of the system's elements. Level 4 contracts (i.e. extra-functional properties) are also
difficult to test for quite similar reasons. Our approach for testing level 4 contracts
relies on the following features:
− existence of probes and extra-functional data collection mechanisms (monitors);
− test cases;
− oracles on extra-functional properties.
In order to be testable, a component must provide probe points where basic extra-
functional data must be available. There are several techniques to implement such
probe points and make performance data available to the test environment.
1. The component runtime may include facilities to record performance data on
various kinds of resources or events (e.g. disk operations, RPC calls, etc). Mod-
ern operating systems and component frameworks now provide performance
counters that can be “tuned” to monitor runtime activity and therefore deduce
performance data on the component’s service.
2. The implementation of the component may perform extra computation to monitor
its own performance. This kind of “self monitoring” is often found in compo-
nents that are designed as level 4 component from scratch (e.g. components pro-
viding multimedia services).
3. A component can be augmented with monitoring facilities by weaving a specific
monitor piece of model or of code. Aspect-oriented design (AOD) or aspect-
oriented programming can help in automating this extension.
We have chosen this latter approach as our main technique for designing monitors.
This choice was motivated mainly by the existence of “legacy” components from
industrial partners [17]. From a software design process point of view, we consider
that designing monitors is a specialist’s task. Monitors rely on low level mechanisms
and/or on mechanisms that are highly platform dependant. By using aspect-oriented
design (AOD), we separate the component implementation model into two main
models: the service part that provides the component’s functional services under
extra-functional contracts, and the monitor part that supervises performance issues. A
designer in charge of the “service design model” does not need to master monitor
design. A specific tool1 (a model transformer) [24] is used to merge the monitor part
of the component with its service part.
More precisely, a contract monitor designer provides component designers with a
reusable implementation of a monitor. This implementation contains two items: a
monitor design model and a script for the model transformer tool (a weaver). The
goal of this aspect weaver is to modify a platform specific component model by inte-
grating new QoSCL classes and modifying existing class and their relationships.

1 The Kase tool is developed by TU-Berlin with the support of the European Project “Quality
Control of Component-based Software” (QCCS) [17].
An MDA Approach to Tame Component Based Software Development 267

3.2 A Practical Example of Weaving

As we have said in the last paragraph of section 2, QoSCL allows us to model the
structural, behavioral and contractual components features. These three aspects can be
specified using the dedicated UML2.0 diagrams. The QoS aspect weaver is a mecha-
nism integrated into Kase, which:
− modifies the UML diagram (add new classes and associations)
− modifies the behavior of the targeted service
Thanks to QoSCL, it is possible to specify into Kase the contract types and their
implementation such as TimeOut and TimeOutC (Fig. 4). According to our vision,
detailed in the QoSCL section (§2.3), the TimeOut contract is an interface, which has
a special operation denoting the “delay” dimension. The TimeOutC is a .Net class that
implements the TimeOut interface. The value of the “delay” dimension is imple-
mented like a private attribute (-delay:double) and its related access/evaluation
method (delay():double).
A QoS aspect not only specifies how the structural diagram will be modified, but
also how the monitored part and the monitor cooperate: when does the timer start,
when does it stop, who handles timeout, etc… This part of the aspect is specified
using the Hierarchical Message Sequence Charts (HMSC) notation in the UML 2.0.
Fig. 5 shows the behavior of a contractual service, called op(), as a HMSC diagram.
The op() operation is the service which must verify a TimeOut contract. The op_c()
operation is a new operation, which realizes the op() service and evaluates the Time-
Out contract below (Fig. 5). This service has two possible behaviors, depending on
whether the op() service finishes before or after the timer.

Contract

« delegate » « interface »
MyDelegate
ContractType
ContractType
+myDelegate():bool +isValid():bool

Dimension
TimeOut
System

« interface »
TimeOut Contract ComponentModel
+timeOut():bool Component
+start():bool
« operation »
+onTimeEvent(source:object, e:ElapsedE ventArgs)
delay
+return:double
Timer

TimeOut C Timer
-delay:double
+delay():double
ElapsedEventArgs
+timeOut():bool
+start():bool ContractQoSCL
+isValid():bool
+onTimeEvent(source:object, e:ElapsedE ventArgs)

Fig. 4. TheTimeOut contract model for .Net


268 J.-M. Jézéquel, O. Defour, and N. Plouzeau

In addition of its structural (Fig. 4) and behavioral (Fig. 5) parts, a contractual QoS
aspect has pre-conditions that must be met at weaving time. For example, a :Cl class
abides a TimeOut contract under the condition that it implements the op() service of
course. In our tool, the aspect is concretely weaved in the UML diagram by a Python
script, which:
− checks the aspect pre-conditions;
− weaves the aspect if these preconditions are satisfied, and this weaving adds new
classes, modifies constructors and operations, etc).
The QoS aspect weaver implemented in the Käse tool allows us to:
− specify a QoS aspect;
− implement an evaluation of this aspect for a targeted service.
According to the QoSCL point of view, contracts can be specified at design time as
specialized interfaces. Therefore, connecting two components at binding time is easy,
using their respectively compliant required and provided interfaces. The QoS aspect
weaver implemented in Käse allows to implement in C# any contract type.
In case of failure, an extra-functional contract can be renegotiated. For instance, a
time out contract that fails too often obviously needs to be adjusted (alternatively the
service bound to that contract has to be shut down).

Fig. 5. Behavior of the op() service with evaluation of a TimeOut contract


An MDA Approach to Tame Component Based Software Development 269

3.3 Limitations of Extra-Functional Property Testing

The QoSCL notation and the monitor integration technique help the component
designer to define and check extra-functional properties. However, application
designers rely on component assemblies to build applications. These designers
need to estimate at design time the overall extra-functional properties of a given
assembly. Using the techniques presented above, they can perform a kind of inte-
gration testing. The tests aim at validating the extra-functional behavior of the
assembly with respect to the global specification of the application. However, the
application designers often have trouble to select and configure the components,
make the assembly and match the global application behavior. Conversely, some
applications are built with preconfigured components and the application designer
needs to build a reasonable specification of the overall extra-functional behavior of
the application.

4 Predicting Extra-Functional Properties of an Assembly


4.1 Modeling a QoS-Aware Component with QoSCL

QoSCL is a metamodel extension dedicated to specify contracts whose extra-


functional properties have explicit dependencies. Models can be used by aspect weav-
ers in order to integrate the contractual evaluation and renegotiation into the compo-
nents. However, at design time, it is possible to predict the global quality of the
composite software.
Predicting a behaviour is difficult. In the best cases, the behaviour can be proved
but this. Otherwise, the behaviour is predicted with uncertainty. Since we want to
predict the quality of a composite, i.e. la value of a set of extra-functional properties,
this uncertainty will be translated into a numerical interval or an enumerated set of
values, called validity domains.
The dependencies defined in QoSCL, which bind the properties, are generally ex-
pressed either as formulae or as rules. The quality of a service is defined as the extra-
functional property’s membership of a specific validity domain. Predicting the global
quality of a composite is equivalent to the propagation of the extra-functional validity
domains through the dependencies.
For instance, we have defined in section §2.2 a set of extra-functional properties
that qualifies different services in our GPS component-based model. In addition, we
have specified the dependencies between the extra-functional properties as formulae.
This knowledge can be specified in QoSCL. The Fig. 6 below represents the computer
component (Fig. 1) refined with contractual properties and their dependencies:
The rules that govern the connection between two (functional) ports are also valid
for ports with required or provided ContractTypes. Thus, a port that requires a service
with its specific QoS properties can only be connected to another Port that provides
this service with the same quality attributes.
270 J.-M. Jézéquel, O. Defour, and N. Plouzeau

provided
« QoSComponent »
required
Computer
service service
getLocation() getData()

provided required
thetaC( thetaD, nbr, P) thetaD()
qualities eps( thetaD, nbr, P) nbr() qualities
required required

getPower() P()
service qualities

: Interface : ContractType : Port

Fig. 6. Quality attributes and dependencies specification of a component

Specifying the QoS properties of required and provided services of a component is


not enough to predict the quality of an assembly at design time. Additional informa-
tion must be supplied:
− constraints on the value of the QoS properties are needed to get the parties to nego-
tiate and to agree; they explain the level of quality required or provided for a service
by a component;
− the dependency between these values is an important kind of relationship; it can
be described either as with a function (for instance: ThetaC = ThetaD + Nbr *
log( Nbr ) (1)) or with a rule (if Nbr = 3 and Eps = medium then ThetaC ≥ 25).
In other words, these constraints can be stated as OCL [14] pre and post-conditions
on the Dimensions. For instance:
context Computer::thetaC( thetaD : real, nbr : int,
P : real) : real
pre: thetaD >= 0 and P >= 0
post: result = thetaD + nbr * log( nbr ) and P =
3*nbr
At design time, the global set of pre and post-conditions of all specified Dimen-
sions of a component builds a system of non-linear constraints that must be satisfied.
The Constraint Logic Programming is the general framework to solve such systems.
Dedicated solvers will determine if a system is satisfied, and in this case the admissi-
ble interval of values for each dimension stressed.

4.2 Prediction of the GPS Quality of Service

In this section we present the set of constraints for the GPS component-based model
(Fig. 1). A first subset of constraints defines possible or impossible values for a QoS
property. These admissible value sets come on the one hand from implementation or
technological constraints and on the other hand from designers’ and users’ require-
ments about a service. The fact that the Nbr value is 3, 5 or 12 (2), or ThetaC and
ThetaD values must be real positive values (3-4) belongs to the first category of con-
An MDA Approach to Tame Component Based Software Development 271

straints. Conversely, the facts that Eps is at least medium (5) and P is less or equal
than 15mW (6) are designers or users requirements.
Nbr ∈ {3, 5, 12}, (2)
ThetaC ≥ 0, (3)
ThetaD ≥ 0, (4)
Eps ∈ {medium, high}, (5)
P ≤ 15. (6)

Secondly, constraints can also explain the dependency relationships that bind the
QoS properties of a component. For instance, the active power level P is linearly
dependent on the Nbr number of receivers according to the formula:
P = 3 * Nbr. (7)
Moreover, the time spent by the getLocation() service (ThetaC) depends on the
time spend by the getData() service (ThetaD) and the number of data received (Nbr),
according the equation (1). Lastly, a rule binds the precision Eps, the time spent to
compute the current position ThetaC and the number of received data (Nbr). The
following diagram (Fig. 7) presents this rule:

Fig. 7. The rule that binds the Eps, Nbr and ThetaC dimensions

All these constraints, expressed in OCL syntax, can be translated into a specific
CLP-compliant language, using a Model Transfomation [24]. For instance, we pre-
sent below the result of a such transformation applied to the computer QoSCompo-
nent (Fig. 6) and its OCL conditions (using the Eclipse™ syntax):
01- computer( [ ThetaC, Eps, P, ThetaD, Nbr ] ) :-
02- ThetaC $>= 0, Eps = high, P $>= 0,
03- ThetaD $>= 0, member( Nbr, [3,5,12]),
04- ThetaC $>= 0, ThetaD $>= 0,
05- ThetaC $= ThetaD + Nbr * log(Nbr),
06- P $= Nbr * 3,
07- rule( Eps, ThetaC, Nbr).
08-
09- rule( medium, ThetaC, 3) :- ThetaC $=< 25.
10- rule( low, ThetaC, 3) :- ThetaC $> 25.
11- rule( high, ThetaC, 5) :- ThetaC $=< 24.
12- rule( medium, ThetaC, 5) :- ThetaC $>24,
272 J.-M. Jézéquel, O. Defour, and N. Plouzeau

13- ThetaC $=< 30.


14- rule( low, ThetaC, 5) :- ThetaC $> 30.
15- rule( high, ThetaC, 12) :- ThetaC $=< 32.
16- rule( medium, ThetaC, 12) :- ThetaC $> 32,
17- ThetaC $=<45.
18- rule( low, ThetaC, 12) :- ThetaC $> 45.
The first line (01) indicates the QoS properties bound by the component. The two
following lines (02, 03) are the constraints on the admissible values for these QoS
properties, and lines 05 to 07 are the dependency relationships (1-7 and Fig. 7) that
bind them.
For each component, it is necessary to check its system of constraints, in order to
compute its availability. The result of such request is the whole of admissible values
for the QoS properties of the component. Thus, for the computer component, the
solutions for the admissible QoS properties values are enumerated below:

ThetaC ThetaD Eps P Nbr


[3.49 .. 24.0] [0.0 .. 20.51] high 15 5
[12.95 .. 32.0] [0.0 .. 19.05] high 36 12

The requirement about the estimated position (Eps = high) implies that:
− the number of data channels must be either 5 or 12,
− consequently, the active power is either 15 or 36mW,
− and the response times of the getLocation() ands getData() services are respec-
tively in the [3.49; 32.0] and [0.0; 20.51] real intervals.
At this time, the designer knows the qualitative behavior of all of its components.
It is also possible to know the qualitative behavior of an assembly, by conjunction of
the constraints systems and unification of their QoS properties.
The following constraint program shows the example of the GPS component:
19- satellite( [ ThetaS ] ) :-
20- ThetaS $>= 15, ThetaS $=< 30.
21-
22- battery( [ P ] ) :-
23- P $>= 0,
24- P $=< 15.
25-
26- receiver( [ ThetaR, ThetaS ] ) :-
27- ThetaR $>= 0, ThetaS $>= 0,
28- ThetaR $= ThetaS + 2.
29-
30- decoder( [ ThetaD, ThetaS, Nbr ] ) :-
31- ThetaD $>= 0, ThetaS $>= 0,
32- member( Nbr, [3,5,12]),
33- receiver( [ ThetaR, ThetaS ] ),
34- ThetaD $= ThetaR + 3.
35-
36- gps( [ ThetaC, Eps, ThetaS ] ) :-
37- ThetaC $>= 0, Eps = high, ThetaS $>= 0,
An MDA Approach to Tame Component Based Software Development 273

38- computer( [ ThetaC, Eps, P, ThetaD, Nbr ] ),


39- decoder( [ ThetaD, ThetaS, Nbr ] ),
40- battery( [ P ] ).
Similarly, the propagation of numerical constraints over the admissible sets of val-
ues implies the following qualitative prediction behavior of the GPS assembly:

ThetaC ThetaS Eps


[23.49 .. 24.0] [15.0 .. 15.50] high

The strong requirement on the precision of the computed location implies that the
satellite signals have to be received by the GPS component with a delay less than
15.5 s. In this case, the location will be computed in less than 24 s.

5 Related Work

In the Component-Based Software Engineering community, the concept of predict-


ability is getting more and more attention, and is now underlined as a real need [4].
Thus, the Software Engineering Institute (SEI) promotes its Predictable Assembly
from Certifiable Components (PACC) initiative: how component technology can be
extended to achieve predictable assembly, enabling runtime behavior to be predicted
from the properties of components. The ongoing work concentrates in a Prediction-
Enabled Component Technology (PECT) as a method to integrate state-of-the-art
techniques for reasoning about software quality attributes [23].
In the introduction of the SEI’s second workshop on predictable assembly [21], the
authors note that component interfaces are not sufficiently descriptive. A syntax for
defining and specifying quality of service attributes, called QML, is defined by Frol-
und and Koistinen in [5], directly followed by Aagedal [1]. The Object Management
Group (OMG) has developed its own UML profile for schedulability, performance
and time specification [16]. These works emphasize the contractual use of QoS prop-
erties, and constitute the fundamental core of QoS specifications.
In the previous approaches, a QoS property is specified as a constant: they do not
allow the specification of QoS properties dependency relationships. In contrast,
Reussner proposes its parameterized contracts [18]: the set of available services pro-
vided by a component depends on its required services that the context can provide.
This concept is a generalization of the design-by-contract [11]. The same author has
published in 2003 a recent extension of its work dedicated to the QoS [19]. He mod-
els the QoS dependency with Markov chains where:
− the states are services, with their QoS values;
− the transitions represent the connections (calls) between components, i.e. the ar-
chitecture of an assembly;
− the usage profile of the assembly is modeled by probabilities for calls to a pro-
vided service. Usage profiles are commonly modeled by Markov chains since
Cheung [3] or Whittaker and Thomason [25].
From an assembly model and its usage profile, it seems possible to generate the as-
sociated Markov chain and to predict the QoS level of provided services. Conversely,
274 J.-M. Jézéquel, O. Defour, and N. Plouzeau

it is not possible to invert the prediction process, in order to propagate a particular


QoS requirement applied on a provided service on the QoS properties of required
services that it depends. Moreover, via the Chapman-Kolmogorov equation, the
Markov processes handle only probabilities, and they are not able to reason about
2
formal un-valued variables. For instance, it is impossible to compare the n and
n*log(n) complexity of two sort algorithms.
Constraints solvers over real intervals and finite domains have already been used
in the context of the software engineering. For instance, logic programming tech-
niques can generate test cases for functional properties. More precisely, this technique
allows a more realistic treatment of bound values [10]. About the software functional
aspect, many authors have successfully used the constraints logic programming,
based on translations from the source code to test or its formal specification into con-
straints: the GATEL system [9] translates LUSTRE [7] expressions, and A. Gotlieb
defines directly its transformation from C [6]. The works mentioned above focus on
the functional aspects of software only, while our approach encompasses extra-
functional properties.

6 Conclusion and Future Work

In mission-critical component based systems, it is particularly important to be able to


explicitly relate the QoS contracts attached to provided interfaces of components with
the QoS contracts obtained from their required interfaces. In this paper we have in-
troduced a notation called QoSCL (defined as an add-on to the UML2.0 component
model) to let the designer explicitly describe and manipulate these higher level con-
tracts and their dependencies. We have shown how the very same QoSCL contracts
can then be exploited for:
1 validation of individual components, by automatically weaving contract monitor-
ing code into the components;
2 validation of a component assembly, including getting end-to-end QoS informa-
tion inferred from individual component contracts, by automatic translation to a
Constraint Logic Programming language.
Both validation activities build on the model transformation framework developed
at INRIA (cf. http://modelware.inria.fr). Preliminary implementations of these ideas
have been prototyped in the context of the QCCS project (cf. http://www.qccs.org)
for the weaving of contract monitoring code into components part, and on the Artist
project (http://www.systemes-critiques.org/ARTIST) for the validation of a compo-
nent assembly part. Both parts still need to be better integrated with UML2.0 model-
ling environments, which is work in progress.

References
1. Aagedal J.O.: “Quality of service support in development of distributed systems”. Ph.D
thesis report, University of Oslo, Dept. Informatics, March 2001.
An MDA Approach to Tame Component Based Software Development 275

2. Beugnard A., Jézéquel J.M., Plouzeau N. and Watkins D.: “Making components contract
aware” in Computer, pp. 38–45, IEEE Computer Society, July 1999.
3. Cheung R.C.: “A user-oriented software reliability model” in IEEE Transactions on Soft-
ware Engineering vol. 6 (2), pp. 118–125, 1980.
4. de Roever W.P.: “The need for compositional proof systems: a survey” in Proceedings of
the Int. Symp. COMPOS’97, Bad Malente, Germany, Sept. 8–12, 1997.
5. Frolund S. and Koistinen J.: “QoS specification in distributed object systems” in Distrib-
uted Systems Engineering, vol. 5, July 1998, The British Computer Society.
6. Gotlieb A., Botella B. and Rueher M.: “Automatic test data generation using constraint
solving techniques” in ACM Int. Symp. on Software Testing and Analysis (ISSTA'98),
also in Software Engineering Notes, 23(2):53–62, 1998.
7. Halbwachs N., Caspi P., Raymond P. and Pillaud D.: “The synchronous data flow pro-
gramming language LUSTRE” in Proc. of IEEE, vol. 79, pp.1305–1320, Sept. 1991.
8. McHale C.: “Synchronization in concurrent object-oriented languages: expressive power,
genericity and inheritance”. Doctoral dissertation, Trinity College, Dept. of computer sci-
ence, Dublin, 1994.
9. Marre B. and Arnould A.: “Test sequences generation from luster descriptions: Gatel” in
th
15 IEEE Int. Conf. On Automated Software Engineering (ASE), pp. 229–237, Sept.
2000, Grenoble, France.
10. Meudec C.: “Automatic generation of software test cases from formal specifications”. PhD
thesis, Queen’s University of Belfast, 1998.
nd
11. Meyer B.: “Object oriented software construction”, 2 ed., Prentice Hall, 1997.
12. Meyer B.: “Applying design by contract” in IEEE Computer vol. 25 (10), pp. 40–51, 1992.
13. Object Management Group: “UML Superstructure 2.0”, OMG, August 2003.
14. Object Management Group: “UML2.0 Object Constraint Language RfP”, OMG, July 2003.
15. Object Management Group: “CORBA Components, v3.0”, adopted specification of the
OMG, June 2002.
16. Object Management Group: “UML profile for schedulability, performance and time speci-
fication”. OMG adopted specification no ptc/02-03-02, March 2002.
17. http://www.qccs.org, Quality Control of Component-based Software (QCCS) European
project home page.
18. Reussner R.H.: “The use of parameterized contracts for architecting systems with software
th
components” in Proc. of the 6 Int. Workshop on Component-Oriented Programming
(WCOP’01), June 2001.
19. Reusnerr R.H., Schmidt H.W. and Poernomo I.H.: “Reliability prediction for component-
based software architecture” in the Journal of Systems and Software, vol. 66, pp.
241–252, 2003.
20. Richter, J.: “Applied Microsoft .Net framework programming”. Microsoft Press, January
23, 2002.
21. Stafford J. and Scott H.: “The Software Engineering Institute’s Second Workshop on
Predictable Assembly: Landscape of compositional predictability”. SEI report no
CMU/SEI-2003-TN-029, 2003
nd
22. Szyperski, C.: “Component software, beyond object-oriented programming”, 2 ed.,
Addison-Wesley, 2002
23. Wallnau K.: “Volume III: A technology for predictable assembly from certifiable compo-
nents”. SEI report no CMU/SEI-2003-TR-009.
24. Weis T. and al.: “Model metamorphosis” in IEEE Software, September 2003, p. 46–51.
25. Whittaker J.A. and Thomason M.G.: “A Markov chain model for statistical software test-
ing” in IEEE Transactions on Software Engineering vol.20 (10), pp.812–824, 1994.
An Application of Stream Calculus to
Signal Flow Graphs

J.J.M.M. Rutten
CWI and VUA, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands

1 Summary
The present paper can be seen as an exercise in the author’s stream calculus
[Rut01] and gives a new proof for an existing result about stream circuits. Such
circuits are also known under the name of signal flow graphs, and are built from
(scalar) multipliers, copiers (fan-out), adders (element-wise sum), and registers
(here: one-element memory cells, aka delays). Because of the presence of mem-
ory, the input-output behaviour of these circuits is best described in terms of
functions from streams to streams (of real numbers). The main statement of
this paper (Theorem 6), gives a characterization of the input-output behaviour
of finite stream circuits in terms of so-called rational streams. It is well-known
in the world of signal processing, where it is formulated and proved in terms
of the Z-transform (a discrete version of the Laplace transform) and transfer
functions (see for instance [Lah98, p.694]). These transforms are used as repre-
sentations of streams of (real or complex) numbers. As a consequence, one has
to deal with two different worlds, and some care is required when moving from
the one to the other. In contrast, we use stream calculus to formulate and obtain
essentially the same result. What is somewhat different and new here is that we
use only streams and nothing else. In particular, expressions for streams such as
1
(1−X)2 = (1, 2, 3, . . .), are not mere representations but should be read as formal
identities. Technically, the formalism of stream calculus is simple, because it uses
the constant stream X = (0, 1, 0, 0, 0, . . .) as were it a formal variable (cf. work
on formal power series such as [BR88]).
We find it worthwhile to present this elementary treatment of signal flow
graphs for a number of reasons:

– It explains in very basic terms two fundamental phenomena in the theory of


computation: memory (in the form of register or delay elements) and infinite
behaviour (in the form of feedback).
– Although Theorem 6 is well-known to electrical engineers, computer scien-
tists do not seem to know it. Also, the result as such is not so easy to isolate
in the relevant literature on (discrete-time, linear) system theory.
– Although not worked out here, there is a very close connection between The-
orem 6, and a well-known result from theoretical computer science: Kleene’s
theorem on rational (regular) languages and deterministic finite state au-
tomata.

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 276–291, 2004.

c Springer-Verlag Berlin Heidelberg 2004
An Application of Stream Calculus to Signal Flow Graphs 277

– The present methodology is relevant for the area of component-based soft-


ware engineering: it has recently been generalised to model software compo-
sition by means of so-called component connectors, in terms of relations on
the streams of ingoing and outgoing messages (or data elements) at the var-
ious communication ports [AR03]. A similar remark applies to our ongoing
work on stream-based models of sequential digital circuits.

Stream calculus has been mainly developed as a playground for the use of
coinduction definition and proof principles (see [Rut01]). In particular, streams
and stream functions can be elegantly defined by so-called stream differential
equations. For the elemenary operations on streams that are used in this paper
(sum, convolution product and its inverse), the more traditional definitions of
the necessary operations on streams suffice and therefore, coinduction is not
discussed here.
Acknowledgements. Thanks are due to the referees for their critical yet con-
structive remarks.

2 Basic Stream Calculus


In this section, we study the set IRω = {σ | σ : {0, 1, 2, . . .} → IR } of streams
of real numbers. We shall introduce a number of constants and shall define
the operations of sum, product, and inverse of streams. These constants and
operations make of IRω a calculus with many pleasant properties. In particular,
it will be possible to compute solutions of linear systems of equations.
We denote streams σ ∈ IRω by σ = (σ0 , σ1 , σ2 , . . .). We define the sum σ + τ
of σ, τ ∈ IRω by
σ + τ = (σ0 + τ0 , σ1 + τ1 , σ2 + τ2 , . . .)
(Note that we use the same symbol + for both the sum of two streams and
the sum of two real numbers.) We define the convolution product σ × τ by

σ × τ = (σ0 · τ0 , (σ0 · τ1 ) + (σ1 · τ0 ), (σ0 · τ2 ) + (σ1 · τ1 ) + (σ2 · τ0 ), . . .)

That is, for any n ≥ 0,



n
(σ × τ )n = σk · τn−k
k=0

In general, we shall simply say ‘product’ rather than ‘convolution product’.


Note that we use the symbol × for the multiplication of streams and the symbol
· for the multiplication of real numbers. Similar to the notation for the multipli-
cation of real numbers (and functions), we shall write σ 0 ≡ 1 and σ n+1 ≡ σ ×σ n .
It will be convenient to define the operations of sum and product also for the
combination of a real number r and a stream σ. This will allow us, for instance,
to write 3 × σ for σ + σ + σ. In order to define this formally, it will be convenient
278 J.J.M.M. Rutten

to view real numbers as streams in the following manner. We define for every
r ∈ IR a stream [r] ∈ IRω by

[r] = (r, 0, 0, 0, . . .)

Note that this defines in fact a function

[ ] : IR → IRω , r → [r]

which embeds the set of real numbers into the set of streams. This definition
allows us to add and multiply real numbers r with streams σ, yielding:

[r] + σ = (r, 0, 0, 0, . . .) + σ
= (r + σ0 , σ1 , σ2 , σ3 . . .)
[r] × σ = (r, 0, 0, 0, . . .) × σ
= (r · σ0 , r · σ1 , r · σ2 , . . .)

For notational convenience, we shall usually write simply r + σ for [r] + σ,


and similarly r × σ for [r] × σ. The context will always make clear whether
the notation r has to be interpreted as the real number r of as the stream [r].
For multiplication, this difference is moreover made explicit by the use of two
different symbols: r × σ always denotes the multiplication of streams (and hence
r should be read as the stream [r]) and r · s always denotes the multiplication
of real numbers. We shall also use the following convention:

−σ ≡ [−1] × σ
= (−σ0 , −σ1 , −σ2 , . . .)

Here are a few basic properties of our operators.

Proposition 1. For all r, s ∈ IR and σ, τ, ρ ∈ IRω ,

[r] + [s] = [r + s]
σ+0=σ
σ+τ =τ +σ
σ + (τ + ρ) = (σ + τ ) + ρ
[r] × [s] = [r · s]
0×σ =0
1×σ =σ
σ×τ =τ ×σ
σ × (τ + ρ) = (σ × τ ) + (σ × ρ)
σ × (τ × ρ) = (σ × τ ) × ρ

Particularly simple are those streams that from a certain point onwards are
constant zero:
σ = (r0 , r1 , r2 , . . . , rn , 0, 0, 0, . . .)
An Application of Stream Calculus to Signal Flow Graphs 279

for n ≥ 0 and r0 , . . . , rn ∈ IR. Using the following constant, we shall see that
there is a very convenient way of denoting such streams: we define
X = (0, 1, 0, 0, 0, . . .)
It satisfies, for all r ∈ IR, σ ∈ IRω , and n ≥ 0:
r × X = (0, r, 0, 0, 0, . . .)
X × σ = (0, σ0 , σ1 , σ2 , . . .)
X n = (0, . . . , 0, 1, 0, 0, 0, . . .)
  
n times
For instance, 2+3X −8X 3 = (2, 3, 0, −8, 0, 0, 0, . . .). More generally, we have,
for all n ≥ 0 and all r0 , . . . , rn ∈ IR:
r0 + r1 X + r2 X 2 + · · · + rn X n = (r0 , r1 , r2 , . . . , rn , 0, 0, 0, . . .)
Such streams are called polynomial streams. Note that although a polynomial
stream such as 2 + 3X − 8X 3 looks like a (polynomial) function f (x) = 2 + 3x −
8x3 , for which x is a variable, it really is a stream, built from constant streams
(2, 3, 8, and X), and the operations of sum and product. At the same time, it
is true that we can calculate with polynomial streams in precisely the same way
as we are used to compute with (polynomial) functions, as is illustrated by the
following example (here we use the basic properties of sum and product listed in
Proposition 1): (2−X)+(1+3X) = 3+2X and (2−X)×(1+3X) = 2+5X −3X 2 .
We shall need to solve linear equations in one unknown τ , such as
τ = 1 + (X × τ ) (1)
(where, recall, 1 = (1, 0, 0, 0, . . .)). Ideally, we would like to solve (1) by reasoning
as follows:
τ = 1 + (X × τ )
⇒ τ − (X × τ ) = 1
⇒ (1 − X) × τ = 1
1
⇒τ =
1−X
Recall that we are not dealing with functions but with streams. Therefore it
is not immediately obvious what we mean by the ‘inverse’ of a stream 1 − X =
(1, −1, 0, 0, 0, . . .). There is however the following fact: for any stream σ such
that σ0 = 0, there exists a unique stream τ such that σ × τ = 1. A proof of this
fact can be given in various ways:
(a) Using the definition of convolution product, one can easily derive the follow-
ing recurrence relation

1 
n−1
τn = · σn−k · τk
σ0
k=0

by which the elements of τ can be constructed one by one.


280 J.J.M.M. Rutten

(b) Alternatively and equivalently, one can use the algorithm of long division to
obtain τ out of σ.
(c) Our personal favourite is a method described in [Rut01], where we have
introduced the operation of inverse by means of so-called stream differ-
ential equations, formulated in terms of the notions of stream derivative
σ  = (σ1 , σ2 , σ3 , . . .) and initial value σ0 for σ ∈ IRω . (In fact, all the opera-
tions on IRω are there introduced by means of such equations.)
Now we can simply define the inverse σ1 of a stream σ with σ0 = 0 as the
unique stream τ such that σ × τ = 1. Here are a few examples that can be easily
computed using any of the methods (a)-(c) above:
1
= (1, 1, 1, . . .)
1−X
1
= (1, 0, 1, 0, 1, 0, . . .)
1 − X2
1
= (1, 2, 3, . . .)
(1 − X)2
As with sum and product, we can calculate with the operation of inverse in
the same way as we compute with functions: For all σ, τ ∈ IRω with σ0 = 0 = τ0 ,
1
σ× =1
σ
1 1 1
× =
σ τ σ×τ
1
1 =σ
σ
Using the various properties of our operators, it is straightforward to see that
in the calculus of streams, we can solve linear equations as usual. Consider for
instance the following system of equations:
σ = 1 + (X × τ )
τ =X ×σ
In order to find σ and τ , we compute as follows: σ = 1 + (X × τ ) = 1 + (X ×
X × σ) = 1 + (X 2 × σ). This implies σ − (X 2 × σ) = 1 and (1 − X 2 ) × σ = 1.
1 X
Thus σ = 1−X 2 and τ = 1−X 2 .

We conclude this section with the following definition, which is of central


importance for the formulation of the main result of this paper.
Definition 2. The product of a polynomial stream and the inverse of a polyno-
mial stream is called a rational stream. Equivalently, a stream σ is rational if
there exist n, m ≥ 0 and coefficients r0 , . . . , rn , s0 , . . . , sm ∈ IR with s0 = 0, such
that

r0 + r 1 X + r 2 X 2 + · · · + r n X n
σ=
s0 + s1 X + s2 X 2 + · · · + sm X m
'
An Application of Stream Calculus to Signal Flow Graphs 281

3 Stream Circuits
Certain functions from IRω to IRω can be represented by means of graphical
networks that are built from a small number of basic ingredients. Such networks
can be viewed as implementations of stream functions. We call them stream
circuits; in the literature, they are also referred to as (signal) flow graphs. Using
the basic stream calculus from Section 2, we shall give a formal but simple answer
to the question precisely which stream functions can be implemented by such
stream circuits.

3.1 Basic Circuits


The circuits that we are about to describe, will generally have a number of
input ends and a number of output ends. Here is an example of a simple circuit,
consisting of one input and one output end:
 /

The input end is denoted by the arrow shaft  and the output end is
denoted by the arrow head / . For streams σ, τ ∈ IRω , we shall write

σ /τ

and say that the circuit inputs the stream σ and outputs the stream τ . Writing
the elements of these streams explicitly, this notation is equivalent to
 / (τ0 , τ1 , τ2 , . . .)
(σ0 , σ1 , σ2 , . . .)

which better expresses the intended operational behaviour of the circuit: It con-
sists of an infinite sequence of actions, at time moments 0, 1, 2, . . .. At each
moment n ≥ 0, the circuit simultaneously inputs the value σn ∈ IR at its input
end and outputs the value τn ∈ IR at its output end. In general, this value τn
depends both on the value σn and on the values σi that have been taken as
inputs at earlier time moments i < n. Note that this implies that circuits have
memory.
Next we present the four basic types of circuits, out of which all other circuits
in this section will be constructed.

(a) For every a ∈ IR, we define a circuit with one input and one output end,
called an a-multiplier , for all σ, τ ∈ IRω , by

σ a / τ ⇐⇒ τn = a · σn , all n ≥ 0
⇐⇒ τ = a × σ

This circuit takes, at any moment n ≥ 0, a value σn at its input end, multi-
plies it with the constant a, and outputs the result τn = a · σn at its output
end. It defines, in other words, a function that assigns to an input stream σ
the output stream τ = a × σ.
282 J.J.M.M. Rutten

Occasionally, it will be more convenient to write the multiplying factor a as


a super- or subscript of the arrow:
 a / ≡  a / ≡  /
a

(b) The adder circuit has two input and one output ends, and is defined, for all
σ, τ, ρ ∈ IRω by
σ
+ /ρ ⇐⇒ ρn = σn + τn , all n ≥ 0
$ ⇐⇒ ρ = σ + τ
τ
At moment n ≥ 0, the adder simultaneously inputs the values σn and τn at
its input ends, and outputs their sum ρn = σn + τn at its output end.
(c) The copier circuit has one input and two output ends and is defined, for all
σ, τ, ρ ∈ IRω , by

σ c ⇐⇒ τn = σn = ρn , all n ≥ 0
-ρ ⇐⇒ τ = σ = ρ
At any moment n ≥ 0, the copier inputs the value σn at its input end, and
outputs two identical copies τn and ρn at its output ends.
(d) A register circuit has one input and one output end and is defined, for all
σ, τ ∈ IRω , by
σ  R / τ ⇐⇒ τ0 = 0 and τn = σn−1 , all n ≥ 1
⇐⇒ τ = (0, σ0 , σ1 , σ2 , . . .)
The register circuit can be viewed as consisting of a one-place memory cell
that initially contains the value 0. The register starts its activity, at time
moment 0, by outputting its value τ0 = 0 at its output end, while it simulta-
neously inputs the value σ0 at its input end, which is stored in the memory
cell. At any future time moment n ≥ 1, the value τn = σn−1 is output and
the value σn is input and stored. (For obvious resaons, the register circuit is
sometimes also called a unit delay.) Recalling that for the constant stream
X = (0, 1, 0, 0, 0, . . .), we have X × σ = (0, σ0 , σ1 , σ2 , . . .), it follows that for
all σ, τ ∈ IRω ,
σ  R / τ ⇐⇒ τ = X × σ

3.2 Circuit Composition


We can construct a larger circuit out of two smaller ones by connecting output
ends of the first to input ends of the second. For instance, for the composition
of a 2-multiplier and a 3-multiplier, we shall write
 2 /◦ 3 /
An Application of Stream Calculus to Signal Flow Graphs 283

We call the connection point ◦ an (internal) node of the composed circuit. A


computation step of this circuit, at any moment in time, consists of the simulta-
neous occurrence of the following actions: a value is input at the input end of the
2-register; it is multiplied by 2 and output at the output end of the 2-register; the
result is input at the input end of the 3-register, is multiplied by 3 and is output
at the output end of the 3-multiplier. More formally, and fortunately also more
succinctly, we define the behaviour of the composed circuit, for all σ, τ ∈ IRω ,
by
σ 2 /◦ 3 /τ
⇐⇒ σ  2 / ∃ρ  3 /τ

⇐⇒ ∃ρ ∈ IRω : σ 2 / ρ and ρ  3 /τ

We shall consider all three of the above notations as equivalent. Combining


the definitions of a 2- and 3-multiplier, we can in the above example easily
compute how the output stream τ depends on the input stream σ:
σ 2 /◦ 3 /τ
⇐⇒ ∃ρ ∈ IR ω
: σ 2 / ρ and ρ  3 /τ
⇐⇒ ∃ρ ∈ IR ω
: ρ = 2 × σ and τ = 3 × ρ
⇐⇒ τ = 6 × σ

Note that the stream ρ is uniquely determined by the stream σ. The motiva-
tion for our notation “∃ρ” is not so much to suggest that there might be more
possible candidate streams for ρ, but rather to emphasise the fact that in order
to express the output stream τ in terms of σ, we have to compute the value of
the stream ρ in the middle.
We can compose circuits, more generally, with several output ends with cir-
cuits having a corresponding number of input ends, as in the following example:
0◦
 c + /
.◦!

In this example, the behaviour of the resulting circuit is defined, for all σ, τ ∈
IRω , by
0◦

σ c + /τ
.◦!
⇐⇒ 2 ∃γ 

σ c + /τ
, ∃δ $
284 J.J.M.M. Rutten

⇐⇒ 0γ γ
 /τ
∃γ, δ ∈ IRω : σ c and +
"
.δ δ
⇐⇒ ∃γ, δ ∈ IRω : σ = γ = δ and τ = γ + δ
⇐⇒ τ = 2 × σ

It will be convenient to have adders with more than two inputs and, sim-
ilarly, copiers with more than two outputs. We define a ternary adder as the
composition of two binary adders as follows:
! !
 + / ≡ /◦
+
! ! /
+

For input streams σ, τ, ρ ∈ IRω , it produces the output stream σ + τ + ρ. We
define a ternary copier by the following composition:
/ .
 c /≡  c /
. .◦ c /

It takes one input stream and produces three identical copies as output
streams. Adders and copiers with four or more inputs and outputs can be con-
structed in a similar fashion.
The following circuit combines (various instances of) all four basic circuit
types:

0◦ 2 /◦
 c /◦ 3 /◦ R /◦ + /
.◦ /◦ R /◦ R /◦
−7

In order to express the output stream τ for a given input stream σ, we have
to compute one intermediate stream for each of the (nine) internal nodes ◦ in
the circuit above. Using the definitions of the basic circuits, and computing from
left to right, we find:

0σ 2 / 2σ 

σ c /σ 3 / 3σ  R / 3Xσ  + /τ

-σ / −7σ  R / −7Xσ  R / −7X 2 σ '


−7
An Application of Stream Calculus to Signal Flow Graphs 285

(To save space, we have omitted the symbol × for multiplication.) We can
now express the output stream τ in terms of the input stream σ as follows:
τ = (2 × σ) + (3X × σ) + (−7X 2 × σ)
= (2 + 3X − 7X 2 ) × σ
The circuit above computes — we shall also say implements — the following
function on streams:
f : IRω → IRω , f (σ) = (2 + 3X − 7X 2 ) × σ
If we supply the circuit with the input stream σ = 1 (= (1, 0, 0, 0, . . .)) then
the output stream is τ = f (1) = 2 + 3X − 7X 2 . We call this the stream generated
by the circuit.
Convention 3. In order to reduce the size of the diagrams with which we depict
stream circuits, it will often be convenient to leave the operations of copying and
addition implicit. In this manner, we can, for instance, draw the circuit above
as follows:

2
0 &
σu  3 /◦ R / τ
D
,◦ /◦&
R
−7 R

The (respective elements of the) stream σ gets copied along each of the three
outgoing arrows. Similarly, the stream τ will be equal to the sum of the output
streams of the three incoming arrows. This convention saves a lot of writing.
Moreover, if we want to express τ in terms of σ, we now have only three internal
streams to compute. If a node has both incoming and outgoing arrows, such as

 1
L
F◦r
$ -

then first the values of the output streams of the incoming arrows have to be
added; then the resulting sum is copied and given as an input stream to each of
the outgoing arrows. Consider for instance the circuit below. It has input streams
σ and τ , an intermediate stream γ, and output streams δ and  in IRω :

σ 4δ
2 R
 L
E γs
' R 5
+
τ
286 J.J.M.M. Rutten

satisfying
γ = 2σ + (X × τ )
δ =X ×γ
= (2X × σ) + (X 2 × τ )
 = 5γ
= 10σ + (5X × τ )
'
As an example, we compute the stream function implemented by the following
circuit, with input stream σ, output stream τ , and intermediate streams γ and δ:

3γ
R
_ 5
H 
σv R Dτ

,δ&
3 R

We have:
γ =X ×σ
δ = (3 × σ) + (X 2 × σ)
τ = (5 × γ) + (X × δ)
= (8X + X 3 ) × σ
Thus the stream function implemented by this circuit is f : IRω → IRω with
f (σ) = (8X + X 3 ) × σ, for all σ ∈ IRω . An equivalent circuit, implementing the
same stream stream function, is given by:

2 /◦ /◦ /& ◦  /


1 R R R

The following proposition, of which we have omitted the easy proof, charac-
terizes which stream functions can be implemented by the type of circuits that
we have been considering so far.
Proposition 4. For all n ≥ 0 and r0 , . . . , rn ∈ IR, each of the following two
circuits:

3◦ r0 /◦

 c /◦ /◦ /◦ /


R r1 +
..
.
+◦ R /◦_ _ _◦ R /◦ rn /◦'
n times
An Application of Stream Calculus to Signal Flow Graphs 287

and

implements the stream function f : IRω → IRω given, for all σ ∈ IRω , by

f (σ) = ρ × σ

where the stream ρ (generated by these circuits) is the polynomial

ρ = r0 + r1 X + r2 X 2 + · · · + rn−1 X n−1 + rn X n

'

3.3 Circuits with Feedback Loops


The use of feedback loops in stream circuits increases their expressive power
substantially. We shall start with a elementary example and then give a simple
and precise characterization of all stream functions that can be implemented by
circuits with feedback loops. Consider the following circuit:

◦ o ◦
_ R O
 + /◦ c /

In spite of its simplicity, this circuit is already quite interesting. Before we


give a formal computation of the stream function that this circuit implements,
we give an informal description of its behaviour first. Assuming that we have
an input stream σ = (σ0 , σ1 , σ2 , . . .), we compute the respective elements of the
output stream τ = (τ0 , τ1 , τ2 , . . .). Recall that a register can be viewed as a
one-place memory cell with initial value 0. At moment 0, our circuit begins its
activity by inputting the first value σ0 at its input end. The present value of the
register, 0, is added to this and the result τ0 = σ0 + 0 = σ0 is the first value to be
output. At the same time, this value σ0 is copied and stored as the new value of
the register. The next step consists of inputting the value σ1 , adding the present
value of the register, σ0 , to it, and outputting the resulting value τ1 = σ0 + σ1 .
At the same time, this value σ0 + σ1 is copied and stored as the new value of
the register. The next step will input σ2 and output the value τ2 = σ0 + σ1 + σ2 .
And so on. We find:

τ = (σ0 , σ0 + σ1 , σ0 + σ1 + σ2 , . . .)

Next we show how the same answer can be obtained, more formally and
more systematically, by applying a bit of basic stream calculus. As before, we
try to express the output stream τ in terms of the input stream σ by computing
288 J.J.M.M. Rutten

the values of intermediate streams ρ1 , ρ2 , ρ3 ∈ IRω , corresponding to the three


internal nodes of the circuit, such that

ρ_1 o ρ
R
O2

σ + / ρ3  c /τ

Note that the values of ρ1 , ρ2 , ρ3 are mutually dependent because of the


presence of the feedback loop: ρ3 depends on ρ1 which depends on ρ2 which
depends on ρ3 . The stream calculus developed in Section 2 is precisely fit to
deal with this type of circularity. Unfolding the definitions of the basic circuits
of which the above circuit is composed (one adder, one register, and one copier),
we find the following system of equations:

ρ 1 = X × ρ2
ρ3 = σ + ρ1
ρ2 = ρ3
τ = ρ3

We have seen in the previous section how to solve such systems of equations.
The right way to start, which will work in general, is to compute first the output
stream of the register:

ρ1 = X × ρ 2
= X × ρ3
= X × (σ + ρ1 )
= (X × σ) + (X × ρ1 )

This implies ρ1 − (X × ρ1 ) = X × σ, which is equivalent to ρ1 = 1−X


X
× σ.
As a consequence, τ = ρ3 = ρ2 = σ + ρ1 = σ + ( 1−X × σ) = 1−X × σ. Thus
X 1

the stream function f : IRω → IRω that is implemented by the feedback circuit
is given, for all σ ∈ IRω , by
1
f (σ) = ×σ
1−X
We see that this function consists again of the convolution product of the
1
argument σ and a constant stream 1−X . The main difference with the exam-
ples in the previous subsections is that now this constant stream is no longer
polynomial but rational.
We still have to check that the first informal and the second formal compu-
tation of the function implemented by the feedback circuit coincide. But this
follows from the fact that, for all σ ∈ IRω ,
1
× σ = (σ0 , σ0 + σ1 , σ0 + σ1 + σ2 , . . .)
1−X
An Application of Stream Calculus to Signal Flow Graphs 289

which is an immediate consequence of the definition of the convolution product


1
and the fact that 1−X = (1, 1, 1, . . .).
Not every feedback loop gives rise to a circuit with a well-defined behaviour.
Consider for instance the following circuit, with input stream σ, output stream
τ , and internal streams ρ1 , ρ2 , ρ3 :

ρ_1 o ρ
1
O2

σ + / ρ3  c /τ

In this circuit, we have replaced the register feedback loop of the example
above by a 1-multiplier. If we try to compute the stream function of this circuit
as before, we find the following system of equations:

ρ1 = 1 × ρ2
ρ 3 = σ + ρ1
ρ2 = ρ3
τ = ρ3

This leads to ρ3 = σ + ρ3 , which implies σ = 0. But σ is supposed to be an


arbitrary input stream, so this does not make sense.
Problems like this can be avoided by assuming that circuits have the following
property.

Assumption 5. From now on, we shall consider only circuits in which every
feedback loop passes through at least one register. '

Note that this condition is equivalent to requiring that the circuit has no
infinite paths passing through only multipliers, adders, and copiers.
Next we present the main result of the present paper. It is a characterization
of which stream functions can be implemented by finite stream circuits. We
formulate it for finite circuits that have one input and one output end, but it
can be easily generalised to circuits with many inputs and outputs.

Theorem 6. (a) Let C be any finite stream circuit, possibly containing feedback
loops (that always pass through at least one register). The stream function
f : IRω → IRω implemented by C is always of the form:

f (σ) = ρ × σ

for all σ ∈ IRω and for some fixed rational stream

r0 + r 1 X + r 2 X 2 + · · · + r n X n
ρ=
s0 + s1 X + s2 X 2 + · · · + sm X m

with n, m ≥ 0, r0 , . . . , rn , s0 , . . . , sm ∈ IR, and s0 = 0.


290 J.J.M.M. Rutten

(b) Let f : IRω → IRω be a stream function of the form, for all σ ∈ IRω :
f (σ) = ρ × σ
for some fixed rational stream ρ. Then there exists a finite stream circuit C
that implements f .
Proof. (a) Consider a finite circuit C containing k ≥ 1 registers. We associate
with the input end of C a stream σ and with the output end of C a stream τ .
With the output end of each register Ri , we associate a stream αi . For the input
end of each register Ri , we look at all incoming paths that: (i) start in either
an output end of any of the registers or the input end of C, (ii) lead via adders,
copiers, and multipliers, (iii) to the input end of Ri . Because of Assumption 5,
there are only finitely many of such paths. This leads to an equation of the form
αi = (a1i × X × α1 ) + · · · + (aki × X × αk ) + (ai × X × σ)
for some ai , aji ∈ IR. We have one such equation for each 1 ≤ i ≤ k. Solving this
system of k equations in stream calculus as before, yields for each register an
expression αi = ρi × σ, for some rational stream ρi . Finally, we play the same
game for τ , at the output end of C, as we did for each of the registers. This will
yield the following type of expression for τ :
τ = (b1 × α1 ) + · · · + (bk × αk ) + (b × σ)
= ((b1 × ρ1 ) + · · · + (bk × ρk ) + b) × σ
for some b, bi ∈ IR, which proves (a). For (b), we treat only the special case that
r0 + r 1 X + r 2 X 2 + r 3 X 3
ρ=
1 + s1 X + s2 X 2 + s3 X 3
where we have taken n = m = 3 and s0 = 1. The general case is not more
difficult, just more writing. We claim that the following circuit implements the
function f (σ) = ρ × σ (all σ ∈ IRω ):

r0 r1
r2
. / *
σ / ρ0  2/ ρ1 3/ ρ2 3/ ρ3
 /( τ
1 e R R R r3
−s1
−s2
−s3

where we have denoted input and output streams by σ and τ , and intermediate
streams by ρ0 , ρ1 , ρ2 , ρ3 . They satisfy the following equations:
ρ0 = σ − (s1 × ρ1 ) − (s2 × ρ2 ) − (s3 × ρ3 )
ρ1 = X × ρ0
ρ2 = X × ρ1
ρ3 = X × ρ2
τ = (r0 × ρ0 ) + (r1 × ρ1 ) + (r2 × ρ2 ) + (r3 × ρ3 )
An Application of Stream Calculus to Signal Flow Graphs 291

It follows that

ρ0 = σ − (s1 X × ρ0 ) − (s2 X 2 × ρ0 ) − (s3 X 3 × ρ0 )

As a consequence, we have, for i = 0, 1, 2, 3, that

Xi
ρi = ×σ
1 + s1 X + s2 X 2 + s3 X 3
This implies

r 0 + r 1 X + r2 X 2 + r 3 X 3
τ= ×σ
1 + s1 X + s2 X 2 + s3 X 3
whereby the claim above is proved. '

Taking σ = 1 in Theorem 6 gives the following corollary.

Corollary 7. A stream ρ ∈ IRω is rational if and only if it is generated by a


(finite) stream circuit. '

References
[AR03] F. Arbab and J.J.M.M. Rutten. A coinductive calculus of component con-
nectors. In M. Wirsing, D. Pattinson, and R. Hennicker, editors, Proceedings
of WADT 2002, volume 2755 of LNCS, pages 35–56. Springer, 2003.
[BR88] J. Berstel and C. Reutenauer. Rational series and their languages, volume 12
of EATCS Monographs on Theoretical Computer Science. Springer-Verlag,
1988.
[Lah98] B.P. Lahti. Signal Processing & Linear Systems. Oxford University Press,
1998.
[Rut01] J.J.M.M. Rutten. Elements of stream calculus (an extensive exercise in coin-
duction). In S. Brooks and M. Mislove, editors, Proceedings of MFPS 2001,
volume 45 of ENTCS, pages 1–66. Elsevier Science Publishers, 2001. To ap-
pear in MSCS.
Synchronous Closing and Flow Analysis for
Model Checking Timed Systems

Natalia Ioustinova1 , Natalia Sidorova2 , and Martin Steffen3


1
Department of Software Engineering, CWI
P.O. Box 94079, 1090 GB Amsterdam, The Netherlands
Natalia.Ioustinova@cwi.nl
2
Department of Mathematics and Computer Science
Eindhoven University of Technology
Den Dolech 2, P.O. Box 513,
5612 MB Eindhoven, The Netherlands
n.sidorova@tue.nl
3
Institute of Computer Science and Applied Mathematics
Christian-Albrechts-Universität
Hermann-Rodewaldstr. 3, 24118 Kiel, Germany
ms@informatik.uni-kiel.de

Abstract. Formal methods, in particular model checking, are increas-


ingly accepted as integral part of system development. With large soft-
ware systems beyond the range of fully automatic verification, however,
a combination of decomposition and abstraction techniques is needed. To
model check components of a system, a standard approach is to close the
component with an abstraction of its environment, as standard model
checkers often do not handle open reactive systems directly. To make
it useful in practice, the closing of the component should be automatic,
both for data and for control abstraction. Specifically for model checking
asynchronous open systems, external input queues should be removed,
as they are a potential source of a combinatorial state explosion.
In this paper we investigate a class of environmental processes for
which the asynchronous communication scheme can safely be replaced by
a synchronous one. Such a replacement is possible only if the environment
is constructed under rather a severe restriction on the behavior, which
can be partially softened via the use of a discrete-time semantics. We
employ data-flow analysis to detect instances of variables and timers
influenced by the data passing between the system and the environment.

Keywords: formal methods, software model checking, abstraction, flow


analysis, asynchronous communication, open components, program trans-
formation

1 Introduction
Model checking [8] is well-accepted for the verification of reactive systems. To al-
leviate the notorious state-space explosion problem, a host of techniques has been
invented, including partial-order reduction [12, 32] and abstraction [23, 8, 10].

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 292–313, 2004.

c Springer-Verlag Berlin Heidelberg 2004
Synchronous Closing and Flow Analysis 293

As standard model checkers, e.g., Spin [16], cannot handle open systems, one
has to construct a closed model, and a problem of practical importance is how
to close open systems. This is commonly done by adding an environment pro-
cess that must exhibit at least all the behavior of the real environment. In the
framework of the assume-guarantee paradigm, the environment should model the
behavior corresponding to the verified properties of the components forming the
environment. However, the way of closing should be well-considered to counter
the state-space explosion problem. This is especially true in the context of model
checking programs with an asynchronous message-passing communication model
—- sending arbitrary message streams to the unbounded input queues would im-
mediately lead to an infinite state space, unless some assumptions restricting the
environment behavior are incorporated in the closing process. Even so, adding
an environment process may result in a combinatorial explosion caused by all
combinations of messages in the input queues.
A desirable solution would be to construct an environment that communicates
to the system synchronously. In [29] such an approach is considered for the sim-
plest safe abstraction of the environment, the chaotically behaving environment:
the outside chaos is embedded into the system’s processes, which corresponds
to the synchronous communication scheme. Though useful at a first verification
phase, the chaotic environment may be too general. Here, we investigate for what
kind of processes, apart from the chaotic one, the asynchronous communication
can be safely replaced with the synchronous one. To make such a replacement
possible, the system should be not reactive — it should either only send or only
receive messages. However, when we restrict our attention to systems with the
discrete-time semantics like the ones of [15, 3], this requirement can be softened
in that the restrictions are imposed on time slices instead of whole runs: In every
time slice, the environmental process can either only receive messages, or it can
both send and receive messages under condition that inputs do not change the
state of the environment process.
Another problem the closing must address is that the data carried with the
messages are usually drawn from some infinite data domains. For data abstrac-
tion, we combine the approaches from [29] and [17]. The main idea is to condense
data exchanged with the environment into a single abstract value , , to deal with
the infinity of environmental data. We employ data-flow analysis to detect in-
stances of chaotically influenced variables and timers and remove them. Based
on the result of the data flow analysis, the system S is transformed into a closed
system S which shows more behavior in terms of traces than the original one.
For formulas of next-free LTL [26, 22], we thus get the desired property preser-
vation: if S |= ϕ then S |= ϕ.
The main target application are protocols specified in SDL (Specification and
Description Language) [28]. As verification tool, we use the well-known Spin
model checker. Our method is implemented as transformations of Promela-
programs, Spin’s input language. With this tool we show experiments on a
real-life protocol to estimate the effect of removing queues on the state space.
294 N. Ioustinova, N. Sidorova, and M. Steffen

The rest of the paper is organized as follows. In Section 2 we fix syntax


and semantics of the language. In Section 3 we describe under which condi-
tion the asynchronous communication with the environment can be replaced by
synchronous one. In Section 4 we abstract from the data exchanged with the
environment and give a data-flow algorithm to optimize the system for model
checking. In Section 5 we show some experimental results and in Section 6 we
discuss related and future work.

2 Semantics
In this section we fix syntax and operational semantics we work with. Our model
is based on asynchronously communicating state machines with top-level con-
currency. The communication is done via channels and we assume a fixed set
Chan of channel names for each program, with c, c , . . . as typical elements. The
set of channel names is partitioned into input channels Chan i and output chan-
nels Chan o , and we write ci , co , . . . to denote membership of a channel to one of
n
these classes. A program Prog is given as the parallel composition Πi=1 Pi of a
finite number of processes.
A process P is described by a tuple (in, out, Var , Loc, Edg, σinit ), where in
and out are finite sets of input resp. output channel names of the process, Var
denotes a finite set of variables, Loc denotes a finite set of locations or control
states, and σinit is the initial state. We assume the sets of variables Var i of
n
processes Pi in a program Prog = Πi=1 Pi to be disjoint. An edge of the state
machine describes a change of state by performing an action from a set Act; the
set Edg ⊆ Loc × Act × Loc denotes the set of edges. For an edge (l, α, ˆl) ∈ Edg
of P , we write more suggestively l −→α ˆl.
A mapping from variables to values is called a valuation; we denote the set
of valuations by Val = Var → D. We assume standard data domains such as
N, Bool , etc., where we write D when leaving the data domain unspecified, and
we silently assume all expressions to be well-typed. A location together with
a valuation of process variables define a state of a process. The set of process
states is defined as Σ = Loc × Val , and each process has one designated initial
state σinit = (linit , ηinit ).
Processes communicate by exchanging signals (carrying values) over chan-
nels. Signals coming from the environment form the set of external signals Sig ext .
Signals that participate in the communication within the system belong to the
set Sig int of internal signals. Note that both signal sets are not necessarily dis-
joint.
As untimed actions, we distinguish (1) input over a channel c of a signal s
containing a value to be assigned to a local variable, (2) sending over a channel c
a signal s together with a value described by an expression, and (3) assignments.
We assume the inputs to be unguarded, while output and assignment are guarded
by a boolean expression g, its guard. The three classes of actions are written as
c?s(x), g  c!s(e), and g  x := e, respectively, and we use α, α . . . when leaving
an action unspecified.
Synchronous Closing and Flow Analysis 295

Table 1. Step semantics for one process

l −→c?s(x) ˆ
l ∈ Edg l ∈ Edg ⇒ s = s
l −→c?s(x) ˆ
Input Discard
(l, η) →ci ?(s,v) (ˆ
l, η [x → v]) (l, η) →ci ?(s ,v) (l, η)
l −→g  c!(s,e) ˆl ∈ Edg [[g]]η = true [[e]]η = v
Output
(l, η) →co !(s,v) (ˆ
l, η)
l −→g  x:=e ˆ
l ∈ Edg [[g]]η = true [[e]]η = v
Assign
(l, η) →τ (ˆ
l, η [x → v])
ˆ
l −→g  set t:=e l ∈ Edg [[g]]η = true [[e]]η = v
Set
(l, η) →τ (ˆ
l, η [t → on(v)])
l −→g  reset t ˆ
l ∈ Edg [[g]]η = true blocked (σ)
Reset TickP
(l, η) →τ (ˆ l, η [t → off ]) σ →tick σ [t →(t−1)]
ˆ
l −→gt  reset t l ∈ Edg [[t]]η = on(0)
Timeout
(l, η) →τ (ˆ
l, η [t → off ])
(l −→α ˆ
l ∈ Edg ⇒ α = gt  reset t) [[t]]η = on(0)
TDiscard
(l, η) →τ (l, η [t → off ])

Time aspects of a system behavior are specified by actions dealing with


timers. Each process has a finite set of timer variables (with typical elements
t, t1 , . . .). A timer can be either set to a value, i.e., activated to run for the
designated period, or reset, i.e., deactivated, which corresponds to timer values
on(n), off , respectively. Setting and resetting are expressed by guarded actions
of the form g  set t := e and g  reset t. If a timer expires, i.e., the value of a
timer becomes zero, it can cause a timeout, upon which the timer is reset. The
timeout action is denoted by gt  reset t, where the timer guard gt expresses the
fact that the action can only be taken upon expiration.
The behavior of a single process is then given by sequences of states σinit =
σ0 →λ σ1 →λ . . . starting from the initial one. The step semantics is given as a
labelled transition relation →λ ⊆ Σ × Lab × Σ between states. The set of labels
Lab is formed by τ -labels for internal steps, tick -labels for time progression and
communication labels. Communication label, either input or output are of the
form c?(s, v) resp. c!(s, v). Depending on the location, the valuation, and the
potential next actions, the possible successor states are given by the rules of
Table 1.
Inputting a signal with a value via a channel means reading a value belonging
to a matching signal from the channel and updating the local valuation accord-
ingly (cf. rule Input), where η [x → v] stands for the valuation equaling η for all
y ∈ Var except for x ∈ Var , where η [x → v](x) = v holds instead. A specific
feature commonly used for communicating finite state machines (e.g. in SDL-
296 N. Ioustinova, N. Sidorova, and M. Steffen

92 [27]) is captured by rule Discard: If the input value cannot be reacted upon
at the current control state, i.e., if there is no input action originating from the
location treating this signal, then the message is just discarded, leaving control
state and valuation unchanged. The automaton is therefore input-enabled: it
cannot refuse to accept a message; it may throw it away, though.
Unlike inputs, outputs are guarded, so sending a message involves evaluat-
ing the guard and the expression according to the current valuation (cf. rule
Output). Assignment in Assign works analogously, except that the step is in-
ternal. We assume for the non-timer guards, that at least one of them evaluates
to true in each state. At the SDL source language, this assumption corresponds
to the natural requirement that each conditional construct must cover all cases,
for instance by having at least a default branch. The system should not block
because of a non-covered alternative in a decision-construct [25].
Concerning the temporal behavior, timers are treated in valuations as vari-
ables, distinguishing active and deactivated timer. We use off to represent in-
active timers. The value of an active timer shows the delay left until timer
expiration. The set-command activates a timer, setting its value to the specified
period until timer expiration, and reset deactivates the timer. Both actions are
guarded (cf. rules Set and Reset). A timeout may occur, if an active timer has
expired, i.e., reached zero (cf. rule Timeout).
Time elapses by counting down active timers till zero, which happens in case
no untimed actions are possible. In rule TickP , this is expressed by the predicate
blocked on states: blocked (σ) holds if no move is possible except either a tick -
step or a reception of a message, i.e., if σ →λ for some label λ, then λ = tick or
λ = c?(s, v). In other words, the time-elapsing steps are those with least priority.
The counting down of the timers is written η [t →(t−1)], by which we mean, all
currently active timers are decreased by one, i.e., on(n + 1) − 1 = on(n), non-
active timers are not affected. Note that the operation is undefined for on(0),
which is justified later by Lemma 1.
In SDL, timeouts are often considered as specific timeout messages kept in
a queue like any other message, and timer-expiration consequently is seen as
adding a timeout-message to the queue. We use an equivalent presentation of
this semantics, where timeouts are not put into the input queue, but are modelled
more directly by guards. The equivalence of timeouts-by-guards and timeouts-as-
messages in the presence of SDL’s asynchronous communication model is argued
for in [3]. The time semantics for SDL chosen here is not the only one conceivable
(see e.g. [6] for a broader discussion of the use of timers in SDL). The semantics
we use is the one described in [15, 3], and is also implemented in DTSpin [2, 11],
a discrete time extension of the Spin model checker.
In the asynchronous communication model, a process receives messages via
channels modelled as queues. We write  for the empty queue; (s, v) :: q denotes
a queue with message (s, v) at the head of the queue, i.e., (s, v) is the message
to be input next; likewise the queue q ::(s, v) contains (s, v) most recently en-
tered; Q denotes the set of possible queues. We model the queues implementing
asynchronous channels explicitly as separate entities of the form (c, q), consist-
Synchronous Closing and Flow Analysis 297

ing of the channel name together with its queue content. We sometimes refer to
the channel process (c, q) just by its name c. We require for the input and the
output channel names of channel c to be in(c) = co and out(c) = ci resp. The
operational rules for queues are shown in Table 2.
In analogy to the tick -steps for processes, a queue can perform a tick -step iff
the only steps possible are input or tick-steps, as captured again by the blocked -
predicate (cf. rule Tick). Note that a queue is blocked and can therefore tick
only if it is empty, and that a queue does not contain any timers. Hence, the
counting down operation [t →(t−1)] has no effect and is therefore omitted in the
rule TickQ of Table 2.

Table 2. Step semantics for a channel c

In blocked (c, q)
(c, q) →co ?(s,v) (c, q ::(s, v)) TickQ
(c, q) →tick (c, q)
Out
(c, (s, v) :: q) →ci !(s,v) (c, q)

A global semantics of a system S is given by a parallel composition of labelled


transition systems modelling processes and channels of the specification. The
semantics of the parallel composition S = S1  . . .  Sn is given by the rules of
Table 3, where ext(S) is used to denote the set of external channel names. Since
we assumed the variable sets of the components to be disjoint, the combined
state is defined as the product. We write [[x]]σ for the value [[x]]σi of x, for one

Table 3. Parallel composition

σi →c!(s,v) σ̂i σj →c?(s,v) σ̂j i=j


Comm
(. . . , σi , . . . , σj , . . .) →τ (. . . , σ̂i , . . . , σ̂j , . . .)
σ1 →tick σ̂1 . . . σn →tick σ̂n
Tick
(σ1 , . . . , σn ) →tick (σ̂1 , . . . , σ̂n )
σi →c?(s,v) σ̂i c ∈ ext(S)
Interleavein
(. . . , σi , . . .) →c?(s,v) (. . . , σ̂i , . . .)
σi →c!(s,v) σ̂i c ∈ ext(S)
Interleaveout
(. . . , σi , . . .) →c!(s,v) (. . . , σ̂i , . . .)
σi →τ σ̂i
Interleaveτ
(. . . , σi , . . .) →τ (. . . , σ̂i , . . .)
298 N. Ioustinova, N. Sidorova, and M. Steffen

state σi being part of σ; analogously, we use the notation [[e]]σ for the value of
e in σ. The initial state of a parallel composition is given by the array of initial
process states together with (c, ) for channels in Chan. We call a sequence
σinit = σ0 →λ σ1 →λ . . . starting from an initial state a run.
Communication between two processes is done by exchanging a common sig-
nal and a value over a channel. According to the syntactic restriction on the use
of communication labels, only synchronisation between a process and a chan-
nel may happen. Sending of a signal over the channel means synchronising an
output step of the process with an input step of the queue, i.e. a co !(s, v) step
of the process is synchronised with a co ?(s, v) step of the channel c. Receiving
is accomplished by synchronising an output step, which removes first element
from the channel queue, with an input step of the process. As defined by the
rule Comm of Table 3, systems perform common steps synchronously. The result
of communication is relabelled to τ .
Communication steps of two partners may synchronize, if they use the same
channel name.Communication steps may be interleaved as in rules Interleavein
and Interleaveout provided the channel name belongs to the set of external
channel names ext(S) of the system. As far as τ steps are concerned, each system
can act on its own according to rule Interleaveτ .

Lemma 1. Let S be a system and σ ∈ Σ one of its states.

1. If σ →tick σ  , then [[t]]σ = on(0), for all timers t.


2. If σ →tick σ  , then for all channel states (c, q), q = .

Proof. For part (1), if [[t]]η = on(0) for a timer t in a process P , then a τ -step is
allowed due to either Timeout or TDiscard of Table 1. Hence, the system is
not blocked and therefore cannot do a tick -step.
Part (2) follows from the fact that a channel can only perform a tick -step
exactly when it is empty. '

The following lemma expresses that the blocked predicate is compositional


in the sense that the parallel composition of processes is blocked iff each process
is blocked (cf. rule Tick of Table 3).

Lemma 2. For a state σ = (σ1 , . . . , σn ) of a system S, blocked (σ) iff blocked (σi )


for all σi .

Proof. If σ is not blocked, it can perform a τ -step or an output-step. The output


step must originate from a process, which thus is not blocked. The τ -step is
either caused by a single process or by a synchronizing action of a sender and
a receiver; in both cases at least one process is not blocked. For the reverse
direction, a τ -step of a single process being thus not blocked, entails that σ is
not blocked. An output step of a single process causes σ either to do the same
output step or, in case of internal communication, to do a τ -step. In both cases,
σ is not blocked. '
Synchronous Closing and Flow Analysis 299

3 Replacing Asynchronous with Synchronous


Communication
In a system with asynchronous communication, introducing an environment pro-
cess can lead to a combinatorial explosion caused by all combinations of messages
in the queues modelling channels. An ideal solution from the perspective of the
state space would be to construct an environment process that communicates
with the system synchronously. In this section, we specify under which condi-
tions we can safely replace the asynchronous communication with an outside
environment process, say E, by synchronous communication.
A general condition an asynchronously communicating process satisfies is that
the process is always willing to accept messages, since the queues are unbounded.
Hence, the environment process must be at least input enabled: it must always
be able to receive messages, lest the synchronous composition will lead to more
blockings. Thanks to the Discard-rule of Table 1, SDL-processes are input
enabled, i.e., at least input-discard steps are possible, which throw away the
message and do not change the state of the process. Another effect of an input
queue is that the queue introduces an arbitrary delay between the reception of
a message and the future reaction of the receiving process to this message. For
an output, the effect is converse. This means, that in order not to miss any
potential behavior, the asynchronous process can be replaced by the analogous
synchronous process as long as there are either only input actions or else only
output actions, so the process is not reactive.1 This is related to the so-called
Brock-Ackerman anomaly, characterizing the difference between buffered and
unbuffered communication [7].
Disallowing reactive behavior is clearly a severe restriction and only mod-
erately generalizes completely chaotic behavior. One feature of the timed se-
mantics, though, allows to loosen this restriction. Time progresses by tick -steps
when the system is blocked. This especially means that when a tick happens, all
queues of a system are empty (cf. Lemma 1). This implies that the restrictions
need to apply only per time slice, i.e., to the steps between two ticks,2 and not
to the overall process behavior. Additionally we require that there are no infinite
sequences of steps without a tick, i.e., there are no runs with zero-time cycles.
This leads to the following definition.

Definition 3. A sequence of steps is tick-separated iff it contains no zero-time


cycle, and for every time slice of the sequence one of the following two conditions
holds:

1
A more general definition would require that the process actions satisfy a confluence
condition as far as the input and output actions are concerned, i.e., doing an input
action does not invalidate the possibility of an output action, and vice versa. Also
in this case, the process is not reactive, since there is no feed-back from input to
output actions.
2
A time slice of a run is a maximal subsequence of the run without tick -steps.
300 N. Ioustinova, N. Sidorova, and M. Steffen

Table 4. Synchronous communication over rendezvous channel c

γ1 →ci ?(s,v) γ̂1 γ2 →co !(s,v) γ̂2


Commsync
(γ1 , γ2 ) →τ (γ̂1 , γ̂2 )

1. the time slice contains no output action;


2. the time slice contains no output over two different channels, and all locations
in the time slice are input-discarding wrt. all inputs of that time slice.
We call a process tick-separated if all its runs are tick-separated.
Further we consider a synchronous version Ps and an asynchronous version
Pa of a process P , where Ps is the process P together with a set of rendezvous
channels, and Pa is formed by the process P together with a set of channels with
the same names as for Ps but which are queues. Synchronous communication
over a rendezvous channel c is defined by rule Commsync of Table 4.
In the following, we call a configuration the combined state of a process
together with the state of its channels. So given Ps and Pa and two correspond-
ing states γs = σs and γa = (σa , (ci , qi ), (c1o , q1 ), . . . , (cko , qk )), we define  as
γa  γs , if σs = σa . Comparing the observable behavior of an asynchronous and
a synchronous process, we must take into account that the asynchronous one
performs more internal steps when exchanging messages with its queues. Hence
the comparison is based on a weak notion of equivalence, ignoring the τ -steps:
so we define a weak step ⇒λ as →∗τ →λ →∗τ when λ = τ , and as →∗τ else. Corre-
spondingly, ⇒λ denotes a sequence of weak steps with labels from a sequence "λ.

Lemma 4. Assume a synchronous and an asynchronous version Ps and Pa of


a process P and corresponding states γs and γa with γa  γs , where the queues
of γa are all empty. If γa ⇒λ γa by a tick-separated sequence, where "λ does not
contain a tick -label, and where the queues of γa are empty, then there exists a
sequence γs ⇒λ γs with γa  γs .

Proof. We are given a sequence γa = γ0a →λ0 γ1a . . . →λn−1 γna = γa , with the
queues of γ0a and γna empty. According to the definition of tick -separation, we
distinguish the following two cases:
Case 1: λi ∈
/ {tick , c!(s, v)}, for all 1 ≤ i ≤ n − 1
To get a matching reduction sequence of the synchronous system starting at
γ0s , we apply the following renaming scheme. Input actions γa →c?(s,v) γa into the
queue are just omitted (which means, they are postponed for the synchronous
process). τ -steps γa →τ γa , inputting a value from the queue into the process,
i.e., τ -steps justified by rule Comm where the process does a step σ →c?(s,v) σ 
by rule Input and the queue the corresponding output step by rule Out, are
replaced by a direct input steps γs →c?(s,v) γs . Process internal τ -steps of the
Synchronous Closing and Flow Analysis 301

asynchronous system are identically taken by the synchronous system, as well.


τ -steps caused by output actions from the process into a queue need not be dealt
with, since the sequence from γ0a to γna does not contain external output from the
queues, and the queues are empty at the beginning and the end of the sequence.
It is straightforward to see that the sequence of steps obtained by this
transformation is indeed a legal sequence of the synchronous system. More-
over, the last configurations have the same state component and, due to the
non-lossiness and the Fifo-behavior of the input queue, both sequences coincide
modulo τ -steps.
Case 2: no output over two different channels, input discarding locations (and
no tick -steps)
Similar to the previous case, the synchronous system can mimic the behavior
of the asynchronous one adhering to the following scheme: τ -steps γa →τ γa ,
feeding a value from the process into the queue, i.e., τ -steps justified by rule
Output where the process does a step σ →c!(s,v) σ  and the queue the corre-
sponding input step by rule In, are replaced by a direct output step γs →c!(s,v) γs .
Input actions γa →c?(s,v) γa into the queue are mimicked by a discard-step. Out-
put steps from the queue of the asynchronous system are omitted, and so are
τ -steps caused by internal communication from the input-queue to the process.
All other internal steps are identically taken in both systems. The rest of the
argument is analogous to the previous case. '

Note that γa  γs means that γs is blocked whenever γa is blocked.
We write [[P ]]wtrace to denote the set of all weak traces of process P . To prove
that for tick -separated processes [[Ps ]]wtrace = [[Pa ]]wtrace , we introduce a notion
of tick -simulation that captures the ability to simulate any sequence of steps up
to a tick step, i.e. the chosen granularity level are time slices and only the states
immediately before and after tick are of importance there. (Remember that we
assume the absence of zero-time cycles.)

Definition 5. A binary relation R ⊆ Γ1 × Γ2 on two sets of states is called a


tick-simulation, when the following conditions hold:
1. If γ1 Rγ2 and γ1 →tick γ1 , then γ2 →tick γ2 and γ1 Rγ2 .
2. If γ1 Rγ2 and γ1 ⇒λ γ1 for some γ1 with blocked (γ1 ) where "λ does not
contain tick , then γ2 ⇒λ γ2 for some γ2 with blocked (γ2 ) and γ1 Rγ2 .
We write γ1 tick γ2 if there exists a tick simulation R with γ1 Rγ2 , and
similarly for processes, P1 tick P2 if their initial states are in that relation.

Theorem 6. If a process P is tick-separated, then [[Ps ]]wtrace = [[Pa ]]wtrace .

Proof. There are two directions to show. [[Ps ]]wtrace ⊆ [[Pa ]]wtrace is immediate:
each communication step of the synchronous process Ps can be mimicked by
the buffered Pa with adding an internal τ -step for the communication with the
buffer.
302 N. Ioustinova, N. Sidorova, and M. Steffen

For the reverse direction [[Pa ]]wtrace ⊆ [[Ps ]]wtrace we show that Pa is simulated
by Ps according to the definition of tick -simulation, which considers as basic steps
only tick -steps or else the sequence of steps within one time slice.
We define the relation R ⊆ Γa × Γs as (σa , ((c0 , q0 ), . . . , (cm , qm )))Rσs iff
σa = σs and qi =  for all queues modelling the channels. To show that R is
indeed a tick -simulation, assume γa = (σa , ((c0 , ), . . . , (cm , ))) and γs = σs
with γa Rγs . There are two cases to consider.
Case: γa →tick γa
where γa = γa [t →(t−1)]. By the definition of the tick -step, blocked (γa ) must hold,
i.e., there are no steps enabled except input from the outside or tick -steps. Since
immediately blocked (γs ), also γs →tick γs [t →(t−1)], which concludes the case.
Case: γa ⇒λ γa
where blocked (γa ) and "λ does not contain a tick -label. The case follows directly
from Lemma 4 and the fact that γa  γs where γa is blocked implies that also
γs is blocked.
Since clearly the initial states are in relation R as defined above, this gives
Pa tick Ps . Since Pa tick Ps and each tick -step of Pa can be mimicked by the
tick step of Ps and each weak step ⇒λ of Pa can also be mimicked by Ps . That
implies [[Pa ]]wtrace ⊆ [[Ps ]]wtrace , as required. '

4 Abstracting Data
Originating from an unknown or underspecified environment, signals from out-
side can carry any value, which renders the system infinite state. Assuming
nothing about the data means one can conceptually abstract values from out-
side into one abstract “chaotic” value, which basically means to ignore these
data and focus on the control structure. Data not coming from outside is left
untouched, though chaotic data from the environment influence internal data
of the system. In this section, we present a straightforward dataflow analysis
marking variable and timer instances that may be influenced by the environ-
ment, namely we establish for each process- and timer-variable in each location
whether
1. the variable is guaranteed to be non-chaotic, or
2. the variable is guaranteed to be influenced by the outside, or
3. whether its status depends on the actual run.
The analysis is a combination of the ones from [29] and [17].

4.1 Dataflow Analysis


The analysis works on a simple flow graph representation of the system, where
each process is represented by a single flow graph, whose nodes n ∈ nodes are
associated with the process’ actions and the flow relation captures the intra-
process data dependencies. Since the structure of the language we consider is
rather simple, the flow-graph can be easily obtained by standard techniques.
Synchronous Closing and Flow Analysis 303

We use an abstract representation of the data values, where , is interpreted


as value chaotically influenced by the environment and ⊥ stands for a non-chaotic
value. We write η α , η1α , . . . for abstract valuations, i.e., for typical elements from
Val α = Var → {,, ⊥}. The abstract values are ordered ⊥ ≤ ,, and the order is
lifted pointwise to valuations. With this ordering, the set of valuations forms a
complete lattice, where we write η⊥ for the least element, given as η⊥ (x) = ⊥ for
n
all x ∈ Var , and we denote the least upper bound of η1α , . . . , ηnα by i=1 ηiα (or
by η1 ∨ η2 in the binary case). By slight abuse of notation, we will use the same
α α

symbol η α for the valuation per node, i.e., for functions of type node → Val α .
The abstract valuation [[e]]ηα for an expression e equals ⊥ iff all variables in e
are evaluated to ⊥, [[e]]ηα is , iff the abstract valuation of at least one of the
variables in e is ,.
Depending on whether we are interested in an answer to point (1) or point (2)
from above, , is interpreted as a variable potentially influenced from outside,
and, dually for the second case, , stands for variables guaranteed to be influenced
from outside. Here we present may and must analysis for the first and the second
case respectively.

May Analysis. First we consider may analysis that marks variables potentially
influenced by data from outside. Each node n of the flow graph has associated
an abstract transfer function fn : Val α → Val α , describing the change of the
abstract valuations depending on the kind of action at the node. The functions
are given in Table 5. The equations are mostly straightforward; the only case de-
serving mention is the one for c?s(x), whose equation captures the inter-process
data-flow from a sending to a receiving action. It is easy to see that the transfer
functions are monotone.
Upon start of the analysis, at each node the variables’ values are assumed to
α
be defined, i.e., the initial valuation is the least one: ηinit (n) = η⊥ . This choice
rests on the assumption that all local variables of each process are properly
initialized. We are interested in the least solution to the data-flow problem given
by the following constraint set:
α
ηpost (n) ≥ fn (ηpre
α
(n))
(1)
α
ηpre (n) ≥ {ηpost
α
(n ) | (n , n) in flow relation}

Table 5. May analysis: transfer functions/abstract effect for process P


α η α [x → ] s ∈ Sig ext
f (c?s(x))η = 
η α [x → {[[e]]ηα |n =g  c!s(e)}] s ∈ Sig ext
f (g  c!s(e))η α = ηα
f (g  x := e)η α = η α [x →[[e]]ηα ]
f (g  set t := e)η α = η α [t → on([[e]]ηα )]
f (g  reset t)η α = η α [t → off ]
f (gt  reset t)η α = η α [t → off ]
304 N. Ioustinova, N. Sidorova, and M. Steffen

For each node n of the flow graph, the data-flow problem is specified by two
α
inequations or constraints. The first one relates the abstract valuation ηpre before
α
entering the node with the valuation ηpost afterwards via the abstract effects of
Table 5. The least fixpoint of the constraint set can be found iteratively in a fairly
standard way by a worklist algorithm (see e.g., [19, 14, 24]), where the worklist
steers the iterative loop until the least fixpoint is reached (cf. Figure 1).

input : t h e f l o w −graph o f t h e program


α α
output : ηpre , ηpost ;

η α (n) = ηinit
α
(n) ;
WL = {n | αn =?s(x), s ∈ Sig ext } ;

repeat
p i c k n ∈ WL ;
if n = g  c!s(e) then
l e t S  = {n | n = c?s(x) and [[e]]ηα (n) ≤ [[x]]ηα (n ) }
in
f o r a l l n ∈ S  : η α (n ) := fn (η α (n )) ;
l e t S = {n ∈ succ(n) | fn (η α (n)) ≤ η α (n )}
in
f o r a l l n ∈ S : η α (n ) := fn (η α (n)) ;
WL := WL\{n} ∪ S ∪ S  ;
u n t i l WL = ∅ ;

α
ηpre (n) = η α (n) ;
ηpost (n) = fn (η α (n))
α

Fig. 1. May analysis: worklist algorithm

The worklist data-structure WL used in the algorithm is a set of elements,


more specifically a set of nodes from the flow-graph, where we denote by succ(n)
the set of successor nodes of n in the flow graph in forward direction. It supports
as operation to randomly pick one element from the set (without removing it),
and we write WL\{n} for the worklist without the node n and ∪ for set-union on
the elements of the worklist. The algorithm starts with the least valuation on all
nodes and an initial worklist containing nodes with input from the environment.
It enlarges the valuation within the given lattice step by step until it stabilizes,
i.e., until the worklist is empty. If adding the abstract effect of one node to the
current state enlarges the valuation, i.e., the set S is non-empty, those successor
nodes from S are (re-)entered into the list of unfinished one. Since the set of
variables in the system is finite, and thus the lattice of abstract valuations, the
termination of the algorithm is immediate.
With the worklist as a set-like data structure, the algorithm is free to work off
the list in any order. In praxis, more deterministic data-structures and traversal
Synchronous Closing and Flow Analysis 305

strategies are appropriate, for instance traversing the graph in a breadth-first


manner (see [24] for a broader discussion or various traversal strategies).
After termination the algorithm yields two mappings ηpre α α
, ηpost
 : Node →
Val . On a location l, the result of the analysis is given by η (l) = {ηpost
α α α
(ñ) |
ñ = ˜l −→α l}, also written as ηlα .

Lemma 7 (Correctness (may)). Upon termination, the algorithm gives back


the least solution to the constraint set as given by the equations (1), resp. Table 5.

Must Analysis. The must analysis is almost dual to may analysis. A transfer
function that describes the change of the abstract valuation depending on the
action at the node is defined in Table 6. For inputs, c?s(x) in process P assigns
⊥ to x if the signal is sent to P with reliable data, only. This means the values
after reception correspond to the greatest lower bound over all expressions which
can occur in a matching send-action.

Table 6. Must analysis: transfer functions/abstract effect for process P


η α [x → ] s ∈ Sig int
f (c?s(x))η α = 
η α [x → {[[e]]ηα |n =g  c!s(e)}] s ∈ Sig int
f (g  c!s(e))η α = ηα
f (g  x := e)η α = η α [x →[[e]]ηα ]
f (g  set t := e)η α = η α [t → on([[e]]ηα )]
f (g  reset t)η α = η α [t → off ]
f (gt  reset t)η α = η α [t → off ]

As that is done for may analysis, the data-flow problem is specified for each
node n of the flow graph by two inequations (2) (see Table 6). Analogously,
the greatest fixpoint of the constraint set can be found iteratively by a worklist
algorithm (cf. Figure 2). Upon start of the analysis, at each node the variables’
values are assumed to be defined, i.e., the initial valuation is the greatest one:
α
ηinit (n) = η .

α
ηpost (n) ≤ fn (ηpre
α
(n))
(2)
α
ηpre (n) ≤ {ηpost
α
(n ) | (n , n) in flow relation}

Like the may-analysis case, the termination of the algorithm follows from the
finiteness of the set of variables.

Lemma 8 (Correctness (must)). Upon termination, the algorithm from Fig-


ure 2 gives back the greatest solution to the constraint set as given by equa-
tions (2) resp. Table 6.
306 N. Ioustinova, N. Sidorova, and M. Steffen

input : t h e f l o w −graph o f t h e program


α α
output : ηpre , ηpost ;

η α (n) = ηinit
α
(n) ;
WL = {n | αn = g  x := e} ;

repeat
p i c k n ∈ WL ;
i f n = g  c!s(e) then
l e t S  = {n | n = c?s(x) and [[e]]ηα (n)  [[x]]ηα (n ) }
in
f o r a l l n ∈ S  : η α (n ) := fn (η α (n )) ;
l e t S = {n ∈ succ(n) | fn (η α (n))  η α (n )}
in
f o r a l l n ∈ S : η α (n ) := fn (η α (n)) ;
WL := WL\{n} ∪ S ∪ S  ;
u n t i l WL = ∅ ;

α
ηpre (n) = η α (n) ;
ηpost (n) = fn (η α (n))
α

Fig. 2. Must analysis: worklist algorithm

4.2 Program Transformation


Based on the result of the analysis, we transform the given system S = P  P̄
into an optimized one, denoted by S , where the communication of P with its
environment P̄ is done synchronously, all the data exchanged is abstracted, and
which is in a simulation relation with the original system.
The intention is to use the information collected in the analyses about the
influence of the environment to reduce the state space. Depending on whether one
relies on the may-analysis alone (which variable occurrences may be influenced
from the outside) or takes into account the results of both analyses (additional
information which variable occurrences are definitely chaotic) the precision of
the abstraction varies. Using only the may-information overapproximates the
system (further) but in general leads to a smaller state space.
The second option, on the other hand, gives a more precise abstraction and
thus less false negatives. Indeed, it does not, apart from the abstraction caused
by introducing chaotic values, abstract the system further as far as the behavior
is concerned. It is nevertheless profitable as it allows to remove any unnecessary
instances of variables or expressions which are detected to be , , constantly. It
furthermore can make further optimizations of the system more effective. For
instance, live analysis and the optimization as described in [4] can be effective
for more variable instances and thus yield better further reduction when applied
after replacing variable instances which are constantly , ,.
Synchronous Closing and Flow Analysis 307

In either case we must ensure that the abstraction of timer values is treated
adequately (see below). Here we describe the transformation for the combination
of may and must analysis, only, since the alternative is simpler.
Overloading the symbols , and ⊥ we mean for the rest of the paper: the value
of , for a variable at a location refers to the result of the must analysis, i.e., the
definite knowledge that the data is chaotic for all runs. Dually, ⊥ stands for the
definite knowledge of the may analysis, i.e., for data which is never influenced
from outside. Additionally, we write ⊥ , in case neither analysis gave a definite
answer.
We extend the data domains each by an additional value , ,, representing
unknown, chaotic, data, i.e., we assume now domains such as N 
= N ∪˙ {,,},


Bool = Bool ∪˙ {, ,}, . . . , where we do not distinguish notationally the various
types of chaotic values. These values , , are considered as the largest values, i.e.,
we introduce ≤ as the smallest reflexive relation with v ≤ , , for all elements v
(separately for each domain). The strict lifting of a valuation η  
to expressions
is denoted by [[.]]η .
The transformation is straightforward: guards influenced by the environment
are taken non-deterministically, i.e., a guard g at a location l is replaced by true,
if [[g]]ηlα = ,. A guard g whose value at a location l is ⊥
, is treated dynamically on
the extended data domain. For assignments, we distinguish between the variables
that carry the value ⊥ , in at least one location and the rest. Assignments of ,, to
variables that take ⊥ , at no location are omitted. Assignments of concrete values
are left untouched and the assignments to the variables that are marked by ⊥ ,
in at least one location are performed on the extended data domain.
The interpretation of timer variables on the extended domain requires spe-
cial attention. Chaos can influence timers only via the set-operation by setting

Table 7. Transformation

l −→c?s(x) ˆ
l ∈ Edg x ∈ Var ⊥
 [[x]]ηlα = 
T-Inputext
l −→c?s( ) ˆ
l ∈ Edg 

l −→g  c!(s,e) ˆ
l ∈ Edg [[e]]ηlα = 
T-Output
l −→g  c!(s, ˆ
) l ∈ Edg


l −→g  x:=e ˆ
l ∈ Edg x ∈ Var ⊥
 [[x]]ηlα = 
T-Assign1
l −→g  skip ˆ
l ∈ Edg 
l −→g  x:=e ˆ
l ∈ Edg x ∈ Var ⊥
 [[e]]ηlα = 
T-Assign2
l −→g  x:=

ˆ
l ∈ Edg 
ˆ
l −→g  set t:=e l ∈ Edg [[e]]ηlα = 
T-Set
l −→g  set t:= ˆ
 l ∈ Edg

308 N. Ioustinova, N. Sidorova, and M. Steffen

it to a chaotic value. Therefore, the domain of timer values contains the addi-
tional chaotic value on(, ,). Since we need the transformed system to show at
least the behavior of the original one, we must provide proper treatment of the
rules involving on(,,), i.e., the Timeout-, the TDiscard-, and the Tick-rule.
As on(, ,) stands for any value of active timers, it must cover the cases where
timeouts and timer-discards are enabled (because of the concrete value on(0))
as well as disabled (because of on(n) with n ≥ 1). The second one is necessary,
since the enabledness of the tick steps depends on the disabledness of timeouts
and timer discards via the blocked-condition.
To distinguish the two cases, we introduce a refined abstract value on(, ,+ )
for chaotic timers, representing all on-settings larger than or equal to 1. The
order on the domain of timer values is given as smallest reflexive order relation
such that on(0) ≤ on(, ,) and on(n) ≤ on(, ,+ ) ≤ on(,,), for all n ≥ 1. To treat
the case where the abstract timer value on(, ,) denotes absence of immediate
timeout, we add edges of the form

T-NoTimeout
l −→t=on( + l ∈ Edg
)  set t:=


which set back the timer value to ,


,+ representing a non-zero delay.
The decreasing operation needed in the Tick-rule is defined in extension to
the definition on values from on(N) on ,
,+ by on(, ,+ ) − 1 = on(,
,). Note that
the operation is left undefined on ,
,, which is justified by a property analogous
to Lemma 1:

Lemma 9. Let (l, η  


, q  ) be a state of S . If (l, η 
 
, q  ) →tick , then [[t]]η ∈
/
{on(,
,), on(0)}, for all timers t.

Proof. By definition of the blocked -predicate and inspection of the Timeout-


and TDiscard-rule (for on(0) as for Lemma 1) and the behavior of the abstract
,) (T-NoTimeout-rule).
value on(, '

As the transformation only adds non-determinism, the transformed system S


simulates S (cf. [29]). Together with Theorem 6, this guarantees preservation of
LTL-properties as long as variables influenced by P̄ are not mentioned. Since we
abstracted external data into a single value, not being able to specify properties
depending on externally influenced data is not much of an additional loss of
precision.

Theorem 10. Let Sa and Ss be the variant of a system communicating to the


environment asynchronously resp. synchronously, and S be given as the parallel
composition Sa  S̄, where S̄ is the environment of the system. Furthermore, let
S = Ss  S̄ be defined as before, and ϕ a next-free LTL-formula mentioning
only variables from {x | ∀l ∈ Loc. [[x]]ηlα = ⊥}. Then S |= ϕ implies S |= ϕ.
Synchronous Closing and Flow Analysis 309

5 Example: A Wireless ATM Medium-Access Protocol


The goal of our experiments is to estimate the state space reduction due to re-
placing asynchronous communication with the environment by the synchronous
one. Primarily interested in the effect of removing queues, we use here the most
trivial environment: the chaotic one.
We applied the methods in a series of experiments to the industrial protocol
Mascara [33]. Located between the ATM-layer and the physical medium, Mas-
cara is a medium-access layer or, in the context of the ISDN reference model,
a transmission convergence sub-layer for wireless ATM communication in local
area networks. A crucial feature of Mascara is the support of mobility. A mobile
terminal (MT) located inside the area cell of an access point (AP) is capable
of communicating with it. When a mobile terminal moves outside the current
cell, it has to perform a so-called handover to another access point covering the
cell the terminal has moved into. The handover must be managed transparently
with respect to the ATM layer, maintaining the agreed quality of service for the
current connections. So the protocol has to detect the need for a handover, select
a candidate AP to switch to, and redirect the traffic with minimal interruption.
This protocol was the main case study in the Vires project; the results of its
verification can be found e.g. in [3, 13, 30]. The SDL-specification of the proto-
col was automatically translated into the input language of DTSpin [2, 11], a
discrete-time extension of the well-known Spin model-checker [16]. For the trans-
lation, we used sdl2if [5] and if2pml-translators [3]. Our prototype implemen-
tation, the pml2pml-translator, post-processes the output from the automatic
translation of the SDL-specification into DTPromela.
Here, we are not interested in Mascara itself and the verification of its prop-
erties, but as real-life example for the comparison of the state spaces of parts
of the protocol when closed with the environment as an asynchronous chaotic
process and the state space of the same entity closed with embedded chaos. For
the comparison we chose a model of the Mascara control entity (MCL) at the
mobile terminal side In our experiments we used DTSpin version 0.1.1, an ex-
tension of Spin3.3.10, with the partial-order reduction and compression options
on. All the experiments were run on a Silicon Graphics Origin 2000 server on a
single R10000/250MHz CPU with 8GB of main memory.
The implementation currently covers the may analysis and the corresponding
transformation. We do not model the chaotic environment as a separate process
communicating with the system via rendezvous channels but transform an open
DTPromela model into a closed one by embedding the timed chaotic environ-
ment into the system as described in [29], which allows us to use the process
fairness mechanism provided by Spin, which works only for systems with asyn-
chronous communication. The translator does not require any user interaction,
except that the user is requested to give the list of external signals. The exten-
sion is implemented in Java and requires JDK-1.2 or later. The package can be
downloaded from http://www.cwi.nl/~ustin/EH.html.
310 N. Ioustinova, N. Sidorova, and M. Steffen

Table 8. Model checking MCL with chaos as a process and embedded chaos

bs states transitions mem. time states transitions mem. time


2 9.73e+05 3.64e+06 40.842 15:57 300062 1.06e+06 9.071 1:13
3 5.24e+06 2.02e+07 398.933 22:28 396333 1.85e+06 11.939 1:37
4 2.69e+07 1.05e+08 944.440 1:59:40 467555 2.30e+06 14.499 2:13

Table 8 gives the results for the model checking of MCL with chaos as external
process on the left and embedded on the right. The first column gives the buffer
size for process queues. The other columns give the number of states, transitions,
memory and time consumption, respectively. As one can see, the state space
as well as the time and the memory consumption are significantly larger for
the model with the environment as a process, and they grow with the buffer
size much faster than for the model with embedded chaos. The model with the
embedded environment has a relatively stable state-space size.

6 Conclusion
In this paper, we integrated earlier works from [29, 18, 31, 17] into a general
framework describing how to close an open, asynchronous system by a timed
environment while avoiding the combinatorial state-explosion in the external
buffers. The generalization presented here goes a step beyond complete arbitrary
environmental behavior, using the timed semantics of the language. We facilitate
the model checking of the system by using the information obtained with may
and must analyses: We substitute the chaotic value , , for expressions influenced
by chaotic data from outside and then optimize the system by removing variables
and actions that became redundant.
In the context of software-testing, [9] describes an a dataflow algorithm to
close program fragments given in the C-language with the most general envi-
ronment. The algorithm is incorporated into the VeriSoft tool. As in our paper,
the assume an asynchronous communicating model and abstract away external
data, but do not consider timed systems and their abstraction. As for model-
checking and analyzing SDL-programs, much work has been done, for instance
in the context of the Vires-project, leading to the IF-toolset [5]
A fundamental approach to model checking open systems is known as module
checking [21][20]. Instead of transforming the system into a closed one, the un-
derlying computational model is generalized to distinguish between transitions
under control of the module and those driven by the environment. Mocha [1]
is a model checker for reactive modules, which uses alternating-time temporal
logic as specification language.
For practical applications, we are currently extending the larger case study
[30] using the chaotic closure to this more general setting. We proceed in the fol-
lowing way: after splitting an SDL system into subsystems following the system
structure, properties of the subsystems are verified being closed with an embed-
ded chaotic environment. Afterwards, the verified properties are encoded into
Synchronous Closing and Flow Analysis 311

an SDL process, for which a tick-separated closure is constructed. This closure


is used as environment for other parts of the system. As the closure gives a safe
abstraction of the desired environment behavior, the verification results can be
transferred to the original system.

References
1. R. Alur, T. A. Henzinger, F. Mang, S. Qadeer, S. K. Rajamani, and S. Tasiran.
Mocha: Modularity in model checking. In A. J. Hu and M. Y. Vardi, editors,
Proceedings of CAV ’98, volume 1427 of Lecture Notes in Computer Science, pages
521–525. Springer-Verlag, 1998.
2. D. Bošnački and D. Dams. Integrating real time into Spin: A prototype imple-
mentation. In S. Budkowski, A. Cavalli, and E. Najm, editors, Proceedings of
Formal Description Techniques and Protocol Specification, Testing, and Verifica-
tion (FORTE/PSTV’98). Kluwer Academic Publishers, 1998.
3. D. Bošnački, D. Dams, L. Holenderski, and N. Sidorova. Verifying SDL in Spin.
In S. Graf and M. Schwartzbach, editors, TACAS 2000, volume 1785 of Lecture
Notes in Computer Science. Springer-Verlag, 2000.
4. M. Bozga, J. C. Fernandez, and L. Ghirvu. State space reduction based on Live.
In A. Cortesi and G. Filé, editors, Proceedings of SAS ’99, volume 1694 of Lecture
Notes in Computer Science. Springer-Verlag, 1999.
5. M. Bozga, J.-C. Fernandez, L. Ghirvu, S. Graf, J.-P. Krimm, and L. Mounier. IF:
An intermediate representation and validation environment for timed asynchronous
systems. In J. Wing, J. Woodcock, and J. Davies, editors, Proceedings of Sympo-
sium on Formal Methods (FM 99), volume 1708 of Lecture Notes in Computer
Science. Springer-Verlag, Sept. 1999.
6. M. Bozga, S. Graf, A. Kerbrat, L. Mounier, I. Ober, and D. Vincent. SDL for
real-time: What is missing? In Y. Lahav, S. Graf, and C. Jard, editors, Electronic
Proceedings of SAM’00, 2000.
7. J. Brock and W. Ackerman. An anomaly in the specifications of nondeterministic
packet systems. Technical Report Computation Structures Group Note CSG-33,
MIT Lab. for Computer Science, Nov. 1977.
8. E. Clarke, O. Grumberg, and D. Long. Model checking and abstraction. ACM
Transactions on Programming Languages and Systems, 16(5):1512–1542, 1994. A
preliminary version appeared in the Proceedings of POPL 92.
9. C. Colby, P. Godefroid, and L. J. Jagadeesan. Automatically closing of open reac-
tive systems. In Proceedings of 1998 ACM SIGPLAN Conference on Programming
Language Design and Implementation. ACM Press, 1998.
10. D. Dams, R. Gerth, and O. Grumberg. Abstract interpretation of reactive sys-
tems: Abstraction preserving ∀CTL∗ ,∃CTL∗ , and CTL∗ . In E.-R. Olderog, editor,
Proceedings of PROCOMET ’94. IFIP, North-Holland, June 1994.
11. Discrete-time Spin. http://www.win.tue.nl/~dragan/DTSpin.html, 2000.
12. P. Godefroid. Using partial orders to improve automatic verification methods.
In E. M. Clarke and R. P. Kurshan, editors, Computer Aided Verification 1990,
volume 531 of Lecture Notes in Computer Science, pages 176–449. Springer-Verlag,
1991. an extended Version appeared in ACM/AMS DIMACS Series, volume 3,
pages 321–340, 1991.
312 N. Ioustinova, N. Sidorova, and M. Steffen

13. J. Guoping and S. Graf. Verification experiments on the Mascara protocol. In M. B.


Dwyer, editor, Model Checking Software, Proceedings of the 8th International SPIN
Workshop (SPIN 2001), Toronto, Canada, Lecture Notes in Computer Science,
pages 123–142. Springer-Verlag, 2001.
14. M. S. Hecht. Flow Analysis of Programs. North-Holland, 1977.
15. G. Holzmann and J. Patti. Validating SDL specifications: an experiment. In
E. Brinksma, editor, International Workshop on Protocol Specification, Testing
and Verification IX (Twente, The Netherlands), pages 317–326. North-Holland,
1989. IFIP TC-6 International Workshop.
16. G. J. Holzmann. The Spin Model Checker. Addison-Wesley, 2003.
17. N. Ioustinova, N. Sidorova, and M. Steffen. Abstraction and flow analysis for
model checking open asynchronous systems. In Proceedings of the 9th Asia-Pacific
Software Engineering Conference (APSEC 2002, 4.–6. December 2002, Gold Coast,
Queensland, Australia, pages 227–235. IEEE Computer Society, Dec. 2002.
18. N. Ioustinova, N. Sidorova, and M. Steffen. Closing open SDL-systems for model
checking with DT Spin. In L.-H. Eriksson and P. A. Lindsay, editors, Proceedings
of Formal Methods Europe (FME’02), volume 2391 of Lecture Notes in Computer
Science, pages 531–548. Springer-Verlag, 2002.
19. G. Kildall. A unified approach to global program optimization. In Proceedings of
POPL ’73, pages 194–206. ACM, January 1973.
20. O. Kupferman and M. Y. Vardi. Module checking revisited. In O. Grumberg, edi-
tor, CAV ’97, Proceedings of the 9th International Conference on Computer-Aided
Verification, Haifa. Israel, volume 1254 of Lecture Notes in Computer Science.
Springer, June 1997.
21. O. Kupferman, M. Y. Vardi, and P. Wolper. Module checking. In R. Alur, editor,
Proceedings of CAV ’96, volume 1102 of Lecture Notes in Computer Science, pages
75–86, 1996.
22. O. Lichtenstein and A. Pnueli. Checking that finite state concurrent programs
satisfy their linear specification. In Twelfth Annual Symposium on Principles of
Programming Languages (POPL) (New Orleans, LA), pages 97–107. ACM, Jan-
uary 1985.
23. D. Long. Model Checking, Abstraction and Compositional Verification. PhD thesis,
Carnegie Mellon University, 1993.
24. F. Nielson, H.-R. Nielson, and C. Hankin. Principles of Program Analysis.
Springer-Verlag, 1999.
25. A. Olsen, O. Færgemand, B. Møller-Pedersen, R. Reed, and J. R. W. Smith. System
Engineering Using SDL-92. Elsevier Science, 1997.
26. A. Pnueli. The temporal logic of programs. In Proceeding of the 18th Annual
Symposium on Foundations of Computer Science, pages 45–57, 1977.
27. Specification and Description Language SDL. CCITT, 1993.
28. Specification and Description Language SDL, blue book. CCITT Recommendation
Z.100, 1992.
29. N. Sidorova and M. Steffen. Embedding chaos. In P. Cousot, editor, Proceedings
of SAS’01, volume 2126 of Lecture Notes in Computer Science, pages 319–334.
Springer-Verlag, 2001.
30. N. Sidorova and M. Steffen. Verifying large SDL-specifications using model check-
ing. In R. Reed and J. Reed, editors, Proceedings of the 10th International SDL
Forum SDL 2001: Meeting UML, volume 2078 of Lecture Notes in Computer Sci-
ence, pages 403–416. Springer-Verlag, Feb. 2001.
Synchronous Closing and Flow Analysis 313

31. N. Sidorova and M. Steffen. Synchronous closing of timed SDL systems for model
checking. In A. Cortesi, editor, Proceedings of the Third International Workshop on
Verification, Model Checking, and Abstract Interpretation (VMCAI) 2002, volume
2294 of Lecture Notes in Computer Science, pages 79–93. Springer-Verlag, 2002.
32. A. Valmari. A stubborn attack on state explosion. Formal Methods in System De-
sign, 1992. Earlier version in the proceeding of CAV ’90 Lecture Notes in Computer
Science 531, Springer-Verlag 1991, pp. 156–165 and in Computer-Aided Verification
’90, DIMACS Series in Discrete Mathematics and Theoretical Computer Science
Vol. 3, AMS & ACM 1991, pp. 25–41.
33. A wireless ATM network demonstrator (WAND), ACTS project AC085.
http://www.tik.ee.ethz.ch/~wand/, 1998.
Priority Systems

Gregor Gössler1 and Joseph Sifakis2


1
INRIA Rhône-Alpes, goessler@inrialpes.fr
2
VERIMAG, sifakis@imag.fr

Abstract. We present a framework for the incremental construction


of deadlock-free systems meeting given safety properties. The frame-
work borrows concepts and basic results from the controller synthesis
paradigm by considering a step in the construction process as a con-
troller synthesis problem.
We show that priorities are expressive enough to represent restric-
tions induced by deadlock-free controllers preserving safety properties.
We define a correspondence between such restrictions and priorities and
provide compositionality results about the preservation of this corre-
spondence by operations on safety properties and priorities. Finally, we
provide an example illustrating an application of the results.

1 Introduction
A common idea for avoiding a posteriori verification and testing, is to use system
design techniques that guarantee correctness by construction. Such techniques
should allow to construct progressively from a given system S and a set of re-
quirements R1 , . . . , Rn , a sequence of systems S1 , . . . , Sn , such that system Si
meets all the requirements Rj for j ≤ i. That is, to allow incremental construc-
tion, requirements should be composable [2,6] along the design process. In spite
of their increasing importance, there is currently a tremendous lack of theory
and methods, especially for requirements including progress properties which
are essential for reactive systems. Most of the existing methodologies deal with
construction of systems such that a set of state properties always hold. They are
founded on the combined use of invariants and refinement relations. Compos-
ability is ensured by the fact that refinement relations preserve trace inclusion.
We present a framework allowing to consider jointly state property invariance
and deadlock-freedom.
Practice for building correct systems is often based on the idea of adding
enforcement mechanisms to a given system S in order to obtain a system S 
meeting a given requirement. These mechanisms can be implemented by instru-
menting the code of S or by composing S with systems such as controllers or
monitors that modify adequately the overall behavior.
An application of this principle is the enforcement of security policies which
are safety properties described by automata [14]. A main requirement for the
enforced system is that it safely terminates when it detects a deviation from

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 314–329, 2004.

c Springer-Verlag Berlin Heidelberg 2004
Priority Systems 315

some nominal secure behavior. A more difficult problem is also to ensure system
availability and preserve continuity of service [3,10].
Another application of this principle is aspect oriented programming [8] used
to build programs meeting (usually simple) requirements. Aspects can be con-
sidered as requirements from which code is generated and woven into a program
intended to meet the requirements. In aspect oriented programming, aspect com-
position is identified as a central problem as it may cause unintentional interfer-
ence and inconsistency [15].
Practice for building correct systems by using enforcement mechanisms raises
some hard theoretical problems. For a sufficiently fine granularity of observation,
it is relatively easy to enforce safety requirements (as non violations of given state
properties) by stopping system progress. It is much harder to devise mechanisms
that also guarantee system availability and avoid service interruption. Further-
more, composability of requirements e.g. security policies, aspects, is identified
as a main obstacle to rigorous incremental system construction.
We propose a design framework for both safety and deadlock-freedom re-
quirements. The framework consists of a model, priority systems and results
concerning its basic properties including composability. A priority system is a
transition system with a (dynamic) priority relation on its actions. A priority re-
lation ≺ is a set of predicates of the form ai ≺ Cij .aj meaning that action ai has
lower priority than action aj at all states satisfying Cij . At a given state of the
transition system, only enabled actions with maximal priority can be executed.
That is, in a priority system, a priority relation restricts the behavior of its tran-
sition system exactly as a scheduler restricts the behavior of a set of tasks. The
remarkably nice property of priority systems is that they are deadlock-free if
they are built from deadlock-free transition systems and from priority relations
satisfying some easy-to-check consistency condition.
The proposed framework considers design as a controller synthesis [12] prob-
lem: from a given system S and requirement R, find a system S  meeting R.
S  is the composition of S with a controller which monitors the state of S and
restricts its behavior by adequately switching off a subset of controllable actions
of S. The controller is usually specified as a solution of a fixpoint equation.
The simple case where R means that S  is deadlock-free and does not violate
a state predicate U has been studied in various contexts e.g., in [11,1]. The
corresponding controller is specified as a deadlock-free control invariant which is
a state predicate U  , U  ⇒ U , such that
– it is preserved by the non controllable actions of S, that is if U  holds at
some state then it remains true forever if only non controllable actions are
executed;
– U  is false for all deadlock states of S.
Given U  , the controlled (designed) system S  is obtained from S by con-
juncting the guard of any controllable action a by the weakest precondition of
U  under a.
In Section 2, we formalize the relationship between S and S  , by introducing
restriction operators. These are specified as tuples of state predicates in bijection
316 G. Gössler and J. Sifakis

with the set of actions of S. The application of a restriction operator to S is S  ,


obtained from S by conjuncting the guards of its actions by the corresponding
state predicates of the restriction. We study properties of deadlock-free control
restrictions, that is restrictions corresponding to deadlock-free control invariants.
In Section 3, we show that under some consistency conditions, priorities
can be used to represent deadlock-free restrictions. Thus, controlled systems
S  can be represented as deadlock-free priority systems. Consistency checking
boils down to computing a kind of transitive closure of the priority relation. We
show that for static priorities consistency is equivalent to deadlock-freedom.
Composability in our framework means commutativity of application of pri-
orities on a given system. As a rule, the result of the successive restriction of
a system S by two priorities ≺1 and ≺2 depends on the order of application
and we provide sufficient conditions for commutativity. This difficulty can be
overcome by using a symmetric composition operator ⊕ for priorities which pre-
serves safety and deadlock-freedom. The restriction of a system S by ≺1 ⊕ ≺2
is a refinement of any other restriction of S obtained by application of ≺1 and
≺2 .
An interesting question is whether priorities are expressive enough to repre-
sent restrictions induced by deadlock-free control invariants. We bring a positive
answer by using a construction associating with a state predicate U a priority
relation ≺U . We show that if U is a deadlock-free control invariant then the
controlled system S  is equivalent to the system S restricted by ≺U . Further-
more, we provide results relating the controlled systems corresponding to U1 ,
U2 , U1 ∧ U2 to restrictions by ≺U1 , ≺U2 , ≺U1 ⊕ ≺U2 .
Section 4 illustrates application of the results on an example.
Section 5 presents concluding remarks about the presented framework.

2 Deadlock-Free Control Invariants


2.1 Definitions and Basic Properties
Definition 1 (Transition System). A transition system B is a tuple (X, A,
{Ga }a∈A , {F a }a∈A ), where

– X is a finite set of variables;


– A is a finite set of actions, union of two disjoint sets Au and Ac , the sets of
the uncontrollable and controllable interactions respectively;
– Ga is a guard, predicate on X;
– F a : X → X is a transition function, where X is the set of valuations of X.

Definition 2 (Semantics of a Transition System). A transition system (X,


A, {Ga }a∈A , {F a }a∈A ) defines a transition relation →: X × A × X such that:
a
∀x, x ∈ X ∀a ∈ A . x → x ⇐⇒ Ga (x) ∧ x = F a (x).
Priority Systems 317

We introduce the following notations:

– Given two transition systems B1 , B2 with disjoint action vocabularies such


that Bi = (Xi , Ai , {Ga }a∈Ai , {F a }a∈Ai ), for i = 1, 2, their union is the
transition system B1 ∪B2 = (X1 ∪X2 , A1 ∪A2 , {Ga }a∈A1 ∪A2 , {F a }a∈A1 ∪A2 ).
– Given a transition system B, we represent by B u (respectively B c ) the tran-
sition system consisting of the uncontrollable (respectively controllable) ac-
tions of B. Clearly B = B u ∪ B c .
– Given a transition system
 B, we represent by G(B) the disjunction of its
guards, that is G(B) = a∈A Ga where A is the set of the actions of B.

Definition 3 (Predecessors). Given B = (X, A, {Ga }a∈A , {F a }a∈A ) and a


predicate U on X, the predecessors of U by action a is the predicate prea (U ) =
Ga ∧ U ([F a (X)/X]) where U [F a (X)/X] is obtained from U by uniform substi-
tution of X by F a (X).
Clearly, prea (U ) characterizes all the states from which execution of a leads
to some state satisfying U .
Definition 4 (Invariants and Control Invariants). Given a transition sys-
tem B and a predicate U ,

a∈A ¬prea (¬U ) = ∨


a
– U is an invariant of B if U =⇒ a∈A (¬G
U ([F (X)/X]). An invariant U , U = false, is called deadlock-free if U ⇒
a

G(B).
– U is a control invariant of B if U ⇒ a∈Au ¬pre  a (¬U ). A control invariant
U , U = false, is called deadlock-free if U ⇒ a∈A prea (U ).
We write inv[B](U ) to express the fact that U is an invariant of B. Notice
that invariants are control invariants of systems that have only uncontrollable
actions.
Proposition 1. If U is a control invariant of B = (X, A, {Ga }a∈A , {F a }a∈A )
then U is an invariant of B  = (X, A, {(Ga ) }a∈A , {F a }a∈A ) where (Ga ) =
Ga ∧ U [F a (X)/X] for a ∈ Ac and (Ga ) = Ga otherwise. Furthermore, if U is a
deadlock-free control invariant of B then it is a deadlock-free invariant of B  .
This result allows to build from a given system B and a safety requirement
of the form ”always U0 ” a deadlock-free system B  meeting this requirement,
provided there exists a deadlock-free control invariant U of B such that U ⇒ U0 .
The following simple example illustrates this fact.
Example 1. In a Readers/Writers system, we use two counters, non negative
integers, r and w to represent respectively, the number of readers and writers
using a common resource. The counters are modified by actions of a transition
system B specified as a set of guarded commands:
a1 : true → r := r + 1 a2 : r > 0 → r := r − 1
b1 : true → w := w + 1 b2 : w > 0 → w := w − 1
318 G. Gössler and J. Sifakis

where a1 and b1 are respectively, the actions of granting the resource to a reader
and a writer and a2 and b2 are respectively, the actions of releasing the resource
by a reader and a writer.
We assume that the actions a1 and b1 are controllable and we want to enforce
the requirement ”always U” for U = (w ≤ 1) ∧ (w = 0 ∨ r = 0). This prevents
concurrent access among writers, as well as between readers and writers. It is
easy to check that U is a deadlock-free control invariant. In fact, it is easy to
check that U is preserved by the uncontrollable actions a2 and b2 :
(r > 0) ∧ U ⇒ U [r − 1/r] and (w > 0) ∧ U ⇒ U [w − 1/w].
Furthermore, it is easy to check that U ⇒ prea1 ∨ prea2 ∨ preb1 ∨ preb2 .
As prea1 (U ) ≡ w = 0 and preb1 (U ) ≡ (w = 0) ∧ (r = 0), we have inv[B  ](U )
where B  is the controlled transition system:
a1 : w = 0 → r := r + 1 a2 : r > 0 → r := r − 1
b1 : (r = 0) ∧ (w = 0) → w := w + 1 b2 : w > 0 → w := w − 1
The notion of restriction defined below allows a formalization of the relation-
ship between the initial and the controlled system.
Definition 5 (Restriction). Given a transition system B = (X, A, {Ga }a∈A ,
{F a }a∈A ), a restriction is a tuple of predicates V = (U a )a∈A . B/V denotes the
transition system B restricted by V , B/V = (X, A, {Ga ∧ U a }a∈A , {F a }a∈A ).
V = (U a )a∈A is a control restriction for B if a∈Au (¬G a∨ U a) =
a a
true.
V = (U )a∈A is a deadlock-free restriction for B if a∈A G ∧U = a∈A Ga .
a

We simply say that V is a control restriction or a deadlock-free restriction if


the corresponding equation holds for any transition system B with vocabulary of
actions A = Ac ∪ Au (independently of the interpretation of the guards).

Definition 6 (U A , V (U )). Given a predicate U , we denote by U A the restric-


tion U A = (U )a∈A , and by V (U ) the restriction V (U ) = (U [F a (X)/X])a∈A .
If V1 , V2 are two restrictions, Vj = (Ujai )ai ∈A for j = 1, 2, we write V1 ∧ V2
for the restriction (U1ai ∧ U2ai )ai ∈A .

Proposition 2 (Control Invariants and Restrictions). Given a transition


system B and a predicate U ,
a) If U is a control invariant of B then V (U ) is a control restriction of B;
b) If U is a deadlock-free invariant of B then V (U ) is a deadlock-free restriction
of B;
c) If U is a deadlock-free control invariant of B then V (U ) is a deadlock-free
control restriction of B.
We need the following definitions for the comparison of transition systems.
Definition 7 (Refinement and Equivalence). Given Bi = (Xi , A, {Gai }a∈A ,
{Fia }a∈A ) for i = 1, 2, two transition systems and a predicate U we say that
Priority Systems 319

– B1 refines B2 under U , denoted by B1 U B2 , if ∀a ∈ A . F1a = F2a and


U ∧ Ga1 ⇒ U ∧ Ga2 ;
– B1 is equivalent to B2 under U , denoted by B1 0U B2 , if B1 U B2 and
B2 U B1 .
We write B1 B2 and B1 0 B2 for B1 true B2 and B1 0true B2 , respec-
tively.

Property 1. Given transition systems B, B1 , B2 and restrictions V , V1 , V2 ,


1a B/V B;
1b (B1 ∪ B2 )/V 0 B1 /V ∪ B2 /V ;
1c (B/V1 )/V2 0 B/(V1 ∧ V2 );
1d B1 B2 ⇒ (inv[B2 ](U ) ⇒ inv[B1 ](U )) for any predicate U .
Notice that if the conjunction of control invariants is a control invariant, the
conjunction of deadlock-free control invariants is not in general, a deadlock-free
control invariant. We investigate conditions for composability.

3 Priority Systems
We define priority systems which are transition systems restricted with priorities.
Priorities provide a general mechanism for generating deadlock-free restrictions.

3.1 Deadlock-Free Restrictions and Priorities


Priorities
Definition 8 (Priority). A priority on a transition system B with set of ac-
tions A is a set of predicates ≺= {Cij }ai ,aj ∈A . The restriction defined by ≺,
V (B, ≺) = (U a )a∈A is U ai = aj ∈A ¬(Cij ∧ Gaj ).
The predicates Cij specify priority between actions ai and aj . If Cij is true
at some state, then in the system restricted by V (B, ≺) the action ai cannot
be executed if aj is enabled. We write ai ≺ Cij .aj to express the fact that ai is
dominated by aj when Cij holds. A priority is irreflexive if Cij ⇒ ¬Cji for all
ai , aj ∈ A.
Definition 9 (Transitive Closure). Given a priority ≺ we denote by ≺+ the
least priority such that ≺⊆≺+ , obtained by the rule:
ai ≺+ Cij .aj and aj ≺+ Cjk .ak implies ai ≺+ (Cjk ∧ Cjk ).ak .

Proposition 3 (Activity Preservation for Priorities). A priority ≺ defines


a deadlock-free restriction if ≺+ is irreflexive.
Proof. Suppose that ≺+ is irreflexive. Consider
 some transition system B =
(X, A, {Ga }a∈A , {F a }a∈A ), and let G = a∈A Ga , and V (B, ≺) = (U a )a∈A . Let
x be a state of B such that G(x), let A = {a ∈ A | Ga (x)}, and define a relation
320 G. Gössler and J. Sifakis

≺ on A such that ∀ai , aj ∈ A . ai ≺ aj ⇐⇒ Cij (x). As ≺+ is irreflexive, ≺



is a partial order on
 A , and If A = ∅ then max
thusacyclic.  
  A exists and is
non-empty. Thus, a∈A G a
∧ U a
(x) = a∈A  G a
)(x) = a∈A G a
)(x).
The above proposition motivates the definition of priority systems which are
transition systems restricted by priorities.
Definition 10 (Priority System). A priority system is a pair (B, ≺) where
B is a transition system and ≺= {Cij }ai ,aj ∈A is a priority on B such that
Cij = f alse for all (ai , aj ) ∈ Au × A.
The priority system (B, ≺) represents the transition system B/V (B, ≺).
The following propositions give properties of priority systems.
Proposition 4. If (B, ≺) is a priority system, then V (B, ≺) is a control re-
striction for B.
Proof. If V (B, ≺) = (U ai )ai ∈A then for all uncontrollable actions ai , U ai = true
because Cij = f alse.

Corollary 1. If U is a control invariant of B then U is a control invariant of


(B, ≺).

Proposition 5. If U is a deadlock-free control invariant of a transition sys-


tem B then for any priority ≺ such that ≺+ is irreflexive, U is a deadlock-free
invariant of (B/V (U ), ≺).
Proof. If U is a deadlock-free control invariant of B then U ⇒ G(B/V (U )) and
inv[B u ](U ). As ≺ defines deadlock-free restrictions, (B/V (U ), ≺)u = B u and
G(B/V (U )) = G(B/V (U ), ≺) .

Static Priorities
Definition 11 (Static Priority). A static priority is a priority≺={Cij }ai ,aj ∈A
such that the predicates Cij are positive boolean expressions on guards. We call
static restrictions the corresponding restrictions V (B, ≺) = (U a )a∈A , that is
restrictions which are tuples of negative boolean expressions on guards.
It is easy to check that any static restriction defines a static priority. Notice
that in a priority system with static priorities, the choice of the action to be
executed at some state depends only on the set of the actions enabled at this
state. For example, a restriction with U a1 = ¬Ga2 ∧(¬Ga3 ∨¬Ga4 ) means that in
the restricted system the action a1 can be executed only if a2 is disabled and a3
or a4 is disabled. The corresponding the priority relation is: a1 ≺ true.a2 , a1 ≺
Ga3 .a4 , a1 ≺ Ga4 .a3
Notation: For a static priorities the notation can be drastically simplified.
If (U ai )ai ∈A is a static restriction then it is of the form, U ai = k∈Ki ¬Mik
where Mik is a monomial on guards Mik = kw ∈W Gakw . Each monomial Mik ,
Priority Systems 321

corresponds to the set of priorities {ai ≺ kw ∈W {j} Gakw .aj }j∈W . This set can
be canonically represented by simply writing ai ≺ kw ∈W akw .
For example if Mik = Ga1 ∧ Ga2 ∧ Ga3 instead of writing ai ≺ (Ga1 ∧ Ga2 ).a3 ,
ai ≺ (Ga1 ∧ Ga3 ).a2 , ai ≺ (Ga2 ∧ Ga3 ).a1 , we write ai ≺ a1 a2 a3 . We propose the
following definition for static priorities.
Definition 12 (Static Priority – Simplified Definition). A monomial m
on a vocabulary of actions A is any term m = aj1 . . . ajn obtained by using
an associative, commutative and idempotent product operation. Let 1 denote its
neutral element, and m(A) the set of the monomials on A.
A static priority ≺ on A is a relation ≺⊆ A × m(A).

Example 2. The static priority ≺ corresponding to the static restriction U a1 =


true, U a2 = true, U a3 = ¬Ga1 ∨¬Ga2 , U a4 = ¬Ga1 ∧¬Ga2 , U a5 = ¬Ga1 ∧¬Ga3 ∨
¬Ga2 ∧ ¬Ga4 ≡ ¬(Ga1 ∧ Ga2 ) ∧ ¬(Ga1 ∧ Ga4 ) ∧ ¬(Ga3 ∧ Ga2 ) ∧ ¬(Ga3 ∧ Ga4 ) is:
a3 ≺ a1 a2 , a4 ≺ a1 , a4 ≺ a2 , a5 ≺ a1 a2 , a5 ≺ a1 a4 , a5 ≺ a3 a2 , a5 ≺ a3 a4 .

Definition 13 (Closure). Let ≺ be a static priority. The closure of ≺ is the


least static priority ≺∓ containing ≺ such that
– if a1 ≺∓ a2 m2 and a2 ≺∓ m3 then a1 ≺∓ m2 m3 ;
– if a ≺∓ am, then a ≺∓ m for m = 1.

Example 3. For ≺= {a ≺ bc, b ≺ ad}, ≺∓ = {a ≺∓ bc, b ≺∓ ad, a ≺∓ acd, a ≺∓


cd, b ≺∓ bcd, b ≺∓ cd}.

Lemma 1. If for any ai ∈ A, ai ≺ mi with mi a monomial on A, then ai ≺∓ ai


for some ai ∈ A.
Proof. Omitted.

Proposition 6 (Activity Preservation for Static Priorities). A static pri-


ority ≺ defines a deadlock-free restriction if and only if ≺+̄ is irreflexive.
Proof. “if”: suppose that ≺+̄ is irreflexive. By definition, only top elements in ≺
can be non-trivial monomials. Thus, ≺ is acyclic, and all ascending chains in ≺
are finite. Consider somedeadlock-free transition system B = (X, A, {Ga }a∈A ,
{F a }a∈A ), and let G = a∈A Ga . Let x be a state of B such that G(x), and
let A = {a ∈ A | Ga (x)}. As ≺ is acyclic, max A exists and is non-empty. It
remains to show that some element of max A is not dominated by any monomial
 
in 2A . Suppose that for any ai ∈ A there is some mi ∈ 2A , ai ≺ mi . In that
case, ≺+̄ has a circuit by lemma 1, which contradicts the hypothesis. Thus, at
least one action in max A is maximal in ≺.
“only if”: suppose that a ≺+̄ a for some a ∈ A. By construction of ≺+̄ ,
this means that (A ∪ m(A), ≺) contains a tree (A ∪ m(A ), ≺ ) with root a
such that all leaves are monomials consisting only of the action a. Take B =
 
(∅, A, {Ga }a∈A , {∅}a∈A ) with G(a ) = true if a ∈ A , and G(a ) = false other-
wise. By definition of /, all guards in B/V (B, ≺) are false, whereas B is clearly
deadlock-free.
322 G. Gössler and J. Sifakis

Example 4. Consider the static priority ≺ on the actions a1 , a2 , a3 , a4 such that,


a2 ≺ a3 a4 , a3 ≺ a2 a4 , a4 ≺ a2 a3 . It is easy to see that ≺∓ is not irreflexive,
thus ≺ does not define a deadlock-free restriction. By elimination of a4 , as in the
proof of Lemma 1, we get: a2 ≺∓ a2 a3 , a3 ≺∓ a2 a3 which gives by application
of the second closure rule, a2 ≺∓ a2 , a3 ≺∓ a3 . Thus ≺∓ is not irreflexive.
Consider the slightly different static priority ≺1 on the actions a1 , a2 , a3 , a4
such that, a2 ≺1 a1 a3 a4 , a3 ≺1 a2 a4 , a4 ≺1 a2 a3 . It can be checked that ≺∓ 1 is
irreflexive and thus deadlock-free and contains the chain a4 ≺∓ 1 a3 ≺∓
1 a 2 ≺∓
1 a1 .
Clearly, ≺+1 is not irreflexive as a3 ≺ 1 G a2
.a4 , a4 ≺1 G a2
.a3 . This example
shows that for static priorities the use of the specific closure gives finer results
than by using Proposition 3.

3.2 Composition of Priorities


Notice that given B and ≺, the predicate V (B, ≺) depends on B. The property
((B, ≺1 ), ≺2 ) = ((B, ≺2 ), ≺1 ) does not hold in general. For instance, consider a
system B and priorities ≺ and ≺ such that a1 ≺ a2 and a2 ≺ a3 where a1 , a2 ,
a3 are actions of B. If from some state of B the three actions are enabled then in
((B, ≺), ≺ ) only a3 is enabled while in ((B, ≺ ), ≺) both a1 and a3 are enabled.
We define two composition operations on priorities and study composability
results.
Definition 14 (Composition of Priorities). Given two priorities ≺1 and
≺2 their composition is the operation ⊕ such that ≺1 ⊕ ≺2 = (≺1 ∪ ≺2 )+ .
Furthermore, if ≺1 and ≺2 are static priorities we define another composition
operation, ⊕ ¯ ≺2 = (≺1 ∪ ≺2 )∓ .
¯ such that ≺1 ⊕

Proposition 7. The operations ⊕ and ⊕


¯ are associative and commutative.

Lemma 2. Let ≺=≺∓ be an irreflexive closed static priority. Then, any non
maximal action a is dominated by some monomial m on maximal actions.
Proof. Omitted.

Proposition 8 (Composability for Static Priorities). Given a transition


system B and two static priorities ≺1 and ≺2 , if ≺1 ∪ ≺2 =≺1 ⊕
¯ ≺2 then
((B, ≺ ), ≺ ) 0 (B, ≺ ⊕ ≺ ).
1 2 1 ¯ 2

Proof. Let Ga , (Ga ) , (Ga ) , and (Ga ) be the guards of action a in B, B/ ≺1 ,
(B/ ≺1 )/ ≺2 , and B/(≺1 ⊕ ¯ ≺2 ), respectively. For some state x, let A0 =
{a ∈ A | G (x)}, A1 = {a ∈ A | (Ga ) (x)}, A2 = {a ∈ A | (Ga ) (x)}, and
a

A3 = {a ∈ A | (Ga ) (x)}, respectively. Notice that A2 ∪ A3 ⊆ A1 ⊆ A0 . We


show that A2 = A3 .
If a ∈ A2 then there is no monomial on A0 dominating a in ≺1 , and there is
no monomial on A1 dominating a in ≺2 . Thus, either there is no monomial on
A0 dominating a in ≺1 ∪ ≺2 =≺1 ⊕ ¯ ≺2 , and a ∈ A3 , or there is a monomial m
on A0 such that a ≺2 m. In the latter case, m = m m with m a non-empty
monomial on A0 A1 , and m a monomial on A1 (i.e., product of actions that are
Priority Systems 323

maximal in ≺1 ). Thus, for any factor ai of m there is a monomial mi on A0 (and


by lemma 2, on A1 ) such that ai (≺1 )+ mi . Since (≺1 )+ ⊆≺1 ∪ ≺2 =≺1 ⊕ ¯ ≺2 ,

we have a(≺ ∪ ≺ )m1 · · · mk m , and a ∈
1 2
/ A2 , which is in contradiction to the
assumption. Thus, a ∈ A3 .
Conversely, if a ∈ A3 , then a is not dominated by any monomial on A0 in
≺1 ∪ ≺2 . Thus, a is maximal among A0 and A1 in both priorities, and a ∈ A2 .

Proposition 9 (Composability for Priorities). Given a transition system


B and two priorities ≺1 , ≺2 , if ≺1 ∪ ≺2 =≺1 ⊕ ≺2 then ((B, ≺1 ), ≺2 ) 0 (B, ≺1
⊕ ≺2 ).
Proof. Consider some state x, and let ≺ix be the static priority defined by ≺i
at state x: ai ≺ix aj ⇐⇒ Cij i
(x), i = 1, 2. Notice that ≺1x ⊕
¯ ≺2x is irreflexive
whenever ≺ ⊕ ≺ is irreflexive. The proof follows that of proposition 8 for the
1 2

static priorities ≺1x and ≺2x at state x.


Propositions 8 and 9 provide composability conditions, that is conditions
guaranteeing commutativity of two restrictions defined by priorities. The fol-
lowing proposition is easy to prove by using monotonicity properties and the
definitions of composition operations. It shows that the successive application of
priority restrictions can be safely replaced by their composition.
Proposition 10. For any transition system B and priorities ≺1 , ≺2 we have
– if ≺1 ⇒≺2 then (B, ≺2 ) (B, ≺1 );
– (B, ≺1 ⊕ ≺2 ) (B, ≺1 ∪ ≺2 ) ((B, ≺1 ), ≺2 ). Furthermore, for static
¯ ≺2 ) 0 (B, ≺1 ⊕ ≺2 ).
priorities, (B, ≺1 ⊕

3.3 Safety and Deadlock-Freedom


We present results relating deadlock-free control invariants to priorities. We show
that priorities can be used to define any restriction corresponding to a deadlock-
free control invariant.
Given a transition system B and a predicate U , the restriction V (U ) guar-
antees the invariance (safety) for U in B/V (U ), that is inv[B/V (U )](U ). Fur-
thermore, if U is a control invariant then V (U ) is a control restriction, that is a
restriction that does not affect the guards of uncontrollable actions. As a rule,
V (U ) is not deadlock-free. We define for a predicate U , a priority ≺U and study
relationships between its restrictions and V (U ).
Definition 15. Given a state predicate U on a transition system, the associated
priority ≺U is defined by ≺U = {prea (¬U ) ∧ prea (U )}(a,a )∈Ac ×A .

Property 2. The priority ≺U is transitively closed and irreflexive and thus it


defines a deadlock-free restriction.

Proposition 11. For any transition system B and predicate U , B/V (U ) U


(B, ≺U ). Furthermore, if U is a deadlock-free invariant of B, B/V (U ) 0U
(B, ≺U ).
324 G. Gössler and J. Sifakis

Proof. As we consider B with initial set of states satisfying U we assume that


all the guards Ga of its actions are such that Ga ⇒ U . Let’s verify that if (Gai )
is the restricted guard of action ai in (B, ≺U ), then Gai ∧ preai (U ) ⇒ (Gai ) .
We find (Gai ) = Gai ∧ aj ∈A ¬(preai (¬U ) ∧ preaj (U ) ∧ Gaj ) = Gai ∧
aj ∈A (¬preai (¬U ) ∨ ¬preaj (U )) = G ∧ ¬preai (¬U ) ∨ G ∧ aj ∈A ¬preaj (U ).
ai ai

Given that G ∧ ¬preai (¬U ) = G ∧ preai (U ), we have


ai ai

(Gai ) = Gai ∧ preai (U ) ∨ Gai ∧ aj ∈A ¬preaj (U ).


From this follows that B/V (U ) U (B, ≺U ).
 If U is a deadlock-free invariant then for any guard Gai , Gai ⇒ U ⇒
aj ∈A preaj (U ). Thus, we have G ∧ aj ∈A ¬preaj (U ) = false. Consequently,
a
ai 
(G ) = G ∧ preai (U ) which completes the proof.
ai

A direct consequence of this proposition is that for any deadlock-free control


invariant U , B/V (U ) 0U (B, ≺U ). That is the effect of deadlock-free controllers
can be modeled by restrictions induced by priorities.
From this proof it also follows that the guards of B/V (U ) and (B, ≺U ) agree
at deadlock-free states of B/V (U ) in U . They may differ at deadlock states of
B/V (U ) where B is deadlock-free. In other words (B, ≺U ) is a kind of “best
deadlock-free abstraction” of B/V (U ) under U .
Example 5. We apply the previous proposition for B and U of Example 1. We
show that (B, ≺U ) behaves exactly as B  = B/V (U ) from any state satisfying
the deadlock-free control invariant U .
We have prea1 (¬U ) ∧ preb2 (U ) ≡ w > 0, preb1 (¬U ) ∧ preb2 (U ) ≡ w ≥
1, preb1 (¬U ) ∧ prea2 (U ) ≡ r > 0 and prea1 (¬U ) ∧ preb1 (U ) ≡ preb1 (¬U ) ∧
prea1 (U ) ≡ false. This gives the priority
≺U = {a1 ≺U (w > 0).b2 , b1 ≺U (w ≥ 1).b2 ), b1 ≺U (r > 0).a2 )}. It can be
checked that (B, ≺U ) is indeed equivalent to (B/V (U )). The computation of the
restricted guards (Ga1 ) and (Gb1 ) gives
(Ga1 ) = Ga1 ∧ (¬w > 0 ∨ ¬Gb2 ) ≡ w = 0 and
(Gb1 ) = Gb1 ∧ (¬w ≥ 1 ∨ ¬Gb2 ) ∧ (¬r > 0 ∨ ¬Ga2 ) ≡ (w = 0) ∧ (r = 0).
The following propositions study relationships between safety and deadlock-
free restrictions.
Proposition 12. If U1 , U2 are two state predicates and ≺U1 , ≺U2 the cor-
responding priorities, then B/V (U1 ∧ U2 ) U1 ∧U2 (B, ≺U1 ⊕ ≺U2 ) U1 ∧U2
(B, ≺U1 ∧U2 ).
Furthermore, if U1 ∧U2 is a deadlock-free invariant then B/V (U1 ∧U2 ) 0U1 ∧U2
(B, ≺U1 ⊕ ≺U2 ) 0U1 ∧U2 (B, ≺U1 ∧U2 ).
Proof. Omitted.
This proposition says that (B, ≺U1 ⊕ ≺U2 ) is an upper approximation of
B/V (U1 ∧ U2 ). The following proposition shows an even stronger relationship
between the two priority systems.
Proposition 13. If U1 , U2 are two deadlock-free invariants of B and ≺U1 ⊕ ≺U2
is irreflexive then B/V (U1 ∧ U2 ) 0U1 ∧U2 (B, ≺U1 ⊕ ≺U2 ) is deadlock-free.
Priority Systems 325

Proof. We have from B/V (U1 ) 0U1 (B, ≺U1 ) and B/V (U2 ) 0U2 (B, ≺U2 ),
(B, ≺U1 ⊕ ≺U2 ) U1 (B, ≺U1 ∪ ≺U2 ) U1 B/V (U1 ) and (B, ≺U1 ⊕ ≺U2 ) U2
(B, ≺U1 ∪ ≺U2 ) U2 B/V (U2 ). This gives, (B, ≺U1 ⊕ ≺U2 ) U1 ∧U2 B/V (U1 ) ∧
V (U2 ) 0 B/V (U1 ∧ U2 ). From the previous proposition we get the result.

The following proposition provides for static priorities, a result similar to


Proposition 11. It is very useful for establishing safety by using static priorities.

Proposition 14. Given a state predicate U on a transition system B = (X, A,


 }a∈A , {F }a∈A ),ailet ≺U be a static priority such that ∀a ∈ A . prea (¬U ) =⇒
{G a a

m . a≺U m ai ∈m G . Then, inv[(B, ≺U )](U ).

Proof. By Definition 10 of the semantics of (B, ≺U ).

As shown in the following example, this proposition provides a means to


ensure invariance of an arbitrary predicate U by static priorities. The choice of
≺U is a trade-off between completeness and efficiency. Extreme choices are given
by

– a ≺U a ⇐⇒ prea (¬U )∧prea (U ) = false, which is a priority with singleton


monomials only; the closure ofthis priority may easily be not irreflexive.

– a ≺U m ⇐⇒ ∃x . prea (¬U ) (x) ∧ m = {a | Ga (x)} which is the most
fine-grained static priority ensuring invariance of U .

4 Example
We consider a robotic system controlled by the following processes:

– 3 trajectory control processes T C1 , T C2 , T C man. T C2 is more precise


and needs more resources than T C1 ; T C man is the process for manual
operation.
– 2 motion planners, MP1 , MP2 ; MP2 is more precise and needs more resources
than MP1 .

We consider for each process P predicates P.halted and P.running such that
P.halted ≡ ¬P.running. Each process P can leave states of P.halted (resp.
P.running) by action P.start (resp. P.stop), as in figure 1. The robotic system
must satisfy forever the following constraints:

1. In order to ensure permanent control of the position and movements of


the robot, at least one trajectory control process and at least one motion
planner must be running at any time: (TC1 .running ∨ TC2 .running ∨
TC man.running) ∧ (MP1 .running ∨ MP2 .running).
2. In order to meet the process deadlines, the CPU load must be kept below a
threshold, which excludes that both high-precision processes can be simul-
taneously active: TC2 .halted ∨ MP2 .halted.
326 G. Gössler and J. Sifakis

start
halted running
stop

Fig. 1. Transition system of a process

The problem is to find a deadlock-free controller which restricts the behavior


of the system so that the above requirement is met. A similar problem has been
solved in [13] by using controller synthesis [12]. We propose a solution by finding
an appropriate static priority.
We put the global constraint to be satisfied as a predicate U in conjunctive
form: U = (TC1 .running ∨ TC2 .running ∨ TC man.running) ∧ (MP1 .running ∨
MP2 .running) ∧ (TC2 .halted ∨ MP2 .halted).
Invariance of U requires the invariance of each one of the three conjuncts,
disjunctions of predicates. We define the static priority ≺U in the following
manner.
For each conjunct D consider the critical configurations where only one literal
of the disjunction is true. The priority ≺U prevents critical actions, that is actions
that can turn this term to false, by keeping them dominated by safe actions
enabled in the considered configuration. More formally, for each disjunction D,
each critical action a (for which D ∧ prea (¬D) = false) is dominated by the
monomial consisting of the safe actions enabled in D.
For example, take D = TC1 .running ∨ TC2 .running ∨ TC man.running.
Consider the critical configuration where TC1 .running = true, TC2 .running =
false, and TC man.running = false. Clearly, TC1 .stop is a critical action for this
configuration. Its occurrence can be prevented by the static priority TC1 .stop ≺
TC2 .start · TC man.start. The monomial TC2 .start · TC man.start is the product
of the safe actions enabled at this configuration. In this way, we compute the
static priority ≺U which guarantees invariance of U :

TC1 .stop ≺U TC2 .start · TC man.start


TC2 .stop ≺U TC1 .start · TC man.start
TC man.stop ≺U TC1 .start · TC2 .start
MP1 .stop ≺U MP2 .start
MP2 .stop ≺U MP1 .start
TC2 .start ≺U MP2 .stop
MP2 .start ≺U TC2 .stop

It is easy to see that ≺+̄


U is irreflexive. By Proposition 6, ≺U is a deadlock-
free restriction. By Proposition 14, U is an invariant of (TC1 ∪ TC2 ∪ TC man ∪
MP1 ∪ MP2 , ≺U ).
Priority Systems 327

This approach can be applied to find deadlock-free control restrictions of arbi-


trary systems of processes {B1 , . . . , Bn } abstractly modeled as the deadlock-free
transition system of figure 1, preserving a predicate U , boolean expression on
atomic predicates Bi .running and Bi .halted. For example, U can express require-
ments on the global system state such as:

– a safety-critical process must not run unless a failure-handling process is


running;
– mutual exclusion between concurrently running processes, e.g., between a
safety-critical and an untrusted process.

We suppose U to be written as a conjunction of disjunctions


 
U= Bj .running ∨ Bj .halted
i∈I j∈Ji j∈Ji

where I, Ji and Ji are index sets such that any conjunct has at least two atoms
that are predicates on two different processes (this is always possible for any
predicate U if we have at least two processes).
Invariance  invariance of all of its conjuncts Di . Consider
of U is equivalent to 
the conjunct l∈Ji Bl .running ∨ l∈J  Bl .halted. As in the previous example,
i
consider a critical configuration, that is, a configuration where only one literal
is true. We distinguish two cases:

– if that literal is Bj .running (thus j ∈ Ji ), then Bj .stop violates U from this


configuration characterized by l∈Ji {j} Bl .halted ∧ l∈J  ∪{j} Bl .running.
i
This action can be prevented by the static priority

Bj .stop ≺U Bl .start · Bl .stop


l∈Ji {j} l∈Ji

In this relation, Bj .stop is dominated by the monomial consisting of the


actions of the other processes involved in this configuration.
– if the literal is Bj .halted (thus j ∈ Ji ), then
 Bj .start violates
 U , and we apply
a similar reasoning and get Bj .start ≺U l∈Ji Bl .start · l∈J  {j} Bl .stop.
i

Let ≺U be the union of the so defined priorities for all i ∈ I.


By definition of ≺U , for any disjunct Di of U , any criticalaction a is dom-
inated by at least one monomial m(a, Di ) = Bl .start · Bl .stop consist-
ai
ing of safe actions enabled in Di . Thus, prea (¬Di ) =⇒ ai ∈m(a,Di ) G ,
  
and prea (¬U ) = prea (¬ i∈I Di ) = prea
 i∈I ¬Di = i∈I pre
 a(¬Di ) =⇒ 
i Bi , ≺U
ai
i∈I ai ∈m(a,Di ) G . By proposition 14, U is an invariant of .
Notice that ≺U is minimally restrictive, that is, only transitions violating the
invariance of U are inhibited.
 
Deadlock-freedom of i Bi , ≺U is established by Proposition 14 if ≺+̄ U is
irreflexive, which depends on the actual predicate U .
328 G. Gössler and J. Sifakis

5 Discussion
We present a framework for the incremental construction of deadlock-free sys-
tems meeting given safety properties. The framework borrows concepts and
basic results from the controller synthesis paradigm by considering a step in
the construction process as a controller synthesis problem. Nevertheless, it does
not directly address controller synthesis and other related computationally hard
problems. Instead, it is based on the abstraction that the effect of the controller
corresponding to a deadlock-free control invariant can be modeled by deadlock-
free control restrictions.
Priorities play a central role in our framework. They can represent any
deadlock-free control restriction. They can be naturally used to model mutual
exclusion constraints and scheduling policies [4,2]. They are equipped with very
simple and natural composition operations and criteria for composability. We
provide an equational characterization of priorities and a sufficient condition for
representing deadlock-free restrictions. Static priorities are solutions expressed
as boolean expressions on guards for which a necessary and sufficient condition
for deadlock-freedom is provided.
The use of priority systems instead of simple transition systems is a key idea
in our approach. Of course, any priority system is, by its semantics, equivalent
to a transition system. Nevertheless, using such layered models offers numerous
advantages of composability and compositionality:

– The separation between transition system (behavior) and priorities allows


reducing global deadlock-freedom to deadlock-freedom of the transition sys-
tem and a condition on the composed priorities.
– The use of priorities to model mutual exclusion and scheduling policies in-
stead of using transition systems leads to more readable and compositional
descriptions [2].
– In [6,5] priority systems are used to define a general framework for com-
ponent-based modeling. This framework uses a single associative parallel
composition operator for layered components, encompassing heterogeneous
interaction. Priorities are used to express interaction constraints. For systems
of interacting components, we have proposed sufficient conditions for global
and individual deadlock-freedom, based on the separation between behavior
and priorities.
Our work on priorities found application in generating schedulers for real-
time Java applications [9]. This paper uses a scheduler synthesis algorithm that
generates directly (dynamic) priorities. Another interesting application is the
use of priorities in the IF toolset to implement efficiently run-to-completion
semantics of the RT-UML profile [7].
Priority systems combine behavior with priorities, a very simple enforce-
ment mechanism for safety and deadlock-freedom. This mechanism is powerful
enough to model the effect of controllers ensuring such properties. They offer
both abstraction and analysis for incremental system construction. Our theo-
Priority Systems 329

retical framework can be a basis for the various approaches and practices using
enforcement mechanisms in a more or less ad-hoc manner.

References
1. K. Altisen, G. Gössler, A. Pnueli, J. Sifakis, S. Tripakis, and S. Yovine. A frame-
work for scheduler synthesis. In Proc. RTSS’99, pages 154–163. IEEE Computer
Society Press, 1999.
2. K. Altisen, G. Gössler, and J. Sifakis. Scheduler modeling based on the controller
synthesis paradigm. Journal of Real-Time Systems, special issue on ”control-
theoretical approaches to real-time computing”, 23(1/2):55–84, 2002.
3. L. Bauer, J. Ligatti, and D. Walker. A calculus for composing security policies.
Technical Report TR-655-02, Princeton University, 2002.
4. S. Bornot, G. Gössler, and J. Sifakis. On the construction of live timed systems.
In S. Graf and M. Schwartzbach, editors, Proc. TACAS’00, volume 1785 of LNCS,
pages 109–126. Springer-Verlag, 2000.
5. G. Gössler and J. Sifakis. Component-based construction of deadlock-free systems
(extended abstract). In proc. FSTTCS’03, volume 2914 of LNCS. Springer-Verlag,
2003.
6. G. Gössler and J. Sifakis. Composition for component-based modeling. In proc.
FMCO’02, volume 2852 of LNCS. Springer-Verlag, 2003.
7. S. Graf, I. Ober, and I. Ober. Model checking of uml models via a mapping to
communicating extended timed automata. In S. Graf and L. Mounier, editors,
Proc. SPIN’04, volume 2989 of LNCS. Springer-Verlag, 2004.
8. G. Kiczales, J. Lamping, A. Mendhekar, C. Maeda, C. Videira Lopes, J.-M. Lo-
ingtier, and J. Irwin. Aspect-oriented programming. In Proc. ECOOP ’97, volume
1241 of LNCS, page 220ff. Springer-Verlag, 1997.
9. C. Kloukinas, C. Nakhli, and S. Yovine. A methodology and tool support for
generating scheduled native code for real-time java applications. In R. Alur and
I. Lee, editors, Proc. EMSOFT’03, volume 2855 of LNCS, pages 274–289, 2003.
10. J. Ligatti, L. Bauer, and D. Walker. Edit automata: Enforcement mechanisms
for run-time security policies. Technical Report TR-681-03, Princeton University,
2003.
11. O. Maler, A. Pnueli, and J. Sifakis. On the synthesis of discrete controllers for
timed systems. In E.W. Mayr and C. Puech, editors, STACS’95, volume 900 of
LNCS, pages 229–242. Springer-Verlag, 1995.
12. P.J. Ramadge and W.M. Wonham. Supervisory control of a class of discrete event
processes. SIAM J. Control and Optimization, 25(1), 1987.
13. E. Rutten and H. Marchand. Task-level programming for control systems using
discrete control synthesis. Technical Report 4389, INRIA, 2002.
14. F. Schneider. Enforceable security policies. ACM Transactions on Information
and System Security, 3(1):30–50, 2000.
15. P. Tarr, M. D’Hondt, L. Bergmans, and C. V. Lopes. Workshop on aspects and
dimensions of concern: Requirements on, challenge problems for, advanced separa-
tion of concerns. In ECOOP 2000 Workshop Proceedings, Springer Verlag, 2000.
Preserving Properties Under Change

Heike Wehrheim
Universität Oldenburg, Fachbereich Informatik,
26111 Oldenburg, Germany
wehrheim@informatik.uni-oldenburg.de

Abstract. In this paper we discuss the question which properties of


a formally verified component are preserved when the component is
changed due to an adaption to a new use. More specifically, we will
investigate when a temporal logic property of an Object-Z class is pre-
served under a modification or extension of the class with new features.
To this end, we use the slicing technique from program analysis which
provides us with a representation of the dependencies within the class
in the form of a program dependence graph. This graph can be used to
determine the effect of a change to the class’ behaviour and thus to the
holding of a temporal logic formula.

1 Introduction
With the advent of component-based software engineering systems are more
and more built from pre-fabricated components which are taken from libraries,
adapted to new needs and assembled into a system. Furthermore, for the design
of dependable systems formal methods are employed during the construction
process to improve the degree of correctness and reliability. The combination
of these two techniques — component-based design and formal methods — in
system construction poses a large number of new research challenges which are
currently very actively taken up.
This paper studies one aspect arising in this area, based on the following
scenario of a component-based construction. We assume that we have a library
of components which are formally specified and proven correct with respect to
certain requirements. During system construction components are taken from the
library and (since they might not fully comply to their new use) are modified
or even extended with new features. The question is then whether the proven
properties are preserved under this specialisation and thus whether we can also
get a re-use of verification results and not just of components. More specifically,
given a component A (which will be a single class here) and its modification or
extension C we are interested in knowing whether a property P holding for A
still holds for C (see the following figure).

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 330–343, 2004.

c Springer-Verlag Berlin Heidelberg 2004
Preserving Properties Under Change 331

A Property P

C Property P ?

Although the picture might suggest that the relationship between A and C
is that of inheritance (since we use the specialisation arrow of UML) we are
actually interested in a more general relationship: C may be any class which
is constructed out of A, may it be by inheritance or by a simple change of the
existing specification.
As a first observation, it can be remarked that even a restriction to inheri-
tance cannot ensure that properties are preserved: a subclass may differ from its
superclass in any aspect and thus none of the properties holding for A might be
preserved in C . Still, preservation of properties to subclasses is an important and
intensively studied topic. Within the area of program verification, especially of
Java programs, this question has already been tackled by a number of researchers
[6,10,5]. In these approaches correctness properties are mainly formulated in
Hoare logic, and the aim is to find proof rules which help to deduce subclass
properties from superclass properties. In order to get correctness of these rules
it is required that the subclass is a behavioural subtype [7] of the superclass. This
assumption is also the basis of [14] which studies preservation of properties in an
event-based setting with correctness requirements formulated as CSP processes.
In this paper we lift this assumption (although also looking at subtypes as a
special case) and consider arbitrary classes constructed out of existing classes.
For convenience we will, however, often call the class C the subclass and A
its superclass. Instead of employing restrictions on the subclass (in order to
preserve properties) we will compute whether a property is preserved or not. This
computation does not involve re-verification of the property but can be carried
out on a special representation of the classes called program dependence graphs.
Program dependence graphs carry all information about the dependencies within
programs (or in our case, specifications) and thus can be used to determine
the influence of a change or extension on proven properties. This technique
is originally coming from program analysis where slicing techniques operating
on program dependence graphs are used to reduce a program with respect to
certain variables of interest. Slicing techniques (and a similar technique called
cone-of-influence reduction) are also being applied in software and hardware
model checking for reducing programs [4,9,1].
In our framework classes are not written in a programming language but are
defined in a state-based object-oriented formal method (Object-Z [11,2]). Cor-
rectness requirements on classes are formalised in a temporal logic (LTL [8]).
As changes (specialisation) we allow the addition of attributes, the modification
of existing methods and the extension with new methods. A comparable study
332 H. Wehrheim

about inheritance of CTL properties is described in [15], however, not employ-


ing the program dependence graphs of slicing which allow for a more succinct
representation of the dependencies within specifications.
The paper is structured as follows. In the next section we define the necessary
background for our study. Section 3 studies property preservation for subtypes
and section 4 introduces slicing as a more general technique for computing pre-
served properties for arbitrary changes. Section 5 concludes.

2 Background
This section describes the background necessary for understanding the results:
the definition of classes in Object-Z, the temporal logic LTL and a result showing
that LTL-X properties are preserved under stuttering equivalence. Stuttering
equivalence will be used to compare super- and subclasses.

2.1 Class Definitions


Classes are described in a formalism very close to Object-Z [11]1 . Object-Z is an
object-oriented extension of Z and thus a state-based specification technique.
The following specification of a simple account is the running example for our
technique. It specifies the state of an account (with a certain balance), its initial
value and two methods for depositing and withdrawing money from the account.
Methods are specified with enable and effect schemas describing the guard (to
the execution of) and the effect of executing the method. For instance, since the
account may not be overdrawn, the guard of Withdraw specifies that the amount
of money to be withdrawn may not exceed the balance. The Δ-list of an effect
schema fixes the set of variables which may be changed by an execution of the
method.

Account0

balance : Z

Init
balance = 0

enable Deposit effect Deposit


amount? : N Δ(balance)
amount? : N
true
balance  = balance + amount?

1
In fact, it is the Object-Z part of CSP-OZ specifications [3] (a formalism which inte-
grates CSP with Object-Z). We use this formalism since we are ultimately interested
in answering the question of property preservation for CSP-OZ.
Preserving Properties Under Change 333

enable Withdraw effect Withdraw


amount? : N Δ(balance)
amount? : N
balance ≥ amount?
balance  = balance − amount?

In our definitions we use the following non-graphical formulation of classes.


Classes consist of attributes (or variables) and methods to operate on attributes.
Methods may have input parameters and may return values, referred to as output
parameters. We assume variables and input/output parameters to have values
from a global set D. A valuation of a set of variables V is a mapping from V to
D, we let RV = {ρ : V → D} stand for the set of all valuations of V ; the set of
valuations of input parameters Inp and output parameters Out can be similarly
defined. We assume that the initialisation schema precisely fixes the values of
variables (i.e. is deterministic) in order to have just one initial state2 .
A class is thus characterised by
– A set of attributes (or variables) V ,
– an initial valuation of V to be used upon construction of objects: I : V → D,
and
– a set of methods (names) M with input and output parameters from a set
of inputs Inp and a set of outputs Out. For simplicity we assume Inp and
Out to be global. Each m ∈ M has a guard enablem : RV × RInp → B
(B are booleans) and an effect effectm : RV × RInp → RV × ROut . The
guard specifies the states in which the method is executable and the effect
determines the outcome of the method execution.
A class will thus be denoted by (V , I , (enablem )m∈M , (effectm )m∈M ) or (V , I ,
M ) for short. We furthermore need to know the set of variables which are set and
referenced by a schema: Set(enable m) = ∅, Set(effect m) are the variable
appearing in the Δ-list of the effect schema and Ref (enable m), Ref (effect m)
are those that syntactically appear in the schemas enable m, effect m respec-
tively.
The semantics of a class is defined in terms of Kripke structures.
Definition 1. Let AP be a nonempty set of atomic propositions. A Kripke
structure K = (S , s0 , −→ , L) over AP consists of a finite set of states S , an
initial state s0 ∈ S , a transition relation −
→ ⊆ S × S and a labelling function
L : S → 2AP .
The set of atomic propositions determines what we may observe about a
state. Essentially there are two kinds of properties we like to look at: the values
of variables and the availability of methods. Thus the atomic propositions APA
that we consider for a class A = (V , I , (enablem )m∈M , (effectm )m∈M ) are

2
This assumption is not essential but more convenient.
334 H. Wehrheim

– v = d , v ∈ V , d ∈ D and
– enabled (m), m ∈ M .

The Kripke structure semantics of a class definition is then defined as follows.

Definition 2. The semantics of (an object of ) a class A = (V , I , (enablem )m∈M ,


(effectm )m∈M ) is the Kripke structure K = (S , s0 , −
→ , L) over APA with
– S = RV ,
– s0 = I ,
→ = {(s, s  ) | ∃ m ∈ M , ρin ∈ RInp , ρout ∈ ROut : enablem (s, ρin )
– −
∧ effectm (s, ρin ) = (s  , ρout )},
– L(s) = {v = d | s(v ) = d } ∪ {enabled (m) | ∃ ρin ∈ RInp : enablem (s, ρin )}.

Since the atomic propositions do not refer to inputs and outputs of methods,
they are not reflected in the semantics. However, inputs and outputs can be
embedded in the state and thus can be made part of the atomic propositions
(see e.g. [12]).
Figure 1 shows the Kripke structure (without L) of class Account0 . The num-
bers indicate the values of attribute balance. All states satisfying balance < 0 are
unreachable. The upper arrows correspond to executions of Deposit, the lower
to those of Withdraw .

... ...
−1 0 1

Fig. 1. Kripke structure of class Account0

Furthermore, we have to fix the kind of changes allowed in subclasses. We do


not allow to remove methods, but methods can be arbitrarily modified as well
as new methods and variables be introduced.

Definition 3. Let A and C be classes. C is a specialisation of A if VA ⊆ VC ,


MA ⊆ MC and IC |VA = IA .

2.2 LTL Formulae


The temporal logic which we use for describing our properties on classes is linear-
time temporal logic (LTL) [8].

Definition 4. The set of LTL formulae over AP is defined as the smallest set
of formulae satisfying the following conditions:

– p ∈ AP is a formula,
– if ϕ1 , ϕ2 are formulae, so are ¬ϕ1 and ϕ1 ∨ ϕ2 ,
– if ϕ is a formula, so are X ϕ (Next), 2ϕ (Always), 3ϕ (Eventually),
– if ϕ1 , ϕ2 are formulae, so is ϕ1 U ϕ2 (Until).
Preserving Properties Under Change 335

As usually, other boolean connectives can be derived from ¬ and ∨. The


next-less part of LTL is referred to as LTL-X. LTL formulae are interpreted on
paths of the Kripke structure, and a formula holds for the Kripke structure if it
holds for all of its paths.

Definition 5. Let K = (S , s, − → , L) be a Kripke structure. A finite or infinite


sequence of states π = s0 s1 s2 . . . is a path of K iff s = s0 and (si , si+1 ) ∈ − →
for all 0 ≤ i . For a path π = s0 s1 s2 . . . we write π[i ] to stand for si and π i to
stand for si si+1 si+2 . . .. The length of a path π, #π, is defined to be the number
of states (in case of a finite path) or ∞ (in case of an infinite path).

Usually, paths are assumed to be always infinite (and LTL formulae inter-
preted on infinite paths). We deviate from that because objects may also exhibit
finite behaviour: if no methods are called from the outside anymore, the object
just stops. This has, however, consequences on the holding of liveness properties:
since for instance s0 alone is also a path, a liveness property can only hold if
it already holds in the initial state. Thus we essentially treat safety here. Live-
ness can be treated if we additionally make some fairness assumptions on the
environment of an object (see conclusion for a discussion).

Definition 6. Let K = (S , s0 , −→ , L) be a Kripke structure and ϕ an LTL for-


mula, both over AP . K satisfies ϕ (K |= ϕ) iff π |= ϕ holds for all paths π of
K , where π |= ϕ is defined as follows:

– π |= p iff p ∈ L(π[0]),
– π |= ¬ϕ iff not π |= ϕ,
– π |= ϕ1 ∨ ϕ2 iff π |= ϕ1 or π |= ϕ2 ,
– π |= X ϕ iff #π > 1 ∧ π 1 |= ϕ,
– π |= 2ϕ iff ∀ i , 0 ≤ i ≤ #π : π i |= ϕ,
– π |= 3ϕ iff ∃ i , 0 ≤ i ≤ #π : π i |= ϕ,
– π |= ϕ1 U ϕ2 iff ∃ k , 0 ≤ k ≤ #π : π k |= ϕ2 and ∀ j , 0 ≤ j < k : π j |= ϕ1 .

For our bank example we for instance have the following properties. The
Kripke structure KAccount0 of Account0 fulfills

KAccount0 |= 2(balance ≥ 0) ,
KAccount0 |= 2(enabled (Deposit)) .

2.3 Stuttering Equivalence


For showing that properties are preserved under change, or more particular,
that a certain property still holds for a subclass, we will later compare super-
and subclasses according to a notion of equivalence called stuttering equivalence.
Stuttering equivalence is defined with respect to some set of atomic proposition
and roughly says that on these proposition of interest two Kripke structures have
an equivalent behaviour. All transitions changing propositions outside those of
interest are regarded as stuttering steps.
336 H. Wehrheim

Definition 7. Two infinite paths π = s0 s1 s2 . . . and ρ = r0 r1 r2 . . . are stuttering


equivalent wrt. a set of atomic propositions AP (π ≈AP ρ) if there are two
sequences of indices 0 = i0 < i1 < i2 < . . . and 0 = j0 < j1 < j2 < . . . such that
for every k ≥ 0

L(sik ) ∩ AP = L(sik +1 ) ∩ AP = · · · = L(sik +1 −1 ) ∩ AP =


L(rjk ) ∩ AP = L(rjk +1 ) ∩ AP = · · · = L(rjk +1 −1 ) ∩ AP

A finite path π = s0 . . . sn is stuttering equivalent to an infinite path ρ if its exten-


sion with an infinite number of repetitions of the last state, i.e. s0 . . . sn sn sn . . .,
is stuttering equivalent to σ. (And similarly for two finite paths.)

Intuitively, the sequences are equivalent if they can be divided into blocks in
which atomic propositions stay stable and the i -th block in π has the same set
of propositions from AP as the i -th block in ρ (illustrated in Figure 2).

p, q p, q p, ¬q p, ¬q ¬p, ¬q p, ¬q ...

p, q p, ¬q p, ¬q p, ¬q ¬p, ¬q p, ¬q ...

Fig. 2. Stuttering equivalent sequences

Definition 8. Let Ki = (Si , s0,i , −→ i , Li ), i = 1, 2, be Kripke structures over


AP1 , AP2 , respectively. K1 and K2 are stuttering equivalent with respect to a set
of atomic propositions AP ⊆ AP1 ∩ AP2 (K1 ≈AP K2 ) iff

– initial states agree on AP :

L1 (s0,1 ) ∩ AP = L2 (s0,2 ) ∩ AP ,

– for each path π in K1 starting from s0,1 there exists a path π  starting from
s0,2 such that π ≈AP π  ,
– and vice versa, for each path π in K2 starting from s0,2 there exists a path
π  starting from s0,1 such that π ≈AP π  .

Stuttering equivalent Kripke structures satisfy the same set of LTL-X prop-
erties [1]. The Next operator has to be omitted since stuttering may introduce
additional steps in one structure which have no counterpart in the other.

Theorem 1. Let ϕ be an LTL-X formula over AP and K1 , K2 Kripke struc-


tures. If K1 ≈AP K2 then

K1 |= ϕ iff K2 |= ϕ .
Preserving Properties Under Change 337

3 Property Preservation
Now that we have set the ground, we have another look at our example and
make two changes to the class. The first is an extension of the class, we add one
new method for balance checking. Here, we use inheritance to avoid having to
write the whole specification again.

Account1
inherit Account0
enable CheckBalance
true

effect CheckBalance
bal ! : Z
bal ! = balance

The second change is a modification, we modify the account such that it


allows overdrawing up to a certain amount. Here, we inherit all parts but the
definition of Withdraw which is overwritten by the new definition.

Account2
inherit Account0
modifies Withdraw
Init
overdraft : N overdraft = 1000

enable Withdraw effect Withdraw


amount? : N Δ(balance)
amount? : N
balance − amount? ≥
−overdraft balance  = balance−
amount?

The question is then which of our properties are preserved, i.e. which of the
following questions can be answered with yes.

KAccount1 |= 2(balance ≥ 0)?


KAccount1 |= 2(enabled (Deposit))?
KAccount2 |= 2(balance ≥ 0)?
KAccount2 |= 2(enabled (Deposit))?

For this simple example, the answers are easy. What we aim at is, however,
a general technique which answers such questions. In general, these two changes
338 H. Wehrheim

are of two different types. The changed class can be a subtype of the original class
(and then all properties are preserved) or it is not (and then a more sophisticated
technique has to be applied to find out whether a property is preserved).
In this section, we deal with the first, more simple case. The second case is
dealt with in the next section. A subtype can be seen as a conservative extension
of a class: new methods may read but may not modify old variables.

Definition 9. Let A = (VA , IA , MA ) and C = (VC , IC , MC ) be two classes, C


a specialisation of A. C is a subtype of A iff the following conditions hold:
– ∀ m ∈ MC \ MA : SetC (effect m) ⊆ VC \ VA (m only modifies new vari-
ables),
– ∀ m ∈ MA : enablem
C A
= enablem ∧ effectm
C A
= effectm (old methods not modi-
fied).

Subtypes inherit all properties as long as they are only talking about propo-
sitions over the old attributes and methods.

Theorem 2. Let C , A be classes, C a subtype of A. Let furthermore AP be the


set of atomic propositions over VA and MA . For all LTL-X formulae ϕ over AP
we then have

A |= ϕ =⇒ C |= ϕ .

In fact, the implication also holds in the reverse direction. The proof proceeds
by showing that C and A are stuttering equivalent. The stuttering steps in C
are those belonging to executions of the new methods: they do not change old
attributes and thus do not affect AP . The proof can be found in an extended
version of this paper.
Coming back to our example, Account1 is a subtype of Account0 : CheckBalance
only references balance but does not modify it. Hence both properties are pre-
served:

KAccount1 |= 2(balance ≥ 0)
KAccount1 |= 2(enabled (Deposit))

4 Slicing
In this section we look at the more general case, where the modifications do not
lead to subtypes. For this case, we cannot get one general result but have to
specifically look at the changes made and the properties under interest.
The technique we use for computing whether a property is preserved under a
specific change is the slicing technique of program analysis [13]. In program ana-
lysis slicing is originally used for debugging and testing, and answers questions
like the following: ”given a variable v and a program point p, which part of
the program may influence the value of v at p?”. Here, we like to extract a
similar kind of information about our changes: ”given some propositions and
Preserving Properties Under Change 339

some change, does it influence the value of these propositions?”. Technically,


slicing operates on graphs which contain information about the dependencies
within a program, so called program dependence graphs (PDG). A similar graph
is now built for Object-Z classes. It starts from the control flow graph (CFG) of
a class (depicted in Figure 3), which contains
– one node n0 labelled Init,
– one node nDO labelled DO (nondeterministic choice),
– for every method m two nodes nen m and neff m labelled enable m and
effect m.

Init

DO

enable m1 ... enable mn

effect m1 effect mn

Fig. 3. Control flow graph of a class

The program dependence graph is obtained from this graph by erasing all
arrows and adding new ones corresponding to the control and data dependencies
of the class. Formally,

Definition 10. A program dependence graph (PDG) of a class specification


(V , I , M ) is a graph G = (K , l , ;, ) with
– K = {n0 , nDO } ∪ {nen m | m ∈ M } ∪ {neff m | m ∈ M } a set of nodes,
– l a labelling function with

l: n0 → Init
nDO → DO
nen m → enable m
neff m → effect m

– ; ⊆ K × K the data dependence edges defined by

n ; n  iff ∃ x ∈ V : x ∈ Set(l (n)) and x ∈ Ref (l (n  )) and n →∗CFG n  ,

–  ⊆ K × K the control dependence edges defined by

n  n  iff ∃ m ∈ M : l (n) = enable m and l (n  ) = effect m .

Here, we take Set(DO) = Ref (DO) = ∅. For class Account0 this gives rise
to the graph shown in Figure 4.
340 H. Wehrheim

Init

DO

enable Deposit enable Withdraw

effect Deposit effect Withdraw

= data dependency
= control dependency

Fig. 4. PDG of class Account0

For computing whether a property of A is preserved in C , we build a PDG


including methods and dependencies of both A and C . In this PDGA,C we next
determine the forward slice of all modified or new methods (where we say that
Init is modified if it assigns new values to variables from A). The forward slice of
a set of nodes N is the part of the graph which is forward reachable from nodes
in N via data or control dependencies.
Definition 11. Let C , A be classes, C a specialisation of A. Let furthermore
N be the nodes belonging to methods which are changed or new in C , i.e.

N = {n | (l (n) ∈ {enable[m], effect m | m ∈ MC \ MA }) ∨


( ∃ m ∈ MA : l (n) = enable m ∧ enablemC = enablemA ) ∨
( ∃ m ∈ MA : l (n) = effect m ∧ effectmC = effectmA )}
The forward slice of N is the set of nodes in PDGA,C which are forward
reachable from N , i.e.

fs(N ) = {n  ∈ K | ∃ n ∈ N : n(; ∪ )∗PDGA,C n  }

The forward slice of N is the part of the class which is directly or indirectly
influenced by the changes. The atomic propositions appearing in this part might
be changed. We let APN denote the atomic propositions over variables or meth-
ods in the forward slice of N .

APN = {v = d | v ∈ VC \ VA ∨ ∃ n ∈ fs(N ) : v ∈ SetC (l (n)) ∪ SetA (l (n))}


∪ {enabled (m) | ∃ n ∈ fs(N ) : l (n) = enable m}

Since these atomic propositions are potentially affected by the change, a


formula talking about them might not hold in the subclass anymore. However,
if a formula does not use propositions in APN then it is preserved.
Preserving Properties Under Change 341

Theorem 3. Let A, C be classes, N the set of methods changed or new in C .


If ϕ is an LTL-X formulae over AP \ APN , then the following holds:

A |= ϕ =⇒ C |= ϕ .

Init

DO

enable Deposit enable Withdraw

effect Deposit effect Withdraw

Fig. 5. PDG of Account0 , Account2

The proof again proceeds by showing that KA and KC are stuttering equiv-
alent wrt. AP \ APN and is included in an extended version. Since they are
stuttering equivalent the implication in the theorem holds in the reverse direc-
tion as well.
For our example, the PDG for Account0 , Account2 is depicted in Figure 5.
The set of changed methods N is {Withdraw }. Nodes not in the forward slice
of Withdraw are {Init, DO, enable Deposit}. The variable balance is set by a
method in the forward slice, but enable Deposit is not in the forward slice.
Hence, concerning our properties, we know that one of them is preserved:

KAccount2 |= 2(enabled (Deposit))

but for the question KAccount2 |= 2(balance ≥ 0)? we get no answer (and in fact
this property does not hold anymore).
The case of changes leading to subtypes can be seen as one particular instance
of this more general result: for subtypes we know by definition that the forward
slice (of the new methods) will only contain new methods and thus affects only
new variables. Hence, the proof of Theorem 3 can be seen as an alternative way
of proving Theorem 2.
The PDG of Account0 , Account1 is depicted in Figure 6. As can be seen, in
the forward slice of CheckBalance there is only CheckBalance.

5 Conclusion
This work is concerned with the re-use of verification results of classes. Given
a verified class the technique presented in this paper can be used to determine
342 H. Wehrheim

Init

DO

enable Deposit enable Withdraw enable CheckBalance

effect Deposit effect Withdraw effect CheckBalance

Fig. 6. PDG of Account0 , Account1

whether some specific property is preserved under a change made to the class.
The technique relies on the representation of the dependencies of a class specifi-
cation in a program dependence graph. On this graph it is possible to determine
the effect of changes on the behaviour of a class. As a special case we looked
at changes inducing subtypes in which all properties (talking about the original
class) are preserved.
So far, this technique considers a single class only. It could be extended to
larger systems either by combining it with compositional verification techniques
(e.g. for Object-Z [16]), or by constructing a program dependence graph of the
whole system. The latter could be achieved by combining program dependence
graphs of the individual objects through a special new dependency arc reflect-
ing the call structure between objects (possibly following approaches for slicing
programs with procedures).
Another limitation of the work presented here concerns the treatment of
liveness properties. The inclusion of finite paths of a Kripke structure into the
interpretation of LTL formulae lead to a restriction to safety properties. There
are a number of ways of avoiding this limitation. One way would be to make
certain assumptions about the environment of an object in that it infinitely often
calls certain methods. These set of methods can be used as a fairness constraint
on the behaviour of an object. The interpretation of LTL formulae can then be
restricted to fair paths. If the fairness constraint for the subclass is the same as
that for the superclass preservation of properties can be achieved.
As future work, we like to extend the technique presented here to integrated
specification formalisms which allow for the modelling of different viewpoints in
different formal methods, as for instance CSP-OZ [3].

References
1. E. Clarke, O. Grumberg, and D. Peled. Model checking. MIT Press, 1999.
2. R. Duke, G. Rose, and G. Smith. Object-Z: A specification language advocated for
the description of standards. Computer Standards and Interfaces, 17:511–533, 1995.
Preserving Properties Under Change 343

3. C. Fischer. CSP-OZ: A combination of Object-Z and CSP. In H. Bowman and


J. Derrick, editors, Formal Methods for Open Object-Based Distributed Systems
(FMOODS ’97), volume 2, pages 423–438. Chapman & Hall, 1997.
4. J. Hatcliff, M. Dwyer, and H. Zheng. Slicing software for model construction.
Higher-order and Symbolic Computation. To appear.
5. K. Huizing and R. Kuiper. Reinforcing fragile base classes. In A. Poetzsch-Heffter,
editor, Workshop on Formal Techniques for Java Programs, ECOOP 2001, 2001.
6. G.T. Leavens and W.E. Weihl. Specification and verification of object-oriented
programs using supertype abstraction. Acta Informatica, 32:705–778, 1995.
7. B. Liskov and J. Wing. A behavioural notion of subtyping. ACM Transactions
on Programming Languages and Systems, 16(6):1811 – 1841, 1994.
8. Z. Manna and A. Pnueli. The temporal logic of reactive and concurrent systems
(Specification). 1991.
9. L. Millett and T. Teitelbaum. Issues in slicing promela and its applications
to model checking, protocol understanding, and simulation. Software Tools for
Technology Transfer, 2(4):343–349, 2000.
10. A. Poetzsch-Heffter and J. Meyer. Interactive verification environments for object-
oriented languages. Journal of Universal Computer Science, 5(3):208–225, 1999.
11. G. Smith. The Object-Z Specification Language. Kluwer Academic Publisher, 2000.
12. G. Smith and K. Winter. Proving Temporal Properties of Z Specifications Using
Abstraction. In D. Bert, J.P. Bowen, S. King, and M. Walden, editors, ZB 2003:
Formal Specification and Development in Z and B, number 2651 in LNCS, pages
260–279. Springer, 2003.
13. F. Tip. A survey of program slicing techniques. Journal of programming languages,
3(3), 1995.
14. H. Wehrheim. Behavioural subtyping and property preservation. In S. Smith
and C. Talcott, editors, FMOODS’00: Formal Methods for Open Object-Based
Distributed Systems. Kluwer, 2000.
15. H. Wehrheim. Inheritance of temporal logic properties. In P. Stevens and
U. Nestmann, editors, FMOODS 2003: Formal Methods for Open Object-based
Distributed Systems, volume 2884 of LNCS, pages 79–93. Springer, 2003.
16. K. Winter and G. Smith. Compositional Verification for Object-Z. In D. Bert,
J.P. Bowen, S. King, and M. Walden, editors, ZB 2003: Formal Specification and
Development in Z and B, number 2651 in LNCS, pages 280–299. Springer, 2003.
Tools for Generating and Analyzing Attack Graphs

Oleg Sheyner1 and Jeannette Wing2


1
Carnegie Mellon University, Computer Science Department
5000 Forbes Avenue, Pittsburgh, PA 15213
oleg@cs.cmu.edu
2
Carnegie Mellon University, Computer Science Department
5000 Forbes Avenue, Pittsburgh, PA 15213
wing@cs.cmu.edu

Abstract. Attack graphs depict ways in which an adversary exploits system vul-
nerabilities to achieve a desired state. System administrators use attack graphs
to determine how vulnerable their systems are and to determine what security
measures to deploy to defend their systems. In this paper, we present details of
an example to illustrate how we specify and analyze network attack models. We
take these models as input to our attack graph tools to generate attack graphs au-
tomatically and to analyze system vulnerabilities. While we have published our
generation and analysis algorithms in earlier work, the presentation of our example
and toolkit is novel to this paper.

1 Introduction
As networks of hosts continue to grow, it becomes increasingly more important to auto-
mate the process of evaluating their vulnerability to attack. When evaluating the security
of a network, it is rarely enough to consider the presence or absence of isolated vulnera-
bilities. Large networks typically contain multiple platforms and software packages and
employ several modes of connectivity. Inevitably, such networks have security holes that
escape notice of even the most diligent system administrator.

1.1 Vulnerability Analysis and Attack Graphs


To evaluate the security of a network of hosts, a security analyst must take into account the
effects of interactions of local vulnerabilities and find global security holes introduced
by interconnection. A typical process for vulnerability analysis of a network is shown in
Figure 1. First, scanning tools determine vulnerabilities of individual hosts. Using this
local vulnerability information along with other information about the network, such as
connectivity between hosts, the analyst produces an attack graph. Each path in an attack
graph is a series of exploits, which we call actions, that leads to an undesirable state. An
example of an undesirable state is a state where the intruder has obtained administrative
access to a critical host.
A typical result of such efforts is a floor-to-ceiling, wall-to-wall “white board” attack
graph, such as the one produced by a Red Team at Sandia National Labs for DARPA’s
CC20008 Information battle space preparation experiment and shown in Figure 2. Each

F.S. de Boer et al. (Eds.): FMCO 2003, LNCS 3188, pp. 344–371, 2004.
c Springer-Verlag Berlin Heidelberg 2004
Tools for Generating and Analyzing Attack Graphs 345

host vulnerability
scanning information
tools per host
Red Team Attack Graph

network information
network

Fig. 1. Vulnerability Analysis of a Network

box in the graph designates a single intruder action. A path from one of the boxes at the
top of the graph to one of the boxes at the bottom is a sequence of actions corresponding
to an attack scenario. At the end of any such scenario, the intruder has broken the network
security in some way. The graph is included here for illustrative purposes only, so we
omit the description of specific details.
Attack graphs can serve as a useful tool in several areas of network security, including
intrusion detection, defense, and forensic analysis. System administrators use attack
graphs for the following reasons:

– To gather information: Attack graphs can answer questions like “What attacks is
my system vulnerable to?” and “From an initial configuration, how many different
ways can an attacker reach a final state to achieve his goal?”
– To make decisions: Attack graphs can answer questions like “Which set of actions
should I prevent to ensure the attacker cannot achieve his goal?” or “Which set of
security measures should I deploy to ensure the attacker cannot achieve his goal?”

1.2 Prior Work and Contributions of this Paper


In practice, attack graphs, such as the one shown in Figure 2, are drawn by hand. In
earlier work, we show how we can use model checking techniques to generate attack
graphs automatically [11, 17]. Our techniques guarantee that attack graphs are sound
(each scenario depicted is a true attack), exhaustive (no attack is missed), and succinct
(only states and state transitions that participate in an attack are depicted) [16].
In earlier work, we also have presented algorithms for analyzing attack graphs that
answer questions such as those posed above [17, 9, 10]. For example, to help system
administrators determine how best to defend their system, we cast the decision-making
questions in terms of finding a minimum set of actions to remove (or minimum set of
measures to deploy) to ensure the attacker cannot achieve his goal. We reduce this NP-
complete problem to the Minimum Hitting Set problem [16], which can be reduced to
the Minimum Set Cover problem [2], and we then use standard textbook algorithms to
yield approximate solutions [3].
346 O. Sheyner and J. Wing

Fig. 2. Sandia Red Team Attack Graph


Tools for Generating and Analyzing Attack Graphs 347

In this paper, we present the complete details of an example. We use this example to
illustrate:
– How we specify network attack models;
– The results of our automatic attack graph generation algorithms;
– The results of our minimization analyses;
– How to use our attack graph toolkit, including how we integrated tools from external
sources with ours.
The presentation of this example and our toolkit is novel to this paper.
In Sect. 2 we give general definitions for attack models and attack graphs. Section 3
narrows the definitions specifically to the domain of network security. Section 4 illus-
trates the definitions with a small example network. Section 5 focuses on the practical
aspects of building a usable attack graph tool. We discuss several approaches to collect-
ing the data necessary to build the network model. Finally, we review related work in
Sect. 6.

2 Attack Models and Graphs


Although our primary interest is in multi-stage cyber-attacks against computer networks,
we define attack graphs abstractly with respect to models where agents attack and defend
a complex system.
Definition 1. An attack model is a finite automaton M = (S, τ, s0 ), where S is a set of
states, τ ⊆ S × S is a transition relation, and s0 ∈ S is an initial state. The state space
S represents a set of three agents I = {E, D, T }. Agent E is the attacker, agent D is
the defender, and agent N is the system under attack. Each agent i ∈ I has its own set
of possible states Si , so that S = ×i∈I Si .

Definition 2. A finite execution of an attack model M = (S, τ, s0 ) is a finite sequence of


states α = s0 s1 . . . sn ,such that for all 0 ≤ i ≤ n, (si , si+1 ) ∈ τ . An infinite execution
of an attack model M = (S, τ, s0 ) is an infinite sequence of states β = s0 s1 . . . sn . . .,
such that for all i ≥ 0, (si , si+1 ) ∈ τ .
With each agent i ∈I we associate a set of actions Ai , so that the total set of actions
in the model is A = i∈I Ai . The single root state s0 represents the initial state of
each agent before any action has taken place. In general, the attacker’s actions move
the system “toward” some undesirable (from the system’s point of view) state, and the
defender’s actions attempt to counteract that effect. For instance, in a computer network
the attacker’s actions would be the steps taken by the intruder to compomise the network,
and the defender’s actions would be the steps taken by the system administrator to disrupt
the attack.
The specifics of how each agent is represented in an attack model depend on the type
of the system that is under attack. In Sect. 3 we specify the agents more precisely for
network attack models. Sheyner presents a more formal definition of attack models in
his Ph.D. thesis [16].
348 O. Sheyner and J. Wing

An attack model is a general formalism suitable for modeling a variety of situations.


The system under attack can be virtually anything: a computer network under attack by
hackers, a city under siege during war, an electrical grid targeted by terrorists, etc. The
attacker is an abstract representation of a group of agents who seek to move the system to
a state that is inimical to the system’s interests. The defender is an agent whose explicit
purpose, in opposition to the attacker, is to prevent this occurrence. The system itself is
oblivious to the fact that it is under attack. It goes through its normal routine according
to its purpose and goals regardless of the actions of the active agents.
Abstractly, an attack graph is a collection of scenarios showing how a malicious agent
can compromise the integrity of a target system. With a suitable model of the system, we
can use model checking techniques to generate attack graphs automatically [11, 17, 16].
In this context, correctness properties specify the negation of the attacker’s goal: an
execution is correct with respect to the property if the attacker does not achieve his goal
for that execution. We call such properties security properties. An example of a security
property in a computer network would be a statement like “the intruder cannot get root
access on the web server.”
Definition 3. Given an attack model M , a security property P is a subset of the set
L(M ) of executions of M .

Definition 4. An execution α ∈ L(M ) is correct with respect to a security property P


iff α ∈ P . An execution α is failing with respect to P (violates P ) iff α ∈
/ P.
We say that an attack model M satisfies a security property P if it does not have
any failing executions (that is, if L(M ) ⊂ P ). If, however, M does have some failing
executions, we say that the set of such executions makes up an attack graph.
Definition 5. Given an attack model M and a security property P , an attack graph of
M with respect to P is the set L(M )\P of failing executions of M with respect to P .
For the remainder of this paper, we restrict the discussion to attack graphs comprised
of finite executions only. For a more comprehensive treatment of finite and infinite failing
executions we refer the reader to Sheyner [16].

3 Network Attack Graphs


Network attack graphs represent a collection of possible penetration scenarios in a com-
puter network. Each penetration scenario is a sequence of actions taken by the intruder,
typically culminating in a particular goal—administrative access on a particular host, ac-
cess to a database, service disruption, etc. For appropriately constructed network models,
attack graphs give a bird’s-eye view of every scenario that can lead to a serious security
breach.

3.1 Network Attack Model


A network attack model is an attack model where the system N is a computer network,
the attacker E is a malicious agent trying to circumvent the network’s security, and the
Tools for Generating and Analyzing Attack Graphs 349

defender D represents both the system administrator(s) and security software installed on
the network. A state transition in a network attack model corresponds to a single action
by the intruder, a defensive action by the system administrator, or a routine network
action.
Real networks consist of a large variety of hardware and software pieces, most of
which are not involved in cyber attacks. We have chosen six network components relevant
to constructing network attack models. The components were chosen to include enough
information to represent a wide variety of networks and attack scenarios, yet keep the
model reasonably simple and small. The following is a list of the components:

1. H, a set of hosts connected to the network


2. C, a connectivity relation expressing the network topology and inter-host reachabil-
ity
3. T, a relation expressing trust between hosts
4. I, a model of the intruder
5. A, a set of individual actions (exploits) that the intruder can use to construct attack
scenarios
6. Ids, a model of the intrusion detection system

We construct an attack model M based on these components. Table 1 defines each


agent i’s state Si and action set Ai in terms of the network components. This construction
gives the security administrator an entirely passive “detection” role, embodied in the
alarm action of the intrusion detection system. For simplicity, regular network activity
is omitted entirely.

Table 1. Network attack model

Agent i ∈ I Si Ai
E I A
D Ids {alarm}
N H ×C ×T 

It remains to make explicit the transition relation of the attack model M . Each
transition (s1 , s2 ) ∈ τ is either an action by the intruder, or an alarm action by the
system administrator. An alarm action happens whenever the intrusion detection system
is able to flag an intruder action. An action a ∈ A requires that the preconditions of
a hold in state s1 and the effects of a hold in s2 . Action preconditions and effects are
explained in Sect. 3.2.

3.2 Network Components


We now give details about each network component.
Hosts. Hosts are the main hubs of activity on a network. They run services, process
network requests, and maintain data. With rare exceptions, every action in an attack
350 O. Sheyner and J. Wing

scenario will target a host in some way. Typically, an action takes advantage of vulnerable
or misconfigured software to gain information or access privileges for the attacker.
The main goal in modeling hosts is to capture as much information as possible about
components that may contribute to creating an exploitable vulnerability.
A host h ∈ H is a tuple (id, svcs, sw, vuls), where

– id is a unique host identifier (typically, name and network address)


– svcs is a list of service name/port number pairs describing each service that is active
on the host and the port on which the service is listening
– sw is a list of other software operating on the host, including the operating system
type and version
– vuls is a list of host-specific vulnerable components. This list may include installed
software with exploitable security flaws (example: a setuid program with a buffer
overflow problem), or mis-configured environment settings (example: existing user
shell for system-only users, such as ftp)

Network Connectivity. Following Ritchey and Ammann [15], connectivity is expres-


sed as a ternary relation C ⊆ H × H × P , where P is a set of integer port numbers.
C(h1 , h2 , p) means that host h2 is reachable from host h1 on port p. Note that the
connectivity relation incorporates firewalls and other elements that restrict the ability of
one host to connect to another. Slightly abusing notation, we say R(h1 , h2 ) when there
is a network route from h1 to h2 .
Trust. We model trust as a binary relation T ⊆ H × H, where T (h1 , h2 ) indicates that
a user may log in from host h2 to host h1 without authentication (i.e., host h1 “trusts”
host h2 ).
Services. The set of services S is a list of unique service names, one for each service
that is present on any host on the network. We distinguish services from other software
because network services so often serve as a conduit for exploits. Furthermore, services
are tied to the connectivity relation via port numbers, and this information must be
included in the model of each host. Every service name in each host’s list of services
comes from the set S.
Intrusion Detection System. We associate a boolean variable with each action, ab-
stractly representing whether or not the IDS can detect that particular action. Actions are
classified as being either detectable or stealthy with respect to the IDS. If an action is de-
tectable, it will trigger an alarm when executed on a host or network segment monitored
by the IDS; if an action is stealthy, the IDS does not see it.
We specify the IDS as a function ids: H × H × A → {d, s, b}, where ids(h1 ,
h2 , a) = d if action a is detectable when executed with source host h1 and target host
h2 ; ids(h1 , h2 , a) = s if action a is stealthy when executed with source host h1 and target
host h2 ; and ids(h1 , h2 , a) = b if action a has both detectable and stealthy strains, and
success in detecting the action depends on which strain is used. When h1 and h2 refer to
the same host, ids(h1 , h2 , a) specifies the intrusion detection system component (if any)
located on that host. When h1 and h2 refer to different hosts, ids(h1 , h2 , a) specifies the
intrusion detection system component (if any) monitoring the network path between h1
and h2 .
Tools for Generating and Analyzing Attack Graphs 351

Actions. Each action is a triple (r, hs , ht ), where hs ∈ H is the host from which the
action is launched, ht ∈ H is the host targeted by the action, and r is the rule that
describes how the intruder can change the network or add to his knowledge about it. A
specification of an action rule has four components: intruder preconditions, network pre-
conditions, intruder effects, and network effects. The intruder preconditions component
places conditions on the intruder’s store of knowledge and the privilege level required
to launch the action. The network preconditions specifies conditions on target host state,
network connectivity, trust, services, and vulnerabilities that must hold before launching
the action. Finally, the intruder and network effects components list the action’s effects
on the intruder and on the network, respectively.

Intruder. The intruder has a store of knowledge about the target network and its users.
The intruder’s store of knowledge includes host addresses, known vulnerabilities, user
passwords, information gathered with port scans, etc. Also associated with the intruder
is the function plvl: Hosts → {none, user, root}, which gives the level of privilege that
the intruder has on each host. For simplicity, we model only three privilege levels. There
is a strict total order on the privilege levels: none ≤ user ≤ root.

Omitted Complications. Although we do not model actions taken by user services for
the sake of simplicity, doing so in the future would let us ask questions about effects of
intrusions on service quality. A more complex model could include services provided
by the network to its regular users and other routine network traffic. These details would
reflect more realistically the interaction between intruder actions and regular network
activity at the expense of additional complexity.
Another activity worth modeling explicitly is administrative steps taken either to
hinder an attack in progress or to repair the damage after an attack has occurred. The
former corresponds to transitioning to states of the model that offer less opportunity for
further penetration; the latter means “undoing” some of the damage caused by successful
attacks.

4 Example Network
Figure 3 shows an example network. There are two target hosts, Windows and Linux,
on an internal company network, and a Web server on an isolated “demilitarized zone”
(DMZ) network. One firewall separates the internal network from the DMZ and another

Fig. 3. Example Network


352 O. Sheyner and J. Wing

firewall separates the DMZ from the rest of the Internet. An intrusion detection system
(IDS) watches the network traffic between the internal network and the outside world.
The Linux host on the internal network is running several services—Linux “I Seek
You” (LICQ) chat software, Squid web proxy, and a Database. The LICQ client lets
Linux users exchange text messages over the Internet. The Squid web proxy is a caching
server. It stores requested Internet objects on a system closer to the requesting site than
to the source. Web browsers can then use the local Squid cache as a proxy, reducing
access time as well as bandwidth consumption. The host inside the DMZ is running
Microsoft’s Internet Information Services (IIS) on a Windows platform.
The intruder launches his attack starting from a single computer, which lies on the
outside network. To be concrete, let us assume that his eventual goal is to disrupt the
functioning of the database. To achieve this goal, the intruder needs root access on the
database host Linux. The five actions at his disposal are summarized in Table 2.
Each of the five actions corresponds to a real-world vulnerability and has an entry in
the Common Vulnerabilities and Exposures (CVE) database. CVE [22] is a standard list
of names for vulnerabilities and other information security exposures. A CVE identifier
is an eight-digit string prefixed with the letters “CVE” (for accepted vulnerabilities) or
“CAN” (for candidate vulnerabilities).

Table 2. Intruder actions

Action Effect Example CVE ID


IIS buffer overflow remotely get root CAN-2002-0364
Squid port scan port scan CVE-2001-1030
LICQ gain user gain user privileges remotely CVE-2001-0439
scripting exploit gain user privileges remotely CAN-2002-0193
local buffer overflow locally get root CVE-2002-0004

The IIS buffer overflow action exploits a buffer overflow vulnerability in the Mi-
crosoft IIS Web Server to gain administrative privileges remotely.
The Squid action lets the attacker scan network ports on machines that would other-
wise be inaccessible to him, taking advantage of a misconfigured access control list in
the Squid web proxy.
The LICQ action exploits a problem in the URL parsing function of the LICQ software
for Unix-flavor systems. An attacker can send a specially-crafted URL to the LICQ client
to execute arbitrary commands on the client’s computer, with the same access privileges
as the user of the LICQ client.
The scripting action lets the intruder gain user privileges on Windows machines.
Microsoft Internet Explorer 5.01 and 6.0 allow remote attackers to execute arbitrary
code via malformed Content-Disposition and Content-Type header fields that cause the
application for the spoofed file type to pass the file back to the operating system for
handling rather than raise an error message. This vulnerability may also be exploited
through HTML formatted email. The action requires some social engineering to entice
a user to visit a specially-formatted Web page. However, the action can work against
Tools for Generating and Analyzing Attack Graphs 353

firewalled networks, since it requires only that internal users be able to browse the Web
through the firewall.
Finally, the local buffer overflow action can exploit a multitude of existing vulner-
abilities to let a user without administrative privileges gain them illegitimately. For the
CVE number referenced in the table, the action exploits a buffer overflow flaw in the
at program. The at program is a Linux utility for queueing shell commands for later
execution.
Some of the actions that we model have multiple instantiations in the CVE database.
For example, the local buffer overflow action exploits a common coding error that occurs
in many Linux programs. Each program vulnerable to local buffer overflow has a separate
CVE entry, and all such entries correspond to the same action rule. The table lists only
one example CVE identifier for each rule.

4.1 Example Network Components


Services, Vulnerabilities, and Connectivity. We specify the state of the network to
include services running on each host, existing vulnerabilities, and connectivity between
hosts. There are five boolean variables for each host, specifying whether any of the three
services are running and whether either of two other vulnerabilities are present on that
host:

Table 3. Variables specifying a host

variable meaning
w3svch IIS web service running on host h
squidh Squid proxy running on host h
licqh LICQ running on host h
scriptingh HTML scripting is enabled on host h
vul-ath at executable vulnerable to overflow on host h

The model of the target network includes connectivity information among the four
hosts. The initial value of the connectivity relation R is shown the following table. An
entry in the table corresponds to a pair of hosts (h1 , h2 ). IIS and Squid listen on port 80
and the LICQ client listens on port 5190, and the connectivity relation specifies which
of these services can be reached remotely from other hosts. Each entry consists of three
boolean values. The first value is ‘y’ if h1 and h2 are connected by a physical link, the
second value is ‘y’ if h1 can connect to h2 on port 80, and the third value is ‘y’ if h1 can
connect to h2 on port 5190.
We use the connectivity relation to reflect the settings of the firewall as well as the
existence of physical links. In the example, the intruder machine initially can reach only
the Web server on port 80 due to a strict security policy on the external firewall. The
internal firewall is initially used to restrict internal user activity by disallowing most
outgoing connections. An important exception is that internal users are permitted to
contact the Web server on port 80.
In this example the connectivity relation stays unchanged throughout an attack. In
general, the connectivity relation can change as a result of intruder actions. For example,
354 O. Sheyner and J. Wing

Table 4. Connectivity relation

Host Intruder IIS Web Server Windows Linux


Intruder y,y,y y,y,n n,n,n n,n,n
IIS Web Server y,n,n y,y,y y,y,y y,y,y
Windows n,n,n y,y,n y,y,y y,y,y
Linux n,n,n y,y,n y,y,y y,y,y

an action may enable the intruder to compromise a firewall host and relax the firewall
rules.

Intrusion Detection System. A single network-based intrusion detection system pro-


tects the internal network. The paths between hosts Intruder and Web and between
Windows and Linux are not monitored; the IDS can see the traffic between any other
pair of hosts. There are no host-based intrusion detection components. The IDS always
detects the LICQ action, but cannot see any of the other actions. The IDS is represented
with a two-dimensional array of bits, shown in the following table. An entry in the table
corresponds to a pair of hosts (h1 , h2 ). The value is ‘y’ if the path between h1 and h2 is
monitored by the IDS, and ’n’ otherwise.

Intruder. The intruder’s store of knowledge consists of a single boolean variable ’scan’.
The variable indicates whether the intruder has successfully performed a port scan on
the target network. For simplicity, we do not keep track of specific information gathered
by the scan. It would not be difficult to do so, at the cost of increasing the size of the
state space.
Initially, the intruder has root access on his own machine Intruder, but no access
to the other hosts. The ’scan’ variable is set to false.

Actions. There are five action rules corresponding to the five actions in the intruder’s
arsenal. Throughout the description, S is used to designate the source host and T the target
host. R(S, T, p) says that host T is reachable from host S on port p. The abbreviation
plvl(X) refers to the intruder’s current privilege level on host X.
Recall that a specification of an action rule has four components: intruder precondi-
tions, network preconditions, intruder effects, and network effects. The intruder precon-
ditions component places conditions on the intruder’s store of knowledge and the privi-
lege level required to launch the action. The network preconditions component specifies

Table 5. IDS locations

Host Intruder IIS Web Server Windows Linux


Intruder n n y y
IIS Web Server n n y y
Windows y y n n
Linux y y n n
Tools for Generating and Analyzing Attack Graphs 355

conditions on target host state, network connectivity, trust, services, and vulnerabilities
that must hold before launching the action. Finally, the intruder and network effects
components list the effects of the action on the intruder’s state and on the network,
respectively.
Sometimes the intruder has no logical reason to execute a specific action, even if all
technical preconditions for the action have been met. For instance, if the intruder’s current
privileges include root access on the Web Server, the intruder would not need to execute
the IIS buffer overflow action against the Web Server host. We have chosen to augment
each action’s preconditions with a clause that disables the action in instances when
the primary purpose of the action has been achieved by other means. This change is not
strictly conservative, as it prevents the intruder from using an action for its secondary side
effects. However, we feel that this is a reasonable price to pay for removing unnecessary
transitions from the attack graphs.

IIS Buffer Overflow. This remote-to-root action immediately gives a remote user a root
shell on the target machine.

action IIS-buffer-overflow is
intruder preconditions
plvl(S) ≥ user User-level privileges on host S
plvl(T ) < root No root-level privileges on host T
network preconditions
w3svcT Host T is running vulnerable IIS server
R(S, T, 80) Host T is reachable from S on port 80
intruder effects
plvl(T ) := root Root-level privileges on host T
network effects
¬w3svcT Host T is not running IIS
end

Squid Port Scan. The Squid port scan action uses a misconfigured Squid web proxy to
conduct a port scan of neighboring machines and report the results to the intruder.

action squid-port-scan is
intruder preconditions
plvl(S) = user User-level privileges on host S
¬scan We have not yet performed a port scan
network preconditions
squidT Host T is running vulnerable Squid proxy
R(S, T, 80) Host T is reachable from S on port 80
intruder effects
scan We have performed a port scan on the network
network effects
3 No changes to the network component
end
356 O. Sheyner and J. Wing

LICQ Remote to User. This remote-to-user action immediately gives a remote user a
user shell on the target machine. The action rule assumes that a port scan has been
performed previously, modeling the fact that such actions typically become apparent to
the intruder only after a scan reveals the possibility of exploiting software listening on
lesser-known ports.

action LICQ-remote-to-user is
intruder preconditions
plvl(S) ≥ user User-level privileges on host S
plvl(T ) = none No user-level privileges on host T
scan We have performed a port scan on the network
network preconditions
licqT Host T is running vulnerable LICQ software
R(S, T, 5190) Host T is reachable from S on port 5190
intruder effects
plvl(T ) := user User-level privileges on host T
network effects
3 No changes to the network component
end

Scripting Action. This remote-to-user action immediately gives a remote user a user shell
on the target machine. The action rule does not model the social engineering required to
get a user to download a specially-created Web page.

action client-scripting is
intruder preconditions
plvl(S) ≥ user User-level privileges on host S
plvl(T ) = none No user-level privileges on host T
network preconditions
scriptingT HTML scripting is enabled on host T
R(T, S, 80) Host S is reachable from T on port 80
intruder effects
plvl(T ) := user User-level privileges on host T
network effects
3 No changes to the network component
end

Local Buffer Overflow. If the intruder has acquired a user shell on the target machine,
this action exploits a buffer overflow vulnerability on a setuid root file (in this case, the
at executable) to gain root access.

action local-setuid-buffer-overflow is
intruder preconditions
plvl(T ) = user User-level privileges on host T
Tools for Generating and Analyzing Attack Graphs 357

network preconditions
vul-atT There is a vulnerable at executable
intruder effects
plvl(T ) := root Root-level privileges on host T
network effects
3 No changes to the network component
end

4.2 Sample Attack Graphs


Figure 4 shows a screenshot of the attack graph generated with our attack graph toolkit
for the security property

G (intruder.privilege[lin] < root)

which states that the intruder will never attain root privileges on the Linux host. In
Figure 4, a sample attack scenario is highlighted with solid square nodes, with each
attack step identified by name and CVE number. Since the external firewall restricts
most network connections from the outside, the intruder has no choice with respect to
the initial step—it must be a buffer overflow action on the IIS Web server. Once the
intruder has access to the Web server machine, his options expand. The highlighted
scenario is the shortest route to success. The intruder uses the Web server machine to
launch a port scan via the vulnerable Squid proxy running on the Linux host. The scan
discovers that it is possible to obtain user privileges on the Linux host with the LICQ
exploit. After that, a simple local buffer overflow gives the intruder administrative control
over the Linux machine. The last transition in the action path is a bookkeeping step,
signifying the intruder’s success.
Any information explicitly represented in the model is available for inspection and
analysis in the attack graph. For instance, with a few clicks we are able to highlight

Begin

Highlighted scenario IIS buffer


overflow
CAN-2002-0364
Squid portscan
LICQ remote- CVE-2001-1030
Local buffer to-user
overflow CVE-2001-0439
CVE-2002-0004

Done!

Fig. 4. Example Attack Graph


358 O. Sheyner and J. Wing

portions of the graph “covered” by the intrusion detection system. Figure 5 shades the
nodes where the IDS alarm has been sounded. These nodes lie on paths that use the
LICQ action along a network path monitored by the IDS. It is clear that while a sub-
stantial portion of the graph is covered by the IDS, the intruder can escape detection and
still succeed by taking one of the paths on the right side of the graph. One such attack
scenario is highlighted with square nodes in Figure 5. It is very similar to the attack
scenario discussed in the previous paragraph, except that the LICQ action is launched
from the internal Windows machine, where the intrusion detection system does not
see it. To prepare for launching the LICQ action from the Windows machine, an addi-
tional step is needed to obtain user privileges in the machine. For that, the intruder uses
the client scripting exploit on the Windows host immediately after taking over the Web
machine.
Begin

Highlighted scenario
IIS buffer
overflow Scripting remote-
Alarm has sounded to-user
CAN-2002-0364
CAN-2002-0193

Squid portscan
CVE-2001-1030

LICQ remote-
to-user
CVE-2001-0439

Local buffer
overflow
Done!
CVE-2002-0004

Fig. 5. Alternative Attack Scenario Avoiding the IDS

4.3 Sample Attack Graph Analysis


After generating an attack graph, we can use it to analyze potential effectiveness of
various security improvements [16]. To demonstrate the analysis techniques, we expand
the example from Sect. 4.1 with an extra host User on the external network and several
new actions. An authorized user W of the internal network owns the new host and uses
it as a terminal to work remotely on the internal Windows host. The new actions permit
the intruder to take over the host User, sniff user W ’s login credentials, and log in
to the internal Windows host using the stolen credentials. We omit the details of the
new actions, as they are not essential to understanding the examples. Figure 6(a) shows
the full graph for the modified example. The graph is significantly larger, reflecting the
expanded number of choices available to the intruder.

Single Action Removal. A simple kind of analysis determines the impact of removing
one action from the intruder’s arsenal. Recall from Sect. 3 that each action is a triple
(r, hs , ht ), where hs ∈ H is the host from which the attack is launched, ht ∈ H is
Tools for Generating and Analyzing Attack Graphs 359

the host targeted by the attack, and r is an action rule. The user specifies a set Arem of
action triples to be removed from the attack graph. The toolkit deletes the transitions
corresponding to each triple in the set Arem from the graph and then removes the nodes
that have become unreachable from the initial state.
As demonstrated in Figure 6, this procedure can be repeated several times, reducing
the size of the attack graph at each step. The full graph in Figure 6(a) has 362 states.
Removing one of two ways the intruder can sniff user W ’s login credentials produces
the graph in Figure 6(b), with 213 states. Removing one of the local buffer overflow
actions produces the graph in Figure 6(c), with 66 states. At each step, the user is able
to judge visually the impact of removing a single action from the intruder’s arsenal.
Critical Action Sets. Once an attack graph is generated, an approximation algorithm
can find an approximately-optimal critical set of actions that will completely disconnect
the initial state from states where the intruder has achieved his goals [16]. A related
algorithm can find an approximately-optimal set of security measures that accomplish the
same goal. With a single click, the user can invoke both of these exposure minimization
algorithms.
The effect of the critical action set algorithm on the modified example attack graph is
shown in Figure 7(a). The algorithm finds a critical action set of size 1, containing the port
scan action exploiting the Squid web proxy. The graph nodes and edges corresponding
to actions in the critical set computed by the algorithm are highlighted in the toolkit
by shading the relevant nodes. The shaded nodes are seen clearly when we zoom in to
inspect a part of the graph on a larger scale (Figure 7(b)).
Since the computed action set is always critical, removing every action triple in the
set from the intruder’s arsenal is guaranteed to result in an empty attack graph. In the
example, we might patch the Linux machine with a new version of the Squid proxy,
thereby removing every action triple that uses the Squid port scan rule on the Linux
machine from the intruder’s arsenal.

5 Attack Graph Toolkit


We have implemented a toolkit for generating and exploring attack graphs, using network
attack models defined in Sect. 3. In this section we describe the toolkit and show several
ways to integrate it with external data sources that supply information necessary to build
a network attack model. Specifically, it is necessary to know the topology of the target
network, configuration of the network hosts, and vulnerabilities present on the network.
In addition, we require access to a database of attack rules to build the transition relation
of the attack model. We could expect the user to specify all of the necessary information
manually, but such a task is tedious, error-prone, and unrealistic for networks of more
than a few nodes.
We recommend deploying the attack graph toolkit in conjunction with information-
gathering systems that supply some of the data automatically. We integrated the at-
tack graph generator with two such systems, MITRE Corp’s Outpost and Lockheed
Martin’s ANGI. We report on our experience with Outpost and ANGI in Sections 5.4
and 5.5.
360 O. Sheyner and J. Wing

Fig. 6. Reducing Action Arsenal


Tools for Generating and Analyzing Attack Graphs 361

Fig. 7. Finding Critical Action Sets


 / 3HEYNER AND * 7ING

 4OOLKIT !RCHITECTURE


&IGURE  SHOWS THE ARCHITECTURE OF THE ATTACK GRAPH TOOLKIT 4HERE ARE THREE MAIN PIECES
A NETWORK MODEL BUILDER A SCENARIO GRAPH GENERATOR AND A GRAPHICAL USER INTERFACE
'5)  4HE NETWORK MODEL BUILDER TAKES AS INPUT INFORMATION ABOUT NETWORK TOPOLOGY
CONlGURATION DATA FOR EACH NETWORKED HOST AND A LIBRARY OF ATTACK RULES )T CONSTRUCTS
A lNITE MODEL OF THE NETWORK SUITABLE FOR AUTOMATED ANALYSIS 4HE MODEL IS AUGMENTED
WITH A SECURITY SPECIlCATION WHICH SPELLS OUT THE SECURITY REQUIREMENTS AGAINST WHICH
THE ATTACK GRAPH IS TO BE BUILT 4HE MODEL AND THE SECURITY SPECIlCATION THEN GO TO THE
SECOND PIECE THE SCENARIO GRAPH GENERATOR

&IG  4OOLKIT !RCHITECTURE

4HE SCENARIO GRAPH GENERATOR TAKES ANY lNITE MODEL AND CORRECTNESS SPECIlCATION
AND PRODUCES A GRAPH COMPOSED OF POSSIBLE EXECUTIONS OF THE MODEL THAT VIOLATE THE
CORRECTNESS SPECIlCATION 4HE MODEL BUILDER CONSTRUCTS THE INPUT TO THE GRAPH GENERATOR
SO THAT THE OUTPUT WILL BE THE DESIRED ATTACK GRAPH 4HE GRAPHICAL USER INTERFACE LETS THE
USER DISPLAY AND EXAMINE THE GRAPH
4HE MODEL BUILDERS RUNNING TIME IS LINEAR IN THE SIZE OF THE INPUT SPECIlCATION
TYPICALLY WRITTEN IN THE 8-, FORMAT SPECIlED IN 3ECT  4HE ALGORITHM IN THE SCENARIO
GRAPH GENERATOR IS LINEAR IN THE SIZE OF THE OUTPUT SCENARIO GRAPH ;= 4HE SLOWEST PART
OF THE TOOLKIT IS THE ALGORITHM THAT LAYS OUT THE ATTACK GRAPH ON SCREEN 4HE ALGORITHM
USES THE NETWORK SIMPLEX METHOD TO lND OPTIMAL X COORDINATES 4HE SIMPLEX METHOD
HAS EXPONENTIAL WORST CASE PERFORMANCE 4HE REST OF THE LAYOUT ALGORITHM HAS CUBIC
COMPLEXITY 4HUS FOR LARGE GRAPHS IT IS SOMETIMES NECESSARY TO RUN ANALYSIS ALGORITHMS
WITHOUT DISPLAYING THE FULL GRAPH ON SCREEN

 4HE -ODEL "UILDER


2ECALL FROM 3ECT  THAT A NETWORK ATTACK MODEL CONSISTS OF SIX PRIMARY COMPONENTS

 ( A SET OF HOSTS CONNECTED TO THE NETWORK


 # A CONNECTIVITY RELATION EXPRESSING THE NETWORK TOPOLOGY AND INTER HOST REACHABIL
ITY
 4 A RELATION EXPRESSING TRUST BETWEEN HOSTS
Tools for Generating and Analyzing Attack Graphs 363

4. I, a model of the intruder


5. A, a set of individual attack actions
6. Ids, a model of the intrusion detection system
To construct each of the six components, the model builder needs to collect the
following pieces of information. For the entire network, we need:
1. The set of hosts H
2. The network topology and firewall rules, which together induce the connectivity
relation C
3. The initial state of the trust relation T : which hosts are trusted by other hosts prior
to any intruder action
Several pieces of data are required for each host h in the set H:
4. A unique host identifier (usually name and network address)
5. Operating system vendor and version
6. Active network services with port numbers
7. Common Vulnerabilities and Exposures IDs of all vulnerabilities present on h
8. User-specific configuration parameters
Finally, for each CVE vulnerability present on at least one host in the set H, we need:
9. An attack rule with preconditions and effects
We designed an XML-based format covering all of the information that the model builder
requires. The XML format lets the user specify each piece of information manually or
indicate that the data can be gathered automatically from an external source. A typical
description of a host in XML is as follows:
1 <host id="typical-machine" ip="192.168.0.1">
2
3 <services>
4 <ftp port="21"/>
5 <W3SVC port="80"/>
6 </services>
7
8 <connectivity>
9 <remote id="machine1" <ftp/> <W3SVC/> </remote>
10 <remote id="machine2"> <sshd/> <W3SVC/> </remote>
11 <remote id="machine3"> <sshd/> </remote>
12 </connectivity>
13
14 <cve>
15 <CAN-2002-0364/>
16 <CAN-2002-0147/>
17 </cve>
18
19 </host>
364 O. Sheyner and J. Wing

The example description provides the host name and network identification (line
1), a list of active services with port numbers (lines 3-6), the part of the connectivity
relation that involves the host (lines 8-12), and names of CVE and CVE-candidate (CAN)
vulnerabilities known to be present on the host (lines 14-17). Connectivity is specified
as a list of services that the host can reach on each remote machine. Lines 9-11 each
specify one remote machine; e.g., typical-machine can reach machine1 on ports
assigned to the ftp and W3SVC (IIS Web Server) services.
It is unrealistic to expect the user to collect and specify all of the data by hand. In
Sections 5.3-5.5 we discuss three external data sources that supply some of the infor-
mation automatically: the Nessus vulnerability scanner, MITRE Corp.’s Outpost, and
Lockheed Martin’s ANGI. Whenever the model builder can get a specific piece of in-
formation from one of these sources, a special tag is placed in the XML file. If Nessus,
Outpost and ANGI are all available at the same time as sources of information, the above
host description may look as follows:
<host id="typical-machine" ip="|Outpost|">

<services source="|Outpost|"/>
<connectivity source="|ANGI|"/>
<cve source="|Nessus|"/>

</host>
The model builder gets the host network address and the list of running services from
Outpost, connectivity information from ANGI, and a list of existing vulnerabilities from
Nessus. Once all of the relevant information is gathered, the model builder creates a
finite model and encodes it in the input language of the scenario graph generator. The
scenario graph generator then builds the attack graph.

5.3 Attack Graphs with Nessus


A savvy attacker might use one of the many widely available vulnerability scanners [4] to
discover facts about the network and construct an attack scenario manually. Similarly, an
attack graph generator can use a scanner to construct such scenarios automatically. Our
attack graph toolkit works with the freeware vulnerability scanner Nessus [8] to gather
information about reachable hosts, services running on those hosts, and any known
exploitable vulnerabilities that can be detected remotely.
The scanner has no internal knowledge of the target hosts, and will usually discover
only part of the information necessary to construct a graph that includes every possible
attack scenario. Using only an external vulnerability scanner to gather information can
lead the system administrator to miss important attack scenarios.
Nevertheless, the administrator can run vulnerability scanners against his own net-
work to find out what a real attacker would discover. In the future, sophisticated intruders
are likely to use attack graph generators to help them devise attack scenarios. As a part of
network security strategy, we recommend running a vulnerability scanner in conjunction
with an attack graph generator periodically to discover avenues of attack that are most
likely to be exploited in practice.
Tools for Generating and Analyzing Attack Graphs 365

Outpost
Server Host
Network Configuration
of Data
Outpost Attack Graph
Clients Toolkit

SQL
database

Fig. 9. Outpost Architecture

5.4 Attack Graphs with MITRE Outpost


MITRE Corporation’s Outpost is a system for collecting, organizing, and maintaining
security-related information on computer networks. It is a suite of inter-related security
applications that share a common data model and a common data collection infrastruc-
ture. The goal of Outpost is to provide a flexible and open environment for network and
system administrators to monitor, control, and protect computer systems.
At the center of the Outpost System is a data collection/probe execution engine that
gathers specific configuration information from all of the systems within a network. The
collected data is stored in a central database for analysis by the Outpost applications.
Outpost collects data about individual hosts only, so it cannot provide information about
network topology or attack rules. Since Outpost stores all of the data in a network-
accessible SQL database, we retrieve the data directly from the database, without talking
to the Outpost server, as shown in Figure 9.
Currently Outpost works with SQL databases supported by Microsoft and Oracle.
Both of these packages use a proprietary Application Programming Interface. The model
builder includes an interface to each database, as well as a generic module that uses
the Open DataBase Connectivity interface (ODBC) and works with any database that
supports ODBC. Furthermore, it is easy to add a capability to interface with other types
of databases.
An Outpost-populated database contains a list of client hosts monitored by the Out-
post server. For the model builder, the Outpost server can provide most of the required
information about each individual host h, including:
1. A unique host identifier (usually name and network address)
2. Operating system vendor and version
3. Active network services with port numbers
4. Common Vulnerabilities and Exposures IDs of all vulnerabilities present on h
5. User-specific configuration parameters (e.g., is Javascript enabled for the user’s
email client?)
Outpost’s lists of CVE vulnerabilities are usually incomplete, and it does not keep
track of some of the user-specific configuration parameters required by the attack graph
366 O. Sheyner and J. Wing

toolkit. Until these deficiencies are fixed, the user must provide the missing information
manually.
In the future, the Outpost server will inform the attack graph toolkit whenever changes
are made to the database. The tighter integration with Outpost will enable attack graph
toolkit to re-generate attack graphs automatically every time something changes in the
network configuration.

5.5 Attack Graphs with Lockheed’s ANGI


Lockheed Martin Advanced Technology Laboratory’s (ATL) Next Generation Infras-
tructure (ANGI) IR&D project is building systems that can be deployed in dynamic,
distributed, and open network environments. ANGI collects local sensor information
continuously on each network host. The sensor data is shared among the hosts, pro-
viding dynamic awareness of the network status to each host. ANGI sensors gather
information about host addresses, host programs and services, and network topology. In
addition, ANGI supports vulnerability assessment sensors for threat analysis.

Fig. 10. ANGI Network

Two distinguishing features of ANGI are the ability to discover network topology
changes dynamically and focus on technologies for pro-active, automated repair of net-
work problems. ANGI is capable of providing the attack graph model builder with
network topology information, which is not available in Outpost and is not gathered by
Nessus.
We tested our attack graph toolkit integrated with ANGI on a testbed of five hosts
with combinations of the five CVE vulnerabilities specified for the example model in
Chapter 4 (p. 352), and one adversary host. Figure 10 is a screenshot of the testbed
network schematic. The intruder resides on the host lindenwold. Hosts trenton
and mtholly run firewalls, which are initially disabled. We assume that the target of
the intruder is the host shamong, which contains some critical resource.
ANGI combines information about each host with data from firewall configuration
files into a single XML document. To convert firewall rules into a reachability relation C
accepted by the attack graph toolkit, ANGI uses a package developed at MITRE Corp.
that computes network reachability from packet filter data [14]. The XML file specifies
explicitly five attack rules corresponding to the CVE vulnerabilities present on the hosts.
ANGI then calls the model builder with the XML document and a security property as
inputs. The security property specifies a guarantee of protection for the critical resource
host shamong:
G(intruder.privilege[shamong] < root)
Tools for Generating and Analyzing Attack Graphs 367

The attack graph generator finds several potential attack scenarios. Figure 11 shows
the attack graph as it is displayed by the graphical user interface. The graph consists of
19 nodes with 28 edges.

Fig. 11. ANGI Attack Graph - No Firewalls

Exploring the attack graph reveals that several successful attack scenarios exploit the
LICQ vulnerability on the host shamong. One such attack scenario is highlighted in
Figure 11. As indicated in the “Path Info” pane on the left of Figure 11, the second step
of the highlighted scenario exploits the LICQ vulnerability on shamong. This suggests
a possible strategy for reducing the size of the graph. Using the ANGI interface, we
enable the firewall on the host trenton, and add a rule that blocks all external traffic at
trenton from reaching shamong on the LICQ port. ANGI then generates a new XML
model file reflecting this change. The new graph demonstrates a significant reduction
in network exposure from this relatively small change in network configuration. The
modification reduces graph size to 7 nodes and 6 edges with only two possible paths.
(Contrast this new graph with the attack graph shown in Figure 11, which has 19 nodes,
28 edges, and 20 paths.)
368 O. Sheyner and J. Wing

Looking at the scenarios in this new graph, we discover that the attacker can still
reach shamong by first compromising the web server on cherryhill. Since we
do not want to disable the web server, we enable the firewall on mtholly and add
a rule specifically blocking cherryhill’s access to the LICQ client on shamong.
Yet another invocation of the attack graph generator on the modified model produces
an empty attack graph and confirms that we have successfully safeguarded shamong
while retaining the full functionality of the network.

6 Related Work
Many of the ideas that we propose to investigate have been suggested or considered in
existing work in the intrusion detection field. This section surveys recent related work.
Phillips and Swiler [13] propose the concept of attack graphs that is similar to the
one described here. However, they take an “attack-centric” view of the system. Since we
work with a general modeling language, we can express in our model both seemingly
benign system events (such as failure of a link) and malicious events (such as attacks).
Therefore, our attack graphs are more general than the one proposed by Phillips and
Swiler. Swiler et al. describe a tool [19] for generating attack graphs based on their
previous work. Their tool constructs the attack graph by forward exploration starting
from the initial state.
The advantage of using model checking instead of forward search is that the technique
can be expanded to include liveness properties, which can model service guarantees in
the face of malicious activity. For example, a model of a banking network could have a
liveness security property such as
G (CheckDeposited → (F CheckCleared))
which specifies that every check deposited at a bank branch must eventually clear.
Templeton and Levitt [20] propose a requires/provides model for attacks. The model
links atomic attacks into scenarios, with earlier atomic attacks supplying the prerequisites
for the later ones. Templeton and Levitt point out that relating seemingly innocuous sys-
tem behavior to known attack scenarios can help discover new atomic attacks. However,
they do not consider combining their attack scenarios into attack graphs.
Cuppens and Ortalo [6] propose a declarative language (LAMBDA) for specifying at-
tacks in terms of pre- and post-conditions. LAMBDA is a superset of the simple language
we used to model attacks in our work. The language is modular and hierarchical; higher-
level attacks can be described using lower-level attacks as components. LAMBDA also
includes intrusion detection elements. Attack specifications includes information about
the steps needed to detect the attack and the steps needed to verify that the attack has
already been carried out. Using a database of attacks specified in LAMBDA, Cuppens
and Miege [5] propose a method for alert correlation based on matching post-conditions
of some attacks with pre-conditions of other attacks that may follow. In effect, they
exploit the fact that alerts about attacks are more likely to be related if the corresponding
attacks can be a part of the same attack scenario.
Dacier [7] proposes the concept of privilege graphs. Each node in the privilege graph
represents a set of privileges owned by the user; edges represent vulnerabilities. Privi-
Tools for Generating and Analyzing Attack Graphs 369

lege graphs are then explored to construct attack state graphs, which represents different
ways in which an intruder can reach a certain goal, such as root access on a host. He
also defines a metric, called the mean effort to failure or METF, based on the attack
state graphs. Orlato et al. describe an experimental evaluation of a framework based on
these ideas [12]. At the surface, our notion of attack graphs seems similar to the one
proposed by Dacier. However, as is the case with Phillips and Swiler, Dacier takes an
“attack-centric” view of the world. As pointed out above, our attack graphs are more
general. From the experiments conducted by Orlato et al. it appears that even for small
examples the space required to construct attack state graphs becomes prohibitive. By
basing our algorithm on model checking we take advantage of advances in representing
large state spaces and can thus hope to represent large attack graphs.
Ritchey and Ammann [15] also use model checking for vulnerability analysis of
networks. They use the (unmodified) model checker SMV [18]. They can obtain only one
counter-example, i.e., only one attack corresponding to an unsafe state. In contrast, we
modified the model checker NuSMV to produce attack graphs, representing all possible
attacks. We also described post-facto analyzes that can be performed on these attack
graphs. These analysis techniques cannot be meaningfully performed on single attacks.
Graph-based data structures have also been used in network intrusion detection sys-
tems, such as NetSTAT [21]. There are two major components in NetSTAT, a set of
probes placed at different points in the network and an analyzer. The analyzer processes
events generated by the probes and generates alarms by consulting a network fact base
and a scenario database. The network fact base contains information (such as connec-
tivity) about the network being monitored. The scenario database has a directed graph
representation of various atomic attacks. For example, the graph corresponding to an IP
spoofing attack shows various steps that an intruder takes to mount that specific attack.
The authors state that “in the analysis process the most critical operation is the generation
of all possible instances of an attack scenario with respect to a given target network.”
Ammann et. al. present a scalable attack graph representation [1]. They encode attack
graphs as dependencies among exploits and security conditions, under the assumption
of monotonicity. Informally, monotonicity means that no action an intruder can take
interferes with the intruder’s ability to take any other actions. The authors treat vulnera-
bilities, intruder access privileges, and network connectivity as atomic boolean attributes.
Actions are treated as atomic transformations that, given a set of preconditions on the
attributes, establish a set of postconditions. In this model, monotonicity means that (1)
once a postcondition is satisfied, it can never become ’unsatisfied’, and (2) the negation
operator cannot be used in expressing action preconditions.
The authors show that under the monotonicity assumption it is possible to construct
an efficient (low-order polynomial) attack graph representation that scales well. They
present an efficient algorithm for extracting minimal attack scenarios from the represen-
tation, and suggest that a standard graph algorithm can produce a critical set of actions
that disconnects the goal state of the intruder from the initial state.
This approach is less general than our treatment of attack graphs. In addition to
the monotonicity requirement, it can handle only simple safety properties. Further, the
compact attack graph representation is less explicit, and therefore harder for a human to
370 O. Sheyner and J. Wing

read. The advantage of the approach is that it has a worst-case bound on the size of the
graph that is polynomial in the number of atomic attributes in the model, and therefore
can scale better than full-fledged model checking to large networks.

7 Summary and Current Status


We have designed, implemented, and tested algorithms for automatically generating
attack graphs and for performing different kinds of vulnerability analyses on them. We
have built an attack graph toolkit to support our generation and analysis algorithms.
The toolkit has an easy-to-use graphical user interface. We integrated our tools with
external sources to populate our network attack model with host and vulnerability data
automatically.
We are in the process of specifying a library of actions based on a vulnerability
database provided to us by SEI/CERT. This database has over 150 actions representing
many published CVEs. We have preliminary results in using a subset of 30 of these
actions as input to our model builder, allowing us to produce attack graphs with over
300 nodes and 3000 edges in just a few minutes. Most telling, is that once graphs are
that large, automated analysis, such as the kind we provide, is essential.
With our current toolkit and our growing library of actions, we are now perform-
ing systematic experiments: on different network configurations, with different subsets
of actions, and for different attacker goals. The ultimate goal is to help the system
administrator—by giving him a fast and completely automatic way to test out different
system configurations (e.g., network connectivity, firewall rules, services running on
hosts), and by finding new attacks to which his system is vulnerable.

References
1. Paul Ammann, Duminda Wijesekera, and Saket Kaushik. Scalable, graph-based network
vulnerability analysis. In 9th ACM Conference on Computer and Communications Security,
pages 217–224, 2002.
2. G. Ausiello, A. D’Atri, and M. Protasi. Structure preserving reductions among convex opti-
mization problems. Journal of Computational System Sciences, 21:136–153, 1980.
3. T.H. Cormen, C.E. Leiserson, and R.L. Rivest. Introduction to Algorithms. MIT Press, 1985.
4. Cotse.net. Vulnerability Scanners. http://www.cotse.com/tools/vuln.htm.
5. Frederic Cuppens and Alexandre Miege. Alert correlation in a cooperative intrusion detection
framework. In 23rd IEEE Symposium on Security and Privacy, May 2002.
6. Frederic Cuppens and Rodolphe Ortalo. Lambda:A language to model a database for detection
of attacks. In Proceedings of the Third International Workshop on the Recent Advances in
Intrusion Detection (RAID), number 1907 in LNCS, pages 197–216. Springer-Verlag, 2000.
7. M. Dacier. Towards Quantitative Evaluation of Computer Security. PhD thesis, Institut
National Polytechnique de Toulouse, December 1994.
8. Renaud Deraison. Nessus Scanner. http://www.nessus.org.
9. Somesh Jha, Oleg Sheyner, and Jeannette M. Wing. Minimization and reliability analyses
of attack graphs. Technical Report CMU-CS-02-109, Carnegie Mellon University, February
2002.
Tools for Generating and Analyzing Attack Graphs 371

10. Somesh Jha, Oleg Sheyner, and Jeannette M. Wing. Two formal analyses of attack graphs. In
Proceedings of the15th IEEE Computer Security Foundations Workshop, pages 49–63, Nova
Scotia, Canada, June 2002.
11. Somesh Jha and Jeannette Wing. Survivability analysis of networked systems. In Proceedings
of the International Conference on Software Engineering, Toronto, Canada, May 2001.
12. R. Ortalo, Y. Dewarte, and M. Kaaniche. Experimenting with quantitative evaluation tools for
monitoring operational security. IEEE Transactions on Software Engineering, 25(5):633–650,
September/October 1999.
13. C.A. Phillips and L.P. Swiler. A graph-based system for network vulnerability analysis. In
New Security Paradigms Workshop, pages 71–79, 1998.
14. John Ramsdell. Frame propagation. MITRE Corp., 2001.
15. R.W. Ritchey and P. Ammann. Using model checking to analyze network vulnerabilities. In
Proceedings of the IEEE Symposium on Security and Privacy, pages 156–165, May 2001.
16. Oleg Sheyner. Scenario Graphs and Attack Graphs. PhD thesis, Carnegie Mellon University,
2004.
17. Oleg Sheyner, Joshua Haines, Somesh Jha, Richard Lippmann, and Jeannette Wing. Auto-
mated generation and analysis of attack graphs. In Proceedings of the IEEE Symposium on
Security and Privacy, 2002.
18. SMV. SMV:A Symbolic Model Checker. http://www.cs.cmu.edu/˜modelcheck/.
19. L.P. Swiler, C. Phillips, D. Ellis, and S. Chakerian. Computer-attack graph generation tool. In
Proceedings of the DARPA Information Survivability Conference and Exposition, June 2000.
20. Steven Templeton and Karl Levitt. A requires/provides model for computer attacks. In
Proceedings of the New Security Paradigms Workshop, Cork, Ireland, 2000.
21. G. Vigna and R.A. Kemmerer. Netstat: A network-based intrusion detection system. Journal
of Computer Security, 7(1), 1999.
22. Common Vulnerabilities and Exposures. http://www.cve.mitre.org.
Author Index

Benveniste, Albert 1 Jézéquel, Jean-Marc 260


Bergstra, Jan A. 17
Bhargavan, Karthikeyan 197 Küster, Jochen M. 157
Börger, Egon 42
Lopes, Antónia 177
Caillaud, Benoı̂t 1
Carloni, Luca P. 1 Olderog, Ernst-Rüdiger 77
Caspi, Paul 1
Pierik, Cees 111
Damm, Werner 77 Plouzeau, Noël 260
de Boer, Frank S. 111 Pucella, Riccardo 197
Defour, Olivier 260
Diaconescu, Răzvan 134 Rossman, Benjamin 240
Rutten, J.J.M.M. 276
Engels, Gregor 157
Sangiovanni-Vincentelli, Alberto L. 1
Fiadeiro, José Luiz 177
Schulte, Wolfram 240
Fournet, Cédric 197
Sheyner, Oleg 344
Gordon, Andrew D. 197 Sidorova, Natalia 292
Gössler, Gregor 314 Sifakis, Joseph 314
Groote, Jan Friso 223 Stärk, Robert F. 42
Gurevich, Yuri 240 Steffen, Martin 292

Hungar, Hardi 77 Wehrheim, Heike 330


Willemse, Tim A.C. 223
Ioustinova, Natalia 292 Wing, Jeannette 344

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy