1 Introduction

Despite the plethora of notations available to model business processes, process modelers struggle to capture real-life processes using mainstream notations such as Business Process Model and Notation (BPMN), Event-driven Process Chains (EPC), and UML activity diagrams. All such notations require the simplifying assumption that each process model focuses on a single, explicitly defined case notion (also referred to as process instance). The discrepancy between the single case view and reality becomes evident when using process mining techniques to reconstruct processes based on the available data [2]. Process mining starts from the available data and, unless one is using a Business Process Management (BPM) or Workflow Management (WFM) system for process execution, explicit case information is typically missing. Process-centric diagrams using BPMN, EPCs, or UML describe the life-cycle of individual cases. When formal languages like Petri nets, automata, and process algebras are used to describe business processes, they tend to model cases in isolation, and the data perspective is secondary or missing completely. Languages like BPMN allow modelers to attach data to processes, but without the possibility to express complex constraints over such data (e.g., cardinality constraints, is-a links, disjointness, covering, etc. as in ER/UML/ORM data models). Mainstream business process modeling notations describe the lifecycle of one type of process instance at a time missing the opportunity to capture the co-evolution of multiple, interacting instances. In particular, complex constraints over data attached to processes must influence the behavior of the process itself—e.g., consider the management of different orders, where the evolution of one order impacts on the possible evolutions of the related orders.

Object-Centric Behavioral Constraint (OCBC) [3, 21, 22] models have been proposed as a modeling language that combines ideas from declarative, constraint-based languages like DECLARE [1], and from data modeling languages. OCBC allows to: (i) describe the temporal interaction between activities in a given process and to attach (structured) data to processes in a unified framework; (ii) model the interactions between multiple process instances, specifically when there is a one-to-many or many-to-many relationship between them. Figure 1 illustrates the way in which OCBC models tackle the above two issues. \(\mathtt{Register~Email}\) and \(\mathtt{Send~Invite}\) are two activities related to object classes \(\mathtt{Person}\) and \(\mathtt{Meeting}\), respectively. A meeting is organized by many persons, each of which can in turn organize many meetings. The double-headed arrow connecting \(\mathtt{Register~Email}\) and \(\mathtt{Send~Invite}\) expresses the constraint that an invitation for a meeting can be sent only if at least one organizer of that meeting has previously registered her e-mail. Assuming that the object targeted by each activity is indeed a case for that activity, this simple example already contains two distinct case notions (\(\mathtt{Person}\) and \(\mathtt{Meeting}\)) that are intertwined. In conventional notations, this can only be modeled from the viewpoint of one of the two instances: the registration process of a person or the invitation process for a meeting. Taking the latter viewpoint using conventional notations such as BPMN would require to explicitly introduce a loop to handle the registration of one or more persons organizing a meeting. However, this is incorrect because one registration may be followed by many meetings. One-to-many and many-to-many relationships lead to convergence and divergence problems that cannot be handled in notations describing isolated cases.

Fig. 1.
figure 1

An OCBC constraint

OCBC models are related to artifact- and data-centric approaches [12, 16, 19] aiming to integrate data and processes. However, this is not done in a single diagram representing different types of process instances and their interactions. In addition, these approaches usually assume complete knowledge over the data, and require to fully spell out data updates when specifying the activities [14, 26]. The few proposals dealing with artifact-centric models with incomplete knowledge [10] do not come with a fully integrated, declarative semantics as done here, but follow instead the Levesque functional approach [20] to separate the evolution of the system from the inspection of (incomplete) knowledge in each state.

Fig. 2.
figure 2

Example of an OCBC model

This paper provides a complete characterization of the formal semantics of the OCBC approach, unambiguously defining the logical meaning of OCBC constraints. We provide a visual and textual syntax for OCBC, then defining the semantics of the different modeling constructs in terms of temporal description logics, i.e., a temporal extension of (fragments of) the well-known OWL language. The obtained formalization, in turn, allows us to lift all reasoning services defined for constraint-based process modeling notations without data, to the much more sophisticated setting of OCBC. In particular, we show how reasoning over OCBC models can be reformulated into decidable, standard reasoning tasks over the corresponding temporal description logic knowledge base, giving solid foundations to the boundaries of decidability and complexity of reasoning over processes and their manipulated data.

The paper is organized as follows. We present a running example in Sect. 2. Section 3 briefly illustrates the temporal DL that will be used to encode and reason over OCBC models. Section 4 shows the syntax for OCBC models and their semantics via the temporal DL encoding. Reasoning and verification tasks for OCBC models are tackled in Sect. 5. We present our remarks and future work in Sect. 6.

Fig. 3.
figure 3

Trace fragment for the OCBC model in Fig. 2

2 Running Example

The driving assumption underlying our proposal is that processes are modeled as a mirror of their manipulated data. Such data is structured according to complex data modeling constraints (see the lower part of Fig. 2). Data can be attached to activities (see the dotted lines of Fig. 2) and ad-hoc co-reference constraints can be expressed on those manipulated data (see the dash-dotted lines of Fig. 2) describing how activities can share/reuse the same data objects.

Example 1

Figure 2 shows an OCBC model for a process composed by five activities (\(\mathtt{Create Order}\), \(\mathtt{Pick Item}\), \(\mathtt{Wrap Item}\), \(\mathtt{Pay Order}\) and \(\mathtt{Deliver Items}\)) and five object classes in the data model (\(\mathtt{Order}\), \(\mathtt{Order Line}\), \(\mathtt{Delivery}\), \(\mathtt{Product}\) and \(\mathtt{Customer}\)). The top part describes the temporal ordering of activities and the bottom part how objects relevant for the process execution are structured (read the lower part as a standard UML class diagram). The middle layer (dotted lines) relates activities and data. We now informally describe the constructs highlighted in Fig. 2. There is a one-to-one correspondence between a \(\mathtt{Create Order}\) activity and an \(\mathtt{Order}\), i.e., the execution of a \(\mathtt{Create Order}\) activity creates a unique \(\mathtt{Order}\) and, vice-versa, due to the 1 on the \(\mathtt{Create Order}\) side, each \(\mathtt{Order}\) has been generated by a single execution of a \(\mathtt{Create Order}\) activity. Every execution of the \(\mathtt{Pick Item}\) activity refers to a unique \(\mathtt{Order Line}\) and each \(\mathtt{Order Line}\) has been generated by an execution of a \(\mathtt{Pick Item}\) activity (and not by a \(\mathtt{Wrap Item}\) activity). Each \(\mathtt{Create Order}\) activity is followed by exactly one (single arrow) \(\mathtt{Pay Order}\) activity related to the same order. Each \(\mathtt{Pay Order}\) activity is preceded by possibly many (double arrow) \(\mathtt{Pick Item}\) activities. Whenever we execute \(\mathtt{Pay Order}\) we will never execute \(\mathtt{Pick Item}\) on the same paid order. The dash-dotted line denotes a co-reference constraint over an object class, imposes that when the \(\mathtt{Create Order}\) creates an order instance, that order instance will eventually be paid by executing a \(\mathtt{Pay Order}\) activity. The dash-dotted line is, in this case, a co-reference constraint now over a relationship which imposes that when we fill an order line it must have been contained in exactly one order created by executing a \(\mathtt{Create Order}\) activity. Since an order line instance could not exist at the same time we create an order instance and relationships are instantiated by co-existing objects, the UML model correctly specifies that, at each point in time, each order participates zero or more times in the \(\mathtt{contains}\) relation. On the other hand, the co-reference constraint together with the mandatory cardinalities constraints and the temporal constraints between \(\mathtt{Create Order}\), \(\mathtt{Pay Order}\) and \(\mathtt{Pick Item}\) imply the eventual existence of at least one order line contained in any given order. The dash-dotted line starting with a \(\times \) denotes a negative co-reference constraint that forbids filling with further order lines an order that has been closed by a \(\mathtt{Pay Order}\) activity.

A possible execution of an OCBC process, called in the following trace fragment, records at once events, with their execution time, and the objects they operate on. In addition, it also captures facts that are known to hold over such objects in a given timestamp, in particular, the classes to which objects belong to at that time, as well as how objects are related to each other. In addition, the trace fragment captures, as customary in a standard first-order logic setting, incomplete knowledge about a process execution, and OCBC constraints are hence interpreted under the open-world semantics. This means that a trace fragment conforms to an OCBC model if it can be extended towards a full trace that satisfies all the constraints contained therein. A trace fragment conforming to the OCBC model of Fig. 2 is depicted in Fig. 3 and shown in the following first-order logic notation (but also as a DL ABox after a small transformation). We abbreviate activity names with their initials. Instances of activities, classes and relationships are timestamped denoting the execution time of the activity, and the time point when the described fact holds (timestamps respect the time ordering starting from \(t_0\)).

$$\begin{aligned}&\mathtt{CO}(co_1,t_0), \mathtt{PI}(pi_1,t_1), \mathtt{PI}(pi_2,t_2), \mathtt{WI}(wi_1,t_3), \mathtt{WI}(wi_2,t_4), \mathtt{PI}(pi_3,t_5), \mathtt{WI}(wi_3,t_6), \mathtt{PO}(po_1,t_7), \\&\mathtt{DI}(di_1,t_8), \mathtt{DI}(di_2,t_9), \mathtt{creates}(co_1,o_1,t_0), \mathtt{fills}(pi_1,ol_1,t_1), \mathtt{contains}(o_1,ol_1,t_1), \mathtt{fills}(pi_2,ol_2,t_2), \\&\mathtt{contains}(o_1,ol_2,t_2), \mathtt{prepares}(wi_1,ol_1,t_3), \mathtt{prepares}(wi_2,ol_2,t_4), \mathtt{fills}(pi_3,ol_3,t_5), \\&\mathtt{contains}(o_1,ol_3,t_5), \mathtt{prepares}(wi_3,ol_3,t_6), \mathtt{closes}(po_1,o_1,t_7), \mathtt{refers~to}(di_1,d_1,t_8),\\&\mathtt{results~in}(ol_1,d_1,t_8), \mathtt{results~in}(ol_2,d_1,t_8), \mathtt{refers~to}(di_2,d_2,t_9), \mathtt{results~in}(ol_3,d_2,t_9), \end{aligned}$$

The process described in the example cannot be modeled using conventional process modeling languages, because (a) three different types of instances (of activities, classes and also relationships instances) are intertwined in a uniform framework so that no further coding or annotations are needed, and (b) cardinality and structural constraints in the object class model influence the allowed behavior of activities, and vice-versa. Take, e.g., the fact that in the example we have three different \(\mathtt{Order Line}\) instances (\(ol_1, ol_2, ol_3\)), then, together with the co-reference constraints on \(\mathtt{Order Line}\), we implicitly enforce the occurrence of three different \(\mathtt{Pick Item}\) and \(\mathtt{Wrap Item}\) activities.

3 A Gentle Introduction to Temporal DLs

Since description logics (DLs) are able to capture data models [4, 11, 17] and are the logical formalism underpinning ontologies expressed in the standard Web Ontology Language OWL (www.w3.org/2007/OWL), while the linear temporal logic (LTL) is able to formalize the temporal interweaving of the activities in a process [1], we propose here to use temporal description logics based on and its fragments [8, 18, 27] to formally describe the semantics of OCBC models and to capture in a uniform formalism both the processes and their attached data.

is one of the most expressive and still decidable temporal description logics. The language alphabet contains object names \(a_0, a_1, \ldots \), concept names \(A_0, A_1, \dots \) and role names \(P_0, P_1, \dots \). Then, roles R and concepts C are given by the following grammar:

where \(R^-\) denotes the inverse of the role R (obtained by reversing the relation R) and q is a positive integer. We use the standard abbreviations: \( C_1 \sqcup C_2 = \lnot (\lnot C_1 \sqcap \lnot C_2)\), \(\bot = \lnot \top \), \(\exists R = (\mathop {\ge 1} R~\top )\), \(\exists R\mathpunct {\text{. }}C = (\mathop {\ge 1} R~C)\), \((\mathop {\le q} R~C) = \lnot (\mathop {\ge (q + 1)} R~C)\). Furthermore, all the temporal operators used in LTL can be expressed via \(\mathbin {\mathcal {S}}\) ‘since’ and ‘until’ [18]. Operators and \(\Diamond _{\!\scriptscriptstyle P}\) (‘sometime in the future/past’) can be expressed as and ; operators (‘always in the future’) and (‘always in the past’) are defined as dual to and \(\Diamond _{\!\scriptscriptstyle P}\), i.e., and . The non-strict operators (including the current evaluation time), denoted as \(\Diamond _{\!\scriptscriptstyle P}^+\) and , can be captured as \(\Diamond _{\!\scriptscriptstyle P}^+ C = C \sqcap \Diamond _{\!\scriptscriptstyle P}C\) and (similarly, and are defined as the dual operators of \(\Diamond _{\!\scriptscriptstyle P}^+\) and , respectively). The ‘always’ operator can be expressed as , while the dual ‘sometime’ is defined as . Finally, the temporal operators (‘next time’) and (‘previous time’) can be defined as and .

A TBox \(\mathcal {T}\) is a finite set of concept and role inclusion axioms of the form \(C_1 \sqsubseteq C_2\) and \(R_1 \sqsubseteq R_2\), respectively. An ABox, \(\mathcal {A}\), consists of assertions of the form where \(A_k\) is a concept name, \(P_k\) a role name, \(a_i\), \(a_j\) object names and, for \(n \in \mathbb {Z}\),

Taken together, the TBox \(\mathcal {T}\) and ABox \(\mathcal {A}\) form the knowledge base (KB) \(\mathcal {K}=(\mathcal {T},\mathcal {A})\). In this paper, OCBC models will be encoded using TBoxes (see Sect. 4.4), while single process executions (i.e., trace fragments as shown in Example 1) are encoded as ABoxes (e.g., \(\mathtt{CO}(co_1,t_0)\) is encoded as ).

A temporal interpretation is a structure of the form \(\mathcal {I}= ((\mathbb {Z},<),\varDelta ^\mathcal {I},\{\cdot ^{\mathcal {I}}\mid n\in \mathbb {Z}\})\), where \((\mathbb {Z},<) \) is the linear model of time, \(\varDelta ^\mathcal {I}\) is a non-empty interpretation domain and \(\mathcal {I}(n)\) gives a standard DL interpretation for each time instant \(n \in \mathbb {Z}\): \( \mathcal {I}(n) =\bigl (\varDelta ^\mathcal {I}, a_0^{\mathcal {I}(n)}, A_0^{\mathcal {I}(n)}, \dots ,P_0^{\mathcal {I}(n)},\dots \bigr ), \) assigning to each concept name \(A_i\) a unary predicate \(A_i^{\mathcal {I}(n)}\subseteq \varDelta ^\mathcal {I}\) and to each role name \(P_i\) a binary relation \(P_i^{\mathcal {I}(n)}\subseteq \varDelta ^\mathcal {I}\times \varDelta ^\mathcal {I}\). We assume that the domain \(\varDelta ^\mathcal {I}\) and the interpretations \(a_i^\mathcal {I}\in \varDelta ^\mathcal {I}\) of object names are the same for all \(n\in \mathbb {Z}\), i.e., we adopt the constant domain assumption and rigid designators (consult [18] for more details on these assumptions). At each time instant \(n \in \mathbb {Z}\), role and concept constructs are interpreted as follows

where \(\sharp X\) denotes the cardinality of X. Thus, for example, iff there is a moment \(k>n\) such that and , for all moments m between n and k. Note that the operators \(\mathbin {\mathcal {S}}\) and are ‘strict’ in the sense that their semantics does not include the current moment of time.

Concept and role inclusion axioms (TBox) are interpreted in \(\mathcal {I}\) globally:

ABox assertions are interpreted relatively to the initial moment, 0:

We call \(\mathcal {I}\) a model of a KB \(\mathcal {K}= (\mathcal {T},\mathcal {A})\) and write \(\mathcal {I}\models \mathcal {K}\) if \(\mathcal {I}\) satisfies all inclusions in \(\mathcal {T}\) and all assertions in \(\mathcal {A}\). A KB \(\mathcal {K}\) is satisfiable if it has a model. A concept C (role R) is satisfiable with respect to \(\mathcal {K}\) if there are a model \(\mathcal {I}\) of \(\mathcal {K}\) and \(n\in \mathbb {Z}\) such that \(C^{\mathcal {I}(n)}\ne \emptyset \) (respectively, \(R^{\mathcal {I}(n)}\ne \emptyset )\). It is readily seen that the concept and role satisfiability problems are equivalent to KB satisfiability.

Reasoning in w.r.t. to a KB is a problem which has been proven to be ExpTime-complete [18, 27]. To achieve better complexity results fragments of \(\mathcal {ALCQI}\) must be considered. Nice results have been gained when temporalizing DL-Lite logics [6, 13]—see, e.g., the temporal DL-Lite called where reasoning has the same complexity of LTL reasoning, i.e., PSpace-complete [8].

4 The OCBC Model

We now present the syntax and graphical appearance of OCBC models, together with their formal semantics. The original proposal of the OCBC model is the way activities and data are related. In particular, an OCBC model captures, at once: (i) Data dependencies, represented using standard data modeling constructs, i.e., classes, relationships and constraints between them; (ii) Activities, accounting for units of work within a process; (iii) Mutual relationships between activities and classes, linking the execution of activities in a given process with the data objects they manipulate; (iv) Temporal constraints between activities; (v) Co-reference constraints that enforce the application of temporal constraints, and in particular limit their application to those activities that indirectly co-refer thanks to the objects and relationships they point to.

4.1 The Data Model – ClaM

Data used by the activities of an OCBC model is structured according to a standard modeling language, i.e., ER/UML/ORM. While \(\mathcal {ALCQI}\) is able to fully capture the semantics of such data models (see [4, 11, 17] and references therein) in the following, just for the sake of simplicity and lack of space, we present only a subset of the complete set of modeling constructs allowed in those standard data modeling languages and denote such set of modeling constructs as the ClaM data model (which stands for CLAss data Model). In particular, the following syntax limits ClaM to capture object classes that can be organized along \(\mathsf {ISA}\) hierarchies (with possibly disjoint sub-classes and covering constraints), binary relationships between object classes and cardinalities expressing participation constraints of object classes in relationships.

Definition 1 (ClaM Syntax)

A conceptual schema \(\varSigma \) in the Class Model, ClaM, is a tuple

  • is the universe of object classes. We denote object classes as \(O_1,O_2,\ldots \);

  • is the universe of binary relationships among object classes. We denote relationships as \(R_1,R_2,\ldots \);

  • is a total function associating a signature to each binary relationship. If \(\tau (R)=(O_1,O_2)\) then \(O_1\) is the range and \(O_2\) the domain of the relationship;

  • is a partial function defining cardinality constraints on the domain of a relationship. \(\#_{\textit{dom}}(R,O)\) is defined only if \(\tau (R) = (O,O_1)\);

  • is a partial function defining cardinality constraints on the range of a relationship. \(\#_{\textit{ran}}(R,O)\) is defined only if \(\tau (R) = (O_1,O)\);

  • is a binary relation defining the super-class and sub-class hierarchy on object classes. If \(\mathsf {ISA}(C_1,C_2)\) then \(C_1\) is said to be a sub-class of \(C_2\) while \(C_2\) is said to be a super-class of \(C_1\);

  • is a binary relation defining the set of disjoint sub-classes in an \(\mathsf {ISA}\) hierarchy;

  • is a binary relation defining the set of sub-classes covering the super-class in an \(\mathsf {ISA}\) hierarchy.

As for the full-fledged syntax of ER/UML/ORM, their formal set-theoretic semantics, and their translation as \(\mathcal {ALCQI}\) KBs we refer to [4, 11, 17]. Concerning the semantics of the ClaM constructs, cardinality constraints are interpreted as the number of times each instance of the involved class participates in the given relationship, \(\mathsf {ISA}\) is interpreted as sub-setting, \(\mathbin {\text{ disj }}\) and \(\mathbin {\text{ cov }}\) are interpreted in the obvious way using disjointness/union between classes, relationships are interpreted as binary predicates, while the relationship signature acts as a typing for its arguments.

Example 2

The lower part of the OCBC model shown in Fig. 2 captures the data model as a ClaM diagram with:

Cardinalities are shown in the diagram following the UML reading.

Fig. 4.
figure 4

Types of temporal constraints between activities and their intuitive semantics

4.2 Temporal Constraints over Activities

Taking inspiration from the DECLARE patterns [1], we present here the temporal constraints between (pairs of) activities that can be expressed in OCBC. Figure 4 graphically renders such constraints together with their intuitive meaning. In the following we present their syntax.

Definition 2 (Temporal constraints)

Let

  • be the universe of activities, denoted with capital letters \(A_1, A_2,\ldots \);

  • be the universe of temporal constraints, i.e., {response, unary-responseprecedenceunary-precedenceresponded-existence, non-response, non-precedence, non-coexistence}, where each is a binary relation over activities, i.e., .

The set of temporal constraints in a given OCBC model is denoted as \(\varSigma _{\textit{TC}}\) and is conceived as a set of elements of the form \(\textit{tc}(A_1,A_2)\), where and .

Remark 1

We observe that the non-precedence constraint is syntactic sugar, as it can be emulated using non-response: \( \mathtt{non\text {-}precedence}(A,B) \equiv \mathtt{non\text {-}response}(B,A).\) Thus, in the following we will not consider it anymore. When defining later on the OCBC model we will consider the set \(\varSigma ^+_{\textit{TC}}\) of positive constraints containing \(\mathtt {response}\), \(\mathtt {unary\text {-}response}\), \(\mathtt {precedence}\), \(\mathtt {unary\text {-}precedence}\), and \(\mathtt {responded\text {-}existence}\), and the set \(\varSigma ^-_{\textit{TC}}\) of negative constraints containing \(\mathtt {non\text {-}response}\) and \(\mathtt {non\text {-}coexistence}\).

4.3 Syntax of OCBC Models

We are now ready to define the OCBC model starting from data models and temporal constraints as respectively defined in Sects. 4.1 and 4.2.

Definition 3 (OCBC syntax)

An OCBC model, \(\mathcal {M}\), is a tuple:

  • ClaM is a data model as in Definition 1, and \(\varSigma _{\textit{TC}}\) a set of temporal constraints as in Definition 2;

  • is the universe of activities;

  • is the universe of activity-object relationships being a set of binary relationships;

  • is a total function associating a signature to each activity-object relationship. If \(\tau _{R_{AC}}(R)=(A,O)\) then and ;

  • is a partial function defining cardinality constraints on the participation of activities in activity-object relationships. \(\#_{\textit{act}}(R,A)\) is defined only if \(\tau _{R_{AC}}(R) = (A,O)\);

  • is a partial function denoting the activity that generated a given object in O. \(\#_{\textit{obj}}(R,O)\) is defined only if \(\tau _{R_{AC}}(R) = (A,O)\);

  • cref is the partial function of co-reference constraints s.t.

  • neg-cref is the partial function of negative co-reference constraints s.t.

Inverses of activity-object relationships are assumed to be functional capturing the intuition that a single occurrence of an activity can manipulate an object at a given point in time. To clarify the syntax of the OCBC modeling language we illustrate the scenario provided in Example 1.

Example 3

We consider the OCBC model in Fig. 2 where the activities are depicted in the upper part of the figure while the lower part shows the ClaM data model for the data manipulated by the activities of the process. The set of the activity-object relationships is: connecting an activity with the manipulated objects as an effect of executing the activity itself. For example, the activity \(\mathtt{Create Order}\) \(\mathtt{creates}\) an instance of the object class \(\mathtt{Order}\) when it is executed. Cardinality constraints can be added to activity-object relationships to specify participation constraints either on the activity side or on the object class side. For example, each execution of \(\mathtt{Pick Item}\) \(\mathtt{fills}\) one and only one \(\mathtt{Order Line}\), i.e., \(\#_{\textit{act}}(\mathtt{fills}, \mathtt{Pick Item}) = (1,1)\). On the other hand, any \(\mathtt{Order Line}\) must be necessarily filled by executing a \(\mathtt{Pick Item}\) activity, i.e., \(\#_{\textit{obj}}(\mathtt{fills}, \mathtt{Order Line}) = 1\). The co-reference constraints involving object classes specify constraints on how objects connected to different activities can be shared. For example, the \(\mathtt{Order Line}\) instance filled by a \(\mathtt{Pick Item}\) is the same as the one prepared by the corresponding \(\mathtt{Wrap Item}\). These co-reference constraints can be expressed using the following OCBC syntax:

$$ \begin{array}{rcl} \textit{cref}\bigl (\mathtt {unary\text {-}response}(\mathtt{Pick Item, Wrap Item}),\mathtt{fills, prepares}\bigr ) &{}=&{} \mathtt{Order Line},\\ \textit{cref}\bigl (\mathtt {unary\text {-}precedence}(\mathtt{Wrap Item, Pick Item}), \mathtt{prepares, fills}\bigr ) &{}=&{} \mathtt{Order Line}. \end{array} $$

The co-reference constraint  , and the negative co-reference constraint  are expressed as, respectively:

$$ \begin{array}{rcl} \textit{cref}(\mathtt {unary\text {-}precedence}(\mathtt{Pick Item, Create Order}),\mathtt{fills, creates}) &{}\!=\!&{} \mathtt{contains};\\ \textit{neg-cref}(\mathtt {non\text {-}response}(\mathtt{Pay Order, Pick Item}),\mathtt{closes, fills}) &{}\!=\!&{} \mathtt{contains}. \end{array} $$

4.4 Semantics of OCBC Models

We now focus on the semantics of OCBC models. As pointed out in Sect. 2, OCBC models are interpreted using traces that capture the occurrence of events, the relationships between events and objects, and the evolution of objects and relationships over time. Here, we base the OCBC semantics on infinite traces (cf. Sect. 6 for a remark on finite traces). The information recorded in an actual execution trace is interpreted under incomplete knowledge, i.e., as a trace fragment containing explicit factual knowledge that is known to certainly hold but, in general, only partially capturing what actually occurred. Thus, the notion of trace as used in event log formats such as the XES IEEE standard has to be interpreted, in our setting, as a trace fragment.

Our effort is to reconcile the process flow semantics with the data model semantics. We thus resort to a knowledge base expressed in the temporal DL . In particular, we map both activities and object classes to concepts, while activity-object relationships and relationships of the data model are mapped to roles. Such an encoding of OCBC models using KBs in the temporal DL interprets constraints of an OCBC model over infinite traces, while the ABox, that encodes the explicit factual knowledge, i.e., the trace fragment at hand, is interpreted as a finite portion of such infinite traces. Here we detail the encoding.

Concerning the semantics of the ClaM data model, we interpret it via a mapping to \(\mathcal {ALCQI}\) as already discussed in Sect. 4.1. Furthermore, we can add to the data model temporal constraints captured in as shown in [5, 7].

As for activity-object relationships, let so that \(\tau _{R_{AC}}(R)=(A,O)\). The following axioms captures inverse functionality, and domain and range restrictions for R:

$$\begin{aligned} (\ge ~2~R^-~\top ) \sqsubseteq \bot ,\qquad \exists R\sqsubseteq A,\quad \exists R^-\sqsubseteq O. \end{aligned}$$
(1)

A cardinality constraint of the form \(\#_{\textit{obj}}(R,O) = 1\), denoting the activity that generated an object of class O, is captured as:

$$\begin{aligned} O\sqsubseteq \Diamond _{\!\scriptscriptstyle P}^+(O\sqcap \exists R^-). \end{aligned}$$

Cardinality constraints for the participation of activities in activity-object relationships (\(\#_{\textit{act}}\)) are instead captured as classical cardinalities in data models (see [5, 7, 11]).

Fig. 5.
figure 5

Co-reference (response) constraints over (a) object classes and (b) relationships, with their negated versions (c-d)

Semantics of Co-reference Constraints. Having fixed the semantics for the ClaM data model and the one for the activity-object relationships we are left with the most tricky aspect of OCBC, namely the semantics of co-reference constraints. In the following, we consider the different kinds of co-reference constraints which, according to Definition 3, can be either positive or negative, and can range either over object classes (as illustrated in Fig. 5a and c) or over relationships (as illustrated in Fig. 5b and d). Let , and s.t. \(\textit{tc}(A_1, A_2)\in \varSigma ^+_{TC}\), \(\tau _{R_{AC}}(R_{1})=(A_1,O)\), \(\tau _{R_{AC}}(R_{2})=(A_2,O)\) and cref be a co-reference constraint over object classes of the form: \(\textit{cref}(\textit{tc}(A_1, A_2), R_{1}, R_{2}) = O\) (as in Fig. 5a). Then, co-reference over object classes when tc is the \(\mathtt{response}\) temporal constraint is captured by the axiom:

(2)

This expresses that “whenever an object is in the range of \(R_1\) then sometime in the future it must be also in the range of \(R_2\)”. This semantics enforces a temporal constraint over the activities via the co-referenced object, i.e., when the activity \(A_1\) is linked via \(R_1\) to an object in O then it must be followed by an execution of \(A_2\) referencing the same object via \(R_2\). Formally, the following logical implication holds:

(3)
Fig. 6.
figure 6

(a) Trace fragment for (2) but not (4); (b) trace fragment for (8) but not (10)

When tc is the \(\mathtt{unary\text {-}response}\) temporal constraint we need to add to formula (2) another formula that guarantees a unique occurrence of \(A_2\) over the co-referenced object:

(4)

Figure 6a shows a possible instantiation of the OCBC model in Fig. 5a which, in turn, is not a valid fragment in case the temporal constraint is changed to .

Similar formulas hold when tc is a temporal constraint over the past, i.e., either \(\mathtt {precedence}\) (formula (5)), \(\mathtt {unary\text {-}precedence}\) (formulas (5) and (6)) or (formula (7)).

(5)
(6)
(7)

We now consider co-reference constraints over relationships. As in Fig. 5b, let , , with \(\tau (R)=(O_1,O_2)\), \(\tau _{R_{AC}}(R_{1})=(A_1,O_1)\), \(\tau _{R_{AC}}(R_{2})=(A_2,O_2)\) and cref be a co-reference of the form: \(\textit{cref}(\textit{tc}(A_1, A_2),R_{1},R_{2}) = R.\) Then, the semantics of co-reference over relationships when tc is the \(\mathtt{response}\) constraint is captured by:

(8)

Expressing that “every object in the range of \(R_1\) sometime in the future should be connected via R to an object in the range of \(R_2\).” A logical implication similar to (3) holds:

(9)

When tc is \(\mathtt {unary\text {-}response}\) we should add to formula (8) another formula that guarantees that activity \(A_1\) is followed by a single occurrence of \(A_2\) via R. The following axiom expresses that “whenever an object is in the range of \(R_2\) (thus under the occurrence of \(A_2\)) and is connected via \(R^-\) to an object that before was in the range of \(R_1\) (due to the occurrence of the activity \(A_1\)) then, it will never be in the range of \(R_2\).”

(10)

Figure 6b shows an instantiation of the OCBC model in Fig. 5b that, in turn, is not anymore a valid fragment in case the temporal constraint is changed to \(\mathtt{unary\text {-}response}\) (because \(o_2\) is pointed to by two different instances—\(b_1, b_2\)—of the activity \(A_2\)).

Similar formulas hold when tc is \(\mathtt {precedence}\) (axiom (11)), (axioms (11) and (12)) and (axiom (13))

(11)
(12)
(13)

Note that axiom (13) allows for \(\mathtt {responded\text {-}existence}\) to be symmetric—as for axiom (7)—i.e., .

We now consider co-references in the presence of negative behavioral constraints (see Fig. 5c-d). We start with co-reference over object classes. In case tc is \(\mathtt {non\text {-}response}\) (as in Fig. 5c) then the following axiom expresses that “whenever an object is in the range of \(R_1\) then never in the future it could be in the range of \(R_2\)”:

(14)

As a consequence of this axiom, and of the fact that the domains of \(R_1\) and \(R_2\) are activities \(A_1\) and \(A_2\), while they both range over the same class O, we can also read this negative co-reference as “every instance of activity \(A_1\) can never be followed by instances of \(A_2\) sharing the same object in O”. The right-hand side of the axiom is the negation of the right-hand side of axiom (2). When tc is \(\mathtt {non\text {-}coexistence}\), we have

(15)

Again, the right-hand side is the negation of the right-hand side of axiom (7).

When negative co-references involve a relationship and tc is \(\mathtt {non\text {-}response}\) (as in Fig. 5d) the following axiom expresses that “whenever an object is in the range of \(R_1\) then never in the future it could be connected via R to an object in the range of \(R_2\) (thus under the occurrence of \(A_2\))”:

(16)

implying that “every instance of activity \(A_1\) can never be followed by instances of \(A_2\) sharing the same pair of objects in R”. Notice again that the right-hand side of the above axiom is the negation of the right-hand side of axiom (8). Finally, by negating the right-hand side of axiom (13) we capture the case when tc is \(\mathtt {non\text {-}coexistence}\)

(17)

Similar to \(\mathtt {responded\text {-}existence}\), \(\mathtt {non\text {-}coexistence}\) over both object classes (15) and relationships (17) is obviously symmetric. Formally, considering the co-reference over a relationship, .

Altogether, an OCBC model can be captured via a TBox in , and its trace fragments using corresponding ABoxes. Overall, a KB is thus able to provide a uniform representation for OCBC, on which we can apply ad hoc reasoning services as described in the following section.

5 Verification and Reasoning over OCBC Models

The main motivation to provide a mapping from OCBC models to a DL Knowledge Base is the possibility of carrying out automated reasoning over them. We discuss how the typical services for verifying declarative, constraint-based process models can be lifted to the more sophisticated setting of OCBC. To do so, we build on the services defined for the well-established DECLARE language [24, 25]. In the following, we show how such services can be reformulated as standard reasoning tasks over knowledge bases, in turn inheriting their decidability and worst-case complexity.

Let \(\mathcal {M}\) be an OCBC model of interest, and \(\rho \) a trace fragment over \(\mathcal {M}\). We denote by \(\mathcal {T}_\mathcal {M}\) and \(\mathcal {A}_\rho \) the TBox and ABox obtained by encoding \(\mathcal {M}\) and \(\rho \) in , and by \(\mathcal {K}_{\mathcal {M},\rho }\) the resulting KB, i.e., \(\mathcal {K}_{\mathcal {M},\rho }= (\mathcal {T}_\mathcal {M},\mathcal {A}_\rho )\).

Model Consistency. The most fundamental service is to check whether \(\mathcal {M}\) is consistent, that is, supports the empty trace fragment (in turn witnessesing that it supports at least one full trace). This directly reduces to check whether \(\mathcal {T}_\mathcal {M}\) is satisfiable.

Activity Executability. An OCBC model may be consistent, but including so-called dead activities [25], i.e., activities that cannot be executed at all. We can show whether an activity A in \(\mathcal {M}\) can be executed by verifying whether such an activity is not logically implied to be empty in the corresponding TBox, i.e., \(\mathcal {T}_\mathcal {M}\not \models \, A\sqsubseteq \bot \).

Fig. 7.
figure 7

Implied (a) and non-implied (b) constraints by the OCBC model of Fig. 2

Implied Properties. Let \(\alpha \) be a model property expressible in . We can check whether \(\mathcal {M}\models \, \alpha \) by checking whether \(\mathcal {K}_{\mathcal {M},\rho }\models \, \alpha \). E.g., (3) is a property implied by \(\mathcal {M}\). The presented encoding of OCBC into allows us to use its reasoning capabilities to detect so-called hidden constraints [24], i.e., constraints that are implicitly present in \(\mathcal {M}\) even though they are not shown graphically.

Example 4

Consider again the OCBC model of Fig. 2 and the two constraints in Fig. 7 where Fig. 7a captures that an order can be paid only if it has been created before, and Fig. 7b that no order line of an order can be wrapped after that order is paid. It is easy to verify that the former constraint is indeed implied, while the latter constraint it is not. While it is true that once an order is paid no further items can be picked for it, already picked order lines may still need to be wrapped.

Execution Trace Compliance. This amounts to check whether a trace fragment \(\rho \) satisfies the constraints in \(\mathcal {M}\). Since \(\rho \) is a trace fragment, we require that no explicit violation is contained in \(\rho \) and that \(\rho \) can be’completed’ into a fully specified, infinite trace that satisfies \(\mathcal {M}\). This corresponds to the notion of conditional compliance recently introduced in [15]. In our setting, this amounts to check whether the ABox \(\mathcal {A}_\rho \) encoding \(\rho \) is satisfiable w.r.t. the TBox \(\mathcal {T}_\mathcal {M}\), i.e., whether the KB \(\mathcal {K}_{\mathcal {M},\rho }\) is satisfiable.

Complexity Considerations. Notice that, KB satisfiability and logical implication are mutually reducible in \(\mathcal {ALCQI}\) [6] (and thus in ) and these reasoning problems over are ExpTime-complete [18, 27], which establishes an ExpTime upper bound for verifying properties of OCBC models. The need to use \(\mathcal {ALCQI}\) as the base DL is due to co-reference constraints over relationships, which requires the power of qualified existential (\(\exists R\mathpunct {\text{. }}C\)) and its dual. If we renounce such constraints (i.e., only consider OCBC constraints co-referring on classes), we could use a temporalized version of a DL-Lite dialect. In particular, the temporal DL-Lite fragment , showed to be PSpace-complete in [8], is able to capture OCBC models with the exception of co-reference constraints over relationships while, at the level of the data model, captures the main constructs of UML—with the exception of ISA between relationships and n-ary relationships (cf. [4, 7] for details).

6 Conclusions

We presented the first, complete formalization of object-centric behavioral constraints (OCBC): a new approach to business process modeling where data models and declarative constraints over activities are seamlessly integrated. Our approach comes with a logic-based semantics for OCBC in terms of an encoding into the temporal DL . This unambiguously defines the meaning of OCBC models, and lays the foundations for reasoning over them, allowing us to understand the (decidability and) complexity boundaries of reasoning tasks over OCBC models. interprets time as a linear, infinite structure, which contrasts with the finite-trace semantics adopted in other declarative process modeling languages such as Declare. The study of temporal description logics with finite-time semantics is rather novel [9], and may constitute the basis for reasoning over OCBC models on finite traces.

We have considered here standard data models to capture the structural aspects of OCBC. Variants of OCBC with non-conventional temporalized cardinality constraints over relationships have been used [21, 22]. We intend to study whether such constraints may impact on the decidability and complexity of reasoning over OCBC models.

In our research agenda, we are interested not only in design-time reasoning of OCBC models, but also in enactment, monitoring, and runtime verification. This poses two major challenges. On the one hand, a monitored trace has to be considered under a “partially closed” semantics, that is, by interpreting it as a complete record of what happened so far, while missing information about the future. On the other hand, a more fine-grained analysis, in the style of [23], regarding if and how a monitored trace conforms to an OCBC model is needed. We intend to attack this problem by combining finite and infinite reasoning over a partially closed knowledge base.