Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Verification problems of complex embedded software can be reduced to solving logic formulas that contain continuous, typically nonlinear, real functions. The framework of \(\delta \)-decision procedures [19, 21] establishes that, under reasonable relaxations, nonlinear SMT formulas over the reals are in principle as solvable as SAT problems. Indeed, using solvers for nonlinear theories as the algorithmic engines, straightforward bounded model checking has already shown promise on nonlinear hybrid systems [9, 28]. Naturally, for enhancing performance, more advanced reasoning techniques need to be introduced, extending SMT towards general quantifier elimination. However, it is well-known that quantifier elimination is not feasible for nonlinear theories over the reals. The complexity of quantifier elimination for real arithmetic (i.e., polynomials only) has a double-exponential lower bound, which is too high for most applications; when transcendental functions are further involved, the problem becomes highly undecidable.

Craig interpolation provides a weak form of quantifier elimination. Given two formulas A and B, such that \(A~{\wedge }~B\) is unsatisfiable, an interpolant I is a formula satisfying: (1) \(A~{\Rightarrow }~I\), (2) \(B~{\wedge }~I~{\Rightarrow }~{\bot }\), and (3) I contains only variables common to A and B. It has found many applications in verifications: as an heuristic to compute inductive invariant [30, 33, 35], for predicate discovery in abstraction refinement loops [32], inter procedural analysis [2, 3], shape analysis [1], fault-localisation [10, 17, 39], and so on.

In this paper, we present methods for computing Craig interpolants in expressive nonlinear theories over the reals. To do so, we extract interpolants from proofs of unsatisfiability generated by \(\delta \)-decision procedures [22] that are based on Interval Constraint Propagation (ICP) [6]. The proposed algorithms are guaranteed to find the interpolants between two formulas A and B, whenever \(A~{\wedge }~B\) is not \(\delta \)-satisfiable.

The framework of \(\delta \)-decision procedures formulates a relaxed notion of logical decisions, by allowing one-sided \(\delta \)-bounded errors [18, 19]. Instead of asking whether a formula has a satisfiable assignment or not, we ask if it is “\(\delta \) -satisfiable” or “unsatisfiable”. Here, a formula is \(\delta \)-satisfiable if it would be satisfiable under some \(\delta \) -perturbation on the original formula [18]. On the other hand, when the algorithm determines that the formula is “unsatisfiable”, it is a definite answer and no numerical error can be involved. Indeed, we can extract proofs of unsatisfiability from such answers, even though the search algorithms themselves involve numerical errors [22]. This is accomplished by analyzing the execution trace of the search tree based on the ICP algorithm.

The core ICP algorithm uses a branch-and-prune loop that aims to either find a small enough box that witnesses \(\delta \)-satisfiability, or detect that no solution exists. The loop consists of two main steps:

  • (Prune) Use interval arithmetic to maintain overapproximations of the solution sets, so that one can “prune” out the part of the state space that does not contain solutions.

  • (Branch) When the pruning operation does not make progress, one performs a depth-first search by “branching” on variables and restart pruning operations on a subset of the domain.

The loop is continued until either a small enough box that may contain a solution is found, or any conflict among the constraints is observed.

Fig. 1.
figure 1

Interval constraint propagation and interpolant construction where A is \(y~{\ge }~x{^2}\) and B is \(y~{\le }~-\cos (x) + 0.8\) over the domain \(x~{\in }~[-1,1]\), \(y{\in }[-1,1]\). The A is shown in green and B in red. The final interpolant is the green part (Color figure online).

When a formula is unsatisfiable, the execution trace of the algorithm generates a (potentially large) proof tree that divides the space into small hypercubes and associating a constraint to each hypercube [22]. The interpolation algorithm can essentially traverse this proof tree to construct the interpolant. To each leaf in the proof, we associate \(\top \) or \(\bot \) depending on the source of the contradiction. The inner nodes of the proof tree correspond to case splits and are handled in a manner reminiscent of Pudlák’s algorithm [37]. Common variables are kept as branching points and A,B local variables are eliminated. A simple example of the method is as follows:

Example 1

Let \(A: y~{\ge }~x{^2}\) and \(B: y~{\le }~-\cos (x) + 0.8\) be two constraints over the domain \(x~{\in }~[-1,1]\), \(y{\in }[-1,1]\). A \(\delta \)-decision procedure uses A and B to contract the domains of x and y by removing the parts that be shown empty using interval arithmetic. Figure 1 shows a sequence of contraction proving the unsatisfiability of the formula. As the contraction occurs, we color the region of the space by the color of the opposite formula. When the interval constraint propagation has finished, the initial domain is associated to either A or B. The interpolant I is composed of the parts corresponding to A. We will compute that I is \(y~{\ge }~0~{\wedge }~(0.26~{\le }~y~{\vee }~(y~{\le }~0.26~{\wedge }~-0.51~{\le }~x~{\le }~0.51))\).

We have implemented the algorithms in the SMT solver dReal  [20]. We show examples of applications from various domains such as control and robotic design, and hybrid system verification.

Related Work. Our algorithm is very similar to the algorithm for propositional interpolation studied by Pudlák [37]. Craig interpolation for real or integer arithmetic has focused on the linear fragment with LA(\(\mathbb {R}\)) [31, 38] and LA(\(\mathbb {Z}\)) [8, 24]. Dai et al. [15] present a method to generate interpolants for polynomial formula. Their method use semi-definite programming to search for a polynomial interpolant and it is complete under the Archimedean condition. In fact, the Archimedean condition imposes similar restrictions as \(\delta \)-decidability, e.g., the variables over bounded domains and limited support for strict inequalities. Our method is more general in that it handles nonlinear fragments over \(\mathbb {R}\) that include transcendental functions and solution functions of ordinary differention equations. Existing tools to compute interpolation such as MathSat5  [12], Princess  [8], SmtInterpol  [11], and Z3  [34] focus on linear arithmetic. We are the first to provide interpolation in nonlinear theories.

Outline. In Sect. 2, we review notions related to interpolation, nonlinear arithmetic over the Reals and \(\delta \)-decision procedures. In Sect. 3, we introduce our interpolation algorithm. In Sect. 4, we present and evaluate our implementation. We conclude and sketch future research direction in Sect. 5.

2 Preliminaries

Craig Interpolation [14]. Craig interpolants were originally defined in propositional logic, but can be easily extended to first-order logic. Given two quantifier-free first-order formulas A and B, such that \(A~{\wedge }~B\) is unsatisfiable, a Craig interpolant I is a formula satisfying:

  • \(A~{\Rightarrow }~I\);

  • \(B~{\wedge }~I~{\Rightarrow }~{\bot }\);

  • \(fv(I)~{\subseteq }~fv(A)~{\cap }~fv(B)\) where \(fv(\cdot )\) returns the free variables in a formula.

Intuitively, I provides an overapproximation of A that is still precise enough to exhibit its conflict with B. In particular, I involves only variables (geometrically, dimensions) that are shared by A and B.

Notation 1

We use the meta-level symbol \(\Rightarrow \) as a shorthand for logical implications in texts. In the proof rules that we will introduce shortly, \(\vdash \) is used as the formal symbol with the standard interpretation as logical derivations.

\(\delta \) -Complete Decision Procedures. We consider first-order formulas interpreted over the real numbers. Our special focus is formulas that can contain arbitrary nonlinear functions that are Type 2 computable [7, 40]. Intuitively, Type 2 computability corresponds to numerical computability. For our purpose, it is enough to note that this set of functions consist of all common elementary functions, as well as solutions of Lipschitz-continuous ordinary differential equations.

Interval Constraint Propagation (ICP) [6] finds solutions of real constraints using the branch-and-prune method, combining interval arithmetic and constraint propagation. The idea is to use interval extensions of functions to prune out sets of points that are not in the solution set and branch on intervals when such pruning can not be done, recursively until a small enough box that may contain a solution is found or inconsistency is observed. A high-level description of the decision version of ICP is given in Algorithm 1 [6, 18]. The boxes, or interval domains, are written as \(\varvec{D}\) and \(c_i\) denotes the ith constraint.

figure a

Proofs from Constraint Propagation. A detailed description of proof extraction from \(\delta \)-decision procedure is available in [22]. Here, we use a simplified version.Intuitively, the proof of unsatisfiability recursively divides the solution space to small pieces, until it can prove (mostly using interval arithmetic) that every small piece of the domain contains no solution of the original system. Note that in such a proof, the difference between pruning and branching operations become blurred for the following reason.

Pruning operations show that one part of the domain can be discarded because no solution can exist there. Branching operations split the domain along one variable, and generates two sub-problems. From a proof perspective, the difference between the two kinds of operations is simply whether the emptiness in one part of domain follows from a simple properties of the functions (theory lemma), or requires further derivations. Indeed, as is shown in [22], the simple proof system in Fig. 2 is enough for establishing all theorems that can be obtained by \(\delta \)-decision procedures. The rules can be explained as follows.

  • The Split rules divides the solution space into two disjoint subspaces.

  • The theory lemmas (ThLem) are the leaves of the proof. They are used when the solver managed to prove the absence of solution in a given subspace.

  • The Weakening rule extracts those conjunct out of the main formula.

We see that each step of the proof has a set of variables \(\varvec{x}\) with a domain \(\varvec{D}\) and F is a formula. We use vector notations in the formulas, writing \(\varvec{x}~{\in }~\varvec{D}\) to denote \(\bigwedge _i x_i~{\in }~D_i\). The domains are intervals, i.e., each \(D_i\) has the form \([l_i,u_i]\) where \(l_i\),\(u_i\) are the lower and upper bounds for \(x_i\). Since we are looking at unsatisfiability proofs, each node implies \({\bot }\). The root of the proof is the formula \(A~{\wedge }~B\), and D covers the entire domain. The inner nodes are Split, and the proof’s leaves are theory lemmas directly followed by weakening. To avoid duplication, we do not give a separate example here, since the full example in Fig. 5 shows the structure of some proof trees obtained from such rules.

Fig. 2.
figure 2

Proof rules for the ICP algorithm. We use the standard notations for sequent calculus. Also, when we write an interval [ab], we always assume that it is a well-defined real interval satisfying \(a\le b\).

A proof of unsatisfiability can be extracted from an execution trace of Algorithm 1 when it returns unsat. The algorithm starts at the root of the proof tree and explores the proof tree depth-first. Branching (line 9) directly corresponds to the \(\textsf {Split} \) rule. Pruning (line 5), on the other hand, is a combination of the three rules. Let us look at \(\varvec{D}' = \text {Prune}(\varvec{D}, c_i)\). The constraint \(c_i\) is selected with the Weakening. For each \(D'_i=[l',u']\) which is strictly smaller than \(D_i=[l,u]\), the Split and ThLem rules are applied. If \(u'<u\) then we split on \(u'\) and a lemma shows that the interval \([u,u']\) has no solution. The same is done for the lower bounds \(l'\),l. Figure 3 shows a pruning step and the corresponding proof.

Fig. 3.
figure 3

Pruning operation and the corresponding proof. The pruning shrinks the domain of x from [lu] to \([l,u']\). The corresponding proof starts with a Split around \(u'\). The interval \([u',u]\) is proved empty using a ThLem and Weakening step. The remaining \([l,u']\) interval is shown empty by further operations.

3 Interpolants in Nonlinear Theories

Intuitively, a proof of unsatisfiability is a partition of the solution space where each sub-domain is associated with a conjunct c from \(A~{\wedge }~B\). c is a witness that shows the absence of solution in a given domain. The interpolation rules traverse the rules and selects which parts belong to the interpolant I. We now describe the algorithm for obtaining such interpolants for formulas A and B from the proof of unsatisfiability for \(A\wedge B\).

Fig. 4.
figure 4

Interpolant producing proof rules

3.1 Core Algorithms

Our method for constructing disjunctive linear interpolants takes two inputs: a proof tree and a labeling function. The labeling function maps formula and variables to either a, b, or ab. For each proof rule introduced in Fig. 2, we associate some partial interpolants, written in square bracket on the right of the conclusion of the rule. Figure 4 shows these modified versions of the rules.

  • At the leaf level (rule ThLem-I), the tile is in I if c is not part of A, i.e., the contradiction originates from B. If c is in both A and B then it can be considered as either part of A or B. Both cases lead to a correct interpolant.

  • The Weakening-I rule does not influence the interpolant, it is only required to pick c from \(A~{\wedge }~B\).

  • The Split-I is the most interesting rule. Splitting the domain essentially defines the bounds of the subsequent domains. Let x be the variable whose domain is split at value p and \(I{_1}\), \(I{_2}\) be the two interpolants for the case when \(x < p\) and \(x~{\ge }~p\). If x occurs in A but not B, then x cannot occur in I. Since x is in A then we know that A implies \(x < p~{\Rightarrow }~I{_1}\) and \(x~{\ge }~p~{\Rightarrow }~I{_2}\). Eliminating x gives \(I = I{_1}~{\vee }~I{_2}\). A similar reasoning applies when x occurs in B but not A and gives \(I = I{_1}~{\wedge }~I{_2}\). When x occurs in both A and B then x is kept in I and acts as a selector for the values of x smaller than p \(I{_1}\) is selected, otherwise \(I{_2}\) applies.

The correctness of our method is shown by the following theorem:

Theorem 1

The rules Split-I, ThLem-I, Weakening-I generate a Craig interpolant I from the proof of unsatisfiability of A and B.

Proof

We prove correctness of the rules by induction. To express the inductive invariant, we split the domain \(\varvec{D}\) into the domains \(\varvec{D}_A\) and \(\varvec{D}_B\) which contains only the intervals of the variables occurring in A, B respectively.

At any given point in the proof, the partial interpolant I is an interpolant for the formula A over \(\varvec{D}_A\) and B over \(\varvec{D}_B\). At the root of the proof tree we get an interpolant for the whole domain \(\varvec{D} = \varvec{D}_A~{\wedge }~\varvec{D}_B\).

At the leaves of the proof, or the ThLem-I rule, one of the constraints has no solution over the domain. Let’s assume that this constraint comes from A. Then the partial interpolant I is \({\bot }\). We have that \(A~{\wedge }~\varvec{D}_A~{\Rightarrow }~I\) by the semantics of the ThLem rule (\({\bot }{\Rightarrow }{\bot }\)). Trivially, \(B~{\wedge }~\varvec{D}_B~{\wedge }~I~{\Rightarrow }~{\bot }\) and \(fv(I) =~{\emptyset }~{\subseteq }~fv(A)~{\cap }~fv(B)\). When the contradiction comes from B, a similar reasoning applies with \(I={\top }\).

The Weakening-I only serves to select the constraint which causes the contradiction and does not change the invariant.

The Split-I rule is the most complex case. We have to consider whether the variable x which is split come from A, B, or is shared. For instance, if \(x~{\in }~fv(A)\) then the induction step has \(\varvec{D}_{A1} = \varvec{D}_A~{\wedge }~x < p\) and \(\varvec{D}_{A2} = \varvec{D}_A~{\wedge }~x~{\ge }~p\) and \(\varvec{D}_B\) is unchanged. If \(x~{\in }~fv(B)\) then \(\varvec{D}_B\) is affected and \(\varvec{D}_A\) is unchanged. If x is shared then both \(\varvec{D}_A\) and \(\varvec{D}_B\) are affected.

Let consider that \(x~{\in }~fv(A)\) and \(x~{\not \in }~fv(B)\). We omit the case where x is in B but not A as it is similar. The induction hypothesis is

figure b

Finally, we need to consider \(x~{\in }~fv(A)\) and \(x~{\in }~fv(B)\). The induction hypothesis is

figure c

.

   \(\square \)

Example 2

If we look at proof for the example in Fig. 1, we get the proof annotated with the partial interpolants shown in Fig. 5. The final interpolants \(I{_5}\) is \(0{\le }y~{\wedge }~(0.26{\le }y~{\vee }~(y{\le }0.26~{\wedge }~-0.51~{\le }~x~{\le }~0.51))\).

Fig. 5.
figure 5

Proof of unsatisfiability where A is \(y~{\ge }~x{^2}\), B is \(y~{\le }~-\cos (x) + 0.8\) along with the corresponding interpolant

Boolean Structure. The method we presented explain how to compute an interpolant for the conjunctive fragment of quantifier-free nonlinear theories over the reals. However, in many cases formula also contains disjunctions. To handle disjunctions, our method can be combined with the method presented by Yorsh and Musuvathi [41] for building an interpolant from a resolution proof where some of the proof’s leaves carry theory interpolants.

Handling ODE Constraints. A special focus of \(\delta \)-complete decision procedures is on constraints that are defined by ordinary differential equations, which is important for hybrid system verification. In the logic formulas, the ODEs are treated simple as a class of constraints, over variables that represent both state space and time. Here we elaborate on the proofs and interpolants for the ODE constraints.

Let \(t_0, T\in \mathbb {R}\) and \(g:\mathbb {R}^n\rightarrow \mathbb {R}\) be a Lipschitz-continuous Type 2 computable function. Let \(t_0, T\in \mathbb {R}\) satisfy \(t_0\le T\) and \(\varvec{x}_0\in \mathbb {R}^n\). Consider the initial value problem

$$ \frac{\mathrm {d}\varvec{x}}{\mathrm {d}t} = \varvec{g}(\varvec{x}(t)) \text{ and } \ \varvec{x}(t_0) = \varvec{x}_0, \text{ where } t\in [t_0, T]. $$

It has a solution function \(\varvec{x}: [t_0, T]\rightarrow \mathbb {R}^n\), which is itself a Type 2 computable function [40]. Thus, in the first-order language \(\mathcal {L}_{\mathbb {R}_{\mathcal {F}}}\) we can write formulas like

$$ \Big (||\varvec{x}_0||=0\Big ) \wedge \Big ( \varvec{x}_t = \varvec{x}_0+\int _0^t \varvec{g}(\varvec{x}(s))\mathrm {d}s\Big ) \wedge \Big (||\varvec{x}_t|| > 1\Big ) $$

which is satisfiable when the system defined by the vector field \(\varvec{g}\) can have a trajectory from some point \(||\varvec{x}(0)||=0\) to \(||\varvec{x}(t)||=1\) after time t. Note that we use first-order variable vectors \(\varvec{x}_0\) and \(\varvec{x}_t\) to represent the value of the solution function \(\varvec{x}\) at time 0 and t. Also, the combination of equality and integration in the second conjunct simply denotes a single constraint over the variables \((\varvec{x}_0, \varvec{x}_t, t)\).

In the \(\delta \)-decision framework, we perform interval-based integration for ODE constraints that satisfies the following. Suppose the time domain for the ODE constraint in question is in \([t_0,T]\). Let \(t_0\le t_1\le \cdots t_m\le T\) be a sequence of time points. An interval-based integration algorithms compute boxes \(D_{t_1},...,D_{t_m}\) such that

$$ \forall i\in \{1,...,m\},\; \{\varvec{x}(t): t_i\le t\le t_{i+1}, \varvec{x}_0\in D_{\varvec{x}_0}\}\subseteq D_{t_0}. $$

Namely, it computes a sequence of boxes such that all possible trajectories are contained in them over time. Thus, the ODE constraints can be handled in the same way as non-ODE constraints, whose solution set is covered by a set of small boxes. Consequently, the proof rules from Fig. 4 apply directly to ODE constraints.

3.2 Extensions

For any two formulas A,B which conjunction is unsatisfiable, the interpolant I is not unique. In practice, it is difficult to know a priori what is a good interpolant. Therefore, it is desirable to have the possibility of generating and testing multiple interpolants. We now explain how to get interpolants of different logical strength. An interpolant \(I{_1}\) is stronger than an interpolant \(I{_2}\) iff \(I{_1}~{\Rightarrow }~I{_2}\). Intuitively, a stronger interpolant is closer to A and a weaker interpolant closer to B.

Parameterizing Interpolation Strength. The interpolation method that we propose uses a \(\delta \)-decision procedure to build a Craig interpolant. I being an interpolant means that \(A~{\wedge }~{\lnot }I\) and \(B~{\wedge }~I\) are both unsatisfiable. However, these formulas might still be \(\delta \)-satisfiable.

To obtain an interpolant such that both \(A~{\wedge }~{\lnot }I\) and \(B~{\wedge }~I\) are \(\delta \)-unsatisfiable, we can weaken both A and B by a factor \(\delta \). However, A and B must be at least \(3{\delta }\)-unsatisfiable to guarantee that the solver finds a proof of unsatisfiability. Furthermore, we can also introduce perturbations only on one side in other to make the interpolant stronger of weaker. To introduce a perturbation \(\delta \), we apply the following rewriting to every inequalities in A and/or B:

$$\begin{aligned} \begin{array}{lcl} L = R &{} ~~ \mapsto ~~ &{} L~{\ge }~R -~{\delta }~{\wedge }~L~{\le }~R +~{\delta }~\\ L~{\ge }~R &{} ~~ \mapsto ~~ &{} L~{\ge }~R -~{\delta }~\\ L> R &{} ~~ \mapsto ~~ &{} L > R -~{\delta }~ \end{array} \end{aligned}$$

Changing the Labelling. Due to the similarity of our method to the interpolation of propositional formulas we can adapt the labelled interpolation system from D’Silva et al. [16] to our framework.

In the labelled interpolation system, it is possible to modify the a,b,ab labelling as long as it preserves locality, see [16] for the details. An additional restriction in our case is that we cannot use a projection of constraints at the proof’s leaves. The projection is not computable in nonlinear theories. Therefore, the labelling must enforce that the leaves maps to the interpolants \(\top \) or \(\bot \).

4 Applications and Evaluation

We have implemented the interpolation algorithm in a modified version of the dReal SMT solver.Footnote 1 The proofs produced by dReal can be very large, i.e., gigabytes. Therefore, the interpolants are built and simplified on-the-fly. The full proof is not kept in memory. We modified the ICP loop and the contractors which are responsible for the pruning steps. The overhead induced by the interpolant generation over the solving time is smaller than 10 %.

The ICP loop (Fig. 1) builds a proof starting from the root of the proof tree and exploring the tree like a depth-first search. On the other hand, the interpolation rules build the interpolant starting from the proof’s leaves. Our implementation modifies the ICP loop to keep a stack P of partial interpolants alongside the stack of branching points S. When branching (line 9), the value used to split \(\varvec{D}_1\) and \(\varvec{D}_2\) is pushed on P. The pruning steps (line 5) are converted to a proof as shown in Fig. 3. When a contradiction is found (line 7, else branch), P is popped to the branching point where the search resumes and the corresponding partial interpolant is pushed back on P. When the ICP loop ends, P contains the final interpolant.

Fig. 6.
figure 6

Interpolants’ size (number of inequalities) compared to the proofs’ size.

Interpolant Sizes. The ICP algorithm implemented in dReal eagerly prunes the domain by applying repeatedly all the constraints. Therefore, it usually generates large proofs often involving all the constraints and all the variables. Interpolation can extract more precise information from the proof. Intuitively, an interpolant which is much smaller than the proof are more likely to be useful in practice. In this test, we try to compare the size of the proof against the size of the interpolants using benchmark from the Flyspeck project [25], certificates for Lyapunov functions in powertrain control systems [27] and the other examples presented in the rest of this section.

We run dReal with a 20 min timeout and generate 1063 interpolants. Out of these, 501 are nontrivial. In Fig. 6 we plot the number of inequalities in the nontrivial interpolants against the size of the proof without the Weakening steps. For similar proofs, we see that the interpolants can be order of magnitude simpler than the proofs and other interpolants obtained by different partitions of the formula. The trivial interpolants still bring information as they mean that the only one side is part of the unsatisfiable core.

Hybrid System Verification. Our method can compute interpolants for systems of ODEs. For instance, we can check that two trajectories do not intersect. Figure 7a shows an interpolant obtained for the following equations:

$$\begin{aligned} A:&~~&x_t = x{_0}~+ \int _0^t \! -x + cos(x) \, \mathrm {d}x~{\wedge }~x{_0}~= 3~{\wedge }~0~{\le }~t~{\le }~2 \\ B:&~~&y_t = y{_0}~+ \int _0^t \! -y + sin(y-1) \, \mathrm {d}y~{\wedge }~y{_0}~= 2~{\wedge }~x_t = y_t \end{aligned}$$

A large portion, 479 out of 1063, of our examples involves differential equations. These examples include: airplane control [5], bouncing balls, networked water tanks, models of cardiac cells [29], verification of the trajectory planning and tracking stacks of autonomous vehicle (in particular, for lane change maneuver [4]), and example from dReal regression tests. Table 1 shows statistics about the interpolants for each family of examples.

Table 1. Results for the interpolation of ODEs. The [_,_] notation stands for intervals that cover the values for the whole families of examples. The first column indicates the family. The next three columns contains the number of tests in the family, the number of flows and variables in the tests. The last three columns shows the size of the proofs, interpolants, and the solving time.

Robotic Design. Often, hybrid system verification is used in model-based design. An expert produces a model of the system which is then analysed. However, it is also possible to extract models directly from the manufacturing designs. As part of an ongoing project about co-design of both the software and hardware component of robots [42], we extract equations from robotic designs. In the extracted models, each structural element is represented by a 3D vector for its position and a unit quaternion for the orientation. The dimension of the elements and the joints connecting them corresponds to equations that relate the position and orientation variables. Active elements, such as motors, also have specific equations associated to them.

This approach provides models faithful to the actual robots, but it has the downside of producing large systems of equations. To verify such systems, we need to simplify them. Due to the presence of trigonometric functions we cannot use quantifier elimination for polynomial systems of equations [13]. However, we use interpolation as an approximation of quantifier elimination.

Let us consider a kinematic model, \({\mathcal {K}}(\varvec{x},\varvec{y},\varvec{z})\) where \(\varvec{x}\) is a set of design and input parameters, \(\varvec{y}\) is the variables that represent the state of each component of the robot, and \(\varvec{z}\) is the variables that represent the parts of the state needed to prove the property of interest. For instance, in the case of a robotic manipulator, \(\varvec{x}\) contain the sizes of each element and the angles of the servo motors and \(\varvec{z}\) is the position of the effector. \(\varvec{y}\) is determined by the designed of the manipulator.

Fully controlled systems have the property that once the design and input parameters are fixed, there is a unique solution for remaining variables in the model. Therefore, we can create an interpolation query:

$$\begin{aligned} \begin{array}{lll} A: &{}~~&{} ~{\mathcal {K}}(\varvec{x},\varvec{y},\varvec{z})~{\wedge }~\\ B: &{}~~&{} ~{\mathcal {K}}(\varvec{x},\varvec{v},\varvec{w})~{\wedge }~(\varvec{z}-\varvec{w}){^2}~{\ge }~{\epsilon }{^2}~\quad \text {where} ~{\epsilon }~>~{\delta } \end{array} \end{aligned}$$

\(\varvec{y}, \varvec{v}\) are two copies of the variables we want to eliminate. Since the kinematic is a function of \(\varvec{x}\) which is the same for the two copies \(\varvec{z}\) and \(\varvec{w}\) should be equal. Therefore, the formula we build has no solution and we get an interpolant \(I(\varvec{x},\varvec{z})\) which is an \(\epsilon \)-approximation of \({\exists }~\varvec{y}.\,{\mathcal {K}}(\varvec{x},\varvec{y},\varvec{z})\).

Fig. 7.
figure 7

Application of interpolation to nonlinear systems

Table 2. Comparison of the original model of a 1 and 2 degrees of freedom manipulator against approximations obtained using interpolation. For the size of the formulas we report the number of theory atoms in the formula. The last column shows the time dReal takes to compute the interpolants.

Example 3

Consider the simple robotic manipulator show in Fig. 7b. The manipulator has one degree of freedom. It is composed of two beams connected by a revolute joint controlled by a servo motor. The first beam is fixed.

The original system of equations describing this system has 22 variables: 7 for each beam, 7 for the effector, and 1 for the revolute joint. Using the interpolation we obtain a simpler formula with only 4 variables: 3 for the effector’s position and 1 for the joint. Table 2 shows some statistics about the interpolants we obtained using different \(\epsilon \) for a one and a two degrees of freedom manipulators.

5 Conclusion and Future Work

We present an method for computing Craig interpolants for first-order formulas over real numbers with a wide range of nonlinear functions. Our method transform proof traces from \(\delta \) decision procedures into interpolants consisting of disjunctive linear constraints. The algorithms are guaranteed to find the interpolants between two formulas A and B whenever \(A~{\wedge }~B\) is not \(\delta \)-satisfiable. Furthermore, we show how the framework apply to systems of ordinary differential equations. We implemented our interpolation algorithm in the dReal SMT-solver and apply the method to domains such robotic design, and hybrid system verification.

In the future, we plan to expand our work to richer proof systems. The ICP loop produces proof based on interval pruning which results in large, “squarish” interpolants. Using more general proof systems, e.g. cutting planes and semi-definite programming [15], we will be able to get smaller, smoother interpolants. CDCL-style reasoning for richer theories, e.g., LA(\(\mathbb {R}\)) [36] and polynomial [26], is a likely basis for such extensions. Furthermore, we are interested in investigating the link between classical interpolation and Craig interpolation over the reals. Using methods like spline interpolation and radial basis functions, it maybe possible to build smoother interpolants. We also to extend the our rules to compute interpolants mixed proofs with both integer and real variables.