The Formalization of Syntax-Based Mathematical Algorithms Using Quotation and Evaluation
The Formalization of Syntax-Based Mathematical Algorithms Using Quotation and Evaluation
William M. Farmer⋆⋆
arXiv:1305.6052v2 [cs.LO] 5 Aug 2013
1 Introduction
The study and application of these kinds of algorithms is called symbolic compu-
tation. For centuries symbolic computation was performed almost entirely using
pencil and paper (and similar devices). However, today symbolic computation
can be performed by computer, and algorithms that manipulate mathematical
expressions are the main fare of computer algebra systems.
In this paper we are interested in the problem of how to formalize syntax-
based mathematical algorithms. These algorithms manipulate members of a for-
mal language in a computer algebra system, but their behavior and meaning are
usually not formally expressed in a computer algebra system. However, we want
to use these algorithms in formal theories and formally understand what they
do. We are interested in employing existing external implementations of these
algorithms in formal theories as well as implementing these algorithms directly
in formal theories.
As an illustration, consider an algorithm, say named RatPlus, that adds ra-
tional number numerals, which are represented in memory in some suitable way.
(An important issue, that we will not address, is how the numerals are repre-
sented to optimize the efficiency of RatPlus.) For example, if the numerals 25
and 38 are given to RatPlus as input, the numeral 31 40 is returned by RatPlus as
output. What would we need to do to use RatPlus to add rational numbers in a
formal theory T and be confident that the results are correct? First, we would
have to introduce values in T to represent rational number numerals as syntactic
structures, and then define a binary operator O over these values that has the
same input-output relation as RatPlus. Second, we would have to prove in T
that, if O(a, b) = c, then the sum of the rational numbers represented by a and
b is the rational number represented by c. And third, we would have to devise a
mechanism for using the definition of O to add rational numbers in T .
The second task is the most challenging. The operator O, like RatPlus, ma-
nipulates numerals as syntactic structures. To state and then prove that these
manipulations are mathematically meaningful requires the ability to express the
interplay of how the numerals are manipulated and what the manipulations mean
with respect to rational numbers. This is a formidable task in a traditional logic
in which there is no mechanism for directly referring to the syntax of the expres-
sions in the logic. We need to reason about a rational number numeral 25 both
as a syntactic structure that can be deconstructed into the integer numerals 2
and 5 and as an expression that denotes the rational number 2/5.
Let us try to make the problem of how to formalize syntax-based mathe-
matical algorithms like RatPlus more precise. Let T be a theory in a traditional
logic like first-order logic or simple type theory, and let A be an algorithm that
manipulates certain expressions of T . To formalize A in T we need to do three
things:
3
f (a + h) − f (a)
lim
h→0 h
if this limit exists. The derivative of f , written deriv(f ), is the function
λ x : R . deriv(f, x).
Line (1) is by the Product Rule; (2) is by the Variable and Sum and Difference
Rules; (3) is by the Power and Constant Rules; (4) is by the Variable Rule; and
(5) is by the simplification rules. Thus, given the function
f = λ x : R . x · (x2 + y),
λ x : R . 3 · x2 + y
care in determining the precise domain of the derivative. For example, differenti-
ating the rational expression x/x using the well-known Quotient Rule yields the
expression 0, but the derivative of λ x : R . x/x is not λ x : R . 0. The derivative
is actually the partial function
λ x : R . if x 6= 0 then 0 else ⊥.
3 Syntax Frameworks
Vsem
Vsyn
′
Vsem
Lobj Q
L Dsem
E Lsyn Dsyn
1. Lsyn ⊆ L.
2. Q : Lobj → Lsyn is an injective, total function.
3. E : Lsyn → Lobj is a (possibly partial) function.
M M
4. For all M = (Dsem , Vsem ) ∈ M, F M = (Dsyn M M
, Vsyn , Lsyn , Q, E) is a syn-
M M M M
tax framework for (Lobj , (L, Dsem , Vsem )) where Dsyn is the range of Vsem
M M
restricted to Lsyn and Vsyn = Vsem ◦ Q.
4 Local Approach
In order to formalize PolyDiff in TR we need the ability to reason about the
polynomials in Lpoly as syntactic structures (i.e., as syntax trees). This can
′ ′
be achieved by constructing a syntax framework for (Lpoly , IR ) where IR =
′ ′ ′
(TR , M ) is an interpreted theory such that TR is a conservative extension of
TR . Since we seek to reason about just the syntax of Lpoly instead of a larger
language, we call this the local approach.
The construction of the syntax framework requires the following steps:
1. Define in TR an inductive type whose members are the syntax trees of the
polynomials in Lpoly . The inductive type should include a new type symbol
S and appropriate constants for constructing and deconstructing expressions
of type S. Let Lsyn be the set of expressions of type S. For example, if x + 3
is a polynomial in Lpoly , then an expression like plus(var(sx ), con(s3 )) could
be the expression in Lsyn that denotes the syntax tree of x + 3. Next add
an unspecified “binary” constant Opd of type S → (S → S) to LR (that is
intended to represent PolyDiff). Let TR′ = (L′R , ΓR′ ) be the resulting extension
of TR . TR′ is clearly a conservative extension of TR .
2. In the metatheory of TR′ define an injective, total function Q : Lpoly → Lsyn
such that, for each polynomial u ∈ Lpoly , Q(u) is an expression e that denotes
the syntax tree of u. For example, Q(x + 3) could be plus(var(sx ), con(s3 )).
3. In the metatheory of TR′ define a total mapping E : Lsyn → Lpoly such that,
for each expression e ∈ Lsyn , E(e) is the polynomial whose syntax tree is
denoted by e. For example, E(plus(var(sx ), con(s3 ))) would be x + 3.
′
Let (Lpoly , IR ′
) where IR = (TR′ , M′ ) and M′ is the set of standard models of
′
TR in simple type theory (see [8]). It is easy to check that F = (Lsyn , Q, E) is
′
a syntax framework for (Lpoly , IR ). Notice that E is the left inverse of Q and
hence the law of disquotation holds: For all u ∈ Lpoly , E(Q(u)) = u.
We are now ready to formalize PolyDiff in TR′ . First, we need to define an
operator in TR′ to represent PolyDiff. We will use Opd for this purpose. We write
a sentence CompBehavior
λ a, b : S . is-var(b) ⇒ B(a, b, Opd (a)(b))
in TR′ where, for all u ∈ Lpoly and x ∈ Lvar ,
B(Q(u), Q(x), Opd (Q(u))(Q(x)))
holds iff
PolyDiff(u, x) = E(Opd (Q(u))(Q(x))).
11
That is, we specify the computational behavior of Opd to be the same as that of
PolyDiff.
Second, we need to prove that Opd is mathematically correct. We write the
sentence MathMeaning
for all u ∈ Lpoly , deriv(λ x : R . u) = λ x : R . E(Opd (Q(u))(Q(x)))
in the metatheory of TR′ that says Opd computes a syntactic value that represents
an expression that denotes the derivative of λ x : R . u at x. And then we prove
in TR′ that MathMeaning follows from CompBehavior. The proof requires showing
that E(Opd (Q(u))(Q(x))) equals deriv((λ x : R . u), x), which is
(λ x : R . u)(x + h) − (λ x : R . u)(x)
lim .
h→0 h
The details of the proof are found in any good calculus textbook such as [17].
Third, we need to show how PolyDiff can be used to compute the derivative of
a function λ x : R . u at x in TR′ . There are two ways. The first way is to simplify
E(Opd (Q(u))(Q(x))) in MathMeaning (e.g., by beta-reduction). The second way
is to replace E(Opd (Q(u))(Q(x))) in MathMeaning with the result of applying
PolyDiff to u and x. The first way requires that PolyDiff is implemented in TR′ as
Opd . The second way does not require that PolyDiff is implemented in TR′ , but
only that its meaning is specified in TR′ .
The local approach is commonly used to reason about the syntax of expres-
sions in a formal theory. It embodies a deep embedding [1] of the object language
(e.g., Lpoly ) into the underlying formal language (e.g., LR ). The local approach
to reason about syntax can be employed in almost any proof assistant in which
it is possible to define an inductive type (e.g., see [1,4,20]).
The local approach has both strengths and weaknesses. These are the
strengths of the local approach:
1. Indirect Reasoning about the syntax of Lpoly in the Theory. In TR′ using Lsyn ,
we can indirectly reason about the syntax of the polynomials in Lpoly . This
thus enables us to specify the computational behavior of PolyDiff via Opd .
2. Direct Reasoning about the syntax of Lpoly in the Metatheory. In the metathe-
ory of TR′ using Lsyn , Q, and E, we can directly reason about the syntax of
the polynomials in Lpoly . In particular, using MathMeaning and the formula
for all u ∈ Lpoly , x ∈ Lvar , PolyDiff(u, x) = E(Opd (Q(u))(Q(x))),
we can specify the mathematical meaning of PolyDiff.
2. Coverage Problem. The syntax framework F can only be used for reasoning
about the syntax of polynomials. It cannot be used for reasoning, for exam-
ple, about rational expressions. To do that a new syntax framework must be
constructed.
3. Extension Problem. Lpoly , Lsyn , Q, and E must be extended each time a new
constant of type R is defined in TR′ .
In summary, the local approach only gives us indirect access to the syntax of
polynomials and must be modified to cover new or enlarged contexts.
If Lobj (which is Lpoly in our example) does not contain variables, then we can
define E to be a total operator in the theory. (If the theory is over a traditional
logic, we will still not be able to define Q in the theory.) This variant of the local
approach is used, for example, in the Agda reflection mechanism [19].
5 Global Approach
The global approach described in this section utilizes a replete syntax framework.
Assume that we have modified TR and simple type theory so that there is a
replete syntax framework F = (Lsyn , Q, E) for (LR , IR ) where IR = (TR , M)
and M is the set of standard models of TR in the modified simple type theory.
Let us also assume that Lsyn is the set of expressions of type S and LR includes
a constant Opd of type S → (S → S). By virtue of F being replete, F embodies
a deep embedding of LR into itself.
As far as we know, no one has ever worked out the details of how to modify
simple type theory so that it admits built-in quotation and evaluation for the
full language of a theory. However, we have shown how NBG set theory can be
modified to admit built-in quotation and evaluation for its entire language [7].
Simple type theory can be modified in a similar way. We plan to present a version
of simple type theory with a replete syntax framework in a future paper.
We can formalize PolyDiff in TR as follows. We will write quote(e) and eval(e)
as peq and JeK, respectively. First, we define the operator Opd in TR to represent
PolyDiff. We write a sentence CompBehavior
λ a, b : S . is-poly(a) ∧ is-var(b) ⇒ B(a, b, Opd (a)(b))
in TR where, for all u ∈ Lpoly and x ∈ Lvar ,
B(puq, pxq, Opd(puq)(pxq))
holds iff
PolyDiff(u, x) = JOpd (puq)(pxq)K.
That is, we specify the computational behavior of Opd to be the same as that of
PolyDiff.
Second, we prove in TR that Opd is mathematically correct. We write the
sentence MathMeaning
∀ a : S . is-poly(a) ⇒ deriv(λ x : R . JaK) = λ x : R . JOpd (a)(pxq)K
13
In short, not only does the global approach enable us to formalize PolyDiff in TR ,
it provides us with the facility to move syntax-based reasoning from the metathe-
ory of TR to TR itself. This seems to be a wonderful result that solves the problem
of formalizing syntax-based mathematical algorithms. Unfortunately, the global
approach has the following serious weaknesses that temper the enthusiasm one
might have for its strengths:
1. Evaluation Problem.
Claim: eval cannot be defined on all expressions in LR .
Proof: Suppose eval is indeed total. TR is sufficiently expressive, in the sense
of Gödel’s incomplete theorem, so apply the diagonalization lemma [3] to
obtain a formula LIAR such that
LIAR = p¬JLIARKq.
Then
JLIARK = Jp¬JLIARKqK = ¬JLIARK,
which is a contradiction. 2
This means that the liar paradox limits the use of eval and, in particular,
the law of disquotation does not hold universally, i.e., there are expressions
e in LR such that JpeqK 6= e.
2. Variable Problem. The variable x is not free in the expression px + 3q (or in
any quotation). However, x is free in Jpx + 3qK because Jpx + 3qK = x + 3.
If the value of the variable e is px + 3q, then both e and x are free in JeK
because JeK = Jpx + 3qK = x + 3.
This example shows that the notions of a free variable, substitution for a
variable, etc. are significantly more complex when expressions contain eval.
14
6 Conclusion
The local approach and close variants are commonly used for formalizing
syntax-based mathematical algorithms. Its major strength is that it provides
the means to formally reason about the syntactic structure of expressions, while
its major weakness is that the mathematical meaning of a syntax-based mathe-
matical algorithm cannot be expressed in the formal theory. Another weakness
is that an application of the local approach cannot be easily extended to cover
new or enlarged contexts.
The global approach enables one to reason in a formal theory T directly
about the syntactic structure of the expressions in T as well as about the inter-
play of syntax and semantics in T . As a result, it is possible to fully formalize
syntax-based algorithms like PolyDiff and move syntax-based reasoning, like the
use of syntactic side conditions, from the metatheory of T to T itself. Unfortu-
nately, these highly desirable results come with a high cost: Significant change
must be made to the underlying logic as illustrated by the Evaluation, Variable,
Extension, and Interpretation Problems given in the previous section.
One of the main goals of the MathScheme project [2], led by J. Carette and
the author, is to see if the global approach can be used as a basis to integrate
axiomatic and algorithmic mathematics. The logic Chiron [7] demonstrates that
it is possible to modify a traditional logic to support the global approach. Al-
though we have begun an implementation of Chiron, it remains an open question
whether a logic modified in this way can be effectively implemented. As part of
the MathScheme project, we are now pursuing this problem as well as developing
the techniques needed to employ the global approach.
Acknowledgments.
The author would like to thank Jacques Carette and Pouya Larjani for many
fruitful discussions on ideas related to this paper. The author is also grateful to
the referees for their comments and careful review of the paper.
References
1. R. Boulton, A. Gordon, M. Gordon, J. Harrison, J. Herbert, and J. Van Tassel. Ex-
perience with embedding hardware description languages in HOL. In V. Stavridou,
T. F. Melham, and R. T. Boute, editors, Proceedings of the IFIP TC10/WG 10.2
International Conference on Theorem Provers in Circuit Design: Theory, Prac-
tice and Experience, volume A-10 of IFIP Transactions A: Computer Science and
Technology, pages 129–156. North-Holland, 1993.
2. J. Carette and W. M. Farmer. Mathscheme: Project description. In J. H. Dav-
enport, W. M. Farmer, F. Rabe, and J. Urban, editors, Intelligent Computer
Mathematics, volume 6824 of Lecture Notes in Computer Science, pages 287–288.
Springer-Verlag, 2011.
3. R. Carnap. Die Logische Syntax der Sprache. Springer-Verlag, 1934.
4. E. Contejean, P. Courtieu, J. Forest, O. Pons, and X. Urbain. Certification of
automated termination proofs. In Frontiers of Combining Systems, volume 4720
of Lecture Notes in Computer Science, pages 148–162. Springer, 2007.
16