Kleene's Second Recursion Theorem: Overview, Implementation and Applications To Computer Virology
Kleene's Second Recursion Theorem: Overview, Implementation and Applications To Computer Virology
Abstract
This self contained document makes a presentation of theory leading up to, involv-
ing and derived from Kleene’s Second Recursion Theorem (SRT), which states that
for any partial recursive function f (x, y), there exists a “fixed-point” k such that
f (k, y) = ϕk (y), where ϕk is the k−th partial recursive function as indexed via Gödel
Numberings of Turing Machines. Following a classic introduction to the theory of
computability via partial recursive functions on the natural numbers and Turing Ma-
chines, we present an array of recursion and computability results (including the
SRT) for acceptable programming languages. Following Jones ( [10]), we show im-
plementations of the SRT in particular computation models TINY( [3]) and 1#( [14]).
Finally, we make an overview of the work of Bonfante et al ( [4], [6], [5]) on ab-
stract computer virology, and show how the SRT and other recursion theorems are
associated with the study and construction of computer viruses.
Contents
Abstract ii
iii
Contents iv
3.2.2 Semantics of 1# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.2.3 1# programs as data . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2.4 SRT with s11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
s − 1 − 1 in 1# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
First proof of SRT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
An implementation of SRT: A computable fixed point finder . . 55
3.2.5 Moss’s Proof of SRT . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5 Conclusion 73
Chapter 1
• There exists a unique y ∈ B such that (x, y) ∈ f . In this case, we say that
f is defined in x, noting it f (x) ↓, and that f evaluated in x is y, (notation:
f (x) = y).
Z(x) = 0, ∀x ∈ N,
1
1.1. Recursive and Computable functions 2
S(x) = x + 1, ∀x ∈ N,
f (⃗
x, 0) = g(⃗
x) ∀⃗
x ∈ Nk
f (⃗
x, y + 1) = h(⃗
x, y, f (⃗
x, y))
f is primitive recursive.
The set of all primitive recursive functions will be noted by PRIM hereafter.
1. Addition: x + y.
2. Multiplication: x × y.
3. Exponentiation: xy .
4. Recursive Difference:
⎧
⎪
⎪m − n if m ≥ n
m−̇n = ⎨
⎪
⎪ otherwise
⎩0
5. Absolute Difference: ∣x − y∣.
6. Order functions:
x, z) = 0] = z0 ⇔def g(⃗
µz[g(⃗ x, z0 ) = 0 and ∀z < z0 ∶ g(⃗
x, z) ↓≠ 0
Example 1.1.1 The set P of all prime numbers is primitive recursive. This follows
from the definition of a prime number (n > 1 is prime if it has 2 divisors) and that
the following function is primitive recursive: D(n) ∶=”The number of divisors of n”.
This is easily proven by noting that D(n) = ∑ni=0 i∣n. With this we have that χP (n) =
sg(D(n)−̇2) × sg(n−̇1).
Definition 1.6 (A Tuple Encoding Scheme) A tuple encoding scheme is a set of re-
cursive functions ⟨_⟩ ∶ Nn → N that to every tuple x⃗ ∈ Nn assign a natural number ⟨⃗
x⟩
such that:
2. Given a natural number y, there exists at most one n ∈ N and one tuple x⃗ ∈ Nn
such that ⟨⃗
x⟩ = y.
3. The set E = {y ∈ N ∶ ∃n ∈ N ∃⃗
x ∈ Nn ∶ ⟨⃗
x⟩ = y} is recursive.
1.1. Recursive and Computable functions 4
is partial recursive.
1. Let p ∶ N → N be the prime numbering function, that is, p(i) ∶=”the i − th prime
number”. This function is recursive as it can be defined:
p(0) = 2
p(n + 1) = µz[(sg(z −̇p(n)) + sg(χP (z))) = 0],
where P is the set of prime numbers. Lets examine the second part of the
definition:
p(0) = 0
p(x + 1) = p(x) + χP (x).
Then mp is recursive and mp(y) is the minimum prime that does not divide y.
If y is a correctly encoded tuple, this means that all primes less than mp(y)
should divide y. This is readily checked by the functions
p(mp(y))
pd(y) = ∑ p(i)∣y "Number of prime divisors of y”
i=0
We can see that ξ(y) = 1 if and only if y ≥ 2 and ALL of the primes less than
mp(y) divide y, and is 0 otherwise. This is a necessary and sufficient condition
for y to be the encoding of some tuple, so χE = ξ and therefore E is a recursive
set. The function L is easily defined now as well, by setting
L(y) = p(mp(y)) if y ∈ E
L(y) = 0 otherwise.
Finally, defining logb (x) ∶= µz[sg(bz −̇x) = 0] as the minimum z such that bz > x,
define for 1 ≤ i ≤ n the projection πi as
Observation 1.2.1 One way of thinking about the automated computation of func-
tions is through the use of machines, as presented in the models below. There are
other intuitive ways of approaching such concept, such as programming languages,
as discussed in Section 2.1.4.
4. Jump [J(x, y, q)]: Suppose Ik = J(x, y, q). If rx = ry , then the next instruction
to be executed is Ik+q . Else, execute the next (Ik+1 ) instruction.
URM Program Execution and Halting Let P = (I1 , ..., In ) be a URM program
where Ii is a basic instruction for i = 1, ..., n.
• Executing a program requires an initial state of the URM, which will be called
input and can be expressed by a tuple (r1 , ..., rk ) ∈ Nk and means that in the
URM, the content of Ri is ri for i = 1, ..., k. As a convention, we will assume
that Rm will initially store 0 for m > k. The execution of P is done then by
sequentially following its instructions, starting from I1 , modifying the contents
of the registers and making jumps appropriately. The execution stops when the
program tries to go to an unexisting instruction (be after the last instruction,
or before the first one).
• To consider that P stops correctly, we will impose the further condition that
the program stops by trying to execute the instruction right after In . This will
be called halting. More formally, an execution of program P halts if one of the
following happens at some point:
Example:
2. Programs (J(1, 1, −1)) and (J(1, 1, 2)) do nothing as well, but stop improperly.
3. Program (J(1, 1, 1), J(1, 1, −1)) does nothing but going back and forth between
the instructions, causing an infinite loop, so execution never stops.
Now that we have the basic definitions about URMs, we explore some further prop-
erties to show that the functions that are URM-computable are exactly the PREC
functions.
Observation 1.2.2 (Basic PRIM functions are in URMCOMP) The programs that exe-
cute the basic PRIM functions are straightforward to write:
3. Projection functions Pik ((x1 , ..., xk )) = xi : For i ∈ N∗ , the program (T (i, 1))
computes Pik for all k ∈ N∗ .
Definition 1.11 (URM Program Standard Form) Let P = (I1 , ..., In ) be a URM pro-
gram. If for every jump instruction Im = J(x, y, q) we have 1 ≤ m + q ≤ n + 1 then we
say that P is in standard form.
From now on we will only deal with URM programs in standard form. This does
not make us lose any generality, as we can easily find for any URM program P a
program P ′ in standard form such that JP Kk = JP ′ Kk for all k ∈ N.
1.2. Computation Models 8
Definition 1.12 (Some auxiliary definitions) The following definitions will come in
handy when proving theorems about URM programs.
1. URM Program Join: Given P = (I1 , ..., In ) and Q = (J1 , ..., Jm ) URM programs,
define the join of P and Q by P ∣Q = (I1 , ..., In , J1 , ..., Jm ). Since we are
assuming the programs are in standard form, we know that during the execution
of a program, there will be no jumps to the middle of another one. (The only
“cross-program” jump allowed in P ∣Q is jumping during the execution of P to
the first instruction of Q).
2. ρ(P )): For P a URM program, define ρ(P ) to be the largest index k of a
register Rk used by P .
Lemma 1.2.1
Let g ∶ N2 → N and h1 , h2 ∶ N → N be URM-computable functions. Then f (x) =
g(h1 (x), h2 (x)) is URM-computable.
Proof. Let Pg , Ph1 , Ph2 be URM programs that compute g, h1 and h2 respectively.
Let L = max{ρ(Pg ), ρ(Ph1 ), ρ(Ph2 ), 2}. Notice that, Ph1 only uses registers in [1; L]
and Ph2 >> L only uses registers in [L + 1; 2L]. Now, consider the program
Notice that if executed with some input x ∈ N, Q halts if and only if h1 (x) ↓ and
h2 (x) ↓. More so, if it does, then at the end of the execution R1 stores h1 (x) and
RL+1 stores h2 (x). With this in mind, we can see that the program
Proof idea. The previous lemma proves a particular case. For the general scenario,
where m need not be 2 and k need not be 1, the proof is similar. The idea is to
simulate m URMs using only one URM, so that we can execute Ph1 ,...,Phm "indepen-
dently", making them use disjoints sets of registers in their executions. (So the L we
defined, for example, would be L = max{ρ(Pg ), ρ(Ph1 ), ..., ρ(Phm ), k, m}). ◻
Lemma 1.2.3
If g ∶ N → N and h ∶ N2 → N are URM-computable then if f ∶ N2 → N is defined by
f (x, 0) = g(x)
f (x, y + 1) = h(x, y, f (x, y))
f is URM-computable.
Proof. The idea is to store the original inputs in some registers, a counter that lets
us know if we have finished computing the function in another, and finally, the last
computed value for the function. We want to compute f (x, y). We will store x in R1
and y in R2 . In R3 there will be a number r3 that indicates that in R6 , the number
f (x, r3 ) is stored (this condition should hold every time before we execute instruction
6 in the program described below). Let Pg and Ph URM programs that compute g and
h respectively. Let L = max{ρ(Pg ), ρ(Ph )}. Then the following program Pf computes
f:
3. J(2, 3, ∣Z[5; L + 3]∣ + 3 + ∣Ph ∣ + 3∣) (This condition tells us if we have finished
the computation and the result is in R4 . If so, go to final instruction, if not,
continue execution).
5. (T (4, 6), T (1, 4), T (2, 5)) (Set up the registers to compute h. Now r4 =
x, r5 = y, r6 = f (x, r3 ))
7. (S(3), J(1, 1, −(∣Ph ∣ + 3 + ∣Z[5; L + 3]∣ + 1)) (Increase counter and jump to
line (3) to check if computation is done.)
◻
1.2. Computation Models 10
f (⃗
x, 0) = g(⃗
x) ∀⃗
x ∈ Nk
f (⃗
x, y + 1) = h(⃗
x, y, f (⃗
x, y))
f is URM-computable.
Proof idea. The previous lemma proves the case k = 1. The proof for the general
case is analogue to that of the Lemma. The main ideas are the same: to copy the
input into another part of the tape, and calculate h on that part (via shifts); and to
keep a "counter" to know when the computation has completed. ◻
With the previous two propositions, and taking into account that the basic prim-
itive recursive functions are trivially also URM-computable, we get the following
result.
Theorem 1.2.5 (PRIM ⊆ URMCOMP)
Every primitive recursive function is URM-computable.
Lemma 1.2.6
Let g ∶ N2 → N be URM-computable. Then the function f ∶ N → N given by
f (x) = µm[g(x, m) = 0] is URM-computable.
Proof. We will store in R1 and R2 the inputs for g. The program Pf next described
computes f :
The idea is to look one by one until we find the first y0 , which is stored in R2 , such
that g(x, y0 ) = 0. If such y0 does not exist, then Pf never halts, which is precisely
what we need. ◻
Proposition 1.2.7 (URMCOMP is closed under Minimalisation)
Let g ∶ Nk+1 → N be URM-computable. Then the function f ∶ Nk → N given by
f (⃗
x) = µm[g(⃗
x, m) = 0] is URM-computable.
Proof idea. Again, the proof for the previous Lemma, which is the case k = 1 of this
proposition, gives us the ideas for the general case: to copy the input into some other
1.2. Computation Models 11
registers, compute g in those registers, and then check if the result of computing g
is 0. If it is, then execution is done, if not, try executing g with the next value of y. ◻
Now we have some even stronger information about URMCOMP.
1. An infinite, (linear, two-sided) tape, divided in cells. Each cell stores either a
1 or a 0.
2. Tape symbols 0 and 1, the symbols that can be stored in the cells.
3. A reading head which is able to move around the tape and read the symbols
contained in the cells, and write symbols in the cells.
4. A set of states q0 , q1 , .... At the start of any step of a computation, the reading
head is specified to be in one of these states.
5. Action symbols: Used by the program to tell the reading head what to do in
relation to its current cell
Observation 1.2.3 (More general Turing Machines) We can allow for more sym-
bols to be stored in the cells of a TM. Having defined a finite set of tape symbols
{S0 , ..., Sn }, there should be, for i = 0, ..., n, an action symbol Si to print Si on the
current cell.
Q = qi SAqj
where qi and qj are states, S is a tape symbol and A is an action symbol. This
quadruple expresses the instruction: If T is in state qi reading S, then perform
action A and change the internal state to qj .
A set X of quadruples is said to be consistent if
placing the tape head on the leftmost 1 and setting the internal state to q0 . We can
assume that the rest of the tape is filled with 0s. At any point during the execution of
the program, the tape head will be reading some tape symbol S with some internal
state qi . If there is some Q ∈ P with the form qi SAqj where A and qj are an action
symbol and qj an internal state, then Q is applied, performing action A and setting
the internal state to qj . Execution of the program continues as long as there are
applicable quadruples. If at some point of the computation there are no applicable
quadruples, we say the execution halts.
Analogous to URMs, we now define what it means for a function to be Turing-
computable.
1.2. Computation Models 13
• If executing P with input x⃗ eventually halts, and when it does there are z 1’s
left on the tape, then JP Kk (⃗
x) = z.
2. The successor function is easy to compute, just read 1 until a 0 is read. There,
place an extra 1:
q0 1Rq0
q0 01q1
q2j 10q2j+1
q2j+1 0Rq2j
q2j 0Rq2(j+1)
We use i − 1 since the states are 0-based. The above subroutine "cleans up" the
entries x1 , ..., xi−1 . In our input convention, after running the above portion of
1.2. Computation Models 14
the program, the state of the tape will be (xi , xi+1 , ..., xk ) and the tape head will
be reading the beginning of xi . Next, we need to read xi leaving it untouched:
q2(i−1) 1Rq2(i−1)
q2(i−1) 0Rq2i
q2i 10q2i+1
q2i+1 0Rq2i
q2i 0Rq2i+2
q2i+2 10q2i+1
◻
We would like to prove now that TMCOMP is closed under Composition, Primi-
tive Recursion and Minimalisation. For these, we need a way of composing Turing
Machines. We face the problem that although the input is given in some standard
form, the output is not. To solve this, we state the following Proposition, proved as
Example 2.4.14 of [8]. Though important, the proof for this is purely technical. For
the following definitions, we will introduce the use of extra symbols, as discussed in
Observation 1.2.3. These symbols will not affect our definition of J_K.
1. P >> kk: Let k ∈ N. For a quadruple Ql of the form qi SAqj , define Ql >> k =
qi+k SAqj+k . Define the program P >> k = {Q1 >> k, ..., Qn >> k}.
2. ρ(P )): Define ρ(P ) to be the largest index k of an internal state qk used by P .
3. The Shift Right program. This program will be noted by Shift. Given an input
1 ... 1 0...
²
n times
1.2. Computation Models 15
# 1...1 0...
±
n times
with the tape head over the # symbol. It is easy to write this program.
Lemma 1.2.11
If g ∶ N2 → N and h1 , h2 ∶ N → N are TM computable, then f ∶ x → g(h1 (x), h2 (x)) is
TM computable.
Proof sketch. Let Pg , Ph1 , Ph2 compute g, h1 , h2 respectively. By Proposition 1.2.10
we can assume that these programs compute g,h1 and h2 with certain uniformity in
the output (as stated in the Proposition). Consider the following subroutines that
make up our desired program:
• Copy the input. For this, we use extra tape symbols a, #. For each 1 read,
place an a, go to the end of the input, place a 1, and go back to the rightmost
a, where the process starts again. At the end, the internal state should be q5 ,
the tape will contain
1...1 # 1...1
± ±
x + 1 times x + 1 times
q0 1aq1 q1 aRq1
q1 1Rq1 q1 0Rq2
q2 1Rq2 q2 01q3
q3 1Lq3 q3 0Lq3
q3 a1q4 q4 1Rq0
q0 0#q0
q0 #Rq5
• Compute h2 (x). Add the quadruples of the program Ph2 >> 5, and append an
extra 1 at the end of the tape. At the end of this execution, the tape will look
like
1...1 # 1...1
± ±
x + 1 times h2 (x) + 1 times
• Compute h1 (x). Now we would like to compute h1 (x) using the input that we
copied. There are several things that should be taken into account here:
1.2. Computation Models 16
– This needs to happen right after the execution of Ph2 >> 5. Set M =
ρ(Ph2 >> 5). By Proposition 1.2.10, we can assume that execution always
halts on some fixed state qω2 ∈ [5, M ]. Add the quadruple
qω2 11qM +1
1...1 # 1...1
± ±
h1 (x) + 1 times h2 (x) + 1 times
and the tape head will be placed on the leftmost 1 with some internal
state qω1 .
1...1 0 1...1 ,
± ±
h1 (x) + 1 times h2 (x) + 1 times
the tape head will be over the leftmost 1 and the internal state will be qK+2 .
Adding the quadruples of Pg >> K + 2 completes the program.
◻
Proposition 1.2.12 (TMCOMP is closed under Composition)
If g ∶ Nm → N and hi ∶ Nk → N for i = 1, ..., m are Turing-computable, and
f (⃗
x) = g(h1 (⃗
x), ..., hm (⃗
x)) then f is Turing-computable.
1.2. Computation Models 17
Lemma 1.2.13
If g ∶ N → N and h ∶ N3 → N, then f ∶ N2 → N defined by
f (x, 0) = g(x)
f (x, y + 1) = h(x, y, f (x, y))
is TM computable.
Proof sketch. The idea is to separate the tape into sections: the first two will
contain the input x, y. The third section will contain some y0 ≤ y indicating that
the fourth section contains the value of f (x, y0 ). As we will see, we will need more
sections, but this is the general idea. Let Pg and Ph be TM programs that compute
g and h respectively, outputting as described in Proposition 1.2.10.
1 ... 1 0 1 ... 1
² ²
x + 1 times y + 1 times
2. Copy and modify the contents of the tape so the end result looks like this
3. Place the tape head at the beginning of the last section of the tape, run Pg ,
and add a final 1, to obtain
4. Check if the second and third sections of the tape are equal. If so, return the
contents of the fourth section, minus a one. To do this, clean everything that
is before the leftmost 1 of the fourth section. Otherwise, continue to next step.
1.2. Computation Models 18
6. Copy, shift and modify the contents of the tape so that they now are
Place the head at the leftmost 1 of the fourth section (the beginning of the
second 1 × (x + 1)).
7. We can now run (a shifted version of) Ph and add a final 1, so that the result
is
8. Go to Step 4.
◻
Proposition 1.2.14 (TMCOMP is closed under Primitive Recursion)
If g ∶ Nk → N and h ∶ Nk+2 → N are Turing-computable then if f ∶ Nk+1 → N is defined
by
f (⃗
x, 0) = g(⃗
x) ∀⃗
x ∈ Nk
f (⃗
x, y + 1) = h(⃗
x, y, f (⃗
x, y))
f is TM computable.
Proof idea. Once more, we showed how to prove the particular case k = 1. For the
general case we can use the same idea. We don’t need to add more sections to the
tape, just keep a copy of the full original input at the beginning of the tape, just like
we do with the particular case. ◻
The previous propositions prove the following theorem.
Theorem 1.2.15 (PRIM ⊆ TMCOMP)
Every primitive recursive function is Turing-computable.
Lemma 1.2.16
Let g ∶ N2 → N be Turing-computable. Then the function f ∶ N → N given by
f (x) = µm[g(x, m) = 0] is Turing-computable.
1.2. Computation Models 19
Proof. Again we use the trick of partitioning the tape and copying the input. Let
Pg be a TM program that computes g like in Proposition 1.2.10.
1 ... 1 0...
²
x + 1 times
2. Copy the input and modify the tape so it ends up looking like this
1 ... 1 # 1 # 1 ... 1 0 1
² ²
x + 1 times x + 1 times
for some m ∈ N.
4. Place the tape head on the leftmost 1 of the third section (right after the second
# symbol), run (a shifted version of) Pg and add an extra 1 to the end so the
result is
If g(x, m) = 0 (or in other words, if there is only one 1 after the second #
symbol), then clean up the tape by zeroing everything from the leftmost 1 to
the first #, and from the second # symbol to the last (and only) 1 to its right.
This sets m as the output. Otherwise, go to the next step.
5. Shift the third section of the tape and add a 1 to the second one so the tape
now looks like
6. Copy the first two sections of the tape so that the tape contents now are
7. Go to step 3.
By following these steps it is clear that if there exists an m such that g(x, m) = 0,
then the TM described above will eventually halt and output such m. ◻
1.3. Coding and the theorems of Recursion 20
Definition 1.17 (Gödel numbers for Turing Machines) Define the following codes
for tape symbols, action symbols and internal states:
gn(L) = 2
gn(R) = 3
gn(qi ) = 2i + 4 Even numbers greater than 2
gn(Si ) = 2i + 5 Odd numbers greater than 3
With this coding we assure that given two different quadruples, their numberings will
be different. More so, given gn(Q) we use the recursive projection functions πi to
unequivocally find the original Q.
Now, given a TM program P = {Q0 , Q1 , ..., Qn } we can code it as
Notice that the coding for each program is not unique, as it depends on the order
we choose for the quadruples. Define then a relation ≃g between T M programs
and natural numbers where P ≃g c if and only if P ’s quadruples can be ordered as
gn(Q )
(Q0 , Q1 , ..., Qn ) such that the equation above (c = Πni=0 pi i ) holds. If P ≃g c, we
say that c is a Gödel Numbering or Index for P . Notice that if P and P ′ are different
T M programs and if for some c ∈ N, P ≃g c, then P ′ ≃/ g c. So different T M programs
will have different sets of indexes.
Gödel numbers provide us with a recursive way of indexing all Turing Machines.
Definition 1.18 (Turing Machine indexes) For e ∈ N, the eth Turing Machine Pe is
defined by
JeKk = JPe Kk
For the proofs of the following theorems, we use the primitive recursive function eq
as defined in 1.1.1.
Lemma 1.3.1
The set of Turing Machine indexes T i = {e ∈ N ∶ Pe ≠ ∅} is recursive.
Proof sketch. First we define a function q such that, for x ∈ N,
⎧
⎪
⎪1 if gn(Q) = x for some quadruple Q
q(x) = ⎨
⎪
⎪ otherwise
⎩0
1. Checking that L(x) = 4. If it is not, then q(x) = 0.
2. Checking that the first and fourth projections (π1 (x) and π4 (x)) are codes for
internal states, that is, that they are even numbers greater than 2. If not, then
q(x) = 0.
3. Checking that the second projection is a code for a tape symbol, that is, it is
an odd number greater than 3. If not, then q(x) = 0.
4. Checking that the third projection is a code for an action symbol, that is, it is
either 2 or 3 or an odd number greater than 3. If not, then q(x) = 0.
If all of the conditions above held, then q(x) = 1. We can check now for any x ∈ N if
it is the encoding of a set of quadruples by checking that each of the projections is
a valid quadruple. We can do this by taking the following sum:
L(x)
∑ q(π(i, x)),
i=1
1.3. Coding and the theorems of Recursion 22
where Qk = π(k, x), and checking that such sum is 0. With enough care, one could
now define recursively the function χT i using the ideas presented above. ◻
Theorem 1.3.2 (The Enumeration Theorem)
f ∶ N1+n → N defined by f (e, x⃗) = JeKn (⃗
x) for all x⃗ ∈ Nn is partially recursive.
Proof idea. Fix n ∈ N∗ . The idea is to code the state of a TM program execution
with numbers recursively, similar to how we code the programs themselves. Sup-
pose we have an index e corresponding to some Turing Machine with tape alphabet
A = {S0 , S1 , ...}. Recall that in Definition 1.3.1 we have already defined codings
gn(qi ) and gn(Si ) for states and tape symbols.
In a similar way to the proof of Lemma 1.3.1, we could show that the set of
codes of tape states is recursive. We can also define a recursive initialization
1.3. Coding and the theorems of Recursion 23
function init ∶ Nn → N such that for all x1 , ..., xn ∈ N, init(x1 , ..., xn ) is the
code of the initial tape state with input (x1 , ..., xn ), that is, the code of a tape
state with
Calculating the next step in the execution (as a partial recursive function):
We will define then a partial recursive function next ∶ N2 → N which on input (e, s),
where s = gn(q, wL , σ, wR ), will simulate executing one step of Turing Machine Pe
from state (q, wL , σ, wR ). More explicitly, given s ∈ N, the function does the following:
3. Finally, if e and s were valid codes and a matching quadruple qσAqj was found,
calculate the next state φ′ = (qj , wL′ , σ ′ , wR
′
) of the execution.
In summary,
and this function clearly counts the 1’s in w ∈ A∗ given its code gn(w). With this
function, define
f (e, x⃗) = oc(π2 (sF )) + eq(π3 (sF ), gn(1)) + oc(π4 (sF ))
where sF = final(e, x⃗). ◻
Proof. By the Enumeration Theorem 1.3.2, we know that the function f (e, x) = JeK(x)
is partial recursive. In section 1.2.2 we proved that PREC ⊆ TMCOMP and so there is a
TM U that computes f . ◻
Observation 1.3.1 (Importance of the Universal Machine) In [15], Rogers gives some
very interesting insight on the implications of Theorem 1.3.3.
When talking about “mechanical complexity”, Rogers refers to that of Turing Ma-
chine U , which provides a bound on the number of internal states and of instructions
that the tape head needs to be able to perform in order to compute ALL computable
functions of one variable.
Observation 1.3.2 (Turing Machines with nice outputs revisited) For the last the-
orems, notably the Enumeration Theorem 1.3.2, we did not care about the output of
Turing machines being nicely formatted, as in Proposition 1.2.10. Indeed, there are
indexes e ∈ N for which Pe is a Turing Machine that does not output in a nice format,
yet the previous theorems hold for these indexes. However, this nice output property
is desirable for our later work. Luckily, the following proposition solves this issue.
We state it without proof, which could be written following the proof of 1.2.10, found
in [8].
Proof. Without loss of generality, we can assume that Py outputs with nice format
(1.2.10), as we could write nice(y) instead of y, and the proof would yield the
same result. The idea is simply to take Turing machines Px and Py and output
Py ∪ (Px >> ω), where ω ∈ N is the final state of Py .
• First, given y, we need to find the final state of Py . By definition, such state
should have the greatest index of all states in quadruples of Py . Following
1.3. Coding and the theorems of Recursion 26
• It can be easily proved that there is a partial recursive function shf t such that
for a quadruple Q = qi SAqj and a number k, shf t(gn(Q), k) = gn(Q >> k),
where Q >> k results in adding k to the state indexes of Q. So define
L(x)
Comp(x, y) = gn(y) × ∏ p(i + L(y) − 1)shf t(π(i−1,x),om(y))
i=1
◻
Theorem 1.3.6 (The s-1-1 Theorem)
There exists a recursive function s11 ∶ N2 → N such that for all e ∈ N, y ∈ N, z ∈ N
Js11 (e, y)K(z) = JeK2 (y, z)
Proof. For a given x ∈ N, there is a Turing Machine Wx that, when executed with
initial state:
1 ... 1 ,
²
k times
and with the tape head on the leftmost 1. Let’s see that we can actually define a
recursive function w such that for every x ∈ N, Jw(x)K = JWx K. There is a Turing
machine which shifts the tape contents to the right by one place, leaving the tape
head in its original position, let s be the index for such machine. There is also a
Turing machine o with nice output which simply writes a 1 wherever the tape head
is, without moving it. With these two machines, define w as
w(0) = Comp(o, Comp(s, s))
w(n + 1) = Comp(o, Comp(s, w(n))),
and so Jw(x)K = JWx K. Now define
s11 (e, y) = Comp(e, w(y))
to obtain the desired result. ◻
1.3. Coding and the theorems of Recursion 27
◻
Theorem 1.3.6 can in fact be generalized. We can take a function of m + n variables
and fix m of them, leaving as a result a function of n variables. This is called the
S-m-n Theorem, which we state below. The proof follows the same ideas that the
proof for s11 does, with added technical difficulty, so we do not present it here.
Theorem 1.3.8 (The s-m-n theorem)
⃗ ∈ Nm and y⃗ ∈ Nn :
n such that for every e ∈ N, x
There exists a computable function sm
⃗)Kn (⃗
n (e, x
Jsm x, y⃗)
y ) = JeKm+n (⃗
We now state one of the main interests of this work, Kleene’s Second Recursion
Theorem, or SRT for short. This is the formulation and proof of the SRT used by
Jones in [11].
f (k, x) = JkK(x)
Proof. Consider the function g(y, x) = f (s11 (y, y), x). By Theorem 1.3.2, there
exists an index p such that
So, it does not matter which computation model we use, the functions calculated
are the same PREC = TMCOMP = URMCOMP = COMP.
Chapter 2
Introduction
Looking towards general applications, we would like to work with functions that work
over sets different than N, usually the words of a language over some alphabet A.
When thinking about computability as the ability to describe algorithms effectively in
“some” language, we immediately think about some popular programming languages
that we use today (Java, C++, etc). Programs in those languages are not limited to
functions over the natural numbers, but instead, work with some inputs and outputs
on some set D of data. In this work, we need to deal with these general kinds of
algorithms. This chapter discusses the notion of computability for such functions, and
introduces Roger’s Axioms, which characterize “acceptable” programming languages
in which we have many results related to Kleene’s Second Recursion Theorem.
29
2.1. Acceptable Programming Languages 30
2.1.1 Preliminaries
Definition 2.1 (Pairing Function) A function ⟨_, _⟩ ∶ N2 → N is a pairing function if
it is total recursive and there exist partial recursive projections π1 , π2 such that
πi (⟨x1 , x2 ⟩) = xi i = 1, 2
Definition 2.2 (Pairing Scheme) A pairing scheme is a set of total recursive func-
tions ⟨_⟩ ∶ Nn → N with partial recursive projections πin such that for every i, n ∈ N
πin (⟨x1 , ..., xi , ..., xn ⟩) = xi
Note that every Tuple Encoding Scheme as in Proposition 1.1.1 is a Pairing Scheme.
Observation. Every pairing function gives rise to a pairing scheme by defining:
⟨x1 ⟩ = x1
⟨x1 , x2 , ..., xn ⟩ = ⟨x1 , ⟨x2 , ..., xn ⟩⟩.
There exist many pairing functions. From now on, assume ⟨_, _⟩ to be an arbitrary
but fixed pairing function. Its explicit definition will be irrelevant for our work.
The "standard" indexing (ϕ) Gödel numberings from last Chapter defined an in-
dexing of PREC(1). We will fix this indexing as a standard and call it ϕ, so that
ϕi = JPi K, where Pi is the ith Turing Machine as defined in 1.3.1.
Definition 2.4 (Some definitions for indexings) Let π be an indexing. We say that
π is:
1. Effective if there exists a recursive function f ∶ N → N such that π = ϕ ○ f .
Observation 2.1.1 (ϕ has all the properties) We have already proved that ϕ is uni-
versal, has the s11 property and has recursive composition (Theorems 2.1.1, 2.1.2,
2.1.3). It is also trivially effective, programmable and thus acceptable.
π =ψ○f
ψ = π ○ g,
2.1. Acceptable Programming Languages 32
so not only do π and ψ compute the same functions, but there is an effective way
to go back and forth between them. Effectiveness and programmability are usually
refered to as Roger’s axioms for acceptable indexings.
(⇐) Suppose now that π is effective, and so there exists a recursive f such that
ϕ ○ f = π. By Theorem 2.1.1, ϕ is universal and so it has a recursive universal
function uϕ . Defining the function u(e, x) = uϕ (f (e), x) we have that for all e, x ∈ N,
u(e, x) = uϕ (f (e), x)
= ϕf (e) (x)
= πe (x),
Theorem 2.1.5 (Programmable, s11 and composition are equivalent for effective indexings)
Let π be an effective (and thus universal) indexing. The following are equivalent:
1. π is programmable.
4. π is acceptable.
Proof. We will prove (4) ⇔ (1) ⇒ (2) ⇒ (3) ⇒ (1). The equivalence (1) ⇔ (4)
comes from the definition of an acceptable indexing. Since π is effective, then there
exists a recursive f such that ϕ ○ f = π.
Programmable ⇒ recursive composition: Suppose π is programmable, and so there
exists a recursive function g such that π ○ g = ϕ. Let cϕ be a recursive composition
2.1. Acceptable Programming Languages 33
function for ϕ. Define c(i, j) ∶= g(cϕ (f (i), f (j))). Clearly c is recursive and we have
for all i, j, x ∈ N,
h(0) = a
h(x + 1) = c(b, h(x)).
Since c is recursive, then so is h. First, we prove by induction that πh(x) (z) = ⟨x, z⟩
for every x, z ∈ N. By definition, πh(0) (z) = πa (z) = ⟨0, z⟩. Now, supposing πh(x) (z) =
⟨x, z⟩ we have
Defining sπ (e, x) ∶= c(e, h(x)), we have that sπ is clearly recursive, and for all e, x, y ∈
N,
so π is programmable. ◻
2.1. Acceptable Programming Languages 34
⟨i, j⟩ = k ∧ i, j, k ∈ N ⇔ ⟨di , dj ⟩ = dk
For all i, x, y ∈ N:
Definition 2.6 (Turing Completeness) We say J_K is Turing complete if the number-
ing π defined above is an indexing of the unary recursive functions.
Observation 2.1.3 It is easy to see from our last definition, and the Church-Turing
thesis, that a programming language is Turing complete if and only if every effective
computable (in the informal, algorithmic sense) function on D, and no other, is J_K
computable.
2.1. Acceptable Programming Languages 35
JU K(p, d) = JpK(d)
3. It has the s11 property, namely, there exists a program s11 ∈ D such that for every
p, x, y ∈ D,
Proof. The proof follows from Theorem 2.1.5 and the facts
Proof. Recall that the numberings are defined as follows, for all i ∈ N:
π σ (i) = σ ○ Jσ −1 (i)K ○ σ −1
π ρ (i) = ρ ○ Jρ−1 (i)K ○ ρ−1 .
Since (D, J_K) is acceptable (with respect to σ), then π σ is an acceptable indexing
(of PREC(1)). We show that there exist recursive functions f, g ∶ N → N such that
πσ = πρ ○ f
π ρ = π σ ○ g.
1. Since J_K is Turing Complete, there exists a program κ ∈ D such that for all
p, x ∈ D:
Since (D, J_K) is acceptable, there exists a specializer program s11 , and so the
function f ∶ N → N defined by
Repeating the process in the previous part of the proof, we find that π ρ = π σ ○ g,
where g ∶ N → N is recursive.
◻
2.2. Results for acceptable programming languages 37
A subset of programs In the real world, not every element of D represents a "valid"
program. For example, not all ASCII sequences represent C + + programs, however,
every computable function over ASCII sequences can be computed with a C + +
program. For this reason we may distinguish a subset P gms ⊆ D of valid programs,
such that trying to execute some d in D − P gms makes no sense, that is, JdK(x) ↑
for all x ∈ D, and all J_K computable functions are computed by some p ∈ P gms. It
is clear then that if (D, J_K) is acceptable, then all “witnesses” (the universal and s11
programs, and programs for Turing Completeness) of acceptability are in P gms.
We now prove the SRT for any acceptable programming language. The proof is
identical to that of Theorem 1.3.2.
Theorem 2.2.1 (SRT for acceptable programming languages)
Let n ∈ N∗ . Let p ∈ P gms. There exists p′ ∈ P gms such that for every x⃗ ∈ D
We have that
Jp′ K(⃗ x)
x) = JJs11 K(p, p)K(⃗
= JpK(p, x⃗)
= JpK(Js11 K(p, p), x⃗)
= JpK(p′ , x⃗)
Observation 2.2.1 (Use of Roger’s axioms in the above proof) Though Roger’s ax-
ioms are indeed enough to prove the SRT, notice that:
So, as we will see in the next chapter, the SRT may hold even in languages that are
far from being Turing Complete, and for which we need not prove the existence of a
universal program.
f (0, y) = g(y)
f (x + 1, y) = h(f (x, y), x, y)
Our previous work proved that if h and g are computable then so is f . The
SRT provides another way to show closure under recursion. Suppose p is a
program such that:
JpK(e, 0, y) = g(y)
JpK(e, x + 1, y) = h(JeK(x, y), x, y)
2.2. Results for acceptable programming languages 39
We apply the SRT when we need to find programs that somehow use their own code.
So the restrictions are less on what the resulting code is, and more on how this
own code is used. The SRT allows us to define how a program works with certain
parameter (Jp′ K(x)) in terms of a computable function that uses both the parameter
x and the code of the program p′ .
Computing Fixed Points Not only can we prove the existence of fixed points for
any computable function in an acceptable programming language, we can compute
them, as it is seen in the next theorem. This will prove useful later on in Chapter 4.
Proof. The key point for the proof of the SRT (Theorem 2.2.1) is, from a program p,
to obtain a program p such that
where U is the universal program. Defining now bar(p) = Js11 K(γ, p) we have a
computable function that from p finds p̄. So now, defining
x) = JqK(e, y, x⃗)
JJeK(y)K(⃗
2.2. Results for acceptable programming languages 40
Proof. Let q ′ be a program such that Jq ′ K(⟨z, y⟩, x⃗) = JqK(z, y, x⃗). By Turing
Completeness there exists a program p such that JpK(z, y) = Js11 K(q ′ , z, y) for all
z, y ∈ D. Let e be a fixed point for p given by the SRT. We have JeK(y) = Js11 K(q ′ , e, y)
and so
◻
We will call the resulting program e of the last theorem an explicit fixed point.
As with 2.2.2, we can find such explicit fixed points computably from a program q.
Theorem 2.2.4 (Computable Explicit Recursion)
There exists a program xrt such that for every q ∈ P gms, x, y ∈ D:
x) = JqK(JxrtK(q), y, x⃗)
JJJxrtK(q)K(y)K(⃗
Proof. Let ○ ∈ P gms be the composition program JJ○K(p, q)K(x) = JpK(JqK(x)). Since
we have computable pairings ⟨⟩ and projections, there exists a program r such that
for all z, y, x⃗ ∈ D
and so
◻
2.2. Results for acceptable programming languages 41
About the explicit recursion theorem The explicit recursion theorem is a stronger
version of the SRT (and hence it is for some authors known as the strong recursion
theorem [17]). We can think of the explicit fixed point e as a program provider, and
so in reality we have a family of programs {JeK(x) ∶ x ∈ D}, such that each of them,
when executed with some parameter y, takes into account not only such y, but also
possibly its own code JeK(x), the code of the provider e, and even the code of some
other programs of the family.
JJJeK(i)K(x)K(y) = JqK(e, i, x, y)
We call e from the above Corollary a nested explicit fixed point. We prove a slightly
stronger version, which tells us how to computably find nested explicit fixed points.
JJJJnestK(q)K(i)K(x)K(y) = JqK(JnestK(q), i, x, y)
Defining
JnestK(q) = JxrtK(JrK(q))
JJJeK(i)K(x)K(y) = JJJJxrtK(JrK(q))K(i)K(x)K(y)
= JJJrK(q)K(e, i, x)K(y)
= JJJs11 K(T, JtK(q))K(e, i, x)K(y)
= JJT K(JtK(q), e, i, x)K(y)
= JJs11 K(JtK(q), e, i, x)K(y)
= JJtK(q)K(⟨e, i, x⟩, y)
= JqK(e, i, x, y),
Proof. Let f (z, y, x) = JqK(y, JzK(g1 (y)), ..., JzK(gn (y)), x). There must be a program
q that computes f . Applying the Explicit Recursion Theorem (2.2.3) to q, there exists
a program e such that
JJeK(y)K(x) = f (e, y, x)
= JqK(y, JeK(g1 (y)), ..., JeK(gn (y)), x)
We will call the resulting programs p′ , q ′ from the last Theorem double fixed points
for p and q. We prove a stronger form of the Theorem, which as for 2.2.2, allows us
to compute double fixed points
Proof. By Turing Completeness and the s11 program, there exist programs t, r1 , r2
such that for all x, y, z ∈ D,
p′ = JtK(J○K(p, r1 ), J○K(q, r2 ))
q ′ = JtK(J○K(q, r2 ), J○K(p, r1 )),
2.2. Results for acceptable programming languages 43
In an identical fashion, we see that Jq ′ K(z) = JqK(p′ , q ′ , z), and so we have that smu
is a double fixed point finder program. ◻
About double fixed points The SRT allows us to define programs that make refer-
ence to their own code when executing. The double recursion theorem allows us to
define a pair of programs that make reference possibly to their own code, and to the
other program’s code. For example, using the SRT we can find a program p′ such that
for all x ∈ D, JpK(x) = J○K(p, x). Using the double recursion theorem we can find a
pair of programs p, q ∈ D such that for all x ∈ D, JpK(x) = J○K(q, x), JqK(x) = J○K(x, p).
Chapter 3
Introduction
Following the discussion we introduced on Section 2 we now turn our attention to
studying the implementation of Kleene’s Theorem on specific models of computation
(with specific programming languages). We follow Jones’s Swiss Pocket Knife for
Computability [11], whose purpose is to talk about the complexity of SRT. We study
two concrete computation models:
• The TINY language, by Bonfante and Greenbaum [3], which works on tree-
structured data.
• If t1 , t2 ∈ TA then (t1 ⋅ t2 ) is the tree with a root node that as left child has tree
t1 , and as right child tree t2 .
The projection functions work as usual, πi (t1 ⋅ t2 ) = ti and πi (a) = a for i = 1, 2 and
a ∈ A. The size of a tree is defined intuitively as ∣a∣ = 1 and ∣(t1 ⋅ t2 )∣ = ∣t1 ∣ + ∣t2 ∣ + 1.
The word a1 a2 ...an ∈ A is encoded in TA∪{nil} as (a1 ⋅ (a2 ⋅ (... ⋅ (an ⋅ nil))))
44
3.1. The TINY language for tree-structured data 45
List notation As deeply parenthesized structures are hard to read, we will some-
times use list notation for elements in TA .
• () stands for nil.
2 x ∈ V ar
3 t ∈ TA
4 // Expressions
5 Exp ::= x | '\ ' 't | ' cons ' Exp Exp | ' hd ' Exp Exp | ' tl ' Exp
6 // Commands
9 Program ::= ' read ' x ( ' , ' x ) * '; ' Cmd '; write ' x
Note: In the definition of Exp we use write /’ to “escape” the quote, as would be
done in a high level language. The grammar rule tells that ′ t for t ∈ TA is an Ex-
pression.
Definition 3.1 (List syntactic sugar) For ease of notation, define, for expressions
E1 , ..., En , the command:
Definition 3.2 (Stores) The "state" of an execution of a TINY program will be deter-
mined by the value of each variable. A store is a function σ ∶ Var ↦ TA , it will be
one of such states. The set of all stores will be noted by G.
Given σ ∈ G, t ∈ TA and X ∈ Var, define the store σ[X ↦ t] by:
⎧
⎪
⎪t if Y = X
σ[X ↦ t](Y ) = ⎨
⎪
⎩σ(Y )
⎪ otherwise
3.1. The TINY language for tree-structured data 46
Informally, hd gets the head (or left child) of an expression, tl gets the tail (right
child), and cons builds a tree with the specified children. As a quick example:
Semantics of Commands Each command updates the store, so for C ∈ Cmd we have
JCK ∶ G → G. It is defined as follows:
Semantics of Programs Finally we define how J_K works for programs. Consider
a program p = read X1 , ..., Xn ; C; write Y . Given t1 , ..., tn ∈ TA , take the initial
configuration (store) σ0 (t1 , ..., tn ) where σ(Xi ) = ti for i = 1, ..., n, and σ(Y ) = nil for
all other variables.
Finally, we define
p = ((X1 ...Xn ) C Y )
s − 1 − 1 in TINY
First, we will need a specializer, or a s11 program such that for every program
p,s, d ∈ D: JpK2 (s, d) = JJs11 K(p, s)K(d). Consider a program p = read q, d; Cp ; write out .
Suppose we want to fix q = s for some s ∈ D. The resulting specialized program could
be p∗ = read d;q:='s;Cp ; write out We need a general formulation for s11 . In concrete
syntax, we have a program
p = ((q d) Cp out)
such that
1
As there are no loops, or similar constructs, the complexity of a program in TINY is purely
determined on its length, not on the size of the input. So TINY is very limited in terms of the functions
it computes. The fact that the SRT holds in TINY is then remarkable.
3.1. The TINY language for tree-structured data 48
Js11 K(p, s) →
At the beginning
[pgm = p]
After inputvar
[pgm = p, inputvar = q]
After C
[pgm = p, inputvar = q, C = Cp ]
After outputvar
[pgm = p, inputvar = q, C = Cp , outputvar = out]
After initialise
[pgm = p, inputvar = q, C = Cp , outputvar = out, initialise = (∶= q(quotes))]
After body
[pgm = p, outputvar = out, initialise = (∶= q(quote s)), body = (; (∶= q(quote s))Cp )]
After body
[pgm = p, outputvar = out, body = (; (∶= q(quote s))Cp )]
After outpgm
[outpgm = ((d) (; (∶= q (quote s)) Cp ) out)]
Implementing SRT
Now that we have a s11 program, we are ready to implement SRT. We follow the steps
of proof 2.2.1.
Let p = read q, d; Cp ; write out . We want to find some p′ such that
Define p =
1 read q , d ;
2 /* First , we need to run s11 with arguments (q, q) . So we
set its inputs to (q, q) and run its body ( as in 3.1.5 ) */
3 pgm := q ; s := q ; C spec ;
3.2. 1#, a language for text register machines 49
2. Write program p′ such that p′ = Js11 K(p, p). By taking p in its concrete syntax
and executing s11 we get our p′ , which we already proved to be a fix point for
p. But we can also write p′ directly using the idea that p′ = Js11 K(p, p) and that
Jp′ K(d) = JpK(p′ , d). p′ =
1 read d ;
2 /* Set arguments of s11 to p and execute its body . */
3 pgm := ' p ; s := ' p ; C spec ;
4 /* In outpgm is the result of Js11 K(p, p) . We need to run p
with arguments (outpgm, d) then . */
5 q := outpgm ; C p ; write out
6
One could actually check that doing Js11 K2 (p, p) would actually yield as a result
the exact program that we wrote above.
Observation 3.1.1 (Self-reproduction) Note that the start of p′ , with the state-
ment q ∶= outpgm, we assign to q the value of Js11 K(p, p). So there is a code
segment in p′ that assigns to q the entire text of p′ .
3 n ∈N
∗
3.2. 1#, a language for text register machines 50
4 z ∈ Z∗
5 // Commands and Programs
6 Cmd := 'A ( 'n , a ') ' | 'J ( 'z ') ' | 'C ( 'n ') '
7 Program := ( Cmd ) *
In general, a 1# program looks like p = I1 I2 ...In where each Ii is a Cmd.
3.2.2 Semantics of 1#
Executing 1# programs Execution of 1# programs is analogue to the execution of
the URM programs defined in 1.2.1. The semantics of 1# are more easily explained
in an informal way. Consider a 1# program p = I1 ... IL . Execution of a program p
with inputs a1 , ..., an ∈ A∗ is done by sequentially executing its instructions, starting
by I1 , with Ri storing word ai for i = 1, ..., n and the rest of the registers empty. The
execution of an instruction Ik is given by:
Correct and incorrect stopping of program execution Programs may or may not
stop. As in 1.2.1, we require that programs stop by trying to execute an instruction
just after the last instruction of the program. Stopping an execution this way is called
halting. Programs may stop in other ways, in which case we will say they stopped
incorrectly. For example, consider the following program, which for all inputs will
stop incorrectly by trying to execute an instruction before the first one.
1 J ( -1)
Definition 3.4 (Halting) Execution of some program p = I1 ... IL halts if one of the
following happens at some point:
2. The program goes to In and In is a Jump instruction of the form J(z) where
n + z = L + 1.
3. The program goes to In and In is a Cases instruction of the form C(n) and:
SRT in 1#
In [14], Moss provides a proof for SRT on 1# without the use of s11 . Here, we show
both ways of implementing SRT with 1#, one following the proof on section 2.2.1,
which uses s − 1 − 1, and the other one, provided by Moss.
We start by defining some auxiliary programs that will be useful later on.
Definition 3.5 (The Move program) Given n, m ∈ N with n ≠ m, the moven,m pro-
gram writes the contents of Rn onto the end of Rm , emptying Rn in the process.
moven,m =
1 C (n) // Cases on Rn
2 J (6) // Case empty ( Move Forward 6 to end )
3 J (3) // Case 1 ( Move Forward 3 to Case 1 implementation )
4 A ( m ,#) // Case # ( Write # to Rm )
Of course, the program above does not strictly follow 1# syntax, and variables n, m
should be replaced with actual numbers.
Definition 3.6 (The Copy program) Similar to move, copy copies the contents of one
register onto the end of another. However, the original register is not emptied. The
downside is that we need to use a third auxiliary register to perform this operation.
The program copyn,m,k copies the contents of Rn to Rm , using Rk as an auxiliary
register. The idea is simple, emulate move but now move the contents from Rn to Rm
and Rk . At the end, move the contents back from Rk to Rn . copyn,m,k =
1 C (n) // Cases on Rn
2 J (8) // Case empty ( Move Forward 8 to move k,n subroutine )
3 J (4) // Case 1 ( Move Forward 4 to Case 1 implementation )
4 A ( m ,#) // Case # ( Write # to Rm and Rk )
5 A ( k ,#) //
6 J ( -5) // Back 5 ( To Cases statement for loop )
7 A ( m ,1) // Case 1 implementation ( Write 1 to Rm and Rk )
8 A ( k ,1) //
9 J ( -8) // Back 8 ( To Cases statement for loop )
10 move k,n
Notice that for copy to work correctly, the auxiliary register must be empty before
its execution.
Definition 3.7 (The Write program) write is a program such that for every x ∈ D,
In words, write is a program that with input x, outputs a program that outputs x.
It is not difficult to come up with such program, for every 1 read, write an Add 1
instruction, and the same for every # read. We need to use an auxiliary register to
write the result, and then move the result to R1 . write =
1 C (1) // Cases on R1
2 J (9) // Case empty . Forward 9 to move2,1 subroutine .
3 J (5) // Case one . Forward 5 to Case 1 Impl .
4 A (2 ,1) // Case #. Write '1## ' to R2 . Add '1 ' to R2 .
5 A (2 ,#) // Add '# ' to R2 .
6 A (2 ,#) // Add '# ' to R2 .
7 J ( -6) // Back 6 to Cases statement .
8 A (2 ,1) // Case 1 Impl . Write '1# ' to R1 . Add '1 ' to R2 .
9 A (2 ,#) // Add '# ' to R2 .
10 J ( -9) // Back 9 to Cases statement .
11 move2,1
Again, the program above does not strictly follow 1# syntax. We will use the names
of known programs, if possible, to make code shorter. To be valid code, we would
need to write out the whole move program at the end.
3.2. 1#, a language for text register machines 53
Jp ∣ qK(x) = JqK(JpK(x))
Where ∣ is the concatenation operator, provided that on all inputs, p and q never stop
incorrectly (so either they halt, or the execution never stops).
s − 1 − 1 in 1#
In 1#, s11 =
1 move1,3
2 move2,1
3 write
4 move1,2
5 JwriteK(move1,2 )
6 move2,1
7 move3,1
Lets follow the execution of s11with input (p, s) ∈ D2 .To do this, we note the
state of the Text Register Machine with tuples (x, y, z) meaning that R1 stores x, R2
stores y, R3 stores z, and the rest of registers are empty. Here, we only keep track
of the contents of registers 1, 2 and 3 because the s11 program does not use any other
register.
And so for d ∈ D
If p =
1 move 2,4 /* Since s 11 uses registers 1, 2, 3 , we move the
2 second argument to R4 as we will later need it . */
3 copy 1,2,3 /* We want to run Js11 K(q, q)
4 where q is the first argument . */
5 s 11
6 move 4,2 // Bring back the original second argument to run p
7 p
We have for q, d ∈ D
JpK(q, d) →
At the beginning, state is the input: (q, d, , )
After move2,4 : (q, , , d)
After Jcopy1,2,3 K(s): (q, q, , d)
After s11 : (Js11 K(q, q), , , d)
After move4,2 : (Js11 K(q, q), d, , )
After p: (JpK(Js11 K(q, q), d), ..., ..., ...)
So JpK(q, d) = JpK(Js11 K(q, q), d)
2. As in the Proof, doing p′ = Js11 K(p, p). It makes no sense to state here explicitly
what p′ looks like because, unlike the SRT implementation of TINY, here the
explicit implementation yields no additional information about p′ .
3.2. 1#, a language for text register machines 55
3 copy 1,2,3
4 s11
5 move 4,2 )
6 move 2,1
Comparing this code with the code for p in last section, the equation JbarK(p) = p
is straightforward. Following the proof for the SRT, which tells us that Js11 K(p, p)
is a fixed point for p we can now compute fixed points.
JpK(JsrtK(p), d) = JJsrtK(p)K(d)
Define srt =
1 bar
2 copy 1,2,3
1
3 s1
The proof of the SRT tells us that Js11 K(p, p) is a fixed point for p. ◻
Definition 3.8 (The diag program) diag is a program which, for every x ∈ P gms
JJdiagK(x)K() = JxK(x)
JJdiagK(x)K() = JJwriteK(x)∣xK()
= JxK(JJwriteK(x)K())
= JxK(x)
3 move 3,1
So, for x ∈ P gms:
JdiagK(x) →
At the beginning (x, , )
After copy1,3,2 (x, , x)
After write (JwriteK(x), , x)
After move3,1 (JwriteK(x)∣x, , )
So JdiagK(x) = JwriteK(x)∣x
2 move 1,2
4 move 2,1
5 JwriteK(move4,2 )
6 JwriteK(p)
Though we will not do it step by step, it is easy to check that for r ∈ P gms
And so for d ∈ D
JJq̂K(r)K(d) = JpK(JrK(r), d)
4.1 Preliminaries
The results presented in this chapter hold for any acceptable language, as defined
in Section 2.1.4, with sets of programs and data P gms ⊆ D and computable pairings
with computable projections ⟨⟩. As in Chapter 2, all programs and functions on the
general statements (those that hold for every acceptable Programming Language) are
really unary, using the loose notation of (x1 , ..., xn ) for ⟨x1 , ..., xn ⟩. Of course, this
unary assumption need not be true for the examples given in specific computation
models, which in this chapter will be presented in the WHILE+ language, an extension
of the TINY language studied in Section 3.1.
3 Var : : = ID
4 / / Expressions
57
4.2. Defining Computer Viruses 58
As before, we assume there is a concrete syntax for WHILE+ programs, so that every
program may be expressed uniquely as a member of TA . For the purposes of this
chapter, we do not need to specify this concrete syntax.
Semantics of WHILE+
∀n, m ∈ N, x1 , ..., xn ∈ D
Plainly, WHILE+ has built in Universal and sm
n programs as required in Roger’s Axioms
(Section 2.1.4).
Here, x represents a finite sequence ⟨x1 , ..., xn ⟩ of elements of the local environment
such as files, parameters, etc. We call the tuple ⟨p, x⟩ the execution environment.
When a virus comes into the environment, it infects programs. So given a virus v
and a program p, we should be able to define the infected form of p. One approach,
as done by Adleman (pioneer in formalizing computer virology) in [2] is to define the
infected form directly by JvK(p). However, as Bonfante et al. point out in [5], this
definition leaves out some scenarios, as Adleman’s virus do not take into account the
local environment variables when infecting a program. This leads to a (more general)
definition of a computer virus:
Definition 4.1 (Computer Virus) Let B be a computable function. A virus with re-
spect to (wrt) B is a program v satisfying
JB(v, p)K(x) = JvK(p, x) (4.1)
for all p ∈ P gms and x ∈ D.
Observation 4.2.1 Function B in the definition above is called the propagation func-
tion of virus v. The above definition makes sense as it takes into account some factors
of viral infections:
• A virus v infects programs. Such infection is determined by the propagation
function B(v, p).
• If v is viral, then an execution of the infected program B(v, p) over some context
x should include an execution of v. Such execution of v takes into account not
only the context x, but also the program which it infected, hence the JvK(v, p).
Running an infected program B(v, p) over a context x returns as result a new context
z. This execution of an infected program can be seen really as an execution of the
virus v, which takes into account its environment x and the program which it infected
p.
JvK(p, x) = JgK(v, p, x)
Proof. Let v be a fixed point for g, which exists by the SRT, so that JvK(p, x) = JgK(v, p, x).
Taking B = Js11 K we have that
Example 4.3.1 (LoveLetter Virus) “LoveLetter” was a computer worm that attacked
millions of personal computers on May 2000. It spread as an email message as an
email message with subject “ILOVEYOU” and an attachment “LOVE-LETTER-FOR-
YOU.txt.vbs”. The virus damaged local files by overwriting them, and then sent itself
to all contacts found in the user’s address book.
In [6], an abstract model for LoveLetter is presented. The program to be infected is
the e-mail outbox, as its behavior will be modified to send copies of the virus to the
infected user’s contacts. The execution environment are the user’s files, which will
be affected by the virus in two ways:
Example 4.3.2 (More Blueprint Examples) 1. A virus that deletes everything. This
virus acts independently of the program infected and so it is readily defined
by JvK(p, x) = nil, where we assume nil to be an element of D representing
the "empty" program.
2. A virus that appends itself to every element of the environment. It follows the
equation JvK(p, ⟨x1 , ..., xn ⟩) = ⟨J○K(v, x1 ), ..., J○K(v, x1 )⟩ (○ is the computable
composition program). This virus can be easily defined through a blueprint
viral specification function, from which we obtain the explicit viral code via a
distribution engine. We would expect the new elements J○K(v, d) to be larger
than the original d. Executing one of those new elements in the environment
4.3. Monomorphic Viruses 62
appends the virus again to the elements of the environment until it runs out of
memory. Virus "Jerusalem" worked this way [4].
The Explicit Recursion Theorem tells us that there exists a program B such that for
all z, p, x ∈ D:
If program r is such that JrK(z, p, x) = g(B, z, p, x), the SRT tells us there exists a
fixed point v of r. So, for all p, x ∈ D:
JvK(p, x) = JrK(v, p, x)
= g(B, v, p, x)
= JJBK(v, p)K(x),
Definition 4.5 (Smith Virus Distribution Engine) A smith virus distribution is a pair
of programs dv , dB such that for every program q, Jdv K(q), JdB K(q) is a smith virus
with specification function JqK.
4.3. Monomorphic Viruses 63
The proof of Theorem 4.3.3 depends on being able to find explicit and regular
fixed points. Since we can find these computably, then we should be able to find
smith viruses computable from their specification functions.
Theorem 4.3.4 (A smith virus distribution engine)
There exists a smith virus distribution engine.
Proof. There exists a program T such that for all z, y, p, x ∈ D:
JT K(y, ⟨z, p⟩, x) = ⟨y, z, p, x⟩.
If ○ is the composition program then define programs dB , r and dv so that for all
q ∈ D:
JdB K(q) = JxrtK(J○K(q, T ))
JrK(q) = Js11 K(q, JdB K(q))
JdV K(q) = JsrtK(JrK(q)),
And so, for q ∈ D, if B = JdB K(q) and v = Jdv K(q) then for all p, x ∈ D:
JvK(p, x) = JJsrtK(JrK(q))K(p, x)
= JJrK(q)K(JsrtK(JrK(q)), p, x)
= JJrK(q)K(v, p, x)
= JJs11 K(q, JdB K(q))K(v, p, x)
= JqK(JdB K(q), v, p, x)
= JqK(B, v, p, x)
= JJ○K(q, T )K(B, ⟨v, p⟩, x)
= JJ○K(q, T )K(JxrtK(J○K(q, T )), ⟨v, p⟩, x)
= JJJxrtK(J○K(q, T ))K(v, p)K(x)
= JJBK(v, p)K(x),
so Jdv K(q), JdB K(q) is a smith virus with specification function JqK, and dv , dB is a
smith virus distribution. ◻
Example 4.3.3 (A parasitic virus) As Bonfante et al. explain, "Parasitic viruses in-
sert themselves into existing files. When an infected host is executed, first the virus
infects a new host, then it gives the control back to the original host" [6]. For this
case, we model the environment as a triple ⟨p, q, x⟩ where p, q are programs and x
is the rest of environment variables, as usual. The definition of virus tells us that a
parasitic virus v, along with its propagation program B satisfy:
JvK(p, q, x) = JJBK(v, p)K(q, x)
By the description, a parasitic virus, the following equation also holds:
JvK(q, x) = JpK(JBK(v, q), x)
We can define a parasitic virus via a smith specification program. In WHILE+ , q =
4.4. Polymorphic Viruses 64
1 read B ,v ,p , d ;
2 // d is really a tuple <q ,x >
3 q := hd d ; x := tl d ;
4 infected_form := exec (B ,v , q ) ;
5 newEnvironment := exec (p , infected_form , x ) ;
6 write newEnvironment ;
7
Example 4.3.4 Consider the virus that, no matter what program it infects, when the
infected program is run, it propagates infecting all other pieces of the environment.
Such virus could be described by the following equation:
JqK(B, v, p, x) = JBK(v, x)
In the WHILE+ language we can make a more detailed version, where for all n ∈ N
and x1 , ..., xn ∈ D:
1 // ○ = 1 // ○ =
2 read p ,q , x ; 2 read p , q ;
3 qx := exec (q , x ) ; 3 comp := spec ( ○ ,p , q ) ;
4 out := exec (p , qx ) ; 4 write comp ;
5 write out ; 5
6
1 // t = 1 // π1 =
2 read z ,q , y ; 2 read x ;
3 e := exec (q , x ) ; 3 p := hd x ;
4 out := cons e y ; 4 write p ;
5 write out ; 5
6
1 // ◇ =
2 read q , y ;
3 out := spec (t ,q , y ) ;
4 write out ;
And finally we obtain the P ad program. P ad =
4.4. Polymorphic Viruses 66
1 // Pad =
2 read q , y ;
3 diam := exec ( ◇ ,q , y ) ;
4 out := exec ( ○ , π1 , diam ) ;
5 write out ;
In particular, for p, x ∈ D:
◻
Again, just as with monomorphic blueprint viruses, we want to computably find evolv-
ing distributions given a specification q.
Example 4.4.1 (Mutating LoveLetter) Let’s revisit the LoveLetter virus from Exam-
ple 4.3.1. This is a virus that overwrites all files in the environment, and sends copies
of itself to all the contacts. In this revisited form, the code sent to all the contacts
is not the exact copy of the virus, but a mutated code. Recall the P ad program from
4.4, where JJP adK(q, y)K = JqK for all q, y ∈ D. This will work as our “mutator”.
EvolvingLoveLetterSpecificationProgram:=
1 read ev ,i , mb , d ;
2 d := hd x ; @bk = tl x ;
3 newFiles := nil ; // Build the new file structure .
4 virus := exec ( ev , i ) // The actual virus .
5 while ( d ) do
6 d := tl d ; /* We do not care about the original file , we
just need to overwrite every one . */
7 newFiles := cons virus newFiles ;
8 end ;
9 // Create new version of virus .
10 nextKey := cons nil i ;
11 newVirus := exec ( Pad , virus , nextKey ) ;
12 // Loop to modify mailbox
13 while ( @bk ) do
14 @ := hd @bk ;
15 newMail := cons @ newVirus ;
16 mb := cons newMail mb ;
17 @bk := tl @bk ;
18 end ;
19 /* The new environment is made up of the modified outbox and
the overwritten files */
20 newEnvironment := cons mb newFiles ;
21 write newEnvironment ;
Using an evolving blueprint virus distribution engine cv we could transform
the program into a code of the corresponding evolving blueprint virus.
Proof. Define a program p that for all y, z, i, p, x ∈ D, JpK(z, i, ⟨y, p⟩, x) = JqK(z, y, i, p, x).
By Theorem 2.2.5 there exists a nested explicit fixed point eB for p, so that
From these two equations we conclude that eB , ev is an evolving smith virus specified
by q. ◻
As with the previous cases, we want to find evolving smith viruses computably
from their specification functions.
Definition 4.10 (Evolving smith virus distribution engine) An evolving smith virus
distribution engine is a pair cB , cv such that for all programs q, JcB K(q), Jcv K(q) is
an evolving smith virus specified by q.
Theorem 4.4.5 (Existence of evolving smith virus distribution engines)
There exists an evolving smith virus distribution engine.
Proof. As is routine by this point, we use the fact that we can compute nested
and regular explicit fixed points. Let T be a program that for all y, z, i, p, x ∈ D,
JT K(z, i, ⟨y, p⟩, x) = ⟨z, y, i, p, x⟩. Let r be a program that for all q ∈ D, JrK(q) =
J○K(q, T ). Define cB so that for all q:
eB = JcB K(q) is a nested explicit fixed point for program JrK(q), so for all y, i, p, x ∈ D:
and if, for some q ∈ D, ev = Jcv K(q), we have that for all i, p, x ∈ D:
and so JcB K(q), Jcv K(q) is an evolving smith virus specified by q, for all q ∈ D. ◻
4.5. Adleman’s Viruses 69
Example 4.4.2 (Revisiting the Parasitic Virus) Recall the original Parasitic Virus
from Example 4.3.3, given by the equations
We modify it so that the infection on a new host q is done with a virus of the “next
generation”. Fix some element 1 ∈ D, and use the notation i + 1 = ⟨i, 1⟩ for all i ∈ D.
i + 1 gives us a way of changing generations. Use the notation vi = Jev K(i) and
Bi = JeB K(i). A mutating parasitic virus is defined by the equations:
In WHILE+ , EvolvingParasiticVirusSpecification=
1 // EvolvingParasiticVirusSpecification =
2 read e B , e v , i , p , d ;
3 // d is really a tuple ⟨ q , x ⟩
4 q := hd d ; x := tl d ;
5 nextIndex := cons i nil ;
6 B i := exec ( e B ,i ) ;
7 v i+1 := exec ( e v , nextIndex ) ;
8 mutation := exec ( Pad , v i+1 , i ) ;
9 infected_form := exec ( B i , mutation , q ) ;
10 newEnvironment := cons infected_form x ;
11 write newEnvironment ;
12
JJAK(p)K(x) = JJAK(q)K(x).
JJAK(p)K(x) = JpK(x)
JvK(p, x) = JJAK(p)K(x)
In fact, we can easily prove something a bit stronger, the ability to go from
Adleman’s viruses to Bonfante’s viruses computably.
Proposition 4.5.3 ( [5] Not every virus propagates like an Adleman virus)
There exists a virus v (in the sense of 4.1) such that there is no program A ∈ D such
that A is an Adleman virus and JvK(p, x) = JJAK(p)K(x) for all p, x ∈ D.
4.5. Adleman’s Viruses 71
Proof. Let θ ∈ D be a program that for all x, y, z ∈ D, JθK(x, y, z) = ⟨y, x, z⟩. Define
a function v that for all p, q, r, d ∈ D,
Since r ≠ q, then r = λ(q) ≠ q, and so it must be that r = λ(q) = JAK(q). This should
happen for all q ≠ r, which is clearly a contradiction, since we could find a third
program t different from both r and q, and we would have r = λ(q) = t. ◻ The virus
presented above is defined by Bonfante et al. as the Wagger Virus.
Closing Comments
We have studied a classification of viruses by Bonfante et al. [6], seeing in the pro-
cess how their approach to formal computer virology relates closely to the study of
recursion theorems, as they play a big part in the construction of different kinds
of viruses, and at the same time help us capture key viral properties such as self
replication and mutation. Through examples like the Wagger Virus we see how this
definition encompasses viral cases not captured by previous models, most notably
Adleman’s [2], which continues to be today a benchmark in the formalization of com-
puter viruses. The generality of the definition comes at a cost, as any computable
function is a virus with respect to the propagation function Js11 K. This generality
poses a problem in the detection of viruses as well, as it is shown in the last Section
of [5]. They present, for example, the following result
3. Results: To the moment, there seem to be only vague results regarding Bon-
fante’s viruses. There are some decidability theorems on [5], which, although
interesting and by no means trivial, provide little to no contribution in practical
terms, as they are centered around the recursiveness or non-recursiveness of
sets of viruses with respect to very particular, but not viral (in the practical
sense) propagation functions.
We think that these three problems have a common root: the detachment of Bonfante’s
model from the practical aspect of computer viruses. We believe more focus should
be put on connecting the formal virus model with the real world definition, in hopes
of getting more concrete, relevant results on the subject, or, at worst, to conclude
that this formal model is not appropriate, learn from the experience and continue on
the search of a suitable formal definition of computer viruses.
Chapter 5
Conclusion
We have made a thorough study of the classic theory leading up to Kleene’s Second
Theorem of Recursion 1.3.2. We have seen how the simple statement
has a wide array of applications and implications. In 2.2 we show some seemingly
pathological implications of the SRT, such as the existence of self writing programs.
In Chapter 3, we studied the implementation of the SRT in two completely differ-
ent computation models, following Jones’ work [11], including the very limited TINY
language, where many of those “pathological” results still hold. In Chapter 2 we
worked our way to the definition of acceptable programming languages, for which we
showed many results regarding fixed points of computable functions, and the ability
to compute such fixed points given any program. These recursion Theorems showed
to be of great relevance in Chapter 4, where we studied a formalization of computer
viruses developed by Bonfante et al [4], and proved interesting results regarding the
construction of viruses by the application of recursion theorems. There is a wide
array of applications based on Kleene’s Second Recursion Theorem, both theoretical
and practical, as Moschovakis shows in [13]. Indeed, as the aforementioned author
states, Kleene’s Second Recursion Theorem is amazing.
73
Bibliography
[4] Guillaume Bonfante, Matthieu Kaczmarek, and J-Y Marion. Toward an abstract
computer virology. In Theoretical Aspects of Computing–ICTAC 2005, pages
579–593. Springer, 2005.
[5] Guillaume Bonfante, Matthieu Kaczmarek, and J-Y Marion. On abstract com-
puter virology from a recursion theoretic perspective. Journal in computer virol-
ogy, 1(3-4):45–54, 2006.
[9] Jean H. Gallier. Lecture notes in elementary recursive function theory, 2010.
[11] Neil D Jones. A swiss pocket knife for computability. arXiv preprint
arXiv:1309.5128, 2013.
[12] Yves Marcoux. Composition is almost (but not quite) as good as s-1-1. Theo-
retical computer science, 120(2):169–195, 1993.
74
Bibliography 75
[13] Yiannis N Moschovakis. Kleene’s amazing second recursion theorem. The Bul-
letin of Symbolic Logic, 16(2):189–239, 2010.
[14] Lawrence S Moss. Recursion theorems and self-replication via text register
machine programs. Bulletin of the EATCS, 89:171–182, 2006.
[15] Hartley Rogers. Theory of recursive functions and effective computability, vol-
ume 5. McGraw-Hill New York, 1967.