0% found this document useful (0 votes)
5 views79 pages

Kleene's Second Recursion Theorem: Overview, Implementation and Applications To Computer Virology

This thesis presents Kleene’s Second Recursion Theorem (SRT) and its implications in computability and computer virology. It explores the theory of computability through partial recursive functions, implementation of SRT in programming languages, and its association with computer virus construction. The document includes a comprehensive overview of recursion theorems and their applications in various computation models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views79 pages

Kleene's Second Recursion Theorem: Overview, Implementation and Applications To Computer Virology

This thesis presents Kleene’s Second Recursion Theorem (SRT) and its implications in computability and computer virology. It explores the theory of computability through partial recursive functions, implementation of SRT in programming languages, and its association with computer virus construction. The document includes a comprehensive overview of recursion theorems and their applications in various computation models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

Kleene’s Second Recursion Theorem:

Overview, Implementation and


Applications to Computer Virology

Santiago Cerón Uribe

A Thesis presented to opt for the degree of


Mathematician

Advisor: Maricarmen Martínez


Departamento de Matemáticas
Universidad de los Andes
Bogotá, Colombia
July 27, 2016
Kleene’s Second Recursion Theorem: Overview,
Implementation and Applications to Computer Virology

Santiago Cerón Uribe

Submitted for the degree of Mathematician


July 27, 2016

Abstract

This self contained document makes a presentation of theory leading up to, involv-
ing and derived from Kleene’s Second Recursion Theorem (SRT), which states that
for any partial recursive function f (x, y), there exists a “fixed-point” k such that
f (k, y) = ϕk (y), where ϕk is the k−th partial recursive function as indexed via Gödel
Numberings of Turing Machines. Following a classic introduction to the theory of
computability via partial recursive functions on the natural numbers and Turing Ma-
chines, we present an array of recursion and computability results (including the
SRT) for acceptable programming languages. Following Jones ( [10]), we show im-
plementations of the SRT in particular computation models TINY( [3]) and 1#( [14]).
Finally, we make an overview of the work of Bonfante et al ( [4], [6], [5]) on ab-
stract computer virology, and show how the SRT and other recursion theorems are
associated with the study and construction of computer viruses.
Contents

Abstract ii

1 A Biased Introduction to Computability 1


1.1 Recursive and Computable functions . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Recursive Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Effectively Computable Functions . . . . . . . . . . . . . . . . . . . 5
1.2 Computation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 Unlimited Register Machines . . . . . . . . . . . . . . . . . . . . . . 6
1.2.2 Turing Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3 Coding and the theorems of Recursion . . . . . . . . . . . . . . . . . . . . . 20
1.3.1 Gödel Numberings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3.2 Important Results about Turing Machines . . . . . . . . . . . . . . 24
1.3.3 The Church-Turing Thesis . . . . . . . . . . . . . . . . . . . . . . . . 28

2 Computability and Recursion on General Sets 29


2.1 Acceptable Programming Languages . . . . . . . . . . . . . . . . . . . . . . 29
2.1.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.1.2 Acceptable Indexings . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.1.3 Computability with non-numerical inputs . . . . . . . . . . . . . . 34
2.1.4 Acceptable Programming Languages . . . . . . . . . . . . . . . . . 35
2.2 Results for acceptable programming languages . . . . . . . . . . . . . . . 37

3 Swiss Pocket Knife 44


3.1 The TINY language for tree-structured data . . . . . . . . . . . . . . . . . 44
3.1.1 Tree-structured data . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.1.2 TINY grammar and syntax . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1.3 Semantics of TINY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1.4 TINY programs as data . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.1.5 SRT in TINY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
s − 1 − 1 in TINY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Implementing SRT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.2 1#, a language for text register machines . . . . . . . . . . . . . . . . . . 49
3.2.1 1# grammar and syntax . . . . . . . . . . . . . . . . . . . . . . . . . 49

iii
Contents iv

3.2.2 Semantics of 1# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.2.3 1# programs as data . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2.4 SRT with s11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
s − 1 − 1 in 1# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
First proof of SRT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
An implementation of SRT: A computable fixed point finder . . 55
3.2.5 Moss’s Proof of SRT . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4 The SRT in Computer Virology 57


4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.1.1 The WHILE+ Language . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Semantics of WHILE+ . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
WHILE+ as an Acceptable Programming Language . . . . . . . . 58
4.2 Defining Computer Viruses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3 Monomorphic Viruses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3.1 Blueprint duplication and distribution engines . . . . . . . . . . 59
4.3.2 Smith Viruses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4 Polymorphic Viruses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.4.1 Evolving Blueprint Virus . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4.2 Evolving Smith Virus . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5 Adleman’s Viruses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

5 Conclusion 73
Chapter 1

A Biased Introduction to Computability

As theoretical background, we introduce some basic concepts of theory of com-


putability that lead to the statement (and proof) of Kleene’s Second Recursion Theo-
rem. We follow very closely Barry Cooper’s presentation contained in Computability
Theory [8]. Some notation and explanation of concepts was extracted from Xavier
Caicedo’s Elementos de Lógica y Computabilidad [7]. Sipser’s Introduction to the
Theory of Computation [16] and Roger’s Theory of Recursive Functions and Effective
Computability [15] have also been used as general references.

1.1 Recursive and Computable functions


1.1.1 Recursive Functions
Definition 1.1 (Partial and Total Functions) A partial function f ∶ A → B is a rela-
tion f ⊆ A × B such that for each x ∈ A either:

• There exists a unique y ∈ B such that (x, y) ∈ f . In this case, we say that
f is defined in x, noting it f (x) ↓, and that f evaluated in x is y, (notation:
f (x) = y).

• There is no y ∈ B such that (x, y) ∈ f . In this case, we say f is not defined in


x and we write f (x) ↑.

A total function is a partial function f ∶ A → B such that for every x ∈ A, f (x) ↓. In


this document, we refer to partial functions as just functions.

Definition 1.2 (Primitive Recursive Functions)

1. The basic primitive recursive functions are:

(a) The zero function defined by

Z(x) = 0, ∀x ∈ N,
1
1.1. Recursive and Computable functions 2

(b) The successor function defined by

S(x) = x + 1, ∀x ∈ N,

(c) The projection functions Pik ∶ Nk → N defined by

Pik ((x1 , ..., xk )) = xi , for k ≥ 1 and i = 1, ..., k.

2. A primitive recursive function is either a basic function, or a function that can


be obtained from basic functions by a finite sequence of applications of the
following rules:

(a) Composition: If g ∶ Nm → N and hi ∶ Nk → N for i = 1, ..., m are primitive


recursive, and f (⃗
x) = g(h1 (⃗ x)) for all x⃗ ∈ Nk then f is primitive
x), ..., hm (⃗
recursive.
(b) Primitive recursion: If g ∶ Nk → N and h ∶ Nk+2 → N are primitive recursive,
then if f ∶ Nk+1 → N is defined by

f (⃗
x, 0) = g(⃗
x) ∀⃗
x ∈ Nk
f (⃗
x, y + 1) = h(⃗
x, y, f (⃗
x, y))

f is primitive recursive.

The set of all primitive recursive functions will be noted by PRIM hereafter.

Observation 1.1.1 The following functions can be proven to be primitive recursive:

1. Addition: x + y.

2. Multiplication: x × y.

3. Exponentiation: xy .

4. Recursive Difference:


⎪m − n if m ≥ n
m−̇n = ⎨

⎪ otherwise
⎩0
5. Absolute Difference: ∣x − y∣.

6. Order functions:

(a) eq(x, y) = 1 if x = y, eq(x, y) = 0 otherwise.


(b) ≤, <, ≥, > (x, y) in a similar fashion.

7. sg(n) = 0 if n = 0, sg(n) = 1 otherwise.

8. sg(n) = 0 if n ≠ 0, sg(n) = 1 otherwise.


1.1. Recursive and Computable functions 3

9. Bounded sums: If f (⃗ x, z) = ∑zy=0 f (⃗


x, y) ∈ PRIM, then the function h(⃗ x, y) is
primitive recursive.

10. The “divides function”:




⎪1 if m divides n
m∣n = ⎨

⎩0 otherwise

Definition 1.3 (The Minimalisation operator µ) Let g ∶ Nk+1 → N. For x⃗ ∈ Nk , define


the operation µ by

x, z) = 0] = z0 ⇔def g(⃗
µz[g(⃗ x, z0 ) = 0 and ∀z < z0 ∶ g(⃗
x, z) ↓≠ 0

In other words, µ works as a search operator. For each x⃗ ∈ Nk , µ computes the


minimum z such that g(⃗ x, z) = 0. This search may or may not be successful, since
x, z) = 0} might be empty for some x⃗. With the µ operator we can
the set {z ∈ N ∶ g(⃗
define a new function f ∶ Nk → N, where f (⃗ x) = µz[g(⃗x, z) = 0]. The Minimalisation
rule says that if g is partial recursive, then so is f .

Definition 1.4 (Partial Recursive functions) A function f ∶ Nk → N is partial recur-


sive if it belongs to PRIM, or if it can be defined by applying a finite sequence of
Composition, Primitive Recursion and Minimalisation to partial recursive functions.
The class of all partial recursive functions will be noted by PREC. If f ∈ PREC is
total, then it is said to be recursive.

Definition 1.5 ((Primitive) Recursive sets) A set S ⊆ Nk is said to be (primitive)


recursive if its characteristic function χS ∶ Nk → {0, 1} is (primitive) recursive.

Example 1.1.1 The set P of all prime numbers is primitive recursive. This follows
from the definition of a prime number (n > 1 is prime if it has 2 divisors) and that
the following function is primitive recursive: D(n) ∶=”The number of divisors of n”.
This is easily proven by noting that D(n) = ∑ni=0 i∣n. With this we have that χP (n) =
sg(D(n)−̇2) × sg(n−̇1).

Definition 1.6 (A Tuple Encoding Scheme) A tuple encoding scheme is a set of re-
cursive functions ⟨_⟩ ∶ Nn → N that to every tuple x⃗ ∈ Nn assign a natural number ⟨⃗
x⟩
such that:

1. There exist partial recursive projections π1 , π2 , ... ∶ N → N such that


πi (⟨x1 , ..., xi , ..., xn ⟩) = xi for all x1 , ..., xn ∈ N, 1 ≤ i ≤ n. There exists a partial
recursive function π ∶ N2 → N such that for all i, x ∈ N, π(i, x) = πi (x).

2. Given a natural number y, there exists at most one n ∈ N and one tuple x⃗ ∈ Nn
such that ⟨⃗
x⟩ = y.

3. The set E = {y ∈ N ∶ ∃n ∈ N ∃⃗
x ∈ Nn ∶ ⟨⃗
x⟩ = y} is recursive.
1.1. Recursive and Computable functions 4

4. The function L ∶ N → N defined as

L(y) = k > 0 ⇔def ∃⃗


x ∈ Nk ∶ ⟨⃗
x⟩ = y
L(y) = 0 otherwise,

is partial recursive.

For a tuple x⃗ ∈ Nn , we say that ⟨⃗


x⟩ is the “code” or “encoding” of x⃗. We now define
explicitly a Tuple Encoding Scheme, as it will be of great relevance in our later work
(Section 1.3.1).
Proposition 1.1.1
There exists a Tuple Encoding Scheme
Proof.

1. Let p ∶ N → N be the prime numbering function, that is, p(i) ∶=”the i − th prime
number”. This function is recursive as it can be defined:

p(0) = 2
p(n + 1) = µz[(sg(z −̇p(n)) + sg(χP (z))) = 0],

where P is the set of prime numbers. Lets examine the second part of the
definition:

sg(z −̇p(n)) = 0 ⇔ z −̇p(n) ≠ 0 ⇔ z > p(n)


sg(χP (z)) = 0 ⇔ χP (z) ≠ 0 ⇔z∈P

So the expression is saying “p(n + 1) is the minimum z such that z is prime


and it is strictly greater than p(n)”, which is exactly what we need.
Associated to p we will define a function p(x) ∶=”the number of primes less
than x”. This function is readily defined by

p(0) = 0
p(x + 1) = p(x) + χP (x).

Notice that for all i ∈ N, p(p(i)) = i.

2. Now for n ∈ N define


n
⟨x1 , ..., xn ⟩ = ∏ p(i − 1)xi +1 .
i=1

Clearly ⟨⟩ is recursive for all n ∈ N.

3. Consider, for y ∈ N, the function

mp(y) = µz[(sg(χP (z)) + sg(z∣y)) = 0].


1.1. Recursive and Computable functions 5

Then mp is recursive and mp(y) is the minimum prime that does not divide y.
If y is a correctly encoded tuple, this means that all primes less than mp(y)
should divide y. This is readily checked by the functions
p(mp(y))
pd(y) = ∑ p(i)∣y "Number of prime divisors of y”
i=0

ξ(y) = sg(∣pd(y) − p(mp(y))∣ + 2−̇y)

We can see that ξ(y) = 1 if and only if y ≥ 2 and ALL of the primes less than
mp(y) divide y, and is 0 otherwise. This is a necessary and sufficient condition
for y to be the encoding of some tuple, so χE = ξ and therefore E is a recursive
set. The function L is easily defined now as well, by setting

L(y) = p(mp(y)) if y ∈ E
L(y) = 0 otherwise.

Finally, defining logb (x) ∶= µz[sg(bz −̇x) = 0] as the minimum z such that bz > x,
define for 1 ≤ i ≤ n the projection πi as

π(i, y) = πi (y) = logp(i−1) (y)−̇2 if y ∈ E


π(i, y) = πi (y) ↑ otherwise

1.1.2 Effectively Computable Functions


Here, we use the intuitive notion of what an algorithm is, namely a set of instructions
described in some language that prescribe how to systematically perform a task.

Definition 1.7 (Effectively Computable Functions) A partial function f ∶ Nk → N is


said to be computable if there is some algorithm that when executed with input
x ∈ Nk , if f (x) ↓ then the algorithm eventually stops and outputs f (x). We will note
the class of effectively computable functions by COMP.

Definition 1.8 (Effectively Computable set) A set S ⊆ Nk is effectively computable


if its characteristic function is effectively computable.
In the early 1930s, Alonzo Church conjectured that any sufficiently general model
of computable functions gives the same class of functions, PREC.

Church’s Thesis PREC = COMP. A function is partial recursive if and only if it is


effectively computable.
1.2. Computation Models 6

1.2 Computation Models


We now introduce two classic models of computation that will later on be relevant to
our research. The idea behind them is that they provide an abstract approximation
to the intuition of automated computation of PREC functions.

Observation 1.2.1 One way of thinking about the automated computation of func-
tions is through the use of machines, as presented in the models below. There are
other intuitive ways of approaching such concept, such as programming languages,
as discussed in Section 2.1.4.

1.2.1 Unlimited Register Machines


Definition 1.9 (Unlimited Register Machine (URM) programs) A URM is made up
of registers R1 , R2 , ... that store natural numbers r1 , r2 , .... A URM Program is a
finite sequence of instructions (I1 , ..., In ) that are executed sequentially starting at
I1 , each instruction having one of the following basic types:

1. Zero [Z(x)]: Stores 0 in Rx . (In other words, makes rx = 0).

2. Successor [S(x)]: Makes rx = rx + 1.

3. Transfer [T (y, x)]: Makes rx = ry (the content of Ry is transfered to Rx ).

4. Jump [J(x, y, q)]: Suppose Ik = J(x, y, q). If rx = ry , then the next instruction
to be executed is Ik+q . Else, execute the next (Ik+1 ) instruction.

URM Program Execution and Halting Let P = (I1 , ..., In ) be a URM program
where Ii is a basic instruction for i = 1, ..., n.

• Executing a program requires an initial state of the URM, which will be called
input and can be expressed by a tuple (r1 , ..., rk ) ∈ Nk and means that in the
URM, the content of Ri is ri for i = 1, ..., k. As a convention, we will assume
that Rm will initially store 0 for m > k. The execution of P is done then by
sequentially following its instructions, starting from I1 , modifying the contents
of the registers and making jumps appropriately. The execution stops when the
program tries to go to an unexisting instruction (be after the last instruction,
or before the first one).

• To consider that P stops correctly, we will impose the further condition that
the program stops by trying to execute the instruction right after In . This will
be called halting. More formally, an execution of program P halts if one of the
following happens at some point:

1. The program goes to In and In is a Zero, Successor or Transfer instruction.


In this case the instruction is executed and the program halts.
1.2. Computation Models 7

2. The program goes to In , In is of the form J(x, y, q) and rx ≠ ry .


3. The program goes to some Im (1 ≤ m ≤ n) of the form J(x, y, q) where
rx = ry and q = n + 1 − m.

Example:

1. Program (J(1, 1, 1)) does absolutely nothing and halts.

2. Programs (J(1, 1, −1)) and (J(1, 1, 2)) do nothing as well, but stop improperly.

3. Program (J(1, 1, 1), J(1, 1, −1)) does nothing but going back and forth between
the instructions, causing an infinite loop, so execution never stops.

Definition 1.10 (URM computability) For every k ∈ N+ , a URM program P defines


a partial function JP Kk ∶ Nk → N as follows:

x) = z ⇔def the execution of P with an initial state x⃗ eventually halts with z in R1 .


JP Kk (⃗

A function f ∶ Nk → N is said to be URM-computable if there exists a URM program


P such that f = JP Kk . For ease of notation, we omit the k in JP Kk if it can be deduced
from the context.
The set of all URM-computable functions will be noted by URMCOMP.

Properties of URMs and URM computability

Now that we have the basic definitions about URMs, we explore some further prop-
erties to show that the functions that are URM-computable are exactly the PREC
functions.

Observation 1.2.2 (Basic PRIM functions are in URMCOMP) The programs that exe-
cute the basic PRIM functions are straightforward to write:

1. Zero function z(x) = 0: If P = (Z(1)) then JP K(x) = 0 ∀x ∈ N.

2. Successor function S(x) = x + 1: If P = (S(1)) then JP K = S.

3. Projection functions Pik ((x1 , ..., xk )) = xi : For i ∈ N∗ , the program (T (i, 1))
computes Pik for all k ∈ N∗ .

Definition 1.11 (URM Program Standard Form) Let P = (I1 , ..., In ) be a URM pro-
gram. If for every jump instruction Im = J(x, y, q) we have 1 ≤ m + q ≤ n + 1 then we
say that P is in standard form.
From now on we will only deal with URM programs in standard form. This does
not make us lose any generality, as we can easily find for any URM program P a
program P ′ in standard form such that JP Kk = JP ′ Kk for all k ∈ N.
1.2. Computation Models 8

Definition 1.12 (Some auxiliary definitions) The following definitions will come in
handy when proving theorems about URM programs.

1. URM Program Join: Given P = (I1 , ..., In ) and Q = (J1 , ..., Jm ) URM programs,
define the join of P and Q by P ∣Q = (I1 , ..., In , J1 , ..., Jm ). Since we are
assuming the programs are in standard form, we know that during the execution
of a program, there will be no jumps to the middle of another one. (The only
“cross-program” jump allowed in P ∣Q is jumping during the execution of P to
the first instruction of Q).

2. ρ(P )): For P a URM program, define ρ(P ) to be the largest index k of a
register Rk used by P .

3. ∣P ∣∣: The length (number of instructions) of program P .

4. For b ≥ a, Z[a; b] is a program that cleans registers from Ra to Rb . Namely,


we can take Z[a; b] = (Z(a), Z(a + 1), ..., Z(b)).

5. Shifting a program: Given a program P and some l ∈ N, we define shifting


P by l as the program P >> l that contains the same instructions of P , but
every time some instruction in P uses register Rm , the respective instruction in
P l uses register Rm+l . For example, let P = (Z(1), S(2), T (3, 4), J(3, 4, −1)).
Then P >> 2 = (Z(3), S(4), T (5, 6), J(5, 6, −1)).

Lemma 1.2.1
Let g ∶ N2 → N and h1 , h2 ∶ N → N be URM-computable functions. Then f (x) =
g(h1 (x), h2 (x)) is URM-computable.
Proof. Let Pg , Ph1 , Ph2 be URM programs that compute g, h1 and h2 respectively.
Let L = max{ρ(Pg ), ρ(Ph1 ), ρ(Ph2 ), 2}. Notice that, Ph1 only uses registers in [1; L]
and Ph2 >> L only uses registers in [L + 1; 2L]. Now, consider the program

Q = (T (1, L + 1))∣Ph1 ∣(Ph2 >> L)

Notice that if executed with some input x ∈ N, Q halts if and only if h1 (x) ↓ and
h2 (x) ↓. More so, if it does, then at the end of the execution R1 stores h1 (x) and
RL+1 stores h2 (x). With this in mind, we can see that the program

Pf = Q∣(T (L + 1, 2))∣Z[3; 2L]∣Pg

indeed computes f , that is, f = JPf K2 . ◻

Proposition 1.2.2 (URMCOMP is closed under Composition)


If g ∶ Nm → N and hi ∶ Nk → N for i = 1, ..., m are URM-computable, and
f (⃗
x) = g(h1 (⃗
x), ..., hm (⃗
x)) then f is URM-computable.
1.2. Computation Models 9

Proof idea. The previous lemma proves a particular case. For the general scenario,
where m need not be 2 and k need not be 1, the proof is similar. The idea is to
simulate m URMs using only one URM, so that we can execute Ph1 ,...,Phm "indepen-
dently", making them use disjoints sets of registers in their executions. (So the L we
defined, for example, would be L = max{ρ(Pg ), ρ(Ph1 ), ..., ρ(Phm ), k, m}). ◻

Lemma 1.2.3
If g ∶ N → N and h ∶ N2 → N are URM-computable then if f ∶ N2 → N is defined by

f (x, 0) = g(x)
f (x, y + 1) = h(x, y, f (x, y))

f is URM-computable.
Proof. The idea is to store the original inputs in some registers, a counter that lets
us know if we have finished computing the function in another, and finally, the last
computed value for the function. We want to compute f (x, y). We will store x in R1
and y in R2 . In R3 there will be a number r3 that indicates that in R6 , the number
f (x, r3 ) is stored (this condition should hold every time before we execute instruction
6 in the program described below). Let Pg and Ph URM programs that compute g and
h respectively. Let L = max{ρ(Pg ), ρ(Ph )}. Then the following program Pf computes
f:

1. (T (1, 4), T (2, 5)) (Copy input to R4 and R5 )

2. Pg >> 3 (Execute Pg shifted by 3, so that its input is in R4 and R5 and


outputs its result in R4 )

3. J(2, 3, ∣Z[5; L + 3]∣ + 3 + ∣Ph ∣ + 3∣) (This condition tells us if we have finished
the computation and the result is in R4 . If so, go to final instruction, if not,
continue execution).

4. Z[5; L + 3] (Clean the registers).

5. (T (4, 6), T (1, 4), T (2, 5)) (Set up the registers to compute h. Now r4 =
x, r5 = y, r6 = f (x, r3 ))

6. Ph >> 3 (Execute Ph shifted by 3. The output should be in R4 ).

7. (S(3), J(1, 1, −(∣Ph ∣ + 3 + ∣Z[5; L + 3]∣ + 1)) (Increase counter and jump to
line (3) to check if computation is done.)

8. (T (4, 1)) (Computation is done, result is in R4 and should be transfered


to R1 before finishing).


1.2. Computation Models 10

Proposition 1.2.4 (URMCOMP is closed under Primitive Recursion)


If g ∶ Nk → N and h ∶ Nk+2 → N are URM-computable then if f ∶ Nk+1 → N is defined
by

f (⃗
x, 0) = g(⃗
x) ∀⃗
x ∈ Nk
f (⃗
x, y + 1) = h(⃗
x, y, f (⃗
x, y))

f is URM-computable.
Proof idea. The previous lemma proves the case k = 1. The proof for the general
case is analogue to that of the Lemma. The main ideas are the same: to copy the
input into another part of the tape, and calculate h on that part (via shifts); and to
keep a "counter" to know when the computation has completed. ◻
With the previous two propositions, and taking into account that the basic prim-
itive recursive functions are trivially also URM-computable, we get the following
result.
Theorem 1.2.5 (PRIM ⊆ URMCOMP)
Every primitive recursive function is URM-computable.

Lemma 1.2.6
Let g ∶ N2 → N be URM-computable. Then the function f ∶ N → N given by
f (x) = µm[g(x, m) = 0] is URM-computable.
Proof. We will store in R1 and R2 the inputs for g. The program Pf next described
computes f :

Pf =(T (1, 3), T (2, 4))


∣Pg >> 2
∣Z[4; ∣ρ(Pg )∣ + 2]
∣J(3, 4, 3) (Jump to end)
∣S(2)
∣J(1, 1, −(2 + ∣Z[4; ∣ρPg ∣ + 2]∣ + ∣Pg ∣ + 2)) (Jump to beginning)
∣T (3, 1) (Execution is done and answer is in R3 )

The idea is to look one by one until we find the first y0 , which is stored in R2 , such
that g(x, y0 ) = 0. If such y0 does not exist, then Pf never halts, which is precisely
what we need. ◻
Proposition 1.2.7 (URMCOMP is closed under Minimalisation)
Let g ∶ Nk+1 → N be URM-computable. Then the function f ∶ Nk → N given by
f (⃗
x) = µm[g(⃗
x, m) = 0] is URM-computable.
Proof idea. Again, the proof for the previous Lemma, which is the case k = 1 of this
proposition, gives us the ideas for the general case: to copy the input into some other
1.2. Computation Models 11

registers, compute g in those registers, and then check if the result of computing g
is 0. If it is, then execution is done, if not, try executing g with the next value of y. ◻
Now we have some even stronger information about URMCOMP.

Theorem 1.2.8 (PREC ⊆ URMCOMP)


Every recursive function is URM-computable.
In fact, the converse of Theorem 1.2.8 is also true, and so PREC = URMCOMP (the
idea behind this is discussed in 1.3.3). Since URM programs capture a concrete
algorithmic way of calculating functions, it follows that URMCOMP ⊆ COMP. Church’s
thesis states that PREC = COMP, and so we obtain the following version of the Thesis:

Church’s Thesis for URM programs URMCOMP = COMP

1.2.2 Turing Machines


Introduced by Alan Turing, though harder to understand and implement than URMs,
Turing Machines are one of (if not THE) most standard models used to define com-
putability. We deviate from the classical presentation of Turing Machines in favor
of Cooper’s “program-like” approach, which is specifically crafted for working with
functions of natural numbers [8].

Definition 1.13 (Turing Machine) A Turing Machine consists of:

1. An infinite, (linear, two-sided) tape, divided in cells. Each cell stores either a
1 or a 0.

2. Tape symbols 0 and 1, the symbols that can be stored in the cells.

3. A reading head which is able to move around the tape and read the symbols
contained in the cells, and write symbols in the cells.

4. A set of states q0 , q1 , .... At the start of any step of a computation, the reading
head is specified to be in one of these states.

5. Action symbols: Used by the program to tell the reading head what to do in
relation to its current cell

• L: Move left one cell.


• R: Move right one cell.
• 1: Print 1 in current cell.
• 0: Print 0 in current cell.
1.2. Computation Models 12

Observation 1.2.3 (More general Turing Machines) We can allow for more sym-
bols to be stored in the cells of a TM. Having defined a finite set of tape symbols
{S0 , ..., Sn }, there should be, for i = 0, ..., n, an action symbol Si to print Si on the
current cell.

Definition 1.14 (Quadruples and TM Programs) Let T be a Turing Machine. Quadru-


ples are the atomic instructions of TM Programs. They are tuples of the form

Q = qi SAqj

where qi and qj are states, S is a tape symbol and A is an action symbol. This
quadruple expresses the instruction: If T is in state qi reading S, then perform
action A and change the internal state to qj .
A set X of quadruples is said to be consistent if

qi SAqj , qi SA′ qk ∈ X ⇒ A = A′ and qj = qk

A Turing Machine Program or TM Program is a finite, consistent set of quadruples.


Throughout this document, we may use the words Turing Machine (TM) Program and
just Turing Machine interchangeably. Whichever we use, we will be referring to a set
of quadruples P and a Turing Machine with a tape head that obeys the instructions
of P .
Just as we did with URMs, we now define how a TM Program is executed, and what
it means for it to halt.

TM Program Execution and Halting Let P = {Q1 , ..., Qn } be a TM Program.


Executing a program P requires an initial state of the TM, which will be called input
and can be expressed by a tuple x⃗ = (x1 , ..., xk ) ∈ Nk . We input x⃗ by writing the
string

1...1 0 1...1 0...0 1...1 ,


± ± ±
x1 + 1 times x2 + 1 times xk + 1 times

placing the tape head on the leftmost 1 and setting the internal state to q0 . We can
assume that the rest of the tape is filled with 0s. At any point during the execution of
the program, the tape head will be reading some tape symbol S with some internal
state qi . If there is some Q ∈ P with the form qi SAqj where A and qj are an action
symbol and qj an internal state, then Q is applied, performing action A and setting
the internal state to qj . Execution of the program continues as long as there are
applicable quadruples. If at some point of the computation there are no applicable
quadruples, we say the execution halts.
Analogous to URMs, we now define what it means for a function to be Turing-
computable.
1.2. Computation Models 13

Definition 1.15 (Turing computability) For each k ∈ N, a Turing program P defines


a partial function JP Kk ∶ Nk → N as follows:

• If executing P with input x⃗ eventually halts, and when it does there are z 1’s
left on the tape, then JP Kk (⃗
x) = z.

• Otherwise, if the execution never halts, JP Kk (⃗


x) ↑.

A function f ∶ Nk → N is Turing computable is there exists a TM program Pf such


that f = JPf Kk . Again, we just write JP K if k can be deduced from the context.
The set of all Turing Computable functions will be noted by TMCOMP.

Proposition 1.2.9 (Basic PRIM functions are TMCOMP)


1. There is a TM program Z such that for every k ∈ N and every x⃗ ∈ Nk ,
JZKk (⃗
x) = 0.

2. There is a TM program S such that for every x ∈ N, JSK(x) = x + 1.

3. For every k ∈ N and every 1 ≤ i ≤ k, there is a TM program Πki such that


JΠi Kk (x1 , ..., xk ) = xi .
Proof.

1. Consider the program Z:


q0 10q1
} Reads 1, places 0 instead and goes to the right.
q1 0Rq0
q0 0Rq2
} If a 0 is read, go right and if a 1 is read, go back to loop.
q2 10q1
It is easy to check that program Z computes the zero function.

2. The successor function is easy to compute, just read 1 until a 0 is read. There,
place an extra 1:

q0 1Rq0
q0 01q1

3. For k ∈ N and i ≤ k define Πki as follows.


For each j ∶ 0 ≤ j < i − 1 add quadruples:

q2j 10q2j+1
q2j+1 0Rq2j
q2j 0Rq2(j+1)

We use i − 1 since the states are 0-based. The above subroutine "cleans up" the
entries x1 , ..., xi−1 . In our input convention, after running the above portion of
1.2. Computation Models 14

the program, the state of the tape will be (xi , xi+1 , ..., xk ) and the tape head will
be reading the beginning of xi . Next, we need to read xi leaving it untouched:

q2(i−1) 1Rq2(i−1)
q2(i−1) 0Rq2i

After reading xi , erase the rest of the tape.

q2i 10q2i+1
q2i+1 0Rq2i
q2i 0Rq2i+2
q2i+2 10q2i+1


We would like to prove now that TMCOMP is closed under Composition, Primi-
tive Recursion and Minimalisation. For these, we need a way of composing Turing
Machines. We face the problem that although the input is given in some standard
form, the output is not. To solve this, we state the following Proposition, proved as
Example 2.4.14 of [8]. Though important, the proof for this is purely technical. For
the following definitions, we will introduce the use of extra symbols, as discussed in
Observation 1.2.3. These symbols will not affect our definition of J_K.

Proposition 1.2.10 (TM programs with nice outputs)


Let T be a TM Program. Then there exists a TM Program T ∗ such that if JT Kn (⃗ x) ↓
then executing T with input x⃗ halts with a block of JT K (⃗
∗ n x) consecutive 1s followed
by only 0s, the reading head placed on the leftmost 1. Additionally, the tape head
never moves left of its starting position and there is a unique state qω such that every
time T ∗ halts, the reading tape has internal state qω , and there is no quadruple in
T ∗ with a state qk with k > ω. We will call this the final state. If execution of T
with input x⃗ does not halt, then execution of T ∗ does not either. In other words,
JT Kn = JT ∗ Kn , but T ∗ leaves a nicely formatted output.

Definition 1.16 (Some auxiliary TM program definitions) Let P = {Q1 , Q2 , ..., Qn }


be a TM program.

1. P >> kk: Let k ∈ N. For a quadruple Ql of the form qi SAqj , define Ql >> k =
qi+k SAqj+k . Define the program P >> k = {Q1 >> k, ..., Qn >> k}.

2. ρ(P )): Define ρ(P ) to be the largest index k of an internal state qk used by P .

3. The Shift Right program. This program will be noted by Shift. Given an input

1 ... 1 0...
²
n times
1.2. Computation Models 15

Executing Shift leaves as result

# 1...1 0...
±
n times

with the tape head over the # symbol. It is easy to write this program.

Lemma 1.2.11
If g ∶ N2 → N and h1 , h2 ∶ N → N are TM computable, then f ∶ x → g(h1 (x), h2 (x)) is
TM computable.
Proof sketch. Let Pg , Ph1 , Ph2 compute g, h1 , h2 respectively. By Proposition 1.2.10
we can assume that these programs compute g,h1 and h2 with certain uniformity in
the output (as stated in the Proposition). Consider the following subroutines that
make up our desired program:

• Copy the input. For this, we use extra tape symbols a, #. For each 1 read,
place an a, go to the end of the input, place a 1, and go back to the rightmost
a, where the process starts again. At the end, the internal state should be q5 ,
the tape will contain

1...1 # 1...1
± ±
x + 1 times x + 1 times

and the tape head will be on the leftmost 1 of the second x.

q0 1aq1 q1 aRq1
q1 1Rq1 q1 0Rq2
q2 1Rq2 q2 01q3
q3 1Lq3 q3 0Lq3
q3 a1q4 q4 1Rq0
q0 0#q0
q0 #Rq5

• Compute h2 (x). Add the quadruples of the program Ph2 >> 5, and append an
extra 1 at the end of the tape. At the end of this execution, the tape will look
like

1...1 # 1...1
± ±
x + 1 times h2 (x) + 1 times

and the tape head will be right after the # symbol.

• Compute h1 (x). Now we would like to compute h1 (x) using the input that we
copied. There are several things that should be taken into account here:
1.2. Computation Models 16

– This needs to happen right after the execution of Ph2 >> 5. Set M =
ρ(Ph2 >> 5). By Proposition 1.2.10, we can assume that execution always
halts on some fixed state qω2 ∈ [5, M ]. Add the quadruple

qω2 11qM +1

– We can now compute h1 by adding the quadruples on Ph1 >> (M + 1).


We face an additional problem: during the computation of h1 , the tape
head may go over the # symbol and damage the computation of h2 (x). To
avoid this, we need to shift to the right the block of h2 (x) + 1 consecutive
1s placed after the # symbol, every time the tape head reads a # in a
state between M + 1 and N , where N = ρ(Ph1 >> (M + 1)). This needs
to be done carefully, as the program needs to "remember" the state which
read the # symbol, to come back to it after the shift is complete.
At the end of this process, by adding the appropriate quadruples, the tape
will contain

1...1 # 1...1
± ±
h1 (x) + 1 times h2 (x) + 1 times

and the tape head will be placed on the leftmost 1 with some internal
state qω1 .

• Compute g. First, we have to change the # symbol to a 0. Let K be an index


state greater than all of the index states previously used. Start by adding the
following quadruples to make the change:

qω1 1aqK qK 1RqK


qK #0qK qK 0LqK+1
qK+1 1LqK+1 qK+1 a1qK+2

After running the subroutine above, the tape will contain

1...1 0 1...1 ,
± ±
h1 (x) + 1 times h2 (x) + 1 times

the tape head will be over the leftmost 1 and the internal state will be qK+2 .
Adding the quadruples of Pg >> K + 2 completes the program.


Proposition 1.2.12 (TMCOMP is closed under Composition)
If g ∶ Nm → N and hi ∶ Nk → N for i = 1, ..., m are Turing-computable, and
f (⃗
x) = g(h1 (⃗
x), ..., hm (⃗
x)) then f is Turing-computable.
1.2. Computation Models 17

Proof idea. As in previous cases, we proved the particular case where m = k = 1.


The proof for the general case follows a similar idea: To copy the input m times and
then compute each hi separately on a delimited section of the tape. We use the extra
symbol # to mark these limits. ◻
Notice that even when proving an "easy" particular case, and skipping some of the
most grueling technical details, the construction for the proof above is very com-
plicated and difficult to understand. For the next propositions, we will give more
wordy explanations of the Turing Machines, hoping to provide enough detail. To
write these complete descriptions, we just need to break up the "word" arguments
into actual quadruples. The ideas for constructing such quadruples are contained in
the proof for Proposition 1.2.11.

Lemma 1.2.13
If g ∶ N → N and h ∶ N3 → N, then f ∶ N2 → N defined by

f (x, 0) = g(x)
f (x, y + 1) = h(x, y, f (x, y))

is TM computable.
Proof sketch. The idea is to separate the tape into sections: the first two will
contain the input x, y. The third section will contain some y0 ≤ y indicating that
the fourth section contains the value of f (x, y0 ). As we will see, we will need more
sections, but this is the general idea. Let Pg and Ph be TM programs that compute
g and h respectively, outputting as described in Proposition 1.2.10.

1. Start out with the input

1 ... 1 0 1 ... 1
² ²
x + 1 times y + 1 times

2. Copy and modify the contents of the tape so the end result looks like this

1 ... 1 # 1 ... 1 #1 0 ... 0 # 1 ... 1


² ² ² ²
x + 1 times y + 1 times y times x + 1 times

3. Place the tape head at the beginning of the last section of the tape, run Pg ,
and add a final 1, to obtain

1 ... 1 # 1 ... 1 #1 0 ... 0 # 1 ... 1


² ² ² ²
x + 1 times y + 1 times y times g(x) + 1 times

4. Check if the second and third sections of the tape are equal. If so, return the
contents of the fourth section, minus a one. To do this, clean everything that
is before the leftmost 1 of the fourth section. Otherwise, continue to next step.
1.2. Computation Models 18

5. The tape is in the following state, for some y0 < y

1 ... 1 # 1 ... 1 # 1 ... 1 0 ... 0 # 1 ... 1


² ² ² ² ²
x + 1 times y + 1 times y0 + 1 times (y − y0 ) times f (x, y0 ) + 1 times

Add a 1 to the third section so the tape looks like

1 ... 1 # 1 ... 1 # 1 ... 1 1 0 ... 0 # 1 ... 1


² ² ² ² ²
x + 1 times y + 1 times y0 + 1 times (y − y0 − 1) times f (x, y0 ) + 1 times

6. Copy, shift and modify the contents of the tape so that they now are

1 ... 1 # 1 ... 1 # 1 ... 1 0 ... 0 # 1 ... 1 0 1 ... 1 0 1 ... 1


² ² ² ² ² ² ²
x + 1 times y + 1 times y0 + 2 times (y − y0 − 1) times x + 1 times y0 + 1 times f (x, y0 ) + 1 times

Place the head at the leftmost 1 of the fourth section (the beginning of the
second 1 × (x + 1)).

7. We can now run (a shifted version of) Ph and add a final 1, so that the result
is

1 ... 1 # 1 ... 1 # 1 ... 1 0 ... 0 # 1 ... 1


² ² ² ² ²
x + 1 times y + 1 times y0 + 2 times (y − y0 − 1) times h(x, y0 , f (x, y0 )) + 1 times

8. Go to Step 4.


Proposition 1.2.14 (TMCOMP is closed under Primitive Recursion)
If g ∶ Nk → N and h ∶ Nk+2 → N are Turing-computable then if f ∶ Nk+1 → N is defined
by

f (⃗
x, 0) = g(⃗
x) ∀⃗
x ∈ Nk
f (⃗
x, y + 1) = h(⃗
x, y, f (⃗
x, y))

f is TM computable.
Proof idea. Once more, we showed how to prove the particular case k = 1. For the
general case we can use the same idea. We don’t need to add more sections to the
tape, just keep a copy of the full original input at the beginning of the tape, just like
we do with the particular case. ◻
The previous propositions prove the following theorem.
Theorem 1.2.15 (PRIM ⊆ TMCOMP)
Every primitive recursive function is Turing-computable.

Lemma 1.2.16
Let g ∶ N2 → N be Turing-computable. Then the function f ∶ N → N given by
f (x) = µm[g(x, m) = 0] is Turing-computable.
1.2. Computation Models 19

Proof. Again we use the trick of partitioning the tape and copying the input. Let
Pg be a TM program that computes g like in Proposition 1.2.10.

1. Start with input x

1 ... 1 0...
²
x + 1 times

2. Copy the input and modify the tape so it ends up looking like this

1 ... 1 # 1 # 1 ... 1 0 1
² ²
x + 1 times x + 1 times

3. The tape contents are now of the form

1 ... 1 # 1 ... 1 # 1 ... 1 0 1 ... 1


² ² ² ²
x + 1 times m + 1 times x + 1 times m + 1 times

for some m ∈ N.

4. Place the tape head on the leftmost 1 of the third section (right after the second
# symbol), run (a shifted version of) Pg and add an extra 1 to the end so the
result is

1 ... 1 # 1 ... 1 # 1 ... 1


² ² ²
x + 1 times m + 1 times g(x, m) + 1 times

If g(x, m) = 0 (or in other words, if there is only one 1 after the second #
symbol), then clean up the tape by zeroing everything from the leftmost 1 to
the first #, and from the second # symbol to the last (and only) 1 to its right.
This sets m as the output. Otherwise, go to the next step.

5. Shift the third section of the tape and add a 1 to the second one so the tape
now looks like

1 ... 1 # 1 ... 1 # 1 ... 1


² ² ²
x + 1 times m + 2 times g(x, m) + 1 times

6. Copy the first two sections of the tape so that the tape contents now are

1 ... 1 # 1 ... 1 # 1 ... 1 0 1 ... 1


² ² ² ²
x + 1 times m + 2 times x + 1 times m + 2 times

7. Go to step 3.

By following these steps it is clear that if there exists an m such that g(x, m) = 0,
then the TM described above will eventually halt and output such m. ◻
1.3. Coding and the theorems of Recursion 20

Proposition 1.2.17 (TMCOMP is closed under Minimalisation)


Let g ∶ Nk+1 → N be Turing-computable. Then the function f ∶ Nk → N given by
f (⃗
x) = µm[g(⃗
x, m) = 0] is Turing-computable.
Proof. As customary, we considered the particular case where k = 1. For the general
case, we don’t even need to keep more sections on the tape, as long as we keep a
copy of the original input at the beginning of the tape. ◻
Theorem 1.2.18 (PREC ⊆ TMCOMP)
Every partial recursive function is computable by a Turing machine.
The converse of this theorem is also true and so PREC = TMCOMP. A proof for this
fact, without the need of Church’s Thesis, is discussed later in this document after
Theorem 1.3.2.

1.3 Coding and the theorems of Recursion


1.3.1 Gödel Numberings
We start by defining a coding that assigns to each TM program a (maybe non unique)
number, from which, using an algorithm, we can get back the original program. In
Proposition 1.1.1 we defined a way of encoding tuples in Nn to N in a recursive
way, as well as partial recursive functions to “decode” such mappings, recovering
the original tuple. Gödel numberings, as we will see, are heavily based on being
able to do this. For the remainder of this section, let ⟨_⟩ be the tuple encoding
function, along with its projection functions πi and "length" function L, as defined in
Proposition 1.1.1.

Definition 1.17 (Gödel numbers for Turing Machines) Define the following codes
for tape symbols, action symbols and internal states:

gn(L) = 2
gn(R) = 3
gn(qi ) = 2i + 4 Even numbers greater than 2
gn(Si ) = 2i + 5 Odd numbers greater than 3

Now, given a quadruple Q = qi SAqj we can code it as

gn(Q) = ⟨gn(qi ), gn(S), gn(A), gn(qj )⟩.

With this coding we assure that given two different quadruples, their numberings will
be different. More so, given gn(Q) we use the recursive projection functions πi to
unequivocally find the original Q.
Now, given a TM program P = {Q0 , Q1 , ..., Qn } we can code it as

c = ⟨gn(Q0 ), gn(Q1 ), ..., gn(Qn )⟩


1.3. Coding and the theorems of Recursion 21

Notice that the coding for each program is not unique, as it depends on the order
we choose for the quadruples. Define then a relation ≃g between T M programs
and natural numbers where P ≃g c if and only if P ’s quadruples can be ordered as
gn(Q )
(Q0 , Q1 , ..., Qn ) such that the equation above (c = Πni=0 pi i ) holds. If P ≃g c, we
say that c is a Gödel Numbering or Index for P . Notice that if P and P ′ are different
T M programs and if for some c ∈ N, P ≃g c, then P ′ ≃/ g c. So different T M programs
will have different sets of indexes.
Gödel numbers provide us with a recursive way of indexing all Turing Machines.

Definition 1.18 (Turing Machine indexes) For e ∈ N, the eth Turing Machine Pe is
defined by

P If P ≃g e for some (unique) P


∅ Otherwise

We introduce new notation for the JK operator. For e ∈ N define

JeKk = JPe Kk

For the proofs of the following theorems, we use the primitive recursive function eq
as defined in 1.1.1.
Lemma 1.3.1
The set of Turing Machine indexes T i = {e ∈ N ∶ Pe ≠ ∅} is recursive.
Proof sketch. First we define a function q such that, for x ∈ N,


⎪1 if gn(Q) = x for some quadruple Q
q(x) = ⎨

⎪ otherwise
⎩0
1. Checking that L(x) = 4. If it is not, then q(x) = 0.

2. Checking that the first and fourth projections (π1 (x) and π4 (x)) are codes for
internal states, that is, that they are even numbers greater than 2. If not, then
q(x) = 0.

3. Checking that the second projection is a code for a tape symbol, that is, it is
an odd number greater than 3. If not, then q(x) = 0.

4. Checking that the third projection is a code for an action symbol, that is, it is
either 2 or 3 or an odd number greater than 3. If not, then q(x) = 0.

If all of the conditions above held, then q(x) = 1. We can check now for any x ∈ N if
it is the encoding of a set of quadruples by checking that each of the projections is
a valid quadruple. We can do this by taking the following sum:
L(x)
∑ q(π(i, x)),
i=1
1.3. Coding and the theorems of Recursion 22

and checking that it is equal to L(x). If it is not, or if L(x) = 0, then χT i (x) = 0. We


should finally check that the set of quadruples is consistent and thus represents a
valid Turing Machine program. We do this by checking that for different quadruples
qSAr and q ′ S ′ A′ r′ , q ≠ q ′ or S ≠ S ′ . We can do this, for example, by summing
L(x)−1 L(x)
∑ ∑ eq(⟨π1 (Qi ), π2 (Qi )⟩, ⟨π1 (Qj ), π2 (Qj )⟩)
i=1 j=i+1

where Qk = π(k, x), and checking that such sum is 0. With enough care, one could
now define recursively the function χT i using the ideas presented above. ◻
Theorem 1.3.2 (The Enumeration Theorem)
f ∶ N1+n → N defined by f (e, x⃗) = JeKn (⃗
x) for all x⃗ ∈ Nn is partially recursive.
Proof idea. Fix n ∈ N∗ . The idea is to code the state of a TM program execution
with numbers recursively, similar to how we code the programs themselves. Sup-
pose we have an index e corresponding to some Turing Machine with tape alphabet
A = {S0 , S1 , ...}. Recall that in Definition 1.3.1 we have already defined codings
gn(qi ) and gn(Si ) for states and tape symbols.

Coding the state of a Turing Machine:

• Given a word w = s0 s1 ...sm ∈ A∗ define its coding as

gn(w) = ⟨gn(s0 ), ..., gn(sn )⟩

• The state of a Turing Machine is completely determined by the current internal


state, tape content and tape head position. Now, even though the tape on a
Turing Machine is infinite, during the execution of a Turing Machine program
there will be only finite symbols different than S0 = 0. Therefore, if we have a
tuple φ = (q, wL , σ, wR ) where

– q is the current internal state.


– wL is the tape content to the left of the tape head. Assume that everything
to the left of wL is 0.
– σ is the symbol that the head is reading.
– wR is the tape content to the right of the head. Assume that everything
to the right of wR is 0,

then φ describes completely the state of the execution of a TM program. We


can code such φ by:

gn(q, wL , σ, wR ) = ⟨gn(q), gn(wL ), gn(σ), gn(wR )⟩

In a similar way to the proof of Lemma 1.3.1, we could show that the set of
codes of tape states is recursive. We can also define a recursive initialization
1.3. Coding and the theorems of Recursion 23

function init ∶ Nn → N such that for all x1 , ..., xn ∈ N, init(x1 , ..., xn ) is the
code of the initial tape state with input (x1 , ..., xn ), that is, the code of a tape
state with

1. q0 as the current state.


2. No content to the left of the tape head (wL = 0).
3. 1 as the current symbol that the head is reading.
4. wR as the rest of the string representation of the tuple x1 , ..., xn in the
tape, as in 1.2.2.

Calculating the next step in the execution (as a partial recursive function):
We will define then a partial recursive function next ∶ N2 → N which on input (e, s),
where s = gn(q, wL , σ, wR ), will simulate executing one step of Turing Machine Pe
from state (q, wL , σ, wR ). More explicitly, given s ∈ N, the function does the following:

1. Tries to factor s as a coding of a TM state s = gn(q, wL , σ, wR ), recovering


q, σ, wL , wR . This is done via the recursive projection functions, so for exam-
ple gn(q) = π1 (s) and gn(wR ) = π4 (s). If s is not a valid state code, then
next(e, s) ↑.

2. Tries to factor e as a coding of a Turing Machine program (see 1.3.1). If e is


not a valid program code then next(e, s) ↑. Otherwise try finding a matching
quadruple qσAqj . If there is no matching quadruple then next(e, s) = 0.

3. Finally, if e and s were valid codes and a matching quadruple qσAqj was found,
calculate the next state φ′ = (qj , wL′ , σ ′ , wR

) of the execution.

• If A = Sk then the next state will be φ′ = (qj , wL , Sk , wR ).


′ ′
• If A = R then φ′ = (qj , wL σ, σ ′ , wR ) where wR = σ ′ wR .
• If A = L then φ′ = (qj , wL′ , σ ′ , σwR ) where wL = wL′ σ ′ .

In this case, next(e, s) ↓ and next(e, s) = gn(φ′ ).

In summary,

• next(e, s) ↓ if e is not the code of a Turing Machine program, or x is not the


code of a tape state.

• next(e, s) = 0 if there is no quadruple in Turing Machine program Pe matching


the state coded by x (so execution cannot continue).

• next(e, s) = s′ ≠ 0 if there is a matching quadruple in Pe , and s′ is the code of


executing one step of Pe starting at the state coded by s.
1.3. Coding and the theorems of Recursion 24

Executing a Turing Machine program (as a recursive function):


Now we define a partial recursive function run so that for all e, s, i ∈ N, run(e, s, i) ↑
if e or s are not valid TM program and tape state codes and
run(e, s, 0) = 0
run(e, s, i + 1) = next(e, run(e, s, i))
So run(e, s, i) simulates, if possible, i transitions of Pe starting with tape a tape
state coded by s. We define a partial recursive function steps that tells us how
many "steps" a TM program executes starting with some tape state before halting.
steps(e, s) = µk[run(e, s, k) = 0].
Finally, the function final ∶ N1+n → N defined as
final(e, x⃗) = run(e, init(⃗
x), steps(e, init(⃗
x))),
returns the code of the final tape state of a Turing Machine after the successful
execution of program Pe with the string representation of x⃗ ∈ Nn as an input. So we
finally define the function f (e, x⃗) of the statement of this theorem to be the number
of 1’s in the tape stated coded by final(e, x⃗). This is done via another recursive
function. Given a word w ∈ A∗ , w = s1 ...sk , we defined gn(w) = ⟨gn(s1 ), ..., gn(sk )⟩.
So a "one-counting function" oc can be defined recursively
L(z)
oc(z) = ∑ eq(π(i, z), gn(1)),
i=1

and this function clearly counts the 1’s in w ∈ A∗ given its code gn(w). With this
function, define
f (e, x⃗) = oc(π2 (sF )) + eq(π3 (sF ), gn(1)) + oc(π4 (sF ))
where sF = final(e, x⃗). ◻

Turing Computable functions are partial recursive TMCOMP = PREC

Proof. Let f ∶ Nn → N be a Turing Computable (partial) function. Hence, there


exists a Turing Machine Pe such that f = JPe Kn = JeKn , which is recursive by the same
reasoning as in the proof of 1.3.2. ◻

1.3.2 Important Results about Turing Machines


Theorem 1.3.3 (The Universal Turing Machine)
There exists a Universal Turing Machine, that is, a Turing Machine program U such
that, given any input (e, x), simulates the computation of Pe with input x. In other
words, for all e, x ∈ N
JU K2 (e, x) = JeK(x)
1.3. Coding and the theorems of Recursion 25

Proof. By the Enumeration Theorem 1.3.2, we know that the function f (e, x) = JeK(x)
is partial recursive. In section 1.2.2 we proved that PREC ⊆ TMCOMP and so there is a
TM U that computes f . ◻

Observation 1.3.1 (Importance of the Universal Machine) In [15], Rogers gives some
very interesting insight on the implications of Theorem 1.3.3.

[Theorem 1.3.3] has a nontrivial practical significance. It shows that, for


computing partial functions of one variable, there is a critical degree
of “mechanical complexity” beyond which all further complexity can be
absorbed into increased size of program and increased use of memory
storage. (Hartley Rogers)

When talking about “mechanical complexity”, Rogers refers to that of Turing Ma-
chine U , which provides a bound on the number of internal states and of instructions
that the tape head needs to be able to perform in order to compute ALL computable
functions of one variable.

Observation 1.3.2 (Turing Machines with nice outputs revisited) For the last the-
orems, notably the Enumeration Theorem 1.3.2, we did not care about the output of
Turing machines being nicely formatted, as in Proposition 1.2.10. Indeed, there are
indexes e ∈ N for which Pe is a Turing Machine that does not output in a nice format,
yet the previous theorems hold for these indexes. However, this nice output property
is desirable for our later work. Luckily, the following proposition solves this issue.
We state it without proof, which could be written following the proof of 1.2.10, found
in [8].

Proposition 1.3.4 (A nice output converter)


There exists a partial recursive function nice ∶ N → N such that, for every e ∈ N, if e
is a valid TM program code, then nice(e) is the code of a Turing Machine program
such that Jnice(e)Kn = JeKn for all n ∈ N, but Pnice(e) outputs with nice format (as in
Proposition 1.2.10).

Theorem 1.3.5 (Recursive Composition)


There exists a partial recursive function Comp which for every x, y ∈ N,

JComp(x, y)K = JxK ○ JyK

Proof. Without loss of generality, we can assume that Py outputs with nice format
(1.2.10), as we could write nice(y) instead of y, and the proof would yield the
same result. The idea is simply to take Turing machines Px and Py and output
Py ∪ (Px >> ω), where ω ∈ N is the final state of Py .

• First, given y, we need to find the final state of Py . By definition, such state
should have the greatest index of all states in quadruples of Py . Following
1.3. Coding and the theorems of Recursion 26

the definition of Gödel Numberings (1.3.1), it can be proved that there is a


partial recursive function state such that, for any quadruple Q = qi SAqj , it
recovers state qj , that is, state(gn(Q)) = j. By the definition of projections in
1.1.1, there is in fact a partial recursive function π such that π(i, z) = πi (z)
for all i, z ∈ N. Let n = L(y) be the number of quadruples in Py . We know
gn(Qi ) = π(i − 1, y) for i = 1, ..., n. So we can find the final state ω recursively
by
n
om(y) = ω = µk[(∑ state(gn(Qi ))−̇k) = 0]
i=1

• It can be easily proved that there is a partial recursive function shf t such that
for a quadruple Q = qi SAqj and a number k, shf t(gn(Q), k) = gn(Q >> k),
where Q >> k results in adding k to the state indexes of Q. So define
L(x)
Comp(x, y) = gn(y) × ∏ p(i + L(y) − 1)shf t(π(i−1,x),om(y))
i=1


Theorem 1.3.6 (The s-1-1 Theorem)
There exists a recursive function s11 ∶ N2 → N such that for all e ∈ N, y ∈ N, z ∈ N
Js11 (e, y)K(z) = JeK2 (y, z)
Proof. For a given x ∈ N, there is a Turing Machine Wx that, when executed with
initial state:
1 ... 1 ,
²
k times

for some k ∈ N, it halts, with the resulting tape:


1 ... 1 0 1 ... 1 ,
² ²
x + 1 times k times

and with the tape head on the leftmost 1. Let’s see that we can actually define a
recursive function w such that for every x ∈ N, Jw(x)K = JWx K. There is a Turing
machine which shifts the tape contents to the right by one place, leaving the tape
head in its original position, let s be the index for such machine. There is also a
Turing machine o with nice output which simply writes a 1 wherever the tape head
is, without moving it. With these two machines, define w as
w(0) = Comp(o, Comp(s, s))
w(n + 1) = Comp(o, Comp(s, w(n))),
and so Jw(x)K = JWx K. Now define
s11 (e, y) = Comp(e, w(y))
to obtain the desired result. ◻
1.3. Coding and the theorems of Recursion 27

Observation 1.3.3 (Meaning of s − 1 − 1) The s11 function acts as a specializer which


fixes one variable of a function of two variables. It not only tells us that such
specialized functions are computable, but also that we can recursively specialize
functions of two variables to functions of one variable.
Corollary 1.3.7 (Application of s − 1 − 1)
Let f ∶ N2 → N be recursive. There exists a recursive g ∶ N → N such that
f (x, y) = Jg(x)K(y) for all x, y ∈ N.
Proof. Since f is recursive, then f = JeK2 for some e ∈ N. Use s11 as defined in
Theorem 1.3.8 to define g(x) = s11 (e, x). Then g is recursive and we have

f (x, y) = JeK2 (x, y)


= Js11 (e, x)K(y) (By the s-m-n theorem)
= Jg(x)K(y)


Theorem 1.3.6 can in fact be generalized. We can take a function of m + n variables
and fix m of them, leaving as a result a function of n variables. This is called the
S-m-n Theorem, which we state below. The proof follows the same ideas that the
proof for s11 does, with added technical difficulty, so we do not present it here.
Theorem 1.3.8 (The s-m-n theorem)
⃗ ∈ Nm and y⃗ ∈ Nn :
n such that for every e ∈ N, x
There exists a computable function sm

⃗)Kn (⃗
n (e, x
Jsm x, y⃗)
y ) = JeKm+n (⃗

We now state one of the main interests of this work, Kleene’s Second Recursion
Theorem, or SRT for short. This is the formulation and proof of the SRT used by
Jones in [11].

Theorem 1.3.9 (Kleene’s Second Recursion Theorem)


Let f ∈ PREC. Then there exists k ∈ N (which we will call a fixed point for f ) such
that for all x ∈ N:

f (k, x) = JkK(x)

Proof. Consider the function g(y, x) = f (s11 (y, y), x). By Theorem 1.3.2, there
exists an index p such that

g(y, x) = JpK2 (y, x)


= Js11 (p, y)K(x)

Let k = s11 (p, p). We have then that:


1.3. Coding and the theorems of Recursion 28

f (k, x) = f (s11 (p, p), x)


= g(p, y)
= Js11 (p, p)K(x)
= JkK(x)

So k is a fixed point for f . ◻


Sometimes the SRT is formulated in a different way, also known as Roger’s Fixed
Point Theorem. We can prove that the SRT implies Roger’s Theorem.

Corollary 1.3.10 (Roger’s Fixed Point Theorem)


Let f ∈ PREC be total. Then there exists a fixed point k such that Jf (k)K(x) = JkK(x)
for all x ∈ N.
Proof. Define g(y, x) = Jf (y)K(y). Since g is recursive, then by the SRT there
exists k ∈ N such that:

JkK(x) = g(k, x) = Jf (k)K(x)

1.3.3 The Church-Turing Thesis


With the Enumeration Theorem (1.3.2) we proved that PREC = TMCOMP. Notice that
the key fact about Turing Machines that our proof of the Enumeration Theorem uses
is that we can recursively code the states of a program execution as natural numbers,
which in turn is possible because of the existence of a tuple encoding function. Using
the same idea, we can code the states of execution of other models of computation,
which would yield as results Enumeration Theorems for such models. For example,
we could do analogue proofs using URMs or λ-calculus. The fact that all of these
models end up computing the same set of functions, namely PREC, led Alonzo Church
to propose the now called Church-Turing Thesis.

Church-Turing Thesis Every effectively computable function is partial recursive.

So, it does not matter which computation model we use, the functions calculated
are the same PREC = TMCOMP = URMCOMP = COMP.
Chapter 2

Computability and Recursion on


General Sets

Introduction
Looking towards general applications, we would like to work with functions that work
over sets different than N, usually the words of a language over some alphabet A.
When thinking about computability as the ability to describe algorithms effectively in
“some” language, we immediately think about some popular programming languages
that we use today (Java, C++, etc). Programs in those languages are not limited to
functions over the natural numbers, but instead, work with some inputs and outputs
on some set D of data. In this work, we need to deal with these general kinds of
algorithms. This chapter discusses the notion of computability for such functions, and
introduces Roger’s Axioms, which characterize “acceptable” programming languages
in which we have many results related to Kleene’s Second Recursion Theorem.

2.1 Acceptable Programming Languages


In Section 1.3.1 we defined Gödel numbers as a way of indexing Turing Machines,
and thus (via Church-Turing’s Thesis) the primitive recursive functions. Then some
important results proceeded, notably the Enumeration Theorem (1.3.2) and the s-m-n
Theorem (1.3.8). These results seem to be tied to the coding scheme and the model
of computation used, and so the question if these results hold for other codings or
models naturally arises. Following Roger’s work [15], accompanied by more modern
presentations in [9] and [12], we present some properties of such codings so that the
results mentioned above hold, allowing us to present results that are independent of
the computation model that is used.

29
2.1. Acceptable Programming Languages 30

2.1.1 Preliminaries
Definition 2.1 (Pairing Function) A function ⟨_, _⟩ ∶ N2 → N is a pairing function if
it is total recursive and there exist partial recursive projections π1 , π2 such that
πi (⟨x1 , x2 ⟩) = xi i = 1, 2
Definition 2.2 (Pairing Scheme) A pairing scheme is a set of total recursive func-
tions ⟨_⟩ ∶ Nn → N with partial recursive projections πin such that for every i, n ∈ N
πin (⟨x1 , ..., xi , ..., xn ⟩) = xi
Note that every Tuple Encoding Scheme as in Proposition 1.1.1 is a Pairing Scheme.
Observation. Every pairing function gives rise to a pairing scheme by defining:
⟨x1 ⟩ = x1
⟨x1 , x2 , ..., xn ⟩ = ⟨x1 , ⟨x2 , ..., xn ⟩⟩.

There exist many pairing functions. From now on, assume ⟨_, _⟩ to be an arbitrary
but fixed pairing function. Its explicit definition will be irrelevant for our work.

Working only with unary functions Given a recursive function f ∶ Nk → N, define


a function f ′ ∶ N → N as f ′ (X) = f (π1 (X), π1 (π2 (X)), ..., π1 (π2 (...π2 (X)))) so that
for every x1 , ..., xk ∈ N, f (x1 , ..., xk ) = f ′ (⟨x1 , ..., xk ⟩). f ′ is now an unary, recursive
function that behaves like f . With this construction, we can justify working only with
unary functions, as they are expressive enough. We easily obtain unary versions for
Theorems 1.3.3, 1.3.8 and 1.3.5. We prove just one, since the other proofs should be
analogue.
Theorem 2.1.1 (Unary Universal Machine)
There exists a partial recursive function U such that for every e, x ∈ N
U (⟨e, x⟩) = JeK(x)
Proof. By Theorem 1.3.3 there exists a partial recursive function u such that
u(e, x) = JeK(x) for all e, x ∈ N. Defining U (z) = u(π1 (z), π2 (z)) we obtain the
result. ◻
Theorem 2.1.2 (Unary S-m-n Theorem)
For every m, n ∈ N, there exists a recursive function sm
n ∶ N → N such that, for every
e, x1 , ...., xm+n ∈ N

n (⟨e, x1 , ..., xm ⟩)K(⟨xm+1 , ..., xm+n ⟩) = JeK(⟨x1 , ..., xm+n ⟩)


Jsm
Theorem 2.1.3 (Unary Recursive Composition)
There exists a partial recursive function c such that for every i, j ∈ N,
Jc(⟨i, j⟩)K = JiK ○ JjK
For the remainder of this section, all functions shall be considered unary. For ease
of notation, we will write f (x1 , ..., xk ) instead of f (⟨x1 , ..., xk ⟩).
2.1. Acceptable Programming Languages 31

2.1.2 Acceptable Indexings


Definition 2.3 (Indexing) Let PREC(1) be the set of unary recursive functions. An
indexing (of the recursive functions) is a surjective function π ∶ N → PREC(1). As
notation, we write πi ∶= π(i) for i ∈ N.

The "standard" indexing (ϕ) Gödel numberings from last Chapter defined an in-
dexing of PREC(1). We will fix this indexing as a standard and call it ϕ, so that
ϕi = JPi K, where Pi is the ith Turing Machine as defined in 1.3.1.

Definition 2.4 (Some definitions for indexings) Let π be an indexing. We say that
π is:
1. Effective if there exists a recursive function f ∶ N → N such that π = ϕ ○ f .

2. Programmable if there exists a recursive function g ∶ N → N such that ϕ = π ○ g.

3. Acceptable if it is effective and programmable.

4. Universal if there exists a recursive function uπ ∶ N → N such that

uπ (e, x) = πe (x) ∀e, x ∈ N

uπ is called a universal function for π.

5. An indexing with the s11 property if there exists a recursive function sπ ∶ N → N


such that

πsπ (e,x) (y) = πe (x, y) ∀e, x, y ∈ N

sπ is called an s11 or specializer function for π.

6. An indexing with recursive composition if there exists a recursive function


cπ ∶ N → N such that

πcπ (e,i) = πe ○ πi ∀e, i ∈ N.

cπ is called a recursive composition function for π.

Observation 2.1.1 (ϕ has all the properties) We have already proved that ϕ is uni-
versal, has the s11 property and has recursive composition (Theorems 2.1.1, 2.1.2,
2.1.3). It is also trivially effective, programmable and thus acceptable.

Observation 2.1.2 (acceptable indexings are effectively equivalent) Let π, ψ be two


acceptable indexings. It is easy to see from the definitions that there exist recursive
functions f, g ∶ N → N such that

π =ψ○f
ψ = π ○ g,
2.1. Acceptable Programming Languages 32

so not only do π and ψ compute the same functions, but there is an effective way
to go back and forth between them. Effectiveness and programmability are usually
refered to as Roger’s axioms for acceptable indexings.

Theorem 2.1.4 (Universal ↔ Effective)


An indexing π is universal if and only if it is effective.
Proof. (⇒) Suppose π is universal and let uπ be a recursive universal function as
in Definition 2.4. Since π is an indexing, there exists a natural number U such that
ϕU = uπ . By Theorem 2.1.2, ϕ has the s11 property, so it has a recursive specializer
function sϕ . For i ∈ N define f (i) = sϕ (U, i), and so for all x ∈ N,

ϕf (i) (x) = ϕsϕ (U,i) (x)


= ϕU (i, x)
= uπ (i, x)
= πi (x).

(⇐) Suppose now that π is effective, and so there exists a recursive f such that
ϕ ○ f = π. By Theorem 2.1.1, ϕ is universal and so it has a recursive universal
function uϕ . Defining the function u(e, x) = uϕ (f (e), x) we have that for all e, x ∈ N,

u(e, x) = uϕ (f (e), x)
= ϕf (e) (x)
= πe (x),

so u is a recursive universal function for π. ◻

Theorem 2.1.5 (Programmable, s11 and composition are equivalent for effective indexings)
Let π be an effective (and thus universal) indexing. The following are equivalent:

1. π is programmable.

2. π has recursive composition.

3. π has the s11 property.

4. π is acceptable.

Proof. We will prove (4) ⇔ (1) ⇒ (2) ⇒ (3) ⇒ (1). The equivalence (1) ⇔ (4)
comes from the definition of an acceptable indexing. Since π is effective, then there
exists a recursive f such that ϕ ○ f = π.
Programmable ⇒ recursive composition: Suppose π is programmable, and so there
exists a recursive function g such that π ○ g = ϕ. Let cϕ be a recursive composition
2.1. Acceptable Programming Languages 33

function for ϕ. Define c(i, j) ∶= g(cϕ (f (i), f (j))). Clearly c is recursive and we have
for all i, j, x ∈ N,

πc(i,j) (x) = πg(cϕ (f (i),f (j))) (x)


= ϕcϕ (f (i),f (j)) (x)
= ϕf (i) (ϕf (j) (x))
= πi (πj (x)) = (πi ○ πj )(x),

so c is a recursive composition function for π.


Recursive composition ⇒ s11 property: [12] Suppose π has a recursive composition
function c. Define α(z) ∶= ⟨0, z⟩ and β(y, z) ∶= ⟨y + 1, z⟩. Both are recursive and
so there exist π indexes a, b ∈ N such that α = πa and β = πb . Define the function
h ∶ N → N as follows:

h(0) = a
h(x + 1) = c(b, h(x)).

Since c is recursive, then so is h. First, we prove by induction that πh(x) (z) = ⟨x, z⟩
for every x, z ∈ N. By definition, πh(0) (z) = πa (z) = ⟨0, z⟩. Now, supposing πh(x) (z) =
⟨x, z⟩ we have

πh(x+1) (z) = πc(b,h(x)) (z)


= πb (πh(x) (z))
= β(⟨x, z⟩)
= ⟨x + 1, z⟩.

Defining sπ (e, x) ∶= c(e, h(x)), we have that sπ is clearly recursive, and for all e, x, y ∈
N,

πsπ (e,x) (y) = πc(e,h(x)) (y)


= πe (πh(x) (y))
= πe (⟨x, y⟩),

so sπ is a recursive specializer for π.


s11 property ⇒ programmable: Suppose π has a recursive specializer sπ . Let U be
a π index of a universal function uϕ for ϕ, so that πU = uϕ . Define g(i) ∶= sπ (U, i).
Clearly g is recursive and we have for all i, x ∈ N,

πg(i) (x) = πsπ (U,i) (x)


= πU (i, x)
= uϕ (i, x)
= ϕi (x),

so π is programmable. ◻
2.1. Acceptable Programming Languages 34

2.1.3 Computability with non-numerical inputs


In practice, we would like to work with algorithms with inputs/outputs that are not
limited to the natural numbers. For the rest of this document, we assume we are work-
ing with computable functions (in the informal sense of 1.1.2) over some enumerable
set D.

Definition 2.5 (Programming Language) A programming language is a tuple (D, J_K)


where J_K is a function J_K ∶ D → (D → D) called its semantic function.
A programming language is just a way of interpreting elements of D, which can
be called programs, as functions over D. In modern computer terms, the semantic
function can be seen as a compiler that takes a program p ∈ D and outputs an
executable file JpK. This executable takes inputs in D, processes them according to
program p, and outputs elements of D.

Some conventions and definitions

1. J_K-computability: Given a function f ∶ D → D, we say f is J_K computable if


there exists a program p ∈ D such that f = JpK.

2. There is a fixed computable bijection σ ∶ D → N with computable inverse.

3. Pairing function for D: As in Section 2.1.1, we need a computable pairing


function ⟨⟩ ∶ D2 → D, though its explicit definition is irrelevant. We just need
it to coincide with the pairing function used for natural numbers, namely, we
choose it so that

⟨i, j⟩ = k ∧ i, j, k ∈ N ⇔ ⟨di , dj ⟩ = dk

4. A numbering for programs: Define π ∶ N → (N → N) so that for all i, x ∈ N:

π(i)(x) = πi (x) = σJσ −1 (i)Kσ −1 (x)

For all i, x, y ∈ N:

πi (x) = y ⇔ Jdi K(dx ) = dy

Definition 2.6 (Turing Completeness) We say J_K is Turing complete if the number-
ing π defined above is an indexing of the unary recursive functions.

Observation 2.1.3 It is easy to see from our last definition, and the Church-Turing
thesis, that a programming language is Turing complete if and only if every effective
computable (in the informal, algorithmic sense) function on D, and no other, is J_K
computable.
2.1. Acceptable Programming Languages 35

2.1.4 Acceptable Programming Languages


Definition 2.7 (Acceptable programming language) A programming language is ac-
ceptable if its associated numbering π, as defined in the last Section, is an acceptable
indexing.

Observation 2.1.4 (Acceptable programming languages are effectively equivalent)


Just as discussed in 2.1.2, given two acceptable programming languages, there is an
effective way to go back and forth between them. This gives us the freedom of not
choosing a specific computation model for our work, as properties proved for some
acceptable language (regarding its semantic function) will hold as well for other
acceptable languages.

Theorem 2.1.6 (Another characterization of acceptable programming languages)


A programming language with semantic function J_K is acceptable if and only if:

1. J_K is Turing Complete.

2. There exists a universal program U ∈ D such that for every p, d ∈ D

JU K(p, d) = JpK(d)

3. It has the s11 property, namely, there exists a program s11 ∈ D such that for every
p, x, y ∈ D,

JJs11 K(p, x)K(y) = JpK(x, y)

Proof. The proof follows from Theorem 2.1.5 and the facts

Jdi K(dx ) = dy ⇔ πi (x) = y


⟨di , dj ⟩ = dk ⇔ ⟨i, j⟩ = k.

We prove only one direction, as the other follows analogously.


(⇐) Since JK is Turing complete, then the numbering π is an indexing of PREC(1).
Since there is a universal program U ∈ D, there must be a number u such that du = U .
So we have that for all p, x ∈ N, Jdu K(dp , dx ) = Jdp K(dx ) implies that for all p, x ∈ N
πu (p, x) = πp (x), so π is a universal indexing. Similarly, there must be a number
s ∈ N such that ds = s11 , and so JJds K(dp , dx )K(dy ) = Jdx K(dy ) implies that πs (p, x) = y
for all p, x, y ∈ N, so π has a recursive specializer and hence it is acceptable. ◻

Observation 2.1.5 (Acceptability is independent of the choice of σ) Suppose (D, J_K)


is acceptable (with respect to the fixed bijection σ and its associated numbering π σ ).
Let ρ ∶ D → N be a computable bijection with computable inverse. Then the number-
ing associated to ρ, π ρ , is an acceptable indexing of PREC(1).
2.1. Acceptable Programming Languages 36

Proof. Recall that the numberings are defined as follows, for all i ∈ N:

π σ (i) = σ ○ Jσ −1 (i)K ○ σ −1
π ρ (i) = ρ ○ Jρ−1 (i)K ○ ρ−1 .

Since (D, J_K) is acceptable (with respect to σ), then π σ is an acceptable indexing
(of PREC(1)). We show that there exist recursive functions f, g ∶ N → N such that

πσ = πρ ○ f
π ρ = π σ ○ g.

1. Since J_K is Turing Complete, there exists a program κ ∈ D such that for all
p, x ∈ D:

JκK(p, x) = (ρ−1 ○ σ ○ JpK ○ σ −1 ○ ρ)(x).

Since (D, J_K) is acceptable, there exists a specializer program s11 , and so the
function f ∶ N → N defined by

f (i) = ρ(Js11 K(κ, σ −1 (i)))

is computable, and hence recursive by the Church-Turing Thesis. Also, note


that for all i, x ∈ N,

π ρ (f (i))(x) = (ρ ○ Jρ−1 (f (i))K ○ ρ−1 )(x)


= (ρ ○ JJs11 K(κ, σ −1 (i))K ○ ρ−1 )(x)
= (ρ ○ JκK)(σ −1 (i), ρ−1 (x))
= (ρ ○ ρ−1 ○ σ ○ Jσ −1 (i)K ○ σ −1 ○ ρ ○ ρ−1 )(x)
= (σ ○ Jσ −1 (i)K ○ σ −1 )(x)
= π σ (i)(x),

and so π σ = π ρ ○ f for some recursive f ∶ N → N.

2. The proof of the existence of a recursive g ∶ N → N such that π ρ = π σ ○ g is


analogous to the previous one. We now define program κ so that

JκK(p, x) = (σ −1 ○ ρ ○ JpK ○ ρ−1 ○ σ)(x),

for p, x ∈ D, and define

g(i) = σ(Js11 K(κ, ρ−1 (i))).

Repeating the process in the previous part of the proof, we find that π ρ = π σ ○ g,
where g ∶ N → N is recursive.


2.2. Results for acceptable programming languages 37

A subset of programs In the real world, not every element of D represents a "valid"
program. For example, not all ASCII sequences represent C + + programs, however,
every computable function over ASCII sequences can be computed with a C + +
program. For this reason we may distinguish a subset P gms ⊆ D of valid programs,
such that trying to execute some d in D − P gms makes no sense, that is, JdK(x) ↑
for all x ∈ D, and all J_K computable functions are computed by some p ∈ P gms. It
is clear then that if (D, J_K) is acceptable, then all “witnesses” (the universal and s11
programs, and programs for Turing Completeness) of acceptability are in P gms.

Another way of defining computability


Another way of defining computability over general (enumerable) sets D is to directly
extend the theory presented in Chapter 1, so that Turing Machines work with tape
symbols in D and compute functions over D. Encoding such machines as elements
of D would then give us an indexing {ϕd ∶ d ∈ D}, which we would set as standard,
that is, a function over D is computable if and only if it corresponds to some ϕd .
Adapting slightly the proofs on Chapter 1, we could prove again results such as the
existence of a universal machine and s11 property. Finally, under this approach, a
programming language with semantic function JK would be defined to be acceptable
if and only if there are computable functions f, g ∶ D → D such that J_K = ϕ ○ f and
ϕ = J_K ○ g. No approach is better than the other, as they end up being equivalent.

2.2 Recursion and Computability Results for acceptable


programming languages
Fix P gms, D, J_K as part of an acceptable programming language. We now present
some theorems related to the SRT. Most of these theorems are used in some (prac-
tical) way later on Chapter 4.

Kleene’s Second Recursion Theorem

We now prove the SRT for any acceptable programming language. The proof is
identical to that of Theorem 1.3.2.
Theorem 2.2.1 (SRT for acceptable programming languages)
Let n ∈ N∗ . Let p ∈ P gms. There exists p′ ∈ P gms such that for every x⃗ ∈ D

JpK(⟨p′ , x⃗⟩) = Jp′ K(⃗


x)

Proof. Fix n ∈ N∗ , let p ∈ P gms. By Turing Completeness, there exists a program


p such that JpK(q, x⃗) = JpK(s11 (q, q), x⃗), for all q ∈ P gms, x⃗ ∈ Dn . Set p′ = Js11 K(p, p).
2.2. Results for acceptable programming languages 38

We have that

Jp′ K(⃗ x)
x) = JJs11 K(p, p)K(⃗
= JpK(p, x⃗)
= JpK(Js11 K(p, p), x⃗)
= JpK(p′ , x⃗)

So p′ is a fixed point for JpK ◻

Observation 2.2.1 (Use of Roger’s axioms in the above proof) Though Roger’s ax-
ioms are indeed enough to prove the SRT, notice that:

1. The universal program was never used.

2. We need a s11 program, but no universal program.

3. The only place where Turing Computability is used is to find program p.

So, as we will see in the next chapter, the SRT may hold even in languages that are
far from being Turing Complete, and for which we need not prove the existence of a
universal program.

Applications of the SRT Let p ∈ P gms and p′ be a fix-point of p as given by the


SRT.

1. Self-reproducing program: Suppose that for all q, d ∈ D, JpK(q, d) = q. Then


Jp′ K(d) = JpK(p′ , d) = p′ . p′ is a self-reproducing program.

2. Self-recognizing program: Suppose that for q, d ∈ D, JpK(q, d) = 1 if q = d and


0 otherwise. Then, Jp′ K(d) = JpK(p′ , d) = 1 if and only if p′ = d.

3. Interchanging programs and data: If JpK(q, d) = JU K(d, q) where U is the


universal program, then Jp′ K(d) = JU K(d, p′ ) = JdK(p′ ) for all d ∈ D.

4. Removing Recursion Suppose that D = N. Consider the following function


defined by primitive recursion.

f (0, y) = g(y)
f (x + 1, y) = h(f (x, y), x, y)

Our previous work proved that if h and g are computable then so is f . The
SRT provides another way to show closure under recursion. Suppose p is a
program such that:

JpK(e, 0, y) = g(y)
JpK(e, x + 1, y) = h(JeK(x, y), x, y)
2.2. Results for acceptable programming languages 39

Notice that p no longer contains self-referencing. By the SRT, we have that

JpK(p′ , x + 1, y) = Jp′ K(x + 1, y) = h(Jp′ K(x, y), x, y)

And so p′ is a program that satisfies the recursive definition.

We apply the SRT when we need to find programs that somehow use their own code.
So the restrictions are less on what the resulting code is, and more on how this
own code is used. The SRT allows us to define how a program works with certain
parameter (Jp′ K(x)) in terms of a computable function that uses both the parameter
x and the code of the program p′ .

Computing Fixed Points Not only can we prove the existence of fixed points for
any computable function in an acceptable programming language, we can compute
them, as it is seen in the next theorem. This will prove useful later on in Chapter 4.

Theorem 2.2.2 (Computing SRT)


Let n ∈ N∗ . There exists a computable function srt such that for every program p
and every x⃗ ∈ Dn

JpK(srt(p), x⃗) = Jsrt(p)K(⃗


x)

Proof. The key point for the proof of the SRT (Theorem 2.2.1) is, from a program p,
to obtain a program p such that

JpK(q, x⃗) = JpK(Js11 K(q, q), x⃗)

By Turing Completeness, there exists a program γ such that for p, q ∈ P gms, x⃗ ∈ Dn

JγK(p, q, x⃗) = JU K(p, Js11 K(q, q), x⃗)


= JpK(Js11 K(q, q), x⃗)

where U is the universal program. Defining now bar(p) = Js11 K(γ, p) we have a
computable function that from p finds p̄. So now, defining

srt(p) = Js11 K(barn (p), barn (p))

we have a function that computes fixed points for p. ◻

Other recursion theorems


Theorem 2.2.3 (Explicit Recursion Theorem)
Let q ∈ P gms. There exists a program e such that for all y, x⃗ ∈ D

x) = JqK(e, y, x⃗)
JJeK(y)K(⃗
2.2. Results for acceptable programming languages 40

Proof. Let q ′ be a program such that Jq ′ K(⟨z, y⟩, x⃗) = JqK(z, y, x⃗). By Turing
Completeness there exists a program p such that JpK(z, y) = Js11 K(q ′ , z, y) for all
z, y ∈ D. Let e be a fixed point for p given by the SRT. We have JeK(y) = Js11 K(q ′ , e, y)
and so

x) = JJs11 K(q ′ , e, y)K(⃗


JJeK(y)K(⃗ x)
= Jq ′ K(⟨e, y⟩, x⃗)
= JqK(e, y, x⃗)


We will call the resulting program e of the last theorem an explicit fixed point.
As with 2.2.2, we can find such explicit fixed points computably from a program q.
Theorem 2.2.4 (Computable Explicit Recursion)
There exists a program xrt such that for every q ∈ P gms, x, y ∈ D:

x) = JqK(JxrtK(q), y, x⃗)
JJJxrtK(q)K(y)K(⃗

Proof. Let ○ ∈ P gms be the composition program JJ○K(p, q)K(x) = JpK(JqK(x)). Since
we have computable pairings ⟨⟩ and projections, there exists a program r such that
for all z, y, x⃗ ∈ D

JrK(⟨y, z⟩, x⃗) = ⟨y, z, x⃗⟩.

Define programs pF , xrt such that for all q ∈ D,

Jpf K(q) = Js11 K(s11 , J○K(q, r))


JxrtK(q) = JsrtK(JpF K(q)).

Let z, y, x⃗ ∈ D and q ∈ P gms. Then

JJpF K(q)K(z, y) = JJs11 K(s11 , J○K(q, r))K(z, y)


= Js11 K(J○K(q, r), z, y),

so if e = JxrtK(q) = JsrtK(JpF K(q)), then we have that

JeK(y) = JJpF K(q)K(e, y)


= Js11 K(J○K(q, r), e, y),

and so

x) = JJs11 K(J○K(q, r), e, y)K(⃗


JJeK(y)K(⃗ x)
= JJ○K(q, r)K(⟨e, y⟩, x⃗)
= JqK(JrK(⟨e, y⟩, x⃗))
= JqK(e, y, x⃗)


2.2. Results for acceptable programming languages 41

About the explicit recursion theorem The explicit recursion theorem is a stronger
version of the SRT (and hence it is for some authors known as the strong recursion
theorem [17]). We can think of the explicit fixed point e as a program provider, and
so in reality we have a family of programs {JeK(x) ∶ x ∈ D}, such that each of them,
when executed with some parameter y, takes into account not only such y, but also
possibly its own code JeK(x), the code of the provider e, and even the code of some
other programs of the family.

Corollary 2.2.5 (Nested Explicit Recursion Theorem)


Let q ∈ P gms. There exists e ∈ D such that for all i, x, y ∈ D

JJJeK(i)K(x)K(y) = JqK(e, i, x, y)

We call e from the above Corollary a nested explicit fixed point. We prove a slightly
stronger version, which tells us how to computably find nested explicit fixed points.

Corollary 2.2.6 (Computable Nested Explicit Recursion Theorem)


There exists a program nest ∈ D such that for every q ∈ P gms,

JJJJnestK(q)K(i)K(x)K(y) = JqK(JnestK(q), i, x, y)

Proof. Define programs T, t, r, F such that for every q, l, i, x, y ∈ D,

JT K(q, l, i, x) = Js11 K(q, l, i, x)


JF K(⟨l, i, x⟩, y) = ⟨l, i, x, y⟩
JtK(q) = J○K(q, F )
JrK(q) = Js11 K(T, JtK(q)).

Defining

JnestK(q) = JxrtK(JrK(q))

We have that for all q ∈ D, if e = JnestK(q), then for all , i, x, y ∈ D:

JJJeK(i)K(x)K(y) = JJJJxrtK(JrK(q))K(i)K(x)K(y)
= JJJrK(q)K(e, i, x)K(y)
= JJJs11 K(T, JtK(q))K(e, i, x)K(y)
= JJT K(JtK(q), e, i, x)K(y)
= JJs11 K(JtK(q), e, i, x)K(y)
= JJtK(q)K(⟨e, i, x⟩, y)
= JqK(e, i, x, y),

so JnestK(q) is a nested explicit fixed point for q. ◻


2.2. Results for acceptable programming languages 42

Theorem 2.2.7 (Extended Recursion Theorem)


Let n ∈ N, g1 , ..., gn be computable functions and q ∈ P gms. Then there exists a
computable function φ such that for all x, y ∈ D.

Jφ(y)K(x) = JqK(y, φ(g1 (y)), ..., φ(gn (y)), x)

Proof. Let f (z, y, x) = JqK(y, JzK(g1 (y)), ..., JzK(gn (y)), x). There must be a program
q that computes f . Applying the Explicit Recursion Theorem (2.2.3) to q, there exists
a program e such that

JJeK(y)K(x) = f (e, y, x)
= JqK(y, JeK(g1 (y)), ..., JeK(gn (y)), x)

Setting φ = JeK we have our result. ◻

Theorem 2.2.8 (Smullyan’s Double Recursion Theorem)


Let p, q ∈ P gms. There exist programs p′ , q ′ such that for all x ∈ D

Jp′ K(x) = JpK(p′ , q ′ , x)


Jq ′ K(x) = JqK(p′ , q ′ , x)

We will call the resulting programs p′ , q ′ from the last Theorem double fixed points
for p and q. We prove a stronger form of the Theorem, which as for 2.2.2, allows us
to compute double fixed points

Theorem 2.2.9 ( [17] Computable Smullyan’s Double Recursion Theorem)


There exists a program smu such that for all programs p, q, JsmuK(p, q) = ⟨p′ , q ′ ⟩ where

Jp′ K(x) = JpK(p′ , q ′ , x)


Jq ′ K(x) = JqK(p′ , q ′ , x)

Proof. By Turing Completeness and the s11 program, there exist programs t, r1 , r2
such that for all x, y, z ∈ D,

JJtK(x, y)K(z) = JxK(x, y, z)


Jr1 K(x, y, z) = ⟨JtK(x, y), JtK(y, x), z⟩
Jr2 K(x, y, z) = ⟨JtK(y, x), JtK(x, y), z⟩

Define smu such that for all p, q ∈ D

JsmuK(p, q) = ⟨JtK(J○K(p, r1 ), J○K(q, r2 )), JtK(J○K(q, r2 ), J○K(p, r1 ))⟩.

Given p, q and letting

p′ = JtK(J○K(p, r1 ), J○K(q, r2 ))
q ′ = JtK(J○K(q, r2 ), J○K(p, r1 )),
2.2. Results for acceptable programming languages 43

we have that for all z ∈ D:

Jp′ K(z) = JJtK(J○K(p, r1 ), J○K(q, r2 ))K(z)


= JJ○K(p, r1 )K(J○K(p, r1 ), J○K(q, r2 ), z)
= JpK(Jr1 K(J○K(p, r1 ), J○K(q, r2 ), z))
= JpK(JtK(J○K(p, r1 ), J○K(q, r2 )), JtK(J○K(q, r2 ), J○K(p, r1 )), z)
= JpK(p′ , q ′ , z).

In an identical fashion, we see that Jq ′ K(z) = JqK(p′ , q ′ , z), and so we have that smu
is a double fixed point finder program. ◻

About double fixed points The SRT allows us to define programs that make refer-
ence to their own code when executing. The double recursion theorem allows us to
define a pair of programs that make reference possibly to their own code, and to the
other program’s code. For example, using the SRT we can find a program p′ such that
for all x ∈ D, JpK(x) = J○K(p, x). Using the double recursion theorem we can find a
pair of programs p, q ∈ D such that for all x ∈ D, JpK(x) = J○K(q, x), JqK(x) = J○K(x, p).
Chapter 3

Swiss Pocket Knife

Introduction
Following the discussion we introduced on Section 2 we now turn our attention to
studying the implementation of Kleene’s Theorem on specific models of computation
(with specific programming languages). We follow Jones’s Swiss Pocket Knife for
Computability [11], whose purpose is to talk about the complexity of SRT. We study
two concrete computation models:

• The TINY language, by Bonfante and Greenbaum [3], which works on tree-
structured data.

• The 1# language, by Lawrence Moss [14], a minimalistic language that works


on text register machines over the alphabet {1, #}.

3.1 The TINY language for tree-structured data


3.1.1 Tree-structured data
Given an alphabet A, let TA be the set of binary trees with leaves on A. Trees will
be noted using parenthesis notation:

• For a ∈ A, a is the tree containing only one node, a.

• If t1 , t2 ∈ TA then (t1 ⋅ t2 ) is the tree with a root node that as left child has tree
t1 , and as right child tree t2 .

The projection functions work as usual, πi (t1 ⋅ t2 ) = ti and πi (a) = a for i = 1, 2 and
a ∈ A. The size of a tree is defined intuitively as ∣a∣ = 1 and ∣(t1 ⋅ t2 )∣ = ∣t1 ∣ + ∣t2 ∣ + 1.
The word a1 a2 ...an ∈ A is encoded in TA∪{nil} as (a1 ⋅ (a2 ⋅ (... ⋅ (an ⋅ nil))))

44
3.1. The TINY language for tree-structured data 45

List notation As deeply parenthesized structures are hard to read, we will some-
times use list notation for elements in TA .
• () stands for nil.

• (t1 t2 ...tn ) stands for (t1 ⋅ (t2 ⋅ (...(tn ⋅ nil)))).

3.1.2 TINY grammar and syntax


The syntax of TINY is given by the grammar (in BNF [1]), where V ar is some set of
acceptable variable names (in our case, words of the latin alphabet):
 
1 // Variables and Constants

2 x ∈ V ar

3 t ∈ TA

4 // Expressions

5 Exp ::= x | '\ ' 't | ' cons ' Exp Exp | ' hd ' Exp Exp | ' tl ' Exp
6 // Commands

7 Cmd ::= x ' := ' Exp | Cmd '; ' Cmd


8 // Program

9 Program ::= ' read ' x ( ' , ' x ) * '; ' Cmd '; write ' x
 
Note: In the definition of Exp we use write /’ to “escape” the quote, as would be
done in a high level language. The grammar rule tells that ′ t for t ∈ TA is an Ex-
pression.

In general, a TINY program looks like:


 
1 read X1 , X2 , ..., Xn ; C ; write out
 
From now on, we refer to X, Y as variables in Var, and t as an element of TA .

Definition 3.1 (List syntactic sugar) For ease of notation, define, for expressions
E1 , ..., En , the command:

list(E1 , ..., En ) ∶= cons(E1 , cons(E2 ..., cons(En ,′ ())...))

3.1.3 Semantics of TINY


We follow the presentation on [3]. We need to define how the semantic function J_K
works with programs on TINY.

Definition 3.2 (Stores) The "state" of an execution of a TINY program will be deter-
mined by the value of each variable. A store is a function σ ∶ Var ↦ TA , it will be
one of such states. The set of all stores will be noted by G.
Given σ ∈ G, t ∈ TA and X ∈ Var, define the store σ[X ↦ t] by:


⎪t if Y = X
σ[X ↦ t](Y ) = ⎨

⎩σ(Y )
⎪ otherwise
3.1. The TINY language for tree-structured data 46

Semantics of Expressions The semantics of an expression E with a current con-


figuration σ, noted by JEKσ is defined by:

JXKσ = σ(X) Jhd EKσ = π1 (JEKσ)


JtKσ = t Jtl EKσ = π2 (JEKσ)
Jcons E F Kσ = (JEKσ ⋅ JF Kσ)

Informally, hd gets the head (or left child) of an expression, tl gets the tail (right
child), and cons builds a tree with the specified children. As a quick example:

Jhd tl consX Y Kσ = π1 (σ(Y ))


Jtl hd consX Y Kσ = π2 (σ(X))

Semantics of Commands Each command updates the store, so for C ∈ Cmd we have
JCK ∶ G → G. It is defined as follows:

JX ∶= EKσ = σ[X ↦ JEKσ]


For C, D ∈ Cmd: JC; DKσ = JDK(JCKσ)

Informally, X ∶= E sets the value of X to E and C; D represents executing C followed


by D.

Semantics of Programs Finally we define how J_K works for programs. Consider
a program p = read X1 , ..., Xn ; C; write Y . Given t1 , ..., tn ∈ TA , take the initial
configuration (store) σ0 (t1 , ..., tn ) where σ(Xi ) = ti for i = 1, ..., n, and σ(Y ) = nil for
all other variables.
Finally, we define

JpKn (t1 , ..., tn ) = (JCKσ0 (t1 , ..., tn ))(Y )


JpKm (t1 , ..., tn ) ↑ if m ≠ n

3.1.4 TINY programs as data


As we have presented it so far, programs in TINY can NOT be treated as data as they
do not follow a tree-structure. To fix this, we follow Jone’s approach from [10]. He
defines a Concrete Syntax, a way of representing TINY programs in a tree structure.

Definition 3.3 (Concrete Syntax for TINY programs) Consider a program


p = read X1 , ..., Xn ; C; write Y . The concrete syntax representation of p is given by
(using the list notation introduced in 3.1.1).

p = ((X1 ...Xn ) C Y )

Where C is the concrete syntax representation of C given by:


3.1. The TINY language for tree-structured data 47

• X = X for X ∈ Var. • tl E = (tl E).


• ′ t = (quote t) for t ∈ TA .
• X ∶= E = (∶= X E)
• cons E F = (cons E F )
• hd E = (hd E). • C; D = (; C D)

Using this syntax we now have P gms ⊆ D = TA∪{nil,quote,cons,hd,tl,∶=,;}∪Var .


We will call the regular TINY syntax informal syntax.

3.1.5 SRT in TINY


We now return to the discussion on [11] to implement the SRT on TINY. TINY is not
Turing Complete 1 , therefore it is not an acceptable programming language and the
proof of the SRT for acceptable programming languages does not suffice to show
that SRT holds in TINY. We can, however, use the ideas from such proof to find fixed
points in TINY programs.

s − 1 − 1 in TINY

First, we will need a specializer, or a s11 program such that for every program
p,s, d ∈ D: JpK2 (s, d) = JJs11 K(p, s)K(d). Consider a program p = read q, d; Cp ; write out .
Suppose we want to fix q = s for some s ∈ D. The resulting specialized program could
be p∗ = read d;q:='s;Cp ; write out We need a general formulation for s11 . In concrete
syntax, we have a program

p = ((q d) Cp out)

And we want some program

s11 = read pgm, s; Cspec ; write outpgm

such that

Js11 K2 (p, s) = p∗ = ((d) q ∶=′ s; Cp out)


= ((d) (; (∶= q (quote s)) Cp ) out)

So define the body of the specializer Cspec :


 
1 inputvar := hd hd pgm ; /* inputvar is the first argument of 'pgm
' , the one that will be fixed */
2 C := hd tl pgm ; /* C contain the body of program 'pgm ' */

1
As there are no loops, or similar constructs, the complexity of a program in TINY is purely
determined on its length, not on the size of the input. So TINY is very limited in terms of the functions
it computes. The fact that the SRT holds in TINY is then remarkable.
3.1. The TINY language for tree-structured data 48

3 outputvar := hd tl tl pgm ; /* outputvar contains the name of the


variable that outputs for 'pgm ' */
4 initialise := list ( ':= , inputvar , list ( ' quote , s ) )
5 /* the part that " fixes " variable inputvar */
6 body := list ( '; , initialise , C )
7 outpgm := list ( tl hd pgm , body , outputvar )
8 /* tl hd pgm contains the second argument that 'pgm ' */
 
Lets evaluate the execution of by keeping track of the values that the variables
s11
acquire. All non-mentioned variables are either nil or not relevant.

Js11 K(p, s) →
At the beginning
[pgm = p]
After inputvar
[pgm = p, inputvar = q]
After C
[pgm = p, inputvar = q, C = Cp ]
After outputvar
[pgm = p, inputvar = q, C = Cp , outputvar = out]
After initialise
[pgm = p, inputvar = q, C = Cp , outputvar = out, initialise = (∶= q(quotes))]
After body
[pgm = p, outputvar = out, initialise = (∶= q(quote s)), body = (; (∶= q(quote s))Cp )]
After body
[pgm = p, outputvar = out, body = (; (∶= q(quote s))Cp )]
After outpgm
[outpgm = ((d) (; (∶= q (quote s)) Cp ) out)]

Implementing SRT

Now that we have a s11 program, we are ready to implement SRT. We follow the steps
of proof 2.2.1.
Let p = read q, d; Cp ; write out . We want to find some p′ such that

JpK2 (p′ , d) = Jp′ K(d)

1. Write program p such that

JpK2 (q, d) = JpK2 (Js11 K2 (q, q), d)

Define p =
 
1 read q , d ;
2 /* First , we need to run s11 with arguments (q, q) . So we
set its inputs to (q, q) and run its body ( as in 3.1.5 ) */
3 pgm := q ; s := q ; C spec ;
3.2. 1#, a language for text register machines 49

4 /* The result of the last operation will be in variable


outpgm . We need to run the body of p with arguments
(outpgm, d) , and output the result of that , which will be
stored in variable out . */
5 q := outpgm ; C p ;
6 write out
7
 

Indeed, p is defined as we needed it.

2. Write program p′ such that p′ = Js11 K(p, p). By taking p in its concrete syntax
and executing s11 we get our p′ , which we already proved to be a fix point for
p. But we can also write p′ directly using the idea that p′ = Js11 K(p, p) and that
Jp′ K(d) = JpK(p′ , d). p′ =
 
1 read d ;
2 /* Set arguments of s11 to p and execute its body . */
3 pgm := ' p ; s := ' p ; C spec ;
4 /* In outpgm is the result of Js11 K(p, p) . We need to run p
with arguments (outpgm, d) then . */
5 q := outpgm ; C p ; write out
6
 

One could actually check that doing Js11 K2 (p, p) would actually yield as a result
the exact program that we wrote above.

Observation 3.1.1 (Self-reproduction) Note that the start of p′ , with the state-
ment q ∶= outpgm, we assign to q the value of Js11 K(p, p). So there is a code
segment in p′ that assigns to q the entire text of p′ .

3.2 1#, a language for text register machines


In section 1.2.1 we defined URMs (Unlimited Register Machines). Unlimited TEXT
register machines work in a similar fashion, but registers store words in some lan-
guage A∗ instead of natural numbers. So an Unlimited Text Register machine con-
sists of a set of registers R1 , R2 , ..., each one storing some word w1 , w2 , .... In [14],
Lawrence Moss defines 1#, a language to work on UTRMs (Unlimited Text Register
Machines) over the alphabet A = {1, #}, and our data set are the words on A∗ . Here,
we give an alternative presentation of Moss’s 1#.

3.2.1 1# grammar and syntax


The syntax of 1# is given by the grammar (in EBNF format):
 
1 // Constants
2 a ∈ {1, #}

3 n ∈N

3.2. 1#, a language for text register machines 50

4 z ∈ Z∗
5 // Commands and Programs

6 Cmd := 'A ( 'n , a ') ' | 'J ( 'z ') ' | 'C ( 'n ') '
7 Program := ( Cmd ) *
 
In general, a 1# program looks like p = I1 I2 ...In where each Ii is a Cmd.

3.2.2 Semantics of 1#
Executing 1# programs Execution of 1# programs is analogue to the execution of
the URM programs defined in 1.2.1. The semantics of 1# are more easily explained
in an informal way. Consider a 1# program p = I1 ... IL . Execution of a program p
with inputs a1 , ..., an ∈ A∗ is done by sequentially executing its instructions, starting
by I1 , with Ri storing word ai for i = 1, ..., n and the rest of the registers empty. The
execution of an instruction Ik is given by:

• Add instruction: If Ik = A(n, a), then append a to the content of register Rn


and go to the next instruction. If there is no next instruction, execution halts.

• Jump instruction: If Ik = J(z), then execute instruction Ik+z . If there is no


such instruction, execution stops.

• Cases instruction: If Ik = C(n), read (and remove) the first letter of Rn :

– If Rn was empty, go to instruction (Ik+1 ).


– If a 1 was read, jump two instructions (go to Ik+2 ).
– If a # was read, jump three instructions (go to Ik+3 ).

As before, if at any point we try to execute an unexisting instruction, the


execution stops.

Correct and incorrect stopping of program execution Programs may or may not
stop. As in 1.2.1, we require that programs stop by trying to execute an instruction
just after the last instruction of the program. Stopping an execution this way is called
halting. Programs may stop in other ways, in which case we will say they stopped
incorrectly. For example, consider the following program, which for all inputs will
stop incorrectly by trying to execute an instruction before the first one.
 
1 J ( -1)
 

Definition 3.4 (Halting) Execution of some program p = I1 ... IL halts if one of the
following happens at some point:

1. The program goes to IL and IL is an Add instruction (A(n, a)).


3.2. 1#, a language for text register machines 51

2. The program goes to In and In is a Jump instruction of the form J(z) where
n + z = L + 1.

3. The program goes to In and In is a Cases instruction of the form C(n) and:

• Register Rn is empty and n + 1 = L + 1.


• The first letter of Rn (before executing In ) is a ’1’ and n + 2 = L + 1.
• The first letter of Rn (before executing In ) is a ’#’ and n + 3 = L + 1.

Now we can define, for a program p, n ∈ N and input w⃗ ∈ Dn :


⃗ = z ⇔def The execution of p with initial state w⃗ halts with z in R1 .
JpKn (w)
⃗ ↑ otherwise (if execution does not stop, or stops incorrectly.)
JpKn (w)

As notation, JpK() = JpK() where  is the empty word.

3.2.3 1# programs as data


As with TINY programs, 1# programs as we defined them cannot be treated as data.
So, as we did with TINY in Section 3.1.4, we define a concrete syntax, a way of
representing 1# programs as elements of D. This syntax is how Lawrence Moss
originally presents 1# in [14]. The translation from “informal” to concrete syntax is
very straightforward. If p = I1 ... IL then, in concrete syntax, p = I1 ...IL with:

• A(n, 1) = 1n # • J(z) = 1−z #### if z < 0


• A(n, #) = 1n ##
• J(z) = 1z ### if z > 0 • C(n) = 1n #####

SRT in 1#
In [14], Moss provides a proof for SRT on 1# without the use of s11 . Here, we show
both ways of implementing SRT with 1#, one following the proof on section 2.2.1,
which uses s − 1 − 1, and the other one, provided by Moss.
We start by defining some auxiliary programs that will be useful later on.

Definition 3.5 (The Move program) Given n, m ∈ N with n ≠ m, the moven,m pro-
gram writes the contents of Rn onto the end of Rm , emptying Rn in the process.
moven,m =
 
1 C (n) // Cases on Rn
2 J (6) // Case empty ( Move Forward 6 to end )
3 J (3) // Case 1 ( Move Forward 3 to Case 1 implementation )
4 A ( m ,#) // Case # ( Write # to Rm )

5 J ( -4) // Back 4 ( To Cases statement for loop )


6 A ( m ,1) // Case 1 implementation ( Write 1 to Rm )

7 J ( -6) // Back 6 ( To Cases statement for loop )


 
3.2. 1#, a language for text register machines 52

Of course, the program above does not strictly follow 1# syntax, and variables n, m
should be replaced with actual numbers.

Definition 3.6 (The Copy program) Similar to move, copy copies the contents of one
register onto the end of another. However, the original register is not emptied. The
downside is that we need to use a third auxiliary register to perform this operation.
The program copyn,m,k copies the contents of Rn to Rm , using Rk as an auxiliary
register. The idea is simple, emulate move but now move the contents from Rn to Rm
and Rk . At the end, move the contents back from Rk to Rn . copyn,m,k =
 
1 C (n) // Cases on Rn
2 J (8) // Case empty ( Move Forward 8 to move k,n subroutine )
3 J (4) // Case 1 ( Move Forward 4 to Case 1 implementation )
4 A ( m ,#) // Case # ( Write # to Rm and Rk )
5 A ( k ,#) //
6 J ( -5) // Back 5 ( To Cases statement for loop )
7 A ( m ,1) // Case 1 implementation ( Write 1 to Rm and Rk )
8 A ( k ,1) //
9 J ( -8) // Back 8 ( To Cases statement for loop )
10 move k,n
 
Notice that for copy to work correctly, the auxiliary register must be empty before
its execution.

Definition 3.7 (The Write program) write is a program such that for every x ∈ D,

JwriteK(x) = y such that


JyK() = x

In words, write is a program that with input x, outputs a program that outputs x.
It is not difficult to come up with such program, for every 1 read, write an Add 1
instruction, and the same for every # read. We need to use an auxiliary register to
write the result, and then move the result to R1 . write =
 
1 C (1) // Cases on R1
2 J (9) // Case empty . Forward 9 to move2,1 subroutine .
3 J (5) // Case one . Forward 5 to Case 1 Impl .
4 A (2 ,1) // Case #. Write '1## ' to R2 . Add '1 ' to R2 .
5 A (2 ,#) // Add '# ' to R2 .
6 A (2 ,#) // Add '# ' to R2 .
7 J ( -6) // Back 6 to Cases statement .
8 A (2 ,1) // Case 1 Impl . Write '1# ' to R1 . Add '1 ' to R2 .
9 A (2 ,#) // Add '# ' to R2 .
10 J ( -9) // Back 9 to Cases statement .
11 move2,1
 
Again, the program above does not strictly follow 1# syntax. We will use the names
of known programs, if possible, to make code shorter. To be valid code, we would
need to write out the whole move program at the end.
3.2. 1#, a language for text register machines 53

Observation 3.2.1 (Concatenation of 1# programs) As Jones notes it in [11], one


key aspect on the implementation of important programs in 1# is the fact that, for
p, q ∈ P gms and x ∈ D

Jp ∣ qK(x) = JqK(JpK(x))

Where ∣ is the concatenation operator, provided that on all inputs, p and q never stop
incorrectly (so either they halt, or the execution never stops).

3.2.4 SRT with s11


Instead of showing that 1# is an acceptable programming language, we emulate the
work on TINY of Section 3.1.5 to implement SRT. We first need to implement s11 .
Although he does not use it to prove the SRT, Moss shows how to implement s11 in
1#.

s − 1 − 1 in 1#

In 1#, s11 =
 
1 move1,3
2 move2,1

3 write

4 move1,2

5 JwriteK(move1,2 )

6 move2,1

7 move3,1
 
Lets follow the execution of s11with input (p, s) ∈ D2 .To do this, we note the
state of the Text Register Machine with tuples (x, y, z) meaning that R1 stores x, R2
stores y, R3 stores z, and the rest of registers are empty. Here, we only keep track
of the contents of registers 1, 2 and 3 because the s11 program does not use any other
register.

At the beginning, state is the input: (p, s, )


After move1,3 : (, s, p)
After move2,1 : (s, , p)
After write: (JwriteK(s), , p)
After move1,2 : (, JwriteK(s), p)
After JwriteK(move1,2 ): (move1,2 , JwriteK(s), p)
After move2,1 : (move1,2 ∣JwriteK(s), , p)
After move3,1 : (move1,2 ∣JwriteK(s)∣p, , )

We have then that

Js11 K(p, s) = move1,2 ∣JwriteK(s)∣p


3.2. 1#, a language for text register machines 54

And so for d ∈ D

JJs11 K(p, s)K(d) →


At the beginning, state is the input: (d, )
After move1,2 : (, d)
After JwriteK(s): (s, d)
After p: (JpK(s, d), ...)
So JJs11 K(p, s)K(d) = JpK(s, d)

First proof of SRT

Again, we follow the steps on Proof 2.2.1

1. Write program p such that

JpK2 (q, d) = JpK2 (Js11 K2 (q, q), d)

If p =
 
1 move 2,4 /* Since s 11 uses registers 1, 2, 3 , we move the
2 second argument to R4 as we will later need it . */
3 copy 1,2,3 /* We want to run Js11 K(q, q)
4 where q is the first argument . */
5 s 11
6 move 4,2 // Bring back the original second argument to run p
7 p
 

We have for q, d ∈ D

JpK(q, d) →
At the beginning, state is the input: (q, d, , )
After move2,4 : (q, , , d)
After Jcopy1,2,3 K(s): (q, q, , d)
After s11 : (Js11 K(q, q), , , d)
After move4,2 : (Js11 K(q, q), d, , )
After p: (JpK(Js11 K(q, q), d), ..., ..., ...)
So JpK(q, d) = JpK(Js11 K(q, q), d)

2. As in the Proof, doing p′ = Js11 K(p, p). It makes no sense to state here explicitly
what p′ looks like because, unlike the SRT implementation of TINY, here the
explicit implementation yields no additional information about p′ .
3.2. 1#, a language for text register machines 55

An implementation of SRT: A computable fixed point finder

By Theorem 2.2.2 we know that, in an acceptable programming language, we


can find fixed points computably. Its proof, however, uses Turing Completeness
and the Universal Program. However, we can implement the SRT in 1#, without
needing to prove that it is an acceptable programming language. The proof
above depends mostly on being able to construct p for each p ∈ P gms. Notice
that not only we can describe how to do it, we can write a 1# program that
does it. Call this program bar, such that JbarK(p) = p. bar =
 
1 move 1,2

2 JwriteK( move 2,4

3 copy 1,2,3
4 s11
5 move 4,2 )
6 move 2,1
 

Comparing this code with the code for p in last section, the equation JbarK(p) = p
is straightforward. Following the proof for the SRT, which tells us that Js11 K(p, p)
is a fixed point for p we can now compute fixed points.

Proposition 3.2.1 (The SRT program)


The srt program is a fixed point computer. More explicitly, for all p ∈ P gms, d ∈
D we have

JpK(JsrtK(p), d) = JJsrtK(p)K(d)

Define srt =
 
1 bar

2 copy 1,2,3
1
3 s1
 

Proof. Let p ∈ P gms. We follow the execution of srt with input p:

At the beginning (p, , )


After bar (JbarK(p), , ) = (p, , )
After copy1,2,3 ∣ s11 (Js11 K(p, p), , )

The proof of the SRT tells us that Js11 K(p, p) is a fixed point for p. ◻

3.2.5 Moss’s Proof of SRT


The key to Moss’s construction is, for some program p, showing that there exists a
program q̂ such that for all r ∈ P gms and d ∈ D we have JJq̂K(r)K(d) = JpK(JrK(r), d).
We start by defining the diag program:
3.2. 1#, a language for text register machines 56

Definition 3.8 (The diag program) diag is a program which, for every x ∈ P gms

JJdiagK(x)K() = JxK(x)

Here, we write it so that JdiagK(x) = JwriteK(x)∣x and then:

JJdiagK(x)K() = JJwriteK(x)∣xK()
= JxK(JJwriteK(x)K())
= JxK(x)

Define (a possible implementation of) diag =


 
1 copy 1,3,2
2 write

3 move 3,1
 
So, for x ∈ P gms:

JdiagK(x) →
At the beginning (x, , )
After copy1,3,2 (x, , x)
After write (JwriteK(x), , x)
After move3,1 (JwriteK(x)∣x, , )
So JdiagK(x) = JwriteK(x)∣x

Construction of q̂ Now define the aforementioned q̂ by


 
1 diag

2 move 1,2

3 JwriteK( move 1,4 )

4 move 2,1

5 JwriteK(move4,2 )

6 JwriteK(p)
 
Though we will not do it step by step, it is easy to check that for r ∈ P gms

Jq̂K(r) = move1,4 ∣ JdiagK(r) ∣ move4,2 ∣ p

And so for d ∈ D

JJq̂K(r)K(d) = JpK(JrK(r), d)

Proof of SRT If now we set p′ = Jq̂K(q̂) we have, for d ∈ D

Jp′ K(d) = JJq̂K(q̂)K(d)


= JpK(Jq̂K(q̂), d)
= JpK(p′ , d)
Chapter 4

The SRT in Computer Virology

Finally, we look at a practical application of Kleene’s Theorem, by studying an


abstract characterization of computer viruses. Though this has been attempted by
plenty of authors, we follow the presentation by Bonfante et al. developed throughout
a series of papers; [4], [5] and [6].

4.1 Preliminaries
The results presented in this chapter hold for any acceptable language, as defined
in Section 2.1.4, with sets of programs and data P gms ⊆ D and computable pairings
with computable projections ⟨⟩. As in Chapter 2, all programs and functions on the
general statements (those that hold for every acceptable Programming Language) are
really unary, using the loose notation of (x1 , ..., xn ) for ⟨x1 , ..., xn ⟩. Of course, this
unary assumption need not be true for the examples given in specific computation
models, which in this chapter will be presented in the WHILE+ language, an extension
of the TINY language studied in Section 3.1.

4.1.1 The WHILE+ Language


The syntax of WHILE+ is given by the following grammar:
 
1 / / V a r i a b l e s and C o n s t a n t s
2 t ∈ TA

3 Var : : = ID

4 / / Expressions

5 Exp : : = Var | ’ \ ’ ’ t | ’ c o n s ’ Exp Exp | ’ hd ’ Exp | ’ t l ’ Exp |


6 ’ e x e c ’ n ( Exp 0 , Exp 1 , . . . , Exp n ) | ’ s p e c ’ m
n ( Exp 0 , Exp 1 , . . . , Exp m ) ( n ≥ 1 )
7 / / Commands

8 Cmd : : = Var ’ := ’ Exp | Cmd ’ ; ’ Cmd | ’ w h i l e ’ Exp ’ do ’ Cmd ’ end ’ |


9 ’ i f ’ Exp ’ t h e n ’ Cmd ’ e l s e ’ Cmd ’ end ’
10 / / Program

11 Program : : = ’ r e a d ’ Var ( ’ , ’ Var ) ∗ ’ ; ’ Cmd ’ ; w r i t e ’ Var


 

57
4.2. Defining Computer Viruses 58

As before, we assume there is a concrete syntax for WHILE+ programs, so that every
program may be expressed uniquely as a member of TA . For the purposes of this
chapter, we do not need to specify this concrete syntax.

Semantics of WHILE+

Let D = TA , where A is an alphabet containing all Unicode characters, plus the


WHILE+ keywords (’read’, ’write’, ...). Recall from Section 3.1 that the state of an
execution of a WHILE+ program is given by the values of the variables, which are
stored in stores, that is, functions σ ∶ Var → D. As before, let G be the set of all
stores.

Semantics of Expressions: WHILE+ contains two new expressions (with respect to


TINY), with semantics:
Jexecn (E0 , E1 , ..., En )Kσ = Jσ(E0 )Kn (σ(E0 ), ..., σ(En )) ∀n ∈ N.
n (E0 , E1 , ..., Em )KσK(x1 , ..., xn ) = Jσ(E0 )K
JJspecm (σ(E1 ), ..., σ(Em ), x1 , .., xn )
m+n

∀n, m ∈ N, x1 , ..., xn ∈ D
Plainly, WHILE+ has built in Universal and sm
n programs as required in Roger’s Axioms
(Section 2.1.4).

Semantics of Commands: WHILE+ introduces two new commands (with respect to


TINY), with semantics:


⎪JC1 Kσ if JEKσ = nil
Jif E then C1 else C2 endKσ = ⎨

⎩JC2 Kσ otherwise


⎪σ
⎪ if JEKσ = nil
Jwhile E do C endKσ = ⎨

⎩JC; while E do C endKσ otherwise

These new commands work as their regular counterparts in high level programming
languages, with nil evaluating to false and everything else to true.

WHILE+ as an Acceptable Programming Language

WHILE+ ’s syntax makes it acceptable almost by definition, as it has built in Universal


(exec), smn (spec) and pairing (cons) functions. Since it also has conditional jumps
and can store an arbitrary amount of variables, it can be proven to be Turing Complete
[3], though this will not be used in any of the examples.

4.2 Defining Computer Viruses


Following Bonfante’s model, we have a scenario where a program p is executed within
a local environment x, modifying it so that the new environment is JpK(x), if JpK(x) ↓.
4.3. Monomorphic Viruses 59

Here, x represents a finite sequence ⟨x1 , ..., xn ⟩ of elements of the local environment
such as files, parameters, etc. We call the tuple ⟨p, x⟩ the execution environment.
When a virus comes into the environment, it infects programs. So given a virus v
and a program p, we should be able to define the infected form of p. One approach,
as done by Adleman (pioneer in formalizing computer virology) in [2] is to define the
infected form directly by JvK(p). However, as Bonfante et al. point out in [5], this
definition leaves out some scenarios, as Adleman’s virus do not take into account the
local environment variables when infecting a program. This leads to a (more general)
definition of a computer virus:

Definition 4.1 (Computer Virus) Let B be a computable function. A virus with re-
spect to (wrt) B is a program v satisfying
JB(v, p)K(x) = JvK(p, x) (4.1)
for all p ∈ P gms and x ∈ D.

Observation 4.2.1 Function B in the definition above is called the propagation func-
tion of virus v. The above definition makes sense as it takes into account some factors
of viral infections:
• A virus v infects programs. Such infection is determined by the propagation
function B(v, p).

• If v is viral, then an execution of the infected program B(v, p) over some context
x should include an execution of v. Such execution of v takes into account not
only the context x, but also the program which it infected, hence the JvK(v, p).
Running an infected program B(v, p) over a context x returns as result a new context
z. This execution of an infected program can be seen really as an execution of the
virus v, which takes into account its environment x and the program which it infected
p.

4.3 Monomorphic Viruses


Though this term is not explicitly used by Bonfante in his work, it is clear that he di-
vides viruses into polymorphic and non-polymorphic, or monomorphic. Monomorphic
viruses duplicate/propagate themselves without modifying their code. This concept
will become more clear once we describe polymorphic viruses in the next section.

4.3.1 Blueprint duplication and distribution engines


Definition 4.2 (Blueprint Virus) A virus v is a blueprint virus if for some computable
function g, for all p ∈ P gms and x ∈ D,
JvK(p, x) = g(v, p, x)
4.3. Monomorphic Viruses 60

g is the virus specification function for v.


Blueprint viruses do not use their propagation function in their specification.

Proposition 4.3.1 (Existence of Blueprint Viruses)


For every program g, there exists a virus v such that

JvK(p, x) = JgK(v, p, x)

Proof. Let v be a fixed point for g, which exists by the SRT, so that JvK(p, x) = JgK(v, p, x).
Taking B = Js11 K we have that

JgK(v, p, x) = JvK(p, x) = JJs11 K(v, p)K(x) = JB(v, p)K(x)

so v is a virus wrt Js11 K. ◻

Definition 4.3 (Virus distribution engine) A distribution engine is a program dv such


that for every viral specification program g, Jdv K(g) is a virus wrt a fixed propagation
function B.
Proposition 4.3.2 (Existence of distribution engines)
There exists a viral distribution engine dv such that for every virus specification
program g, Jdv K(g) is a blueprint virus for JgK wrt Js11 K.
Proof. Theorem 2.2.2 does precisely what we need, which is to compute fixed points.
Let dv = srt2 as in the Theorem, and so

JgK(Jdv K(g), p, x) = JJdv K(g)K(p, x)

for all g, p ∈ P gms, x ∈ D. ◻


The virus distribution engine is useful in practical terms, as it would allow us to
effectively obtain blueprint viruses given their specification programs, and not just
merely know about their existence.

Example 4.3.1 (LoveLetter Virus) “LoveLetter” was a computer worm that attacked
millions of personal computers on May 2000. It spread as an email message as an
email message with subject “ILOVEYOU” and an attachment “LOVE-LETTER-FOR-
YOU.txt.vbs”. The virus damaged local files by overwriting them, and then sent itself
to all contacts found in the user’s address book.
In [6], an abstract model for LoveLetter is presented. The program to be infected is
the e-mail outbox, as its behavior will be modified to send copies of the virus to the
infected user’s contacts. The execution environment are the user’s files, which will
be affected by the virus in two ways:

• To find and use the contact address book.

• Files will be overwritten.


4.3. Monomorphic Viruses 61

More specifically, we model the following entities:

1. An e-mail is a tuple m = ⟨@, y⟩, associating a destination address @ and an


e-mail content y.

2. The outbox will be represented as a list of e-mails mb = ⟨m1 , ..., mk ⟩. To send


a mail m, we add it to the outbox (via cons mb m).

3. A local file structure d.

4. A local address book @bk = ⟨@1 , ..., @n ⟩.

5. The local environment is presented as x = ⟨d, @bk⟩.

We now define the LoveLetterSpecificationProgram:=


 
1 read v , mb , x ;
2 d := hd x ; @bk = tl x ;
3 newFiles := nil ; // Build the new file structure .
4 while ( d ) do
5 d := tl d ; /* We do not care about the original file , we just
need to overwrite every one . */
6 newFiles := cons v newFiles ;
7 end ;
8 // Loop to modify mailbox
9 while ( @bk ) do
10 @ := hd @bk ;
11 newMail := cons @ v ;
12 mb := cons newMail mb ;
13 @bk := tl @bk ;
14 end ;
15 /* The new environment is made up of the modified outbox and
the overwritten files */
16 newEnvironment := cons mb newFiles ;
17 write newEnvironment ;
 
Using the distribution engine, Jdv K(LoveLetterSpecificationProgram) yields
the LoveLetter virus in WHILE+ .

Example 4.3.2 (More Blueprint Examples) 1. A virus that deletes everything. This
virus acts independently of the program infected and so it is readily defined
by JvK(p, x) = nil, where we assume nil to be an element of D representing
the "empty" program.

2. A virus that appends itself to every element of the environment. It follows the
equation JvK(p, ⟨x1 , ..., xn ⟩) = ⟨J○K(v, x1 ), ..., J○K(v, x1 )⟩ (○ is the computable
composition program). This virus can be easily defined through a blueprint
viral specification function, from which we obtain the explicit viral code via a
distribution engine. We would expect the new elements J○K(v, d) to be larger
than the original d. Executing one of those new elements in the environment
4.3. Monomorphic Viruses 62

appends the virus again to the elements of the environment until it runs out of
memory. Virus "Jerusalem" worked this way [4].

4.3.2 Smith Viruses


Definition 4.4 (Smith virus) A smith virus is a pair of programs v, B such that for
some computable function g, for all p, x ∈ D:

JvK(p, x) = JJBK(v, p)K(x) (v is a virus wrt JBK)


JvK(p, x) = g(B, v, p, x).

g is known as a smith virus specification function.


Being able to specify smith viruses via a specification function g is justified using
the Explicit Recursion Theorem (2.2.3).

Theorem 4.3.3 ( [5] Existence of smith viruses)


Let g be a computable function. There exists a smith virus v, B such that g(B, v, p, x) =
JvK(p, x) for all p, x ∈ D.
Proof. By Turing Completeness and the computable pairing/projection functions,
there is a program t such that for all y, z, p, x ∈ D:

JtK(y, ⟨z, p⟩, x) = g(y, z, p, x).

The Explicit Recursion Theorem tells us that there exists a program B such that for
all z, p, x ∈ D:

JJBK(⟨z, p⟩)K(x) = JtK(B, ⟨z, p⟩, x)


= g(B, z, p, x).

If program r is such that JrK(z, p, x) = g(B, z, p, x), the SRT tells us there exists a
fixed point v of r. So, for all p, x ∈ D:

JvK(p, x) = JrK(v, p, x)
= g(B, v, p, x)
= JJBK(v, p)K(x),

so v, B is a smith virus with specification g. ◻


As with Blueprint viruses, given a smith specification function g, we would like
to compute a smith virus with such specification.

Definition 4.5 (Smith Virus Distribution Engine) A smith virus distribution is a pair
of programs dv , dB such that for every program q, Jdv K(q), JdB K(q) is a smith virus
with specification function JqK.
4.3. Monomorphic Viruses 63

The proof of Theorem 4.3.3 depends on being able to find explicit and regular
fixed points. Since we can find these computably, then we should be able to find
smith viruses computable from their specification functions.
Theorem 4.3.4 (A smith virus distribution engine)
There exists a smith virus distribution engine.
Proof. There exists a program T such that for all z, y, p, x ∈ D:
JT K(y, ⟨z, p⟩, x) = ⟨y, z, p, x⟩.
If ○ is the composition program then define programs dB , r and dv so that for all
q ∈ D:
JdB K(q) = JxrtK(J○K(q, T ))
JrK(q) = Js11 K(q, JdB K(q))
JdV K(q) = JsrtK(JrK(q)),
And so, for q ∈ D, if B = JdB K(q) and v = Jdv K(q) then for all p, x ∈ D:
JvK(p, x) = JJsrtK(JrK(q))K(p, x)
= JJrK(q)K(JsrtK(JrK(q)), p, x)
= JJrK(q)K(v, p, x)
= JJs11 K(q, JdB K(q))K(v, p, x)
= JqK(JdB K(q), v, p, x)
= JqK(B, v, p, x)
= JJ○K(q, T )K(B, ⟨v, p⟩, x)
= JJ○K(q, T )K(JxrtK(J○K(q, T )), ⟨v, p⟩, x)
= JJJxrtK(J○K(q, T ))K(v, p)K(x)
= JJBK(v, p)K(x),
so Jdv K(q), JdB K(q) is a smith virus with specification function JqK, and dv , dB is a
smith virus distribution. ◻

Example 4.3.3 (A parasitic virus) As Bonfante et al. explain, "Parasitic viruses in-
sert themselves into existing files. When an infected host is executed, first the virus
infects a new host, then it gives the control back to the original host" [6]. For this
case, we model the environment as a triple ⟨p, q, x⟩ where p, q are programs and x
is the rest of environment variables, as usual. The definition of virus tells us that a
parasitic virus v, along with its propagation program B satisfy:
JvK(p, q, x) = JJBK(v, p)K(q, x)
By the description, a parasitic virus, the following equation also holds:
JvK(q, x) = JpK(JBK(v, q), x)
We can define a parasitic virus via a smith specification program. In WHILE+ , q =
4.4. Polymorphic Viruses 64

 
1 read B ,v ,p , d ;
2 // d is really a tuple <q ,x >
3 q := hd d ; x := tl d ;
4 infected_form := exec (B ,v , q ) ;
5 newEnvironment := exec (p , infected_form , x ) ;
6 write newEnvironment ;
7
 

Example 4.3.4 Consider the virus that, no matter what program it infects, when the
infected program is run, it propagates infecting all other pieces of the environment.
Such virus could be described by the following equation:

JJBK(v, p)K(x) = JBK(v, x)

Which in part is easily expressed via a smith specification program

JqK(B, v, p, x) = JBK(v, x)

In the WHILE+ language we can make a more detailed version, where for all n ∈ N
and x1 , ..., xn ∈ D:

JJBK(v, p)K(⟨x1 , ..., xn ⟩) = ⟨JBK(v, x1 ), ..., JBK(v, xn )⟩.


 
1 read B ,v ,p , x ;
2 newEnvironment := nil ;
3 while ( x ) do
4 x := tl x ;
5 infectedElement := exec (B ,v , hd x ) ;
6 newEnvironment := cons infectedElement newEnvironment ;
7 end ;
8 write newEnvironment ;
9
 

4.4 Polymorphic Viruses


The viruses that we considered so far do not modify their code when propagating.
Polymorphic viruses may modify their code before propagating, for example, mutating
their code so that the function computed is still the same, but the code is different,
making them harder to detect. This can be accomplished via padding programs.

Definition 4.6 (Padding Programs) A padding program P ad is a program such that


for all q, y ∈ D:

JJP adK(q, y)K = JqK,

And if ⟨q, y⟩ ≠ ⟨q ′ , y ′ ⟩ then JP adK(q, y) ≠ JP adK(q ′ , y ′ ).


4.4. Polymorphic Viruses 65

Lemma 4.4.1 (A Padding Program)


There exists a Padding Program P ad.
Proof. Let t be a program that for all q, y, x ∈ D, JtK(⟨q, y⟩, x) = ⟨JqK(x), y⟩. Define
◇ and P ad such that for q, y ∈ D, J◇K(q, y) = Js11 K(t, ⟨q, y⟩), and

JP adK(q, y) = J○K(π1 , J◇K(q, y)),

where, π1 is the computable projection program. We have now a program different


than q and

JJP adK(q, y)K(x) = JJ○K(π1 , J◇K(q, y))K(x)


= Jπ1 K(JJ◇K(q, y)K(x))
= Jπ1 K(JJs11 K(t, ⟨q, y⟩)K(x))
= Jπ1 K(JtK(⟨q, y⟩, x))
= Jπ1 K(⟨JqK(x), y⟩)
= JqK(x)

P ad in WHILE+ We follow the steps of the above proof to construct a P ad program


in WHILE+ . Start with the composition program ○.

   
1 // ○ = 1 // ○ =
2 read p ,q , x ; 2 read p , q ;
3 qx := exec (q , x ) ; 3 comp := spec ( ○ ,p , q ) ;
4 out := exec (p , qx ) ; 4 write comp ;
5 write out ; 5
 
6
 

More auxiliary programs definitions:

   
1 // t = 1 // π1 =
2 read z ,q , y ; 2 read x ;
3 e := exec (q , x ) ; 3 p := hd x ;
4 out := cons e y ; 4 write p ;
5 write out ; 5
 
6
 
 
1 // ◇ =
2 read q , y ;
3 out := spec (t ,q , y ) ;
4 write out ;
 
And finally we obtain the P ad program. P ad =
4.4. Polymorphic Viruses 66

 
1 // Pad =
2 read q , y ;
3 diam := exec ( ◇ ,q , y ) ;
4 out := exec ( ○ , π1 , diam ) ;
5 write out ;
 

4.4.1 Evolving Blueprint Virus


Definition 4.7 (Evolving Blueprint Virus) Let q be a program. An evolving blueprint
virus with specification q is a function ev such that for all i, p, x ∈ D:

Jev K(i) is a virus


JJev K(i)K(p, x) = JqK(ev , i, p, x)

Analogously to Proposition 4.3.1, the existence of an evolving blueprint virus for


any given q is due to a recursion theorem. In this case, it follows from the Explicit
Recursion Theorem 2.2.3.
Proposition 4.4.2 (Existence of Evolving Blueprint Viruses)
Let q be a program. There exists an evolving blueprint virus ev with specification q.
Proof. This is a direct result of Theorem 2.2.3. Given q, there exists an explicit fixed
point ev for q such that for all i, z ∈ D:

JJev K(i)K(z) = JqK(ev , i, z)

In particular, for p, x ∈ D:

JJev K(i)K(⟨p, x⟩) = JqK(ev , i, ⟨p, x⟩)


= JqK(ev , i, p, x)


Again, just as with monomorphic blueprint viruses, we want to computably find evolv-
ing distributions given a specification q.

Definition 4.8 (Evolving Blueprint Virus Distribution Engine) An evolving blueprint


virus distribution engine is a program cv such that for every program q, Jcv K(q) is an
evolving blueprint virus with specification q.
The existence of Evolving Blueprint Virus Distribution Engines comes from being
able to compute explicit fixed points, as in Theorem 2.2.4.

Theorem 4.4.3 (Existence of evolving blueprint virus distribution engines)


There exists an evolving blueprint virus distribution engine cv .
Proof. cv = xrt is an evolving blueprint virus distribution engine. ◻
4.4. Polymorphic Viruses 67

Example 4.4.1 (Mutating LoveLetter) Let’s revisit the LoveLetter virus from Exam-
ple 4.3.1. This is a virus that overwrites all files in the environment, and sends copies
of itself to all the contacts. In this revisited form, the code sent to all the contacts
is not the exact copy of the virus, but a mutated code. Recall the P ad program from
4.4, where JJP adK(q, y)K = JqK for all q, y ∈ D. This will work as our “mutator”.
EvolvingLoveLetterSpecificationProgram:=
 
1 read ev ,i , mb , d ;
2 d := hd x ; @bk = tl x ;
3 newFiles := nil ; // Build the new file structure .
4 virus := exec ( ev , i ) // The actual virus .
5 while ( d ) do
6 d := tl d ; /* We do not care about the original file , we
just need to overwrite every one . */
7 newFiles := cons virus newFiles ;
8 end ;
9 // Create new version of virus .
10 nextKey := cons nil i ;
11 newVirus := exec ( Pad , virus , nextKey ) ;
12 // Loop to modify mailbox
13 while ( @bk ) do
14 @ := hd @bk ;
15 newMail := cons @ newVirus ;
16 mb := cons newMail mb ;
17 @bk := tl @bk ;
18 end ;
19 /* The new environment is made up of the modified outbox and
the overwritten files */
20 newEnvironment := cons mb newFiles ;
21 write newEnvironment ;
 
Using an evolving blueprint virus distribution engine cv we could transform
the program into a code of the corresponding evolving blueprint virus.

4.4.2 Evolving Smith Virus


Definition 4.9 (Evolving Smith Virus) Let q be a program. An evolving smith virus
specified by q is a pair ev , eB such that for all i, p, x ∈ D:

Jev K(i), JeB K(i) is a smith virus


JJev K(i)K(p, x) = JqK(eB , ev , i, p, x)

The existence, given a program q, of an evolving smith virus specified by q, is


given by the Nested Explicit Recursion Theorem 2.2.5.

Theorem 4.4.4 (Existence of evolving smith virus)


Given a program q, there exists an evolving smith virus ev , eB specified by q.
4.4. Polymorphic Viruses 68

Proof. Define a program p that for all y, z, i, p, x ∈ D, JpK(z, i, ⟨y, p⟩, x) = JqK(z, y, i, p, x).
By Theorem 2.2.5 there exists a nested explicit fixed point eB for p, so that

JJJeB K(i)K(y, p)K(y) = JpK(eB , i, ⟨y, p⟩, x)


= JqK(eB , y, i, p, x)

Now defining a program r that for all y, i, p, x ∈ D, JrK(y, i, p, x) = JqK(eB , y, i, p, x),


and applying the explicit recursion Theorem (2.2.3), we find a fixed point ev , satisfying

JJev K(i)K(p, x) = JqK(eB , ev , i, p, x).

From these two equations we conclude that eB , ev is an evolving smith virus specified
by q. ◻
As with the previous cases, we want to find evolving smith viruses computably
from their specification functions.

Definition 4.10 (Evolving smith virus distribution engine) An evolving smith virus
distribution engine is a pair cB , cv such that for all programs q, JcB K(q), Jcv K(q) is
an evolving smith virus specified by q.
Theorem 4.4.5 (Existence of evolving smith virus distribution engines)
There exists an evolving smith virus distribution engine.
Proof. As is routine by this point, we use the fact that we can compute nested
and regular explicit fixed points. Let T be a program that for all y, z, i, p, x ∈ D,
JT K(z, i, ⟨y, p⟩, x) = ⟨z, y, i, p, x⟩. Let r be a program that for all q ∈ D, JrK(q) =
J○K(q, T ). Define cB so that for all q:

JcB K(q) = JnestK(JrK(q)).

eB = JcB K(q) is a nested explicit fixed point for program JrK(q), so for all y, i, p, x ∈ D:

JJrK(q)K(eB , i, ⟨y, p⟩, x) =


JqK(eB , y, i, p, x) =
JJJeB K(i)K(y, p)K(x),

Now for all q define

Jcv K(q) = JxrtK(Js11 K(q, JcB K(q))),

and if, for some q ∈ D, ev = Jcv K(q), we have that for all i, p, x ∈ D:

JJev K(i)K(p, x) = JJJxrtK(Js11 K(q, JcB K(q)))K(i)K(p, x)


= JJs11 K(q, JcB K(q))K(ev , i, p, x)
= JqK(eB , ev , i, p, x),

and so JcB K(q), Jcv K(q) is an evolving smith virus specified by q, for all q ∈ D. ◻
4.5. Adleman’s Viruses 69

Example 4.4.2 (Revisiting the Parasitic Virus) Recall the original Parasitic Virus
from Example 4.3.3, given by the equations

JvK(p, q, x) = JJBK(v, pK(x)


JvK(q, x) = JpK(JBK(v, q), x).

We modify it so that the infection on a new host q is done with a virus of the “next
generation”. Fix some element 1 ∈ D, and use the notation i + 1 = ⟨i, 1⟩ for all i ∈ D.
i + 1 gives us a way of changing generations. Use the notation vi = Jev K(i) and
Bi = JeB K(i). A mutating parasitic virus is defined by the equations:

Jvi K(p, q, x) = JJBi K(vi , p)K(q, x)


Jvi K(q, x) = JpK(JBi K(JP adK(vi+1 , i), q), x).

In WHILE+ , EvolvingParasiticVirusSpecification=
 
1 // EvolvingParasiticVirusSpecification =
2 read e B , e v , i , p , d ;
3 // d is really a tuple ⟨ q , x ⟩
4 q := hd d ; x := tl d ;
5 nextIndex := cons i nil ;
6 B i := exec ( e B ,i ) ;
7 v i+1 := exec ( e v , nextIndex ) ;
8 mutation := exec ( Pad , v i+1 , i ) ;
9 infected_form := exec ( B i , mutation , q ) ;
10 newEnvironment := cons infected_form x ;
11 write newEnvironment ;
12
 

4.5 Adleman’s Viruses


We provide a characterization of Adleman virus encompassing ideas from [4] and [5].

Definition 4.11 (Adleman Virus) A program A is an Adleman virus if for each x ∈ D


one of the following holds:

1. Injure: For all p, q ∈ D:

JJAK(p)K(x) = JJAK(q)K(x).

2. Imitate: For all p ∈ D:

JJAK(p)K(x) = JpK(x)

3. Infect: For all p, x ∈ D:

JJAK(p)K(x) = JλA K(JpK(x)),


4.5. Adleman’s Viruses 70

where λA is a program that for all n ∈ N, x1 , ..., xn ∈ D, JλA K(x1 , ..., xn ) =


⟨λ(x1 ), ..., λ(xn )⟩ where either


⎪ xi or
λ(xi ) = ⎨

⎩JAK(xi ).

So running an infected program on an environment, first emulates the execution


of the original program over the environment, and then each element of the
resulting environment is either infected or left untouched.

Adleman’s definition describes the behavior of a virus A by the behavior of its


infected programs JAK(p). In this sense, Bonfante’s definition would seem more
general, as its infected programs are given by a propagation function JBK(v, p). This
is formalized in the following Proposition.

Proposition 4.5.1 (Adleman’s Viruses are viral propagation programs)


Let A be an Adleman virus. There exists a virus v (in the sense of Definition 4.1)
such that for all p, x ∈ D

JvK(p, x) = JJAK(p)K(x)

In fact, we can easily prove something a bit stronger, the ability to go from
Adleman’s viruses to Bonfante’s viruses computably.

Proposition 4.5.2 (An Adleman virus translator)


There exists a program T r such that for all Adleman virus A, JT rK(A) is a virus such
that for all p, x ∈ D:

JJT rK(A)K(p, x) = JJAK(p)K(x),

i.e. JT rK(A) is a virus wrt to a propagation program JBK(v, p) = JAK(p).


Proof. Let T be a program that for all a, p, x ∈ D, JT K(a, p, x) = JJaK(p)K(x). Define
for a ∈ D:

JT rK(a) = Js11 K(T, a),

and so we have that for all A, p, x ∈ D

JJT rK(A)K(p, x) = JJAK(p)K(x),

so JT rK(A) is a virus wrt to propagation program JBK(v, p) = JAK(p). ◻

Proposition 4.5.3 ( [5] Not every virus propagates like an Adleman virus)
There exists a virus v (in the sense of 4.1) such that there is no program A ∈ D such
that A is an Adleman virus and JvK(p, x) = JJAK(p)K(x) for all p, x ∈ D.
4.5. Adleman’s Viruses 71

Proof. Let θ ∈ D be a program that for all x, y, z ∈ D, JθK(x, y, z) = ⟨y, x, z⟩. Define
a function v that for all p, q, r, d ∈ D,

JvK(p, q, r, d) = JθK(JpK(q, r, d)).

So v is a virus that disorganizes the output of the infected programs. By contradiction,


we show that the propagation of v is not a virus in the sense of Adleman, so suppose
there is an Adleman virus A such that JvK(p, x) = JJAK(p)K(x). Let id be the identity
program JidK(d) = d and inc a program that for r, q, d ∈ D, JincK(q, r, d) = ⟨q, r, d + 1⟩
(d + 1 as in Example 4.4.2). Let q, r, d ∈ D with r ≠ q. Then, since
JJAK(id)K(q, r, d) = ⟨r, q, d⟩ ≠ JidK(q, r, d), then the imitate condition does not hold.
The injure condition does not either, as JJAK(i)K(q, r, d) = ⟨r, q, d⟩ ≠ JJAK(inc)K(q, r, d) =
⟨r, q, d + 1⟩. So the Infect condition must hold and so

⟨r, q, d⟩ = JJAK(id)K(q, r, d) = JλA K(q, r, d)


= ⟨λ(q), λ(r), λ(d)⟩.

Since r ≠ q, then r = λ(q) ≠ q, and so it must be that r = λ(q) = JAK(q). This should
happen for all q ≠ r, which is clearly a contradiction, since we could find a third
program t different from both r and q, and we would have r = λ(q) = t. ◻ The virus
presented above is defined by Bonfante et al. as the Wagger Virus.

Closing Comments
We have studied a classification of viruses by Bonfante et al. [6], seeing in the pro-
cess how their approach to formal computer virology relates closely to the study of
recursion theorems, as they play a big part in the construction of different kinds
of viruses, and at the same time help us capture key viral properties such as self
replication and mutation. Through examples like the Wagger Virus we see how this
definition encompasses viral cases not captured by previous models, most notably
Adleman’s [2], which continues to be today a benchmark in the formalization of com-
puter viruses. The generality of the definition comes at a cost, as any computable
function is a virus with respect to the propagation function Js11 K. This generality
poses a problem in the detection of viruses as well, as it is shown in the last Section
of [5]. They present, for example, the following result

Theorem 4.5.4 (Π2 -complete propagation functions)


Given a propagation function B, let VB be the set of viruses wrt B. There exists a
propagation function B such that VB is neither recursive, nor recursively enumerable.
However the authors later propose that it may be worthwhile to continue the study
of this virus formalization through the classification of viruses and restriction of their
propagation functions.
4.5. Adleman’s Viruses 72

Challenges to solve The problem of giving a formal definition of a computer virus


is still open. We chose to review and present Bonfante’s approach, as it seems to be
the most widely accepted among modern authors, and it allowed us to explore the
applications of the different recursion theorems. However, we find three issues with
Bonfante et al.’s work that are still to be solved:

1. Restriction: Computer viruses are of practical (instead of theoretical) nature,


and even in practical terms there is no consensus of an exact definition of a
computer virus. So giving right now a precise formal definition of a virus is
simply impossible. We know, however, some things that computer viruses are
not. In particular, it can be accepted that not every program is a computer
virus, and in such apparently basic aspect, Bonfante’s definition fails. This
does not mean that it should be completely discarded, however, but rather
worked upon and restricted. Not every propagation function is, nor should
be, viral. Although a classification has been attempted, as it is presented in
this document, this classification is based on the construction of the viruses, a
theoretical approach, instead of the action of the viruses, which is what is of
practical interest.

2. Justification: Bonfante’s definition attempts to give a formal model of a practical


concept. As such, more work should be done in showing that such definition can
successfully model the practical scenarios. The examples of “real-life” viruses
presented are very lacking, and so the question if Bonfante’s model is appro-
priate remains unanswered. It would be reasonable to provide more examples
of popular computer viruses, modeled as the equation
JvK(p, x) = JB(v, p)K(x). So far, very few are presented, so even if we had con-
crete results about Bonfante’s viruses, we would not know if they are relevant
to the practical study of computer virology. But that leads us to the third point,

3. Results: To the moment, there seem to be only vague results regarding Bon-
fante’s viruses. There are some decidability theorems on [5], which, although
interesting and by no means trivial, provide little to no contribution in practical
terms, as they are centered around the recursiveness or non-recursiveness of
sets of viruses with respect to very particular, but not viral (in the practical
sense) propagation functions.

We think that these three problems have a common root: the detachment of Bonfante’s
model from the practical aspect of computer viruses. We believe more focus should
be put on connecting the formal virus model with the real world definition, in hopes
of getting more concrete, relevant results on the subject, or, at worst, to conclude
that this formal model is not appropriate, learn from the experience and continue on
the search of a suitable formal definition of computer viruses.
Chapter 5

Conclusion

We have made a thorough study of the classic theory leading up to Kleene’s Second
Theorem of Recursion 1.3.2. We have seen how the simple statement

∀p ∃p′ ∀x ∶ JpK(p′ , x) = Jp′ K(x),

has a wide array of applications and implications. In 2.2 we show some seemingly
pathological implications of the SRT, such as the existence of self writing programs.
In Chapter 3, we studied the implementation of the SRT in two completely differ-
ent computation models, following Jones’ work [11], including the very limited TINY
language, where many of those “pathological” results still hold. In Chapter 2 we
worked our way to the definition of acceptable programming languages, for which we
showed many results regarding fixed points of computable functions, and the ability
to compute such fixed points given any program. These recursion Theorems showed
to be of great relevance in Chapter 4, where we studied a formalization of computer
viruses developed by Bonfante et al [4], and proved interesting results regarding the
construction of viruses by the application of recursion theorems. There is a wide
array of applications based on Kleene’s Second Recursion Theorem, both theoretical
and practical, as Moschovakis shows in [13]. Indeed, as the aforementioned author
states, Kleene’s Second Recursion Theorem is amazing.

73
Bibliography

[1] Backus-naur form. Available at https://en.wikipedia.org/wiki/Backus%


E2%80%93Naur_Form.

[2] Leonard M Adleman. An abstract theory of computer viruses. Advances in Crypto,


1998.

[3] Guillaume Bonfante, Mohamed El-Aqqad, Benjamin Greenbaum, and Mathieu


Hoyrup. Immune systems in computer virology. In Evolving Computability, pages
127–136. Springer, 2015.

[4] Guillaume Bonfante, Matthieu Kaczmarek, and J-Y Marion. Toward an abstract
computer virology. In Theoretical Aspects of Computing–ICTAC 2005, pages
579–593. Springer, 2005.

[5] Guillaume Bonfante, Matthieu Kaczmarek, and J-Y Marion. On abstract com-
puter virology from a recursion theoretic perspective. Journal in computer virol-
ogy, 1(3-4):45–54, 2006.

[6] Guillaume Bonfante, Matthieu Kaczmarek, and Jean-Yves Marion. A classifi-


cation of viruses through recursion theorems. In Computation and Logic in the
Real World, pages 73–82. Springer, 2007.

[7] Xavier Caicedo. Elementos de lógica y calculabilidad. Universidad de San


Buenaventura de Medellin (USB), 1990.

[8] S Barry Cooper. Computability theory. CRC Press, 2003.

[9] Jean H. Gallier. Lecture notes in elementary recursive function theory, 2010.

[10] Neil D Jones. Computability and complexity from a programming perspective. In


Proof and System-Reliability, pages 79–135. Springer, 2002.

[11] Neil D Jones. A swiss pocket knife for computability. arXiv preprint
arXiv:1309.5128, 2013.

[12] Yves Marcoux. Composition is almost (but not quite) as good as s-1-1. Theo-
retical computer science, 120(2):169–195, 1993.

74
Bibliography 75

[13] Yiannis N Moschovakis. Kleene’s amazing second recursion theorem. The Bul-
letin of Symbolic Logic, 16(2):189–239, 2010.

[14] Lawrence S Moss. Recursion theorems and self-replication via text register
machine programs. Bulletin of the EATCS, 89:171–182, 2006.

[15] Hartley Rogers. Theory of recursive functions and effective computability, vol-
ume 5. McGraw-Hill New York, 1967.

[16] Michael Sipser. Introduction to the Theory of Computation, volume 2. Thomson


Course Technology Boston, 2006.

[17] Raymond M Smullyan. Recursion theory for metamathematics. 1993.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy