0% found this document useful (0 votes)
12 views52 pages

Tese Polymorphic Type Inference

Tese Polymorphic Type Inference

Uploaded by

Sam Uel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views52 pages

Tese Polymorphic Type Inference

Tese Polymorphic Type Inference

Uploaded by

Sam Uel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

Polymorphic Type Inference

(Polymorf Type Inferens)


Rasmusk Kock Thygesen, 201909745
Timur Bas, 201906748

Bachelor Report (15 ECTS) in Computer Science


Advisor: Lars Birkedal
Department of Computer Science, Aarhus University
June 2022

AU AARHUS
UNIVERSITY
DEPARTMENT OF COMPUTER SCIENCE
Abstract

This thesis contributes to the understanding of how the Hindley-Milner Type system works based
on a language that is an extended version of the lambda calculus with let-polymorphism. The
proofs of the consistency between the dynamic semantics and static semantics of the language
are given for the new expressions introduced in the language. In particular, the new expressions
are integers, booleans, strings, tuples, the fst operation, and the snd operation.
Furthermore, two type inference algorithms are discussed, namely, W and Wopt . The latter
algorithm is an optimized version of the prior. Soundness proofs are given for W. Implementations
for both algorithms are given in OCaml.
Finally, the full implementation of both algorithms can be found on our GitHub page here.

Rasmusk Kock Thygesen and Timur Bas,


Aarhus, June 2022.

ii
Contents

Abstract ii

1 Introduction 1

2 Language Definition & Semantics 2


2.1 The Grammar of the Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.2 Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2.1 Dynamic Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2.2 Static Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.2.1 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.2.2 Type Environment . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.2.3 The Tyvar Map . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2.2.4 Substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.2.5 Instantiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2.6 Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2.7 Derivation Examples . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Consistency Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Type Inference - Algorithm W 13


3.1 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1.1 Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1.2 Pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Soundness Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3.1 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4 Optimizing Algorithm W 25
4.1 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1.1 Union-Find Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1.2 Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1.3 Pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5 Conclusion 30

Bibliography 31

Appendices 32

iii
A Operations 33
A.1 Tyvars Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
A.2 Substitution Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
A.3 Unification Table for W . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
A.4 Unification Table for Wopt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

B Generalization 35
B.1 Full Derivation of Example 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

C Type Inference Examples 36


C.1 Monomorphic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
C.2 Polymorphic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

D Additional Proofs 44
D.1 Consistency Proof for the snd(e1 , e2 ) . . . . . . . . . . . . . . . . . . . . . . . . . 44
D.2 Soundness Proof for snd(e1 , e2 ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

E Practicalities of W and Wopt 46


E.1 Examples Used For Test Correctness of W and Wopt . . . . . . . . . . . . . . . . 46
E.2 Pretty Printer for W . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
E.3 Pretty Printer for Wopt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

iv
Chapter 1

Introduction

In many programming languages, the programmer has to make sure the types of the variables
are coherent to avoid typing errors. One way to make it easier for the programmer to work with
the types is to infer the types so the programmer can omit explicit type annotation. A compiler
will then be able to deduce which variables have which types and an editor might even tell the
programmer the type of different variables. This can give a better overview of the code and the
programmer can receive on-the-fly error messages about types where the expected type can be
mentioned. This can potentially make it easier to figure out how to fix errors. One problem
with type inference algorithms is that it might be harder to infer the type in more complex
languages. The Hindley-Milner type system is a type system based on lambda calculus and is
complex enough to have polymorphic functions in the form of let polymorphism while being
simple enough to have a practical type inference algorithm that can give the programmer hints
in real-time while writing their code.
In Chapter 2 of this thesis, we will introduce the Hindley-Milner type system by defining the
grammar of the language and present both the dynamic and static semantics. We will go
over a few concepts that will help us build the algorithm such as substitutions, instantiation,
generalization, and unification. After this, a proof of the consistency between the dynamic
semantics and the static semantics will be presented. We will provide both pseudocode and
a full working implementation of the type inference algorithm known as algorithm W as well
as proof of the soundness of the algorithm in Chapter 3. In Chapter 4 we will then look into
optimizing algorithm W by using a Union-Find data structure instead of substitutions to which
we have also provided both pseudocode and a full working implementation. In the appendices of
this thesis, we have summaries of some operations, examples of type inference, more detailed
examples, additional proofs, and some extra practical code such as tests and pretty printers for
both algorithms.

1
Chapter 2

Language Definition & Semantics

In order to reason about type systems and type inference, we will have to define a language
and its semantics. The language we will consider in this thesis is a relatively small functional
language. In this chapter, we will formally define the language by defining its grammar and
semantics. Lastly, we will prove consistency between the two groups of semantics..

2.1 The Grammar of the Language


The grammar of the language defines exactly which expressions are allowed in the language.
Before presenting the grammar, we have to introduce four sets. Let Var be the set of program
variables and the rest is self-explanatory. Thus, we have

x ∈ Var = {a, b, . . . , x, y, . . . }
i∈Z
b ∈ Boolean = {true, f alse}
s ∈ String = {”myV ariable”, ”myV ariable2”, . . .}

The grammar of the language Exp ranged over by e is defined using a context-free grammar
where the production rules are in Backus-Naur form
⟨e⟩ ::= i Integer
| b Boolean
| s String
| x Variable
| λx.e1 Lambda Abstraction
| e1 e2 Application
| let x = e1 in e2 Let
| (e1 , e2 ) Tuple
| fst (e1 , e2 ) First Tuple
| snd (e1 , e2 ) Second Tuple
note that this grammar is an extended version of the lambda calculus with let-polymorphism.
In particular, we have extended the grammar with the first three production rules and the last
three production rules. Parentheses are used to disambiguate expressions.

2
2.2 Semantics
In this section, we present the semantics of the language. The semantics of this language can be
separated into two groups, namely the dynamic semantics and the static semantics. The former
deals with the evaluation of programs while the latter deals with type checking programs.

2.2.1 Dynamic Semantics


In order to present the inference rules we have to introduce the semantic objects that the rules
in this section will be built upon.

Figure 2.1: Semantic Objects

bv ∈ BasVal = Z ∪ Boolean ∪ String


v ∈ Val = BasVal + Clos + Val × Val
[x, e, E] ∈ Clos = Var × Exp × Env
E ∈ Env = Var → Val
r ∈ Results = Val + {wrong}

Figure 2.1 presents the semantic objects being used in the inference rules. Note that the result
wrong is not a value but merely an indicator of a nonsensical evaluation, for instance, trying to
do the fst operation on a lambda abstraction.
Each inference rule has the general form
P1 , . . . , P n
n≥0
C
where each premise Pi for 0 ≤ i ≤ n all together allows us to infer the conclusion C. Each
premise is either a sequent or a side condition written by standard mathematical concepts.
The inference rules for the dynamic semantics are the following

D-Lookup D-Lookup-Wrong D-Lambda


x ∈ Dom E / Dom E
x∈
E ⊢ x → E(x) E ⊢ x → wrong E ⊢ λx.e → [x, e, E]

D-App
E ⊢ e1 → [x0 , e0 , E0 ]
E ⊢ e2 → v0
E0 ± {x0 7→ v0 } ⊢ e0 → r
E ⊢ e1 e2 → r

D-App-Wrong1 D-App-Wrong2
E ⊢ e1 → [x0 , e0 , E0 ] E ⊢ e2 → wrong E ⊢ e1 → w : w ∈ (Val \ Clos) ∪ {wrong}
E ⊢ e1 e2 → wrong E ⊢ e1 e2 → wrong

D-Let D-Let-Wrong
E ⊢ e1 → v1 E ± {x 7→ v1 } ⊢ e2 → r E ⊢ e1 → wrong
E ⊢ let x = e1 in e2 → r E ⊢ let x = e1 in e2 → wrong

3
D-Tuple D-Tuple-Wrong1 D-Tuple-Wrong2
E ⊢ e 1 → v1 E ⊢ e2 → v2 E ⊢ e1 → wrong E ⊢ e 1 → v1 E ⊢ e2 → wrong
E ⊢ (e1 , e2 ) → (v1 , v2 ) E ⊢ (e1 , e2 ) → wrong E ⊢ (e1 , e2 ) → wrong

D-Fst D-Fst-Wrong1 D-Fst-Wrong2


E ⊢ e1 → v1 E ⊢ e2 → v2 E ⊢ e1 → wrong E ⊢ e 1 → v1 E ⊢ e2 → wrong
E ⊢ fst(e1 , e2 ) → v1 E ⊢ fst(e1 , e2 ) → wrong E ⊢ fst(e1 , e2 ) → wrong

D-Snd D-Snd-Wrong1 D-Snd-Wrong2


E ⊢ e 1 → v1 E ⊢ e2 → v2 E ⊢ e1 → wrong E ⊢ e 1 → v1 E ⊢ e2 → wrong
E ⊢ snd(e1 , e2 ) → v2 E ⊢ snd(e1 , e2 ) → wrong E ⊢ snd(e1 , e2 ) → wrong

D-Fst D-Snd
E ⊢ (e1 , e2 ) → (v1 , v2 ) E ⊢ (e1 , e2 ) → (v1 , v2 )
E ⊢ snd(e1 , e2 ) → v1 E ⊢ snd(e1 , e2 ) → v2

D-Int D-Bool D-String

E⊢i→i E⊢b→b E⊢s→s

The notation E ⊢ e → r is a ternary relation between elements of Env, Exp, and Results,
respectively. One can read it as – Under the Environment E when e is evaluated it results in
r. The domain and range of any map f are denoted Dom f and Range f respectively. For any
f, g that is an infinite or finite map we define f ± g to be the map where f is modified by g.
The resulting map will have domain Dom f ± g = Dom f ∪ Dom g and range Range f ± g =
Range g ∪ {f (x) : x ∈ Dom f ∧ x ∈ / Dom g}. We use ± because the + resembles that the
domain can possibly be larger than f and the − because if f, g have the same element in their
domains then the value in the range of g is used hence a value might disappear from f .

Example 1 (The ± operator). Let f = {x 7→ α, y 7→ β} and g = {y 7→ int, z 7→ γ} then


f ± g = {x 7→ α, y 7→ int, z 7→ γ}.

2.2.2 Static Semantics


The basic Hindley-Milner type system is defined by the following set of inference rules

S-Lookup S-Lambda S-App


x ∈ Dom Γ Γ(x) > τ Γ ± {x 7→ τ ′ } ⊢ e1 : τ Γ ⊢ e1 : τ ′ → τ Γ ⊢ e2 : τ ′
Γ⊢x:τ Γ ⊢ λx.e1 : τ ′ → τ Γ ⊢ e1 e2 : τ

S-Let
Γ ⊢ e1 : τ1 Γ ± {x 7→ ClosΓ τ1 } ⊢ e2 : τ
Γ ⊢ let x = e1 in e2 : τ

4
However, due to the extension of the grammar, we have to define one rule per new expression

S-Tuple S-Fst S-Snd


Γ ⊢ e1 : τ1 Γ ⊢ e2 : τ2 Γ ⊢ (e1 , e2 ) : τ1 × τ2 Γ ⊢ (e1 , e2 ) : τ1 × τ2
Γ ⊢ (e1 , e2 ) : τ1 × τ2 Γ ⊢ fst (e1 , e2 ) : τ1 Γ ⊢ snd (e1 , e2 ) : τ2

S-Int S-Bool S-String

Γ ⊢ i : int Γ ⊢ b : bool Γ ⊢ s : string

Now that the type system is well-defined we can pick parts of the rules (e.g. the ClosΓ τ1 ) in
order to explain their meanings.

2.2.2.1 Types
Let TyCon be a finite set of nullary type constructors also known as the basic types. A nullary
type constructor simply means that the type itself takes zero types as parameter. Furthermore,
let TyVar be an infinite set of type variables.

π ∈ TyCon = {int, bool, string}


α ∈ TyVar = {β, γ, δ, . . .}

A type τ and a type scheme σ are defined as follows

τ ::= π | α | τ1 → τ2 | τ1 × τ2
σ ::= τ | ∀α.σ1

We denote the set of τ ’s and σ’s as Type and TypeScheme respectively.


A type τ can represent a nullary type constructor, a type variable, an abstraction, or a product.
Note that for the case of abstraction and product the types τ1 , τ2 that occur in them are defined
recursively with respect to τ .

Example 2 (Type). τ = (β → bool) × (string → γ)


Moreover, a type scheme σ is either a type τ or a type scheme that is quantified with type
variables. Generally we write ∀α1 . . . ∀αn .τ which we will abbreviate by writing ∀α1 , . . . , αn .τ .
If the set of type variables is ∅ then we will write τ but implicitly any type τ is quantified, written
∀.τ but we will not be bothered to write so.

Example 3 (Type schemes). Let σ1 , σ2 be type schemes and σ1 = ∀α.α → α and σ2 = int → int
Additionally, there is the concept of bound and free type variables in σ. If σ = ∀α1 , . . . , αn .τ
then all {α1 , . . . , αn } are said to be bound in σ. Contrarily, a type variable α is free if it is not
bound and occurs in τ .

Example 4 (Free and bound type variables). Let σ = ∀β.(β → bool) × (string → γ) then {γ}
is a free variable in σ and {β} is a bound variable in σ.

5
2.2.2.2 Type Environment
A type environment Γ contains information about which program variables have which type
schemes. Thus, it is a finite map from program variables to type schemes i.e. Γ : Var →
TypeScheme. Lastly, a judgement of the form Γ ⊢ e : τ is read as follows — Under the type
environment Γ the expression e is well-typed with τ .

Example 5 (Type environment). Γ = {x 7→ ∀α, β.α → β}

2.2.2.3 The Tyvar Map


The type variables map, tyvars, is a map from a type to the set of type variables occurring in the
given type. In the following we will show the operations we can do with the tyvars map.
tyvars(τ )
By case distinction of τ we get four cases
τ =π
The result of the operation will be the empty set ∅.
τ =α
The result of the operation will be the set with α i.e. {α}.
τ = τ1 → τ2
τ = τ1 × τ2
The result of these two operation is defined as tyvars(τ1 ) ∪ tyvars(τ2 ). See Appendix
A for a summary of the tyvars operations on types.
tyvars(σ)
If σ = ∀α1 . . . αn .τ then we have that

tyvars(σ) = tyvars(∀α1 . . . αn .τ )
n
[
= tyvars(τ ) \ αi
i=1

which is all type variables that occur in τ and are free in σ.


tyvars(Γ)

This operation is defined as tyvars(σ).


S
σ∈Range Γ

Example 6 (The tyvars map applied on a type environment). Let Γ = {x 7→ ∀α, γ.(α × β) × γ}
then

tyvars(Γ) = tyvars(∀α, γ.(α × β) × γ)


= tyvars((α × β) × γ) \ {α, γ}
= (tyvars(α × β) ∪ tyvars(γ)) \ {α, γ}
= (tyvars(α) ∪ tyvars(β) ∪ tyvars(γ)) \ {α, γ}
= {α, β, γ} \ {α, γ}
= {β}

6
Furthermore, whenever we write fresh type variables we mean type variables that do not exist
in the tyvars(Γ) set nor in any of the bound variables in the range of Γ. Finally, a type τ is a
monotype µ ∈ Type if tyvars(τ ) = ∅.

2.2.2.4 Substitutions
A substitution S is a map from a type variable to a type, S : TyVar → Type. We let the identity
substitution be denoted ID. Moreover, every substitution is the identity substitution outside its
domain and range. A substitution is ground if for all types τ in the range of any substitution S is
a monotype. Similarly to the last section, we will define the substitution operations. Parentheses
can be disregarded.
S(τ )
By case distinction of τ we find out what the result of the operation would be.
τ =π
The result of the substitution applied to the nullary type constructor π is π.
τ =α
If {α 7→ τ1 } ⊆ S then Sα = τ1 else Sα = α.
Example Let S = {β 7→ γ}. If α = β then Sα = Sβ = γ. However, if α = γ then
Sα = Sγ = γ.
τ = τ1 → τ2
The result of the substitution applied to an abstraction is defined as Sτ1 → Sτ2 . For
the sake of proofs introduced later we explicitly write it as a definition

Definition 2.2.1. For any substitution S and any type of the form τ1 → τ2 then
S(τ1 → τ2 ) = Sτ1 → Sτ2
τ = τ1 × τ2
The result of the substitution applied to a tuple is defined as Sτ1 × Sτ2 . For the sake
of proofs introduced later we explicitly write it as a definition

Definition 2.2.2. For any substitution S and any type of the form τ1 × τ2 then
S(τ1 × τ2 ) = Sτ1 × Sτ2
See Appendix A for a summary of the substitution operations on types.
S(σ)

For this operation, we will use the notation {αi 7→ βi } = {α1 7→ β1 , . . . , αn 7→ βn } where n is
the number of bound variables in the respective σ. We would like to define Sσ such that the
following definition holds for σ1 = σ and σ2 = Sσ.

Definition 2.2.3. Let σ1 = ∀α1 . . . αn .τ1 and σ2 = ∀β1 . . . βn .τ2 be type schemes and S be a
S
substitution. We will write σ1 −
→ σ2 if all of the following holds

1. m = n

2. {αi 7→ βi } is a bijection and βi ∈


S
/ tyvars(τ )
τ ∈Range S

3. (S ± {αi 7→ βi })τ1 = τ2

7
Definition 2.2.4. Given a substitution S, a typescheme σ = ∀α1 . . . αn .τ , and n fresh variables
β1 , . . . , βn , Sσ = S(∀α1 . . . αn .τ ) = ∀β1 . . . βn .(S ± {αi 7→ βi })τ
S
Lemma 2.2.1. For any typescheme σ and any substitution S, σ −
→ Sσ

Proof of lemma 2.2.1. The first condition obviously holds since we are using 1, . . . , n for the
numbering of the bound variables in both σ and Sσ. The second condition holds since βi , . . . , βn
are fresh type variables and thus are not present anywhere else. We see that in our case τ1 = τ
and τ2 = (S ± {αi 7→ βi })τ1 which is exactly what the third condition requires. Thus, the third
condition holds as well.

Example 7 (Substitution on a type scheme). Let σ = ∀α1 .α → α1 and S = {α 7→ bool, α1 7→ int}


then

Sσ = S(∀α1 .α → α1 )
= ∀β1 .(S ± {α1 7→ β1 })(α → α1 )
= ∀β1 .({α 7→ bool, α1 7→ int} ± {α1 7→ β1 })(α → α1 )
= ∀β1 .{α 7→ bool, α1 7→ β1 }(α → α1 )
= ∀β1 .bool → β1

Note that the ± operator prioritizes the second operand which is why we get {α 7→ bool, α1 7→
int} ± {α1 7→ β1 } = {α 7→ bool, α1 7→ β1 }.

S(Γ)
Similarly to Sσ, we would like to define SΓ such that the following definition holds for Γ1 = Γ
and Γ2 = SΓ.
S
Definition 2.2.5. Let Γ1 , Γ2 be type environments. We write Γ1 −
→ Γ2 if the following holds

1. Dom Γ1 = Dom Γ2
S
2. ∀x ∈ Dom Γ1 , Γ1 (x) −
→ Γ2 (x)

Definition 2.2.6. Given a substitution S and a type environment Γ,


[
SΓ = {x 7→ S(Γ(x))}
x∈Dom Γ

S
Lemma 2.2.2. For any type environment Γ and any substitution S, Γ −
→ SΓ

Proof of lemma 2.2.2. The lemma is a special case of 2.2.5. By definition 2.2.6 we have Dom SΓ
which should be equal to Dom Γ so the first condition holds. We see that the second condition
S S
in our case is Γ(x) −
→ SΓ(x) which by our definition is the same as Γ(x) − → S(Γ(x)). Since Γ is
S
a map to type schemes we get σ −
→ Sσ for some σ and therefore the second condition must hold
by lemma 2.2.1.

8
Example 8 (Substitution on a type environment). Let Γ = {x 7→ ∀α1 .α → α1 , y 7→ ∀α1 , α2 .α →
int} and S = {α 7→ int} then

SΓ = {x 7→ S(Γ(x))} ∪ {y 7→ S(Γ(y))}
= {x 7→ S(∀α1 .α → α1 )} ∪ {y 7→ S(∀α1 , α2 .α → int)}
= {x 7→ ∀β1 .int → β1 } ∪ {y 7→ ∀β1 , β2 .int → int} By definition 2.2.4
= {x 7→ ∀β1 .int → β1 , y 7→ ∀β1 , β2 .int → int}

Sn . . . S2 S1
With the operations being defined we look at the composition of substitutions. It is is defined
generally as follows

[
Sn . . . S2 S1 = {α 7→ Sn . . . S2 τ }
α∈Dom S1 ,
τ =S1 α

The order of evaluation is from right to left. For instance, for three substitutions S1 , S2 , S3 then
for any type τ then S3 S2 S1 τ = S3 (S2 (S1 τ )).

Example 9 (Composite substitution). Assume three substitutions S1 , S2 , S3

S1 = {α 7→ β}
S2 = {β 7→ γ}
S3 = {γ 7→ δ}

and let the composite substitution be S3 S2 S1 then

S3 S2 S1 = {α 7→ S3 S2 β}
= {α 7→ S3 γ}
= {α 7→ δ}

2.2.2.5 Instantiation
The > operator in the S-Lookup rule indicates that a type τ1 is an instantiation of a type scheme
σ = ∀α1 . . . αn .τ2 written as σ > τ1 . More specifically, τ1 is an instantiation of σ if there exists a
substitution S with domain {α1 , . . . , αn } and range {β1 , . . . , βn } where each of the type variables
in the range are fresh such that τ1 = Sτ2 .

Example 10 (Instantiation). Let σ = ∀α.α → α and Γ = {x 7→ ∀β.β → int, y 7→ ∀γ.γ × γ} and


S = {α → δ}. Obviously, δ is a fresh type variable. We have that τ1 = S(α → α) = Sα → Sα =
δ → δ.

2.2.2.6 Generalization
In the S-Let rule, the ClosΓ operation on any type τ is defined as ClosΓ τ = ∀α1 , . . . , αn .τ where
{α1 , . . . , αn } is defined as tyvars(τ ) \ tyvars(Γ). Thus, this operation tries to generalize over each
type variable in τ but only if it does not already exist in the free type variables of Γ.

9
Example 11 (Simple generalization). If τ = α → α and Γ = ∅ then to compute ClosΓ τ we first
compute tyvars(τ ) \ tyvars(Γ) = {α} \ ∅ = {α}. Thus, ClosΓ τ = ∀α.α → α.

Example 12 (Complicated generalization). A more complicated version would be if τ = α →


(int × β) → (γ × string) → δ → int and Γ = {x 7→ ∀η.γ → η, y 7→ ∀α.α → α} then

tyvars(τ ) \ tyvars(Γ) = tyvars(α → (int × β) → (γ × string) → δ → int)


\ tyvars(∀η.γ → η) ∪ tyvars(∀α.α → α)
= {α, β, γ, δ} \ {γ}
= {α, β, δ}

which means we can quantify τ with α, β, δ and thus ClosΓ τ = ∀α, β, δ.α → (int × β) →
(γ × string) → δ → int. See Appendix B for a more detailed derivation.

2.2.2.7 Derivation Examples


In Appendix C, we show a monomorphic and polymorphic type inference derivation for two
different expressions. In the examples, we make use of what we have covered so far. It is strongly
encouraged to take a look at these examples.

2.3 Consistency Proof


Although the static semantics is independent of the dynamic semantics we would still like to
have consistency between them. The proofs presented in this section imply that a well-typed
program cannot evaluate to wrong.
Before going into the proofs, we need a relation between values and types

Definition 2.3.1. We say that v has monotype µ written ⊨ v : µ if one of the following holds

• v = bv and bv has the correct type i.e. for instance for bv = ”myV ariable” then µ = string etc.

• v = [x, e, E] and µ = µ1 → µ2 for some monotypes µ1 , µ2 and


for all v1 , r if ⊨ v1 : µ1 and E ± {x 7→ v1 } ⊢ e → r then r ̸= wrong and ⊨ r : µ2
This is extended as follows.

Definition 2.3.2. Let Γ◦ be a type environment that ranges over all closed type environments
then we define the following

⊨ v : τ if for all total ground substitutions S we have ⊨ v : Sτ


⊨ v : ∀α1 . . . αn .τ if ⊨ v : τ
⊨ E : Γ◦ if Dom E = Dom Γ◦ and ⊨ E(x) : Γ◦ (x) for all x ∈ Dom E
S
⊨ E : Γ if ⊨ E : Γ◦ and for a ground substitution S we have Γ −
→ Γ◦

We present consistency proofs for tuples, fst, and snd. The proofs for the other expressions can
be found in [4] and the base cases for integers, booleans, and strings are trivial and therefore
omitted.

Theorem 2.3.1 (Consistency of Static and Dynamic Semantic).


If ⊨ E : Γ and Γ ⊢ e : τ and E ⊢ e → r then r ̸= wrong and ⊨ r : τ

10
Proof of theorem 2.3.1. By structural induction on e.

For all cases, assume that


⊨E:Γ (2.1)

e = (e1 , e2 )

The type inference of the expression must have been of the form
S-Tuple
Γ ⊢ e1 : τ1 Γ ⊢ e2 : τ2
(2.2)
Γ ⊢ (e1 , e2 ) : τ1 × τ2

The evaluation must have been one of the following

D-Tuple
E ⊢ e1 → v1 E ⊢ e 2 → v2
(2.3)
E ⊢ (e1 , e2 ) → (v1 , v2 )

D-Tuple-Wrong1
E ⊢ e1 → wrong E ⊢ e 2 → v2
(2.4)
E ⊢ (e1 , e2 ) → wrong

D-Tuple-Wrong2
E ⊢ e 1 → v1 E ⊢ e2 → wrong
(2.5)
E ⊢ (e1 , e2 ) → wrong

We see that e1 → r1 and e2 → r2 for some r1 , r2 ∈ Results.

By our I.H. and 2.1, the first premise in 2.2, (2.3 / 2.4 / 2.5) we get

r1 ̸= wrong (2.6)
⊨ r1 : τ1 (2.7)

This means the evaluation cannot have been 2.4.

Furthermore, by our I.H and 2.1, the second premise in 2.2, (2.3 / 2.5) we get

r2 ̸= wrong (2.8)
⊨ r2 : τ2 (2.9)

This means the evaluation cannot have been 2.5. The evaluation must have been 2.3, meaning
that r1 = v1 , r2 = v2 , and r = (v1 , v2 ) ∈ Val × Val = Val. Since a Val cannot be wrong then
r ̸= wrong.

From 2.7 and 2.9 we have that ⊨ r1 : τ1 and ⊨ r2 : τ2 respectively. Since r = (v1 , v2 ) = (r1 , r2 )
and ⊨ (r1 , r2 ) : τ1 × τ2 , we must have that ⊨ r : τ1 × τ2 .

e = fst(e1 , e2 )

The type inference of the expression must have been of the form

11
S-Fst
Γ ⊢ (e1 , e2 ) : τ1 × τ2
(2.10)
Γ ⊢ fst(e1 , e2 ) : τ1

The evaluation must have been one of the following

D-Fst
E ⊢ e 1 → v1 E ⊢ e 2 → v2
(2.11)
E ⊢ fst(e1 , e2 ) → v1

D-Fst-Wrong1
E ⊢ e1 → wrong E ⊢ e2 → v2
(2.12)
E ⊢ fst(e1 , e2 ) → wrong

D-Fst-Wrong2
E ⊢ e 1 → v1 E ⊢ e2 → wrong
(2.13)
E ⊢ fst(e1 , e2 ) → wrong
We see that e1 → r1 and e2 → r2 for some r1 , r2 ∈ Results.

By our I.H. and 2.1, 2.10, (2.11 / 2.12 / 2.13) we get

r1 ̸= wrong (2.14)
⊨ r1 : τ1 (2.15)

This means the evaluation cannot have been 2.12.

By our I.H. and 2.1, 2.10, (2.11 / 2.13) we get

r2 ̸= wrong (2.16)
⊨ r2 : τ2 (2.17)

This means the evaluation cannot have been 2.13. The evaluation must have been 2.11, meaning
that r1 = v1 , r2 = v2 , and r = v1 ∈ Val. Since a Val cannot be wrong then r ̸= wrong.

From 2.15 we have that ⊨ r1 : τ1 . Since r = v1 = r1 and ⊨ r1 : τ1 , we must have that ⊨ r : τ1 .

e = snd(e1 , e2 )

Similar to the e = fst(e1 , e2 ) case and thus omitted here. However, the proof for this case can be
found in Appendix D.

12
Chapter 3

Type Inference - Algorithm W

With the language and its semantics being defined we will now dig into how an actual implementa-
tion of a type inference algorithm based on the language and its semantics from Chapter 2 works.
In this chapter, we will present an inefficient type inference algorithm and soundness proofs for
it. We will not provide completeness proof for the algorithm but in [1] they are presented for the
lambda calculus with let-polymorphism.

3.1 The Algorithm


In this section, we present the first type inference algorithm denoted W. It is heavily based on
substitutions, but also instantiation, generalization, and an operation called unification. W is
an inefficient algorithm due to the heavy usage of substitutions, however, in Chapter 4 we will
investigate how to achieve a more efficient algorithm. The following two sections present the idea
of unification and the pseudocode for W.

3.1.1 Unification
The idea of unification is to unify two types τ1 , τ2 i.e. making them equivalent to each other. In
W, the unification makes use of substitutions. Particularly, unification in the algorithm is defined
as Unify : (Type × Type) → Substitution where the range is the set of all substitutions.
The possible operations for Unify are defined below
Unify(π1 , π2 )
If π1 = π2 then we return the empty substitution ID, otherwise we fail.
Unify(α1 , α2 )
If α1 = α2 then we return the empty substitution ID otherwise we make α1 equal to α2 and thus
return {α1 7→ α2 }.

Unify(α, τ )
Unify(τ, α)

If α ∈ tyvars(τ ) then we fail, otherwise we make α equal to τ and thus return {α 7→ τ }.

Unify(τ11 → τ12 , τ21 → τ22 )


Unify(τ11 × τ12 , τ21 × τ22 )

13
The two operations above do the following. First, we get S1 = Unify(τ11 , τ21 ) then S2 =
Unify(S1 τ21 , S1 τ22 ). If both succeed then the substitution composition S2 S1 is returned.
In all other possible cases the unification fails. The operations’ semantics are taken from [3]. One
thing to note is the criteria we have in the Unify(α, τ ) and Unify(τ, α) operations. The criteria
is needed to prevent creation of infinite types. For instance, consider τ = α × α then Unify(α, τ )
must fail since there is no finite type solving the symbolic equation α = (α × α).

Example 13 (Unification). Let the Unify function take the following two types as parameter
τ1 = int → α and τ2 = int → (β × γ) then

Unify(τ1 , τ2 ) = Unify(int → α, int → (β × γ))


= Unify(α, β × γ) Since Unify(int, int) = ID
= {α 7→ β → γ} Since α ∈
/ tyvars(β → γ)

Definition 3.1.1. If S = U nif y(τ1 , τ2 ) then Sτ1 = Sτ2


This definition states that the two types τ1 , τ2 we provide to the Unify function are equal under
the returned substitution, i.e. Sτ1 = Sτ2 .

3.1.2 Pseudocode
The pseudocode for W is presented below

W(Γ, e) = case e of let x = e1 in e2 =⇒


i =⇒ (ID, int) (S1 , τ1 ) = W(Γ, e1 )
b =⇒ (ID, bool) (S2 , τ2 ) = W(S1 Γ ± {x 7→ ClosS1 Γ τ1 }, e2 )
s =⇒ (ID, string) in (S2 S1 , τ2 )
x =⇒ (e1 , e2 ) =⇒
if x ∈
/ Dom Γ then fail (S1 , τ1 ) = W(Γ, e1 )
else let ∀α1 . . . αn .τ = Γ(x) (S2 , τ2 ) = W(S1 Γ, e2 )
β1 . . . βn be new (S2 S1 , S2 τ1 × τ2 )
in (ID, {αi 7→ βi }τ ) fst (e1 , e2 ) =⇒
λx.e1 =⇒ let (S1 , τ1 × τ2 ) = W(Γ, (e1 , e2 ))
let α be a new type variable in (S1 , τ1 )
(S1 , τ1 ) = W(Γ ± {x 7→ α}, e1 ) snd (e1 , e2 ) =⇒
in (S1 , S1 (α) → τ1 ) let (S1 , τ1 × τ2 ) = W(Γ, (e1 , e2 ))
e1 e2 =⇒ in (S1 , τ2 )
(S1 , τ1 ) = W(Γ, e1 ) _ =⇒ fail
(S2 , τ2 ) = W(S1 Γ, e2 )
let α be a new type variable
S3 = Unify(S2 (τ1 ), τ2 → α)
in (S3 S2 S1 , S3 (α))

Figure 3.1: Pseudocode for W


where _ is the wildcard used to show all other cases, for instance, fst e1 where e1 is an expression
and not a tuple.

14
3.2 Soundness Proof
In this section we prove the soundness of W for each expression in our grammar. W is sound in
the following sense

Theorem 3.2.1 (Soundness of W).


S
→ Γ′
If (S, τ ) = W(Γ, e) succeeds and Γ − then Γ′ ⊢ e : τ
S
If W succeeds for some Γ, e and returns some S, τ and the relation Γ −→ Γ′ is satisfied then e
is well-typed with τ under Γ . To prove an expression is sound we will also need the following

lemma from [4]


S
Lemma 3.2.2. If Γ ⊢ e : τ → Γ′
and Γ − then Γ′ ⊢ e : S τ
We will not provide proofs for this lemma, however a proof for let expressions is provided in [4].
S
Furthermore, we will use lemma 2.2.2 which states that Γ −
→ SΓ always holds.

Proof of theorem 3.2.1. By structural induction on e.

e=i

From the algorithm we have


(ID, int) = W(Γ, i) (3.1)
By lemma 2.2.2 where Γ = Γ and S = ID
ID
Γ −−→ IDΓ (3.2)

By filling in 3.1, 3.2 in 3.2.1 we get that we want to prove IDΓ ⊢ i : int.

By our S-Int rule we have the conclusion since it has no premise(s)


S-Int
(3.3)
Γ ⊢ i : int

By letting Γ = IDΓ1 we have IDΓ ⊢ i : int which is exactly what we wanted to prove.

e=b

From the algorithm we have


(ID, bool) = W(Γ, b) (3.4)
By lemma 2.2.2 where Γ = Γ and S = ID
ID
Γ −−→ IDΓ (3.5)

By filling in 3.4, 3.5 in 3.2.1 we get that we want to prove IDΓ ⊢ b : bool.

By our S-Bool rule we have IDΓ ⊢ b : bool which is what we wanted to show.

e=s

From the algorithm we have


(ID, string) = W(Γ, s) (3.6)
1
This is allowed because the Γ in the S-Int rule is for all quantified i.e. it is for an arbitrary type environment
Γ

15
By lemma 2.2.2 where Γ = Γ and S = ID
ID
Γ −−→ IDΓ (3.7)

By filling in 3.6, 3.7 in 3.2.1 we get that we want to prove IDΓ ⊢ s : string

By our S-String rule we have IDΓ ⊢ s : string which is what we wanted to show.

e=x

From our algorithm we have


(ID, {αi 7→ βi }τ ) = W(Γ, x) (3.8)
By lemma 2.2.2 where Γ = Γ and S = ID
ID
Γ −−→ IDΓ (3.9)

By filling in 3.8, 3.9 in 3.2.1 we get that we want to prove

ID Γ ⊢ x : {αi 7→ βi }τ (3.10)

By our S-Lookup rule


S-Lookup
x ∈ Dom Γ Γ(x) > τ
(3.11)
Γ⊢x:τ
we have two premises. From the algorithm we have that

if x ∈
/ Dom Γ then fail else let ∀α1 . . . αn .τ = Γ(x) (3.12)

Let us consider the two cases — x ∈/ Dom Γ and x ∈ Dom Γ. Assume x ∈ / Dom Γ then by
3.12 W fails. However, by assumption W succeeds (by Theorem 3.2.1) and therefore we have a
contradiction. Thus, x ∈ Dom Γ. Furthermore, we have from the algorithm

β1 . . . βn be new (3.13)

and the type from 3.10 corresponds to τ in Γ(x) > τ .

Thus, we have that the two premises hold and by the conclusion from 3.11 we have IDΓ ⊢ x :
{αi 7→ βi }τ which is what we wanted to prove.

e = λx.e1

We want to show S1 Γ ⊢ λx.e1 : S1 α → τ1 .

From our algorithm we have

(S1 , τ1 ) = W(Γ ± {x 7→ α}, e1 ) (3.14)

By lemma 2.2.2 where Γ = Γ ± {x 7→ α} and S = S1


S
1
Γ ± {x 7→ α} −→ S1 (Γ ± {x 7→ α}) (3.15)

By our I.H. and 3.14, 3.15 we get

S1 (Γ ± {x 7→ α}) ⊢ e1 : τ1 (3.16)

16
By applying the substitution S1 on α and Γ we get

S1 Γ ± {x 7→ S1 α} ⊢ e1 : τ1 (3.17)

Thus, by letting 3.17 be the premise of the S-Lambda rule


S-Lambda
Γ ± {x 7→ τ ′ } ⊢ e1 : τ
(3.18)
Γ ⊢ λx.e1 : τ ′ → τ

we can conclude S1 Γ ⊢ λx.e1 : S1 α → τ1 which is what we wanted to show.

e = e1 e2

We want to show S3 S2 S1 Γ ⊢ e1 e2 : S3 α.

From our algorithm we have


(S1 , τ1 ) = W(Γ, e1 ) (3.19)
By lemma 2.2.2 where Γ = Γ and S = S1 we have
S
1
Γ −→ S1 Γ (3.20)

By our I.H. and 3.19, 3.20 we have


S1 Γ ⊢ e1 : τ1 (3.21)
From our algorithm we have
(S2 , τ2 ) = W(S1 Γ, e2 ) (3.22)
By lemma 2.2.2 where Γ = S1 Γ and S = S2 we have
S2
S1 Γ −→ S2 S1 Γ (3.23)

By our I.H. and 3.22, 3.23 we have


S2 S1 Γ ⊢ e2 : τ2 (3.24)
By lemma 3.2.2 and 3.21, 3.23 we have

S2 S1 Γ ⊢ e1 : S2 τ1 (3.25)

From our algorithm we have

let α be a new type variable


(3.26)
S3 = Unify(S2 (τ1 ), τ2 → α)

By definition 3.1.1 and 3.26 we have

S3 S2 τ1 = S3 (τ2 → α) (3.27)

By definition 2.2.1 and the RHS of the equality sign in 3.27 we have

S3 S2 τ1 = S3 (τ2 → α) = S3 τ2 → S3 α (3.28)

By lemma 2.2.2 where Γ = S2 S1 Γ and S = S3 we have


S
3
S2 S1 Γ −→ S3 S2 S1 Γ (3.29)

17
By lemma 3.2.2 and 3.25, 3.29 we have

S3 S2 S1 Γ ⊢ e1 : S3 S2 τ1 (3.30)

By the equality presented in 3.28 and by replacing the type in 3.30 we get

S3 S2 S1 Γ ⊢ e1 : S3 τ2 → S3 α (3.31)

By lemma 3.2.2 and 3.24, 3.29 we have

S3 S2 S1 Γ ⊢ e2 : S3 τ2 (3.32)

Thus, by letting 3.31 and 3.32 be the premises of the S-App rule
S-App
Γ ⊢ e1 : τ ′ → τ Γ ⊢ e2 : τ ′
Γ ⊢ e1 e2 : τ

we can conclude S3 S2 S1 Γ ⊢ e1 e2 : S3 α which is what we wanted to show.

e = (let x = e1 in e2 )

We want to show S2 S1 Γ ⊢ let x = e1 in e2 : τ2 .

From our algorithm we have


(S1 , τ1 ) = W(Γ, e1 ) (3.33)

By lemma 2.2.2 where Γ = Γ and S = S1 we have


S
1
Γ −→ S1 Γ (3.34)

By our I.H. and 3.33, 3.34 we have


S1 Γ ⊢ e1 : τ1 (3.35)

From our algorithm we have

(S2 , τ2 ) = W(S1 Γ ± {x 7→ ClosS1 Γ τ1 }, e2 ) (3.36)

By definition of the Closτ operation it can not make the algorithm stop and we can proceed
without any issues. Thus, by lemma 2.2.2 where Γ = S1 Γ ± {x 7→ ClosS1 Γ τ1 } and S = S2 we
have
S2
S1 Γ ± {x 7→ ClosS1 Γ τ1 } −→ S2 (S1 Γ ± {x 7→ ClosS1 Γ τ1 }) (3.37)

By our I.H. and 3.36, 3.37 we have

S2 (S1 Γ ± {x 7→ ClosS1 Γ τ1 }) ⊢ e2 : τ2 (3.38)

Which is the same as


S2 S1 Γ ± {x 7→ ClosS2 S1 Γ S2 τ1 } ⊢ e2 : τ2 (3.39)

By lemma 2.2.2 where Γ = S1 Γ and S = S2 we have


S
2
S1 Γ −→ S2 S1 Γ (3.40)

18
By lemma 3.2.2 and 3.35, 3.40 we have

S2 S1 Γ ⊢ e1 : S2 τ1 (3.41)

Thus, by letting 3.41, 3.39 be the premises of the S-Let rule


S-Let
Γ ⊢ e1 : τ1 Γ ± {x 7→ ClosΓ τ1 } ⊢ e2 : τ
Γ ⊢ let x = e1 in e2 : τ

we can conclude that S2 S1 Γ ⊢ let x = e1 in e2 : τ2 which is what we wanted to show.

e = (e1 , e2 )

We want to show S2 S1 Γ ⊢ (e1 , e2 ) : S2 τ1 × τ2 .

From our algorithm we have


(S1 , τ1 ) = W(Γ, e1 ) (3.42)

By lemma 2.2.2 where Γ = Γ and S = S1 we have


S
1
Γ −→ S1 Γ (3.43)

By our I.H. and 3.42, 3.43 we have


S1 Γ ⊢ e1 : τ1 (3.44)

From our algorithm we have


(S2 , τ2 ) = W(S1 Γ, e2 ) (3.45)

By lemma 2.2.2 where Γ = S1 Γ and S = S2 we have


S
2
S1 Γ −→ S2 S1 Γ (3.46)

By lemma 3.2.2 and 3.44, 3.46 we have

S2 S1 Γ ⊢ e1 : S2 τ1 (3.47)

By our I.H. and 3.45, 3.46 we have


S2 S1 Γ ⊢ e2 : τ2 (3.48)

Thus, by letting 3.47, 3.48 be the premises of the S-Tuple rule


S-Tuple
Γ ⊢ e1 : τ1 Γ ⊢ e2 : τ2
Γ ⊢ (e1 , e2 ) : τ1 × τ2

we can conclude S2 S1 Γ ⊢ (e1 , e2 ) : S2 τ1 × τ2

e = fst(e1 , e2 )

We want to show S1 Γ ⊢ fst (e1 , e2 ) : τ1

From our algorithm we have


(S1 , τ1 × τ2 ) = W(Γ, (e1 , e2 )) (3.49)

19
By lemma 2.2.2 where Γ = Γ and S = S1 we have
S
1
Γ −→ S1 Γ (3.50)

By our I.H. and 3.49, 3.50 we have

S1 Γ ⊢ (e1 , e2 ) : τ1 × τ2 (3.51)

Thus, by letting 3.51 be the premise of the S-Fst rule


S-Fst
Γ ⊢ (e1 , e2 ) : τ1 × τ2
Γ ⊢ fst (e1 , e2 ) : τ1

we can conclude S1 Γ ⊢ fst (e1 , e2 ) : τ1

e = snd(e1 , e2 )

Similar to the e = fst(e1 , e2 ) case and therefore omitted. However, the proof for this case can be
found in Appendix D.

3.3 Implementation
In this section we will implement W for the language defined in Chapter 2. We will make use of
the pseudocode for W defined in this chapter and the language and its semantics.

3.3.1 Code
Our chosen programming language for the implementation is OCaml. It is based on the Hindley-
Milner type system which makes it an obvious candidate for another language based on the same
type system. The relevant files for W are presented in the following

20
The types defined in the Section 2.2.2.1 are The grammar defined in Section 2.1 is im-
implemented as follows plemented as an AST (Abstract Syntax
Tree)
types.ml
ast.ml
module SS = Set.Make(Int)
open Types
exception Fail of string
type bas_val =
type tyvar = int | Int of int
| Bool of bool
| String of string
type tycon =
| Int
type exp =
| Bool
| BasVal of bas_val
| String
| Var of program_variable
| Lambda of { id : program_variable;
type typ =
e1 : exp }
| TyCon of tycon | App of { e1 : exp; e2 : exp }
| TyVar of tyvar | Let of { id : program_variable;
| TyFunApp of {t1: typ; t2: typ} e1 : exp; e2 : exp }
| TyTuple of {t1: typ; t2: typ} | Tuple of { e1 : exp; e2 : exp }
| Fst of exp
type typescheme = | Snd of exp
TypeScheme of {tyvars: SS.t; tau: typ}

type program_variable = string

21
The type environment Γ defined in Section The substitution S and its operations as de-
2.2.2.2 is implemented as follows fined in Section 2.2.2.4 are implemented as fol-
lows
typeEnv.ml
substitution.ml
open Types
open Types
module Gamma =
Map.Make (struct module TE = TypeEnv
type t = program_variable
let compare = compare module Substitution =
end) Map.Make (struct
type t = tyvar
type ts_map = typescheme Gamma.t let compare = compare
end)
let wrap_monotype tau =
TypeScheme {tyvars=SS.empty; tau} type map_type = typ Substitution.t
let empty: map_type = Substitution.empty
let empty = Gamma.empty let add k v t : map_type =
Substitution.add k v t
let add k v t = Gamma.add k v t let look_up k t: typ option =
Substitution.find_opt k t
let look_up k t = Gamma.find_opt k t let remove k t: map_type =
Substitution.remove k t
let remove k t = Gamma.remove k t let get_or_else k t ~default =
match look_up k t with
let bindings t = Gamma.bindings t | Some v -> v
| None -> default
let map m t = Gamma.map m t let apply t typ: typ =
let counter = ref 0 let rec subst typ' =
let get_next_tyvar () = match typ' with
counter := !counter + 1; | TyCon _ -> typ'
!counter | TyVar tv ->
let reset () = counter := 0 get_or_else tv t ~default:typ'
| TyFunApp {t1; t2} ->
A small helper file TyFunApp {t1 = subst t1; t2 = subst t2}
| TyTuple {t1; t2} ->
utils.ml TyTuple {t1 = subst t1; t2 = subst t2}
in
open Types
subst typ
let apply_to_typescheme t
module TE = TypeEnv
(TypeScheme{tyvars; tau}) =
TypeScheme{tyvars; tau=apply t tau}
let new_tyvar () = TyVar (TE.get_next_tyvar())
let apply_to_gamma t gamma =
let ( +- ) gamma (id, ts) = TE.add id ts
TE.map (apply_to_typescheme t) gamma
gamma
let map m (t: map_type) = Substitution.map m t
let ( !& ) t = TE.wrap_monotype t
let union a b c =
let ( => ) t1 t2 = TyFunApp { t1; t2 }
Substitution.union a b c
let ( ** ) t1 t2 = TyTuple { t1; t2 }
let compose (s2: map_type) (s1: map_type) =
union (fun _ _ v2 -> Some v2) s2
let combine_sets sets =
(map (fun v -> apply s2 v) s1)
List.fold_left
let bindings t = Substitution.bindings t
(fun a b -> SS.union a b) SS.empty sets
let of_seq s = Substitution.of_seq s
let of_list l = of_seq (List.to_seq l)
let assoc_or_else bindings key ~default =
Option.value (List.assoc_opt key bindings)
~default:default

22
algorithmW.ml let subst = S.of_list bindings in
(S.empty, S.apply subst tau)
open Types
open Utils let infer_type exp =
let rec w gamma exp =
module A = Ast match exp with
module S = Substitution | A.Var id -> (
match TE.look_up id gamma with
let rec find_tyvars tau = match tau with | None -> raise
| TyCon _ -> SS.empty (Fail "id not in type environment")
| TyVar alpha -> SS.singleton alpha | Some ts -> specialize ts)
| TyFunApp { t1; t2 } -> | A.Lambda { id; e1 } ->
SS.union (find_tyvars t1) (find_tyvars t2) let alpha = new_tyvar () in
| TyTuple { t1; t2 } -> let s1, tau1 =
SS.union (find_tyvars t1) (find_tyvars t2) w (gamma +- (id, !&alpha)) e1 in
(s1, S.apply s1 alpha => tau1)
let find_free_tyvars | A.App { e1; e2 } ->
(TypeScheme {tyvars; tau}) = let (s1, tau1) = w gamma e1 in
SS.diff (find_tyvars tau) tyvars let (s2, tau2) =
w (S.apply_to_gamma s1 gamma) e2 in
let clos gamma tau = let alpha = new_tyvar () in
let free_tyvars_tau = find_tyvars tau in let s3 =
let free_tyvars_gamma = unify (S.apply s2 tau1) (tau2 => alpha)
combine_sets (List.map (fun (_, v) -> in (S.compose s3 (S.compose s2 s1),
find_free_tyvars v) (TE.bindings gamma)) in S.apply s3 alpha)
TypeScheme { tyvars = SS.diff | A.Let { id; e1; e2 } ->
free_tyvars_tau free_tyvars_gamma; tau } let (s1, tau1) = w gamma e1 in
let s1_gamma =
let occurs_check tyvar tau = S.apply_to_gamma s1 gamma in
if SS.mem tyvar (find_tyvars tau) then let (s2, tau2) =
raise (Fail "recursive unification") w (s1_gamma +- (id, clos s1_gamma tau1)) e2
in (S.compose s2 s1, tau2)
let rec unify t1 t2 = | A.Tuple { e1; e2 } ->
match t1, t2 with let (s1, tau1) = w gamma e1 in
| TyCon c1, TyCon c2 -> if c1 = c2 then let (s2, tau2) =
S.empty else raise (Fail "cannot unify") w (S.apply_to_gamma s1 gamma) e2 in
| TyVar tv1, TyVar tv2 -> if tv1 = tv2 then (S.compose s2 s1,
S.empty else S.add tv1 t2 S.empty (S.apply s2 tau1) ** tau2)
| TyVar tv, _ -> occurs_check tv t2; | A.Fst e1 -> (
S.add tv t2 S.empty let (s1, tau1) = w gamma e1 in
| _, TyVar tv -> occurs_check tv t1; match tau1 with TyTuple { t1; _ } ->
S.add tv t1 S.empty (s1, t1) | _ -> raise
| TyFunApp { t1 = t11; t2 = t12 }, (Fail "expected tuple"))
TyFunApp { t1 = t21; t2 = t22 } | A.Snd e1 -> (
| TyTuple { t1 = t11; t2 = t12 }, let (s1, tau1) = w gamma e1 in
TyTuple { t1 = t21; t2 = t22 } -> match tau1 with TyTuple { t2; _ } ->
let s1 = unify t11 t21 in (s1, t2) | _ -> raise
let s2 = (Fail "expected tuple"))
unify (S.apply s1 t12) (S.apply s1 t22) in | A.BasVal b ->
S.compose s2 s1 match b with
| _ -> raise (Fail "unify _ case") | Int _ -> (S.empty, TyCon Int)
| Bool _ -> (S.empty, TyCon Bool)
let specialize (TypeScheme{tyvars; tau}) = | String _ -> (S.empty, TyCon String)
let bindings = in
List.map (fun tv -> snd (w TE.empty exp)
(tv, new_tyvar())) (SS.elements tyvars) in

23
The workhorse function is infer_type which contains the recursive function w corresponding to
the pseudocode we presented in Section 3.1.2. The other functions are dedicated to unification
as defined in Section 3.1.1, the Clos operation, the tyvars map (finding free type variables etc.),
and instantiation as defined in Section 2.2.2.
The code and files presented so far are the ones that cover the implementation of W, however,
practicalities such as the examples we used to test our implementation are not crucial for our
thesis, but can be found in Appendix E. The whole implementation i.e. all the files used for
implementing W can be found on our GitHub page here.

24
Chapter 4

Optimizing Algorithm W

4.1 The Algorithm


From Robin Milner himself —W is hardly an efficient algorithm, substitutions are applied too
often [2]. In this Chapter, we present an alternative to substitutions which is a data structure,
and we will present an optimized version of algorithm W which we will denote Wopt .

4.1.1 Union-Find Data Structure


One way to optimize algorithm W is to implement the substitutions as a data structure. Par-
ticularly, the data structure we will use is called Union-Find. The idea behind Union-Find is
to have disjoint sets of nodes. This means we have sets of nodes where all nodes in a set are
connected. The data structure supports the creation of new sets, linking two nodes, and finding
the representative node for the set containing a given node. In our case, a node is a type and we
can Link a type τ1 to a new type τ2 . Calling Find on any type τ in a set will now return the
representative type in that set which is equivalent to performing the substitutions. This means
we do not need to use explicit substitutions in our algorithm anymore.

4.1.2 Unification
When using substitutions, our Unify function from 3.1.1 would either fail or return a substitution
that unifies the two types. With Union-Find, we do not have substitutions, so instead we link
the types as needed. Let Unifyopt : (Type × Type) → Substitution be the unification function
for Wopt and let () be the unit type defined in OCaml. The possible operations for Unifyopt are
defined below
Unifyopt (π1 , π2 )

If π1 = π2 then () else Fail

Unifyopt (α1 , α2 )

If α1 = α2 then () else Link(α1 , α2 )

Unifyopt (α, τ )
Unifyopt (τ, α)

If α ∈ tyvars(τ ) then Fail else Link(α, τ )

25
Unifyopt (τ11 → τ12 , τ21 → τ22 )
Unifyopt (τ11 × τ12 , τ21 × τ22 )

We call Unifyopt (τ11


′ , τ ′ ) and Unify (τ ′ , τ ′ ) where the prime symbol is used to specify the
21 opt 12 22
find action on a type, i.e. τ ′ means F ind(τ ).

Example 14 (Unification). Let the Unifyopt function take the following two types as parameter
τ1 = int → α and τ2 = int → (β × γ) then Unifyopt (τ1 , τ2 ) will do the Link(α, β × γ) operation
i.e. putting α into the set with β × γ so we get {α, β × γ} and whenever we do the Find(α)
operation it returns β × γ since it is now the representative of that set.

4.1.3 Pseudocode
We will again use the prime symbol to specify the find action on a type, i.e. τ ′ means Find(τ ).

W(Γ, e) = case e of
i =⇒ (ID, int) let x = e1 in e2 =⇒
b =⇒ (ID, bool) τ1 = W(Γ, e1 )
s =⇒ (ID, string) τ2 = W(Γ ± {x 7→ ClosΓ τ1 }, e2 )
x =⇒ in τ2
if x ∈
/ Dom Γ then fail (e1 , e2 ) =⇒
else let ∀α1 . . . αn .τ = Γ(x) τ1 = W(Γ, e1 )
β1 . . . βn be new τ2 = W(Γ, e2 )
in {αi 7→ βi }τ τ1 × τ2
λx.e1 =⇒ fst e1 =⇒
let α be a new type variable τ1 = W(Γ, e1 )
τ1 = W(Γ ± {x 7→ α}, e1 ) if typeof(τ1 ) = τ2 × τ3
in α′ → τ1 then τ2
e1 e2 =⇒ else fail
τ1 = W(Γ, e1 ) snd e1 =⇒
τ2 = W(Γ, e2 ) τ1 = W(Γ, e1 )
let α be a new type variable if typeof(τ1 ) = τ2 × τ3
Unify(τ1′ , τ2 → α) then τ3

in α else fail

Figure 4.1: Pseudocode for Wopt

26
4.2 Implementation
The files ast.ml and typeEnv.ml have not been changed, but substitution.ml has obviously
been deleted. In the following, we present the files that have been changed and we only write out
the parts and/or functions that have been changed substantially.
We decided to implement the Union-Find data structure directly into the types instead of
wrapping all types in nodes since we know that only tyvars can be linked to another type and
this way we will not have to unwrap the types all the time.
Note that we have implemented path compression in the Find operation. Path compression
makes sure that the pointer always points to the root instead of an intermediary.
types.ml
...
type typ =
...
| TyVar of tyvar ref
...

and tyvar = Int of int | Link of typ


...

(* Union-find *)
let rec find typ: typ = match typ with
| TyVar ({contents = Link t} as kind) ->
let root = find t in
kind := Link root;
root
| _ -> typ

Small change since we changed our TyVar type


utils.ml
...
let new_tyvar () =
TyVar (ref (Int (TE.get_next_tyvar())))
...

A lot has changed since we use pointers, no substitution maps and a different approach to
unification
algorithmW.ml
open Types
open Utils

module A = Ast

let rec find_tyvars tau = match ~$tau with


| TyCon _ -> SS.empty
| TyVar {contents = Int alpha} -> SS.singleton alpha
| TyVar _ -> raise (Fail "find_tyvars link")
| TyFunApp { t1; t2 } -> SS.union (find_tyvars t1) (find_tyvars t2)
| TyTuple { t1; t2 } -> SS.union (find_tyvars t1) (find_tyvars t2)

27
...

let rec unify t1 t2 =


match t1, t2 with
| TyCon c1, TyCon c2 -> if c1 = c2 then () else raise (Fail "cannot unify")
| TyVar tv1, TyVar tv2 -> if tv1 = tv2 then () else union tv1 t2
| TyVar ({contents = Int i} as tv), _ -> occurs_check i t2; union tv t2
| _, TyVar ({contents = Int i} as tv) -> occurs_check i t1; union tv t1
| TyFunApp { t1 = t11; t2 = t12 }, TyFunApp { t1 = t21; t2 = t22 }
| TyTuple { t1 = t11; t2 = t12 }, TyTuple { t1 = t21; t2 = t22 } ->
unify ~$t11 ~$t21;
unify ~$t12 ~$t22
| _ -> raise (Fail "unify _ case")

let specialize (TypeScheme{tyvars; tau}) =


let bindings = List.map (fun tv -> (tv, new_tyvar())) (SS.elements tyvars) in
let rec subst_tyvars tau =
let tau = ~$tau in
match tau with
| TyCon _ -> tau
| TyVar {contents = Int i} -> assoc_or_else bindings i ~default:tau
| TyVar _ -> raise (Fail "specialize link")
| TyFunApp {t1; t2} -> subst_tyvars t1 => subst_tyvars t2
| TyTuple {t1; t2} -> subst_tyvars t1 ** subst_tyvars t2
in
subst_tyvars tau

let infer_type exp =


let rec w gamma exp =
match exp with
| A.Var id -> (
match TE.look_up id gamma with
| None -> raise (Fail "id not in type environment")
| Some ts -> specialize ts)
| A.Lambda { id; e1 } ->
let alpha = new_tyvar () in
let tau1 = w (gamma +- (id, !&alpha)) e1 in
~$alpha => tau1
| A.App { e1; e2 } ->
let tau1 = w gamma e1 in
let tau2 = w gamma e2 in
let alpha = new_tyvar () in
unify ~$tau1 (tau2 => alpha);
~$alpha
| A.Let { id; e1; e2 } ->
let tau1 = w gamma e1 in
let tau2 = w (gamma +- (id, clos gamma tau1)) e2 in
tau2
| A.Tuple { e1; e2 } ->
let tau1 = w gamma e1 in
let tau2 = w gamma e2 in
tau1 ** tau2
| A.Fst e1 -> (
let tau1 = w gamma e1 in

28
match tau1 with TyTuple { t1; _ } -> t1 | _ -> raise (Fail "expected tuple"))
| A.Snd e1 -> (
let tau1 = w gamma e1 in
match tau1 with TyTuple { t2; _ } -> t2 | _ -> raise (Fail "expected tuple"))
| A.BasVal b ->
match b with
| Int _ -> TyCon Int
| Bool _ -> TyCon Bool
| String _ -> TyCon String
in
w TE.empty exp

The examples used for testing are the same examples used in W and there is a small change in
the Pretty Printer. The examples and the Pretty Printer can be found in Appendix E. The full
implementation of Wopt can be found on our GitHub page here.

29
Chapter 5

Conclusion

In this thesis, we have described the grammar of the Hindley-Milner type system and extended
this to include integers, booleans, strings, tuples, fst, and snd. We have defined dynamic and static
semantics for the language and introduced concepts and operations such as type environments
and substitutions to help us formulate and implement type inference for the language in practice.
Formal definitions, as well as examples, have been presented for the described concepts and
operations, and we have proven consistency between our dynamic and static semantics. Unification
was introduced and its function in our algorithm, as well as its operations, were described. After
this, we presented pseudocode for a non-optimized algorithm for type inference in our language
which we call algorithm W. The soundness of algorithm W, was proven for all types, and we
presented our working implementation of the algorithm with explanations as needed.
After having a working algorithm, we looked into optimizing it by using a Union-Find data
structure instead of substitutions. We introduced the Union-Find data structure shortly along
with its three operations, namely the creation of new sets, linking two nodes, and finding the
representative node in a set. We then related this to types in our language and presented
pseudocode for the new, optimized algorithm which we denoted Wopt . After presenting the
pseudocode for the optimized algorithm, we adapted our implementation to use a Union-Find
data structure with path compression and presented the main differences in our new working
implementation. Lastly, we added details, examples, and summaries to the appendices.

30
Bibliography

[1] Luis Manuel Martins Damas. Type assignment in programming languages. chapter 2, pages
74–82. 1984.
[2] Robin Milner. A theory of type polymorphism in programming. Journal of Computer and
System Sciences, 17(3):369, 1978.
[3] Peter Sestoft. Programming language concepts for software developers. page 117, 2010.
[4] Mads Tofte. Operational semantics and polymorphic type inference. chapter 2. 1988.

31
Appendices

32
Appendix A

Operations

A.1 Tyvars Table

Parameter Result
π ∅
α α
τ1 → τ2
tyvars(τ1 ) ∪ tyvars(τ2 )
τ1 × τ2

A.2 Substitution Table


Parameter Result
π π
α if {α 7→ τ } ⊆ S then τ else α
τ1 → τ2 S(τ1 ) → S(τ2 )
τ1 × τ2 S(τ1 ) × S(τ2 )

A.3 Unification Table for W


Parameter Result
π1 , π2 if π1 = π2 then ID else Fail
α1 , α2 if α1 = α2 then ID else {α1 7→ α2 }
α, τ
if α ∈ tyvars(τ ) then Fail else {α 7→ τ }
τ, α
τ11 → τ12 , τ21 → τ22 let S1 = Unify(τ11 , τ21 ) in
τ11 × τ12 , τ21 × τ22 Unify(S1 τ12 , S1 τ22 )S1
otherwise Fail

33
A.4 Unification Table for Wopt
We will use the prime symbol to specify the find action on a type, i.e. τ ′ means Find(τ ).

Parameter Result
π1 = π2 if π1 = π2 then () else Fail
α1 = α2 if α1 = α2 then () else Link(α1 , α2 )
α=τ
if α ∈ tyvars(τ ) then Fail else Link(α, τ )
τ =α
τ11 → τ12 = τ21 → τ22 ′ , τ ′ ); U nif y(τ ′ , τ ′ )
U nif y(τ11 21 12 22
τ11 × τ12 = τ21 × τ22
otherwise Fail

34
Appendix B

Generalization

B.1 Full Derivation of Example 12

tyvars(τ ) \ tyvars(Γ) = tyvars(α → (int × β) → (γ × string) → δ → int) \ tyvars(Γ)


= tyvars(α) ∪ tyvars(int × β) ∪ tyvars(γ × string) ∪
tyvars(δ) ∪ tyvars(int) \ tyvars(Γ)
= {α} ∪ tyvars(int) ∪ tyvars(β) ∪ tyvars(γ) ∪
tyvars(string) ∪ {δ} \ tyvars(Γ)
= {α, β, γ, δ} \ tyvars(Γ)
[
= {α, β, γ, δ} \ tyvars(σ)
σ∈Range Γ

= {α, β, γ, δ} \ (tyvars(∀.η.γ → η) ∪ tyvars(∀α.α → α))


n n
! !!
[ [
= {α, β, γ, δ} \ tyvars(γ → η) \ αi ∪ tyvars(α → α) \ αi
i=1 i=1
n
!!
[
= {α, β, γ, δ} \ ((tyvars(γ) ∪ tyvars(η)) \ {η}) ∪ tyvars(α → α) \ αi
i=1
n
!!
[
= {α, β, γ, δ} \ ({γ, η} \ {η}) ∪ tyvars(α → α) \ αi
i=1
n
!!
[
= {α, β, γ, δ} \ {γ} ∪ tyvars(α → α) \ αi
i=1
= {α, β, γ, δ} \ ({γ} ∪ (tyvars(α) ∪ tyvars(α) \ {α}))
= {α, β, γ, δ} \ ({γ} ∪ ({α} \ {α}))
= {α, β, γ, δ} \ ({γ} ∪ ∅)
= {α, β, γ, δ} \ {γ}
= {α, β, δ}

35
Appendix C

Type Inference Examples

With all the details in place we can now present some type inference examples when given an
expression e.

C.1 Monomorphic
e = λx.(λy.xy)1
We start off with the S-Lambda rule

Γ ± {x 7→ α → β} ⊢ (λy.xy)1 : γ
Γ ⊢ λx.(λy.xy)1 : (α → β) → γ

note that we have to guess the type of x for later use. Furthermore, by the S-App rule we
have

Γ ⊢ λy.xy : δ → γ Γ⊢1:δ
Γ ± {x 7→ α → β} ⊢ (λy.xy)1 : γ
Γ ⊢ λx.(λy.xy)1 : (α → β) → γ

and by the S-Int rule we find out that 1 has type int and therefore we replace each occurence of
δ with int, thus we have

Γ ⊢ λy.xy : int → γ
Γ ⊢ 1 : int
Γ ± {x 7→ α → β} ⊢ (λy.xy)1 : γ
Γ ⊢ λx.(λy.xy)1 : (α → β) → γ

by the S-Lambda rule we get

Γ ± {y 7→ int} ⊢ xy : γ
Γ ⊢ λy.xy : int → γ Γ ⊢ 1 : int
Γ ± {x 7→ α → β} ⊢ (λy.xy)1 : γ
Γ ⊢ λx.(λy.xy)1 : (α → β) → γ

36
by the S-App rule we get

Γ⊢x:ϵ→γ Γ⊢y:ϵ
Γ ± {y 7→ int} ⊢ xy : γ
Γ ⊢ λy.xy : int → γ Γ ⊢ 1 : int
Γ ± {x 7→ α → β} ⊢ (λy.xy)1 : γ
Γ ⊢ λx.(λy.xy)1 : (α → β) → γ

by the S-Lookup rule we get

x ∈ Dom Γ Γ(x) > ϵ → γ y ∈ Dom Γ Γ(y) > ϵ


Γ⊢x:ϵ→γ Γ⊢y:ϵ
Γ ± {y 7→ int} ⊢ xy : γ
Γ ⊢ λy.xy : int → γ Γ ⊢ 1 : int
Γ ± {x 7→ α → β} ⊢ (λy.xy)1 : γ
Γ ⊢ λx.(λy.xy)1 : (α → β) → γ

when evaluating Γ(x) > ϵ → γ and Γ(y) > ϵ our Γ = {x 7→ α → β, y 7→ int}, thus those two
operations gives

Γ(x) > ϵ → γ =⇒
ϵ=α γ=β
Γ(y) > ϵ =⇒ ϵ = int

By replacing the type variables accordingly, the resulting derivation tree looks as follows

x ∈ Dom Γ Γ(x) > int → β y ∈ Dom Γ Γ(y) > int


Γ ⊢ x : int → β Γ ⊢ y : int
Γ ± {y 7→ int} ⊢ xy : β
Γ ⊢ λy.xy : int → β Γ ⊢ 1 : int
Γ ± {x 7→ int → β} ⊢ (λy.xy)1 : β
Γ ⊢ λx.(λy.xy)1 : (int → β) → β

and the inferred type of the expression is (int → β) → β.

C.2 Polymorphic
e = let id = λx.x in (id 1, id ”hello”)
By the S-Let rule we have

Γ ⊢ λx.x : β Γ ± {id 7→ ClosΓ β} ⊢ (id 1, id ”hello”) : α


Γ ⊢ let id = λx.x in (id 1, id ”hello”) : α

By the S-Lambda rule we get

37
Γ ± {x 7→ γ} ⊢ x : δ
Γ ± {id 7→ ClosΓ β} ⊢ (id 1, id ”hello”) : α
Γ ⊢ λx.x : β
Γ ⊢ let id = λx.x in (id 1, id ”hello”) : α

note that we get that β = γ → δ so we replace each occurence of β with γ → δ

Γ ± {x 7→ γ} ⊢ x : δ
Γ ± {id 7→ ClosΓ γ → δ} ⊢ (id 1, id ”hello”) : α
Γ ⊢ λx.x : γ → δ
Γ ⊢ let id = λx.x in (id 1, id ”hello”) : α

let Γ1 = Γ ± {x 7→ γ} and by the S-Lookup rule we get

x ∈ Dom Γ1 Γ1 (x) > δ


Γ ± {x 7→ γ} ⊢ x : δ
Γ ± {id 7→ ClosΓ γ → δ} ⊢ (id 1, id ”hello”) : α
Γ ⊢ λx.x : γ → δ
Γ ⊢ let id = λx.x in (id 1, id ”hello”) : α

where Γ1 (x) > δ =⇒ δ = γ, thus we replace each occurence of δ with γ

x ∈ Dom Γ1 Γ1 (x) > γ


Γ ± {x 7→ γ} ⊢ x : γ
Γ ± {id 7→ ClosΓ γ → γ} ⊢ (id 1, id ”hello”) : α
Γ ⊢ λx.x : γ → γ
Γ ⊢ let id = λx.x in (id 1, id ”hello”) : α

now, we have to calculate the ClosΓ γ → γ operation to evaluate the tuple, thus we have

tyvars(γ → γ) \ tyvars(Γ) =(tyvars(γ) ∪ tyvars(γ)) \ tyvars(Γ) =


({γ} ∪ {γ}) \ tyvars(Γ) =
{γ} \ tyvars(Γ) =
{γ} \ ∅ =
{γ}

and therefore ClosΓ γ → γ = ∀γ.γ → γ.


Let Γ2 = Γ ± {id 7→ ∀γ.γ → γ} and by the S-Tuple rule we get

x ∈ Dom Γ1 Γ1 (x) > γ


Γ ± {x 7→ γ} ⊢ x : γ Γ2 ⊢ id 1 : ϵ Γ2 ⊢ id ”hello” : ζ
Γ ⊢ λx.x : γ → γ Γ ± {id 7→ ∀γ.γ → γ} ⊢ (id 1, id ”hello”) : α
Γ ⊢ let id = λx.x in (id 1, id ”hello”) : α

where we replace all occurences of α with ϵ × ζ

38
x ∈ Dom Γ1 Γ1 (x) > γ
Γ ± {x 7→ γ} ⊢ x : γ Γ2 ⊢ id 1 : ϵ Γ2 ⊢ id ”hello” : ζ
Γ ⊢ λx.x : γ → γ Γ ± {id 7→ ∀γ.γ → γ} ⊢ (id 1, id ”hello”) : ϵ × ζ
Γ ⊢ let id = λx.x in (id 1, id ”hello”) : ϵ × ζ

by the S-App rule on both expressions we get

x ∈ Dom Γ1 Γ1 (x) > γ Γ2 ⊢ id : η → ϵ Γ2 ⊢ 1 : η Γ2 ⊢ id : θ → ζ Γ2 ⊢ ”hello” : θ


Γ ± {x 7→ γ} ⊢ x : γ Γ2 ⊢ id 1 : ϵ Γ2 ⊢ id ”hello” : ζ
Γ ⊢ λx.x : γ → γ Γ ± {id 7→ ∀γ.γ → γ} ⊢ (id 1, id ”hello”) : ϵ × ζ
Γ ⊢ let id = λx.x in (id 1, id ”hello”) : ϵ × ζ

by the S-Lookup, S-Int, and S-String rule and the replacement of η = int and θ = string we
get

39
id ∈ Dom Γ2 Γ2 (id) > int → ϵ id ∈ Dom Γ2 Γ2 (id) > string → ζ
x ∈ Dom Γ1 Γ1 (x) > γ Γ2 ⊢ id : int → ϵ Γ2 ⊢ 1 : int Γ2 ⊢ id : string → ζ Γ2 ⊢ ”hello” : string
Γ ± {x 7→ γ} ⊢ x : γ Γ2 ⊢ id 1 : ϵ Γ2 ⊢ id ”hello” : ζ
Γ ⊢ λx.x : γ → γ Γ ± {id 7→ ∀γ.γ → γ} ⊢ (id 1, id ”hello”) : ϵ × ζ
Γ ⊢ let id = λx.x in (id 1, id ”hello”) : ϵ × ζ
40
we now do the instantiations

Γ2 (id) > int → ϵ =⇒


(S1 = {α 7→ ι})α → α = int → ϵ =
ι → ι = int → ϵ =⇒
ι = int ι = ϵ
Γ2 (id) > string → ζ =⇒
(S2 = {α 7→ κ})α → α = string → ζ =
κ → κ = string → ζ =⇒
κ = string κ=ζ

thus, ϵ = int and ζ = string and the resulting derivation tree looks like

41
id ∈ Dom Γ2 Γ2 (id) > int → int id ∈ Dom Γ2 Γ2 (id) > string → string
x ∈ Dom Γ1 Γ1 (x) > γ Γ2 ⊢ id : int → int Γ2 ⊢ 1 : int Γ2 ⊢ id : string → string Γ2 ⊢ ”hello” : string
Γ ± {x 7→ γ} ⊢ x : γ Γ2 ⊢ id 1 : int Γ2 ⊢ id ”hello” : string
Γ ⊢ λx.x : γ → γ Γ ± {id 7→ ∀γ.γ → γ} ⊢ (id 1, id ”hello”) : int × string
Γ ⊢ let id = λx.x in (id 1, id ”hello”) : int × string
42
The expression e thus have the type int × string.

43
Appendix D

Additional Proofs

D.1 Consistency Proof for the snd(e1 , e2 )


The type inference of the expression must have been of the form

S-Snd
Γ ⊢ (e1 , e2 ) : τ1 × τ2
(D.1)
Γ ⊢ snd(e1 , e2 ) : τ1

and the evaluation must have been one of the following

D-Snd
E ⊢ e1 → v1 E ⊢ e 2 → v2
(D.2)
E ⊢ snd(e1 , e2 ) → v1

D-Snd-Wrong1
E ⊢ e1 → wrong E ⊢ e2 → v2
(D.3)
E ⊢ snd(e1 , e2 ) → wrong

D-Snd-Wrong2
E ⊢ e 1 → v1 E ⊢ e2 → wrong
(D.4)
E ⊢ snd(e1 , e2 ) → wrong
We see that e1 → r1 and e2 → r2 for some r1 , r2 .
By our I.H. and 2.1, D.1, (D.2 / D.3 / D.4) we get
r1 ̸= wrong (D.5)
⊨ r1 : τ1 (D.6)
This means the evaluation cannot have been D.3
By our I.H. and 2.1, D.1, (D.2 / D.4) we get
r2 ̸= wrong (D.7)
⊨ r2 : τ2 (D.8)
This means the evaluation cannot have been D.4. The evaluation must have been D.2, meaning
that r1 = v1 , r2 = v2 , and r = v2 ∈ Val. Since a Val cannot be wrong then r ̸= wrong.
From D.8 we have that ⊨ r2 : τ2 . Since r = v2 = r2 and ⊨ r2 : τ2 , we must have that ⊨ r : τ2 .

44
D.2 Soundness Proof for snd(e1 , e2 )
We want to show S1 Γ ⊢ snd (e1 , e2 ) : τ2
From our algorithm we have
(S1 , τ1 × τ2 ) = W(Γ, (e1 , e2 )) (D.9)
By lemma 2.2.2 where Γ = Γ and S = S1 we have
S
1
Γ −→ S1 Γ (D.10)

By our I.H. and D.9, D.10 we have

S1 Γ ⊢ (e1 , e2 ) : τ1 × τ2 (D.11)

Thus, by letting D.11 be the premise of the S-Snd rule


S-Snd
Γ ⊢ (e1 , e2 ) : τ1 × τ2
Γ ⊢ snd (e1 , e2 ) : τ2

we can conclude S1 Γ ⊢ snd (e1 , e2 ) : τ2 .

45
Appendix E

Practicalities of W and Wopt

E.1 Examples Used For Test Correctness of W and Wopt


The examples.ml file is 566 lines long and therefore it is not presented directly in this Appendix
but click here to view the examples used for verifying our correctness for W and Wopt .

E.2 Pretty Printer for W


The prettyPrinter.ml is a file that contains functions that outputs a string containing useful
information. For instance, it is very useful for visually seeing the result type τ when W returns
it and therefore we would use the string_of_tau function.
open Types
open TypeEnv

let string_of_tau tau_node =


let rec trav tau =
match tau with
| TyCon s -> (
match s with Int -> "int" | Bool -> "bool" | String -> "string"), 0
| TyVar i -> string_of_int i, 0
| TyFunApp { t1; t2 } ->
let a1, a2 = trav t1 in
let b1, b2 = trav t2 in
let string_a = if a2 > 0 then "(" ^ a1 ^ ")" else a1 in
let string_b = if b2 > 1 then "(" ^ b1 ^ ")" else b1 in
string_a ^ " -> " ^ string_b, 1
| TyTuple { t1; t2 } ->
let a1, a2 = trav t1 in
let b1, b2 = trav t2 in
let string_a = if a2 > 0 then "(" ^ a1 ^ ")" else a1 in
let string_b = if b2 > 0 then "(" ^ b1 ^ ")" else b1 in
string_a ^ " x " ^ string_b, 1
in
fst (trav tau_node)

let print_tau tau = print_string (string_of_tau tau ^ "\n")

46
let string_of_typescheme (TypeScheme { tyvars; tau }) =
let tyvars = String.concat ", " (List.map (fun x -> string_of_int x) (SS.elements tyvars))
in "forall " ^ tyvars ^ " . " ^ (string_of_tau tau)
let print_typescheme typescheme = print_string (string_of_typescheme typescheme ^ "\n")

let string_of_tyvars tyvars =


let elems = String.concat ", " (List.map (fun x -> string_of_int x) tyvars) in
"{ " ^ elems ^ " }"
let print_tyvars tyvars = print_string (string_of_tyvars tyvars ^ "\n")

let string_of_gamma gamma =


let elems = String.concat ", " (List.map (fun (k, v) -> k ^ " -> " ^ string_of_typescheme v)
(Gamma.bindings gamma)) in "{ " ^ elems ^ " }"

let print_gamma gamma = print_string (string_of_gamma gamma ^ "\n")

E.3 Pretty Printer for Wopt


This Pretty Printer is almost identical to the one in E.2.
open Types
open TypeEnv
open Utils

let string_of_tau tau_node =


let rec trav tau =
match ~$tau with
| TyCon s -> (
match s with Int -> "int" | Bool -> "bool" | String -> "string"), 0
| TyVar {contents = Int i} -> string_of_int i, 0
| TyVar _ -> raise (Fail "string_of_tau tyvar link")
| TyFunApp { t1; t2 } ->
let a1, a2 = trav t1 in
let b1, b2 = trav t2 in
let string_a = if a2 > 0 then "(" ^ a1 ^ ")" else a1 in
let string_b = if b2 > 1 then "(" ^ b1 ^ ")" else b1 in
string_a ^ " -> " ^ string_b, 1
| TyTuple { t1; t2 } ->
let a1, a2 = trav t1 in
let b1, b2 = trav t2 in
let string_a = if a2 > 0 then "(" ^ a1 ^ ")" else a1 in
let string_b = if b2 > 0 then "(" ^ b1 ^ ")" else b1 in
string_a ^ " x " ^ string_b, 1
in
fst (trav tau_node)

let print_tau tau = print_string (string_of_tau tau ^ "\n")

let string_of_typescheme (TypeScheme { tyvars; tau }) =


let tyvars = String.concat ", " (List.map (fun x -> string_of_int x) (SS.elements tyvars)) in
"forall " ^ tyvars ^ " . " ^ (string_of_tau tau)
let print_typescheme typescheme = print_string (string_of_typescheme typescheme ^ "\n")

let string_of_tyvars tyvars =

47
let elems = String.concat ", " (List.map (fun x -> string_of_int x) tyvars) in
"{ " ^ elems ^ " }"
let print_tyvars tyvars = print_string (string_of_tyvars tyvars ^ "\n")

let string_of_gamma gamma =


let elems = String.concat ", " (List.map (fun (k, v) -> k ^ " -> " ^ string_of_typescheme v)
(Gamma.bindings gamma)) in "{ " ^ elems ^ " }"
let print_gamma gamma = print_string (string_of_gamma gamma ^ "\n")

48

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy