0% found this document useful (0 votes)
4 views

Principles of Programming Languages

The document provides lecture notes for a course on the Principles of Programming Languages, summarizing key topics such as programming paradigms, language evaluation criteria, and implementation methods. It serves as a checklist for instructors and a supplementary resource for students, emphasizing the importance of understanding programming concepts for effective software development. The notes cover various programming languages and their historical context, as well as practical considerations for language selection and usage.

Uploaded by

Halil Özmen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Principles of Programming Languages

The document provides lecture notes for a course on the Principles of Programming Languages, summarizing key topics such as programming paradigms, language evaluation criteria, and implementation methods. It serves as a checklist for instructors and a supplementary resource for students, emphasizing the importance of understanding programming concepts for effective software development. The notes cover various programming languages and their historical context, as well as practical considerations for language selection and usage.

Uploaded by

Halil Özmen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 68

Principles of Programming Languages

Lecture Notes

Halil Özmen

Version: 2023-12-28
This document contains a brief summary of the subjects of "Principles of Programming Languages"
course. These notes are intended to be used as a check list by the instructor who teaches this course.
It is not a replacement for the text-books or the notes taken by the students in the classroom. The
students can use these notes as a complement to their own notes and the recommended text-books.

TABLE OF CONTENTS
1. Introduction to Principles of Programming Languages .................................................................3
1.1. Computer Architecture ............................................................................................................4
1.2. History of Programming Languages ........................................................................................5
1.3. Programming Domains ...........................................................................................................6
1.4. Programming Language Categories .......................................................................................6
1.5. Language Evaluation Criteria ..................................................................................................7
1.6. Implementation Methods.........................................................................................................8
1.7. Syntax Analysis ....................................................................................................................11
1.8. Lexical Analysis ....................................................................................................................13
2. Data Types and Variables ..........................................................................................................15
2.1. Data Types ...........................................................................................................................15
2.2. Variables...............................................................................................................................16
2.3. Type Checking ......................................................................................................................17
2.4. Scope ...................................................................................................................................18
2.5. Storage Bindings and Lifetime ..............................................................................................19
2.6. Exceptions and Exception Handling ......................................................................................21
3. Subprograms, Parameter Passing and Polymorphism ...............................................................24
3.1. Subprograms and Parameter Passing ..................................................................................24
3.2. Polymorphism .......................................................................................................................26
4. Functional Programming Languages .........................................................................................28
4.1. ML ........................................................................................................................................28
4.2. Haskell ..................................................................................................................................29
5. Imperative and Object-Oriented Programming Languages.........................................................39
5.1. Imperative Programming .......................................................................................................39
5.2. Object-Oriented Programming ..............................................................................................41
6. Logic Programming Languages .................................................................................................44
6.1. Prolog ...................................................................................................................................44
A. Regular Expressions ..................................................................................................................52
B. OCaml .......................................................................................................................................54

References:
• Concepts of Programming Languages, 11th Ed., Roberta W. Sebesta
• Principles of Programming Languages, M. Archana
• CS303 Lecture Notes of Assoc. Prof. Hilal Kazan
• Various web sites and text books

Principles of Programming Languages 2 Halil Özmen


1. Introduction to Principles of Programming Languages

Objectives of the "Principles of Programming Languages" course:


• study the common concepts of programming languages,
• understand how to compare and evaluate different programming languages,
• get familiar with the major programming paradigms (e.g. functional, imperative, logical etc.),
• not to teach specific programming languages,
• construct a basis for other topics like compiler design, software engineering, OO design, human-
computer interaction.

Reasons for Studying Concepts of Programming Languages


• To increase capacity to express ideas while programming
Programmers, in the process of developing software, are constrained if they know a single
or very few programming languages. The language in which they develop software places
limits on the kinds of control structures, data structures, and abstractions they can use;
thus, the forms of algorithms they can construct are likewise limited. Awareness of a wider
variety of programming language features can reduce such limitations in software
development. Programmers can increase the range of their software development thought
processes by learning new language constructs.
• To improve background for choosing appropriate languages
Some programmers do not have a good understanding of concepts of programming
languages and know only few languages. The result is that many programmers, when
given a choice of languages for a new project, use the language with which they are most
familiar, even if it is poorly suited for the project at hand. If these programmers were
familiar with a wider range of languages and language constructs, they would be better
able to choose the language with the features that best address the problem.
• To increase ability to learn new languages
The process of learning a new programming language can be lengthy and difficult,
especially for someone who is comfortable with only one or two languages and has never
examined programming language concepts in general. Once a thorough understanding of
the fundamental concepts of languages is acquired, it becomes far easier to see how
these concepts are incorporated into the design of the language being learned. For
example, programmers who understand the concepts of object-oriented programming will
have a much easier time learning Ruby (Thomas et al., 2013) than those who have never
used those concepts.
• To better understand the significance of implementation
In some cases, an understanding of implementation issues leads to an understanding of
why languages are designed the way they are. In turn, this knowledge leads to the ability
to use a language more intelligently, as it was designed to be used. We can become better
programmers by understanding the choices among programming language constructs and
the consequences of those choices. Certain kinds of program bugs can be found and fixed
only by a programmer who knows some related implementation details.
• To better use languages that are already known.
Most contemporary programming languages are large and complex. Accordingly, it is
uncommon for a programmer to be familiar with and use all of the features of a language
he or she uses. By studying the concepts of programming languages, programmers can
learn about previously unknown and unused parts of the languages they already use and
begin to use those features.

Principles of Programming Languages 3 Halil Özmen


1.1. Computer Architecture
COMPUTER
Most modern computers are based on the von Neumann Central
architecture. Processing Memory
Unit
System
(CPU) Intercon-
The von Neumann Architecture is based on three key
nection
concepts:
1. Data and instructions are stored in a single read-write Input/
memory. Output
2. The contents of this memory is addressable by Modules
location, without regard to the type of data contained
there.
Input/Output (IO)
3. Execution occurs in a sequential fashion (unless
Devices
explicitly modified) from one instruction to the next.
Fig.1.1. von Neuman Arch. (simplified)

Central Processing Unit (CPU)

Arithmetic and
Control Unit Input and Output
Logic Unit
Modules
Registers

Instructions and data Results of operations

Memory
. (stores both instructions and data)
.
.
.
.

Fig. 1.2. von Neumann Architecture

Fetch-Execute Cycle
The execution of a machine code program on a von Neumann architecture computer occurs in a
process called fetch-execute cycle:
initialize the program counter
repeat forever
fetch the instruction pointed by the program counter
increment the program counter to point at the next instruction
decode the instruction
execute the instruction
end repeat

von Neumann Architecture Bottleneck:


• The speed of the connection between a computer’s memory and its processor often determines
the speed of the computer.
• Because instructions can be executed faster than they can be moved to the processor for
execution, this is the primary limiting factor in the speed of von Neumann architectures.

Principles of Programming Languages 4 Halil Özmen


1.2. History of Programming Languages
The first high-level programming language was Plankalkül, created by Konrad Zuse between
1942 and 1945. However this language was not implemented.
The first high-level language to have an associated compiler was created by Corrado Böhm in
1951, for his PhD thesis.
The first commercially available language was FORTRAN (FORmula TRANslation), developed
in 1954 - 1956 (first developed in 1954, but first manual appeared in 1956) by a team led by
John Backus at IBM.

Major Programming Languages:


1951 - Assembly Language 1969 - B (forerunner of C) 1990 - Python
1956 - FORTRAN 1970 - Pascal 1991 - Visual Basic
1958 - LISP 1972 - C 1995 - Ruby
1958 - ALGOL 1978 - SQL (query language) 1993 - R
1959 - COBOL 1980 - C++ 1995 - Java
1962 - APL 1983 - Ada 1995 - PHP
1962 - SNOBOL 1984 - MATLAB 1996 - OCaml
1964 - BASIC 1987 - Perl 2002 - Scratch
1964 - PL/I 1990 - Haskell 2005 - F# ....

Fig. 1.3. Genealogy of Common High Level Programming Languages

Principles of Programming Languages 5 Halil Özmen


1.3. Programming Domains
Computers have been applied to a myriad of different areas, from controlling nuclear power plants
to providing video games in mobile phones. Because of this great diversity in computer use,
programming languages with very different goals have been developed. In this section, we briefly
discuss a few of the areas of computer applications and their associated languages.
Programming Domains Languages
FORTRAN, Algol
Scientific Applications C, C++
Matlab, R, Maple, Octave
Business Applications COBOL, SAP
sh, bash
Scripting Languages
awk, Perl, Python
Artificial Intelligence Lisp, Prolog
Web Software HTML, PHP, JavaScript

1.4. Programming Language Categories

Essential Programming Language Paradigms:


• Imperative Programming Languages
 Central features are variables, assignment statements, and iteration
 Include scripting languages
 Include the visual languages
 Examples: Fortran, Algol, Pascal, C, Visual Basic .NET
• Functional Programming Languages
 Main means of making computations is by applying functions to given parameters
 Functional Programming is a programming paradigm that treats computation as the
evaluation of mathematical functions and avoids changing-state and mutable data. In
functional code, the output value of a function depends only on the arguments that are
passed to the function.
 Examples: Lisp, Scheme, ML (Caml, OCaml), F#, Haskell
• Logic Programming Languages
 Rule-based (rules are specified in no particular order)
 Examples: Prolog, Datalog, Absys, Gödel, ROOP
• Object-Oriented Programming Languages
 Data abstraction, inheritance, late binding
 Examples: Java, C++

More Programming Paradigms


• Markup Programming Languages
 HTML, XML
• Concurrent Programming Languages
 Ada, Occam
• Multi-Paradigm Programming Languages
 Matlab (imperative and functional), R

Programming languages list ("Hello World" programs): https://helloworldcollection.github.io/

Principles of Programming Languages 6 Halil Özmen


1.5. Language Evaluation Criteria
• Readability
The ease with which programs can be read and understood.
• Writability
The ease with which a language can be used to create programs.
• Reliability
Conformance to specifications (i.e., performs to its specification).
• Cost
The ultimate total cost.

Readability
The ease with which programs can be read and understood.
• Overall simplicity:
A manageable set of features and constructs, minimal feature multiplicity, minimal
operator overloading.
• Orthogonality:
Orthogonality in a programming language means that a relatively small set of
primitive constructs can be combined in a relatively small number of ways to build
the control and data structures of the language. Furthermore, every possible
combination of primitives is legal and meaningful.
• Data Types: Adequate facilities for defining data types and data structures
• Syntax Design:
Special words and methods of forming compound statements
Form and meaning: self-descriptive constructs, meaningful keywords

Writability
The ease with which a language can be used to create programs.
• Simplicity and Orthogonality: Few constructs, a small number of primitives and set of rules.
To be able to design a solution to a complex problem after learning only a simple set
of primitive constructs. A smaller number of primitive constructs and a consistent set
of rules for combining them (that is, orthogonality) is much better than having a large
number of primitives.
• Support for abstraction: The ability to define and use complex structures or operations in
ways that allow details to be ignored.
E.g. use of a subprogram to implement a sort algorithm that’s needed several times
in a program.
• Expressivity: convenient ways of specifying computations
Strength and number of operators and predefined functions
E.g.: count++ rather than count = count + 1 in C

Reliability
Conformance to specifications (i.e., performs to its specification)
• Type Checking: Type checking is simply testing for type errors in a given program, either
by the compiler or during program execution. Compile-time type checking is more
desirable.
• Exception Handling: The ability of a program to intercept run-time errors (as well as other
unusual conditions detectable by the program), take corrective measures, and then
continue is an obvious aid to reliability.
• Aliasing: Presence of two or more distinct referencing methods for the same memory
location (which is dangerous)
• Readability and Writability: A language that does not support natural ways of expressing
an algorithm will require the use of unnatural approaches, and hence reduced reliability.

Principles of Programming Languages 7 Halil Özmen


Cost
The ultimate total cost:
• Cost of training programmers to use the language
• Cost of writing programs in the language
• Cost of compiling programs in the language
• Cost of testing programs in the language
• Cost of executing programs written in a language (speed at run time)
• Cost of the language implementation system (IDE). Availability of free compilers.
• Reliability: poor reliability leads to high costs
• Cost of maintaining programs

Language Evaluation Criteria - Others:


• Portability
The ease with which programs can be moved from one implementation to another
• Generality
The applicability to a wide range of applications
• Well-definedness
The completeness and precision of the languages official definition

Language Design Trade-Offs


The followings are all conflicting criteria.
• Reliability vs. cost of execution
Example: Java demands all references to array elements be checked for proper indexing, which
leads to increased execution costs
• Readability vs. writability
Example: APL provides many powerful operators (and a large number of new symbols), allowing
complex computations to be written in a compact program but at the cost of poor readability
• Writability (flexibility) vs. reliability
Example: C++ pointers are powerful and very flexible but are unreliable

1.6. Implementation Methods

Source Program Source Program Source Program

Input data
Compiler Code Generator
Interpreter (Translator)
Input data

Executable Results Intermediate


(Machine language) Code

Input data
Results Interpreter
(Run-time engine)

Results
Compilation Interpretation Hybrid Implementation

Fig. 1.4. Implementation Methods

Principles of Programming Languages 8 Halil Özmen


• Compilation
Programs are translated into machine language. Then this machine language file is executed.
E.g.: C, C++
• Pure Interpretation
Evaluate a program directly by an interpreter, without a conversion to a native form.
E.g.: Python, OCaml, F#
• Hybrid Implementation Systems
Translate high-level language programs to an intermediate language designed to allow easy
interpretation.
Faster than pure interpretation because the source language statements are decoded only once.
E.g.: Java

Advantages and Disadvantages of Implementation Methods

Pure Compilation
Advantages:
• Good runtime performance: compiles down to machine language
Disadvantages:
• Longer edit-debug cycle due to the need of full compilation
• Software distribution is harder due to platform dependence

Pure Interpretation
Advantages:
• Short edit-debug cycle: no compilation step is needed
• debugging is easy -> the error message can easily indicate the source line of the error
• Platform independence, however there should be an interpreter available for each platform
supported (not a problem for the programmer)
Disadvantages:
• Low runtime performance, statement decoding is complex and timeconsuming, the same
statement has to be decoded
• Have to distribute source code
• Needs more space, source program + symbol table is stored

Hybrid Implementation (Compilation and Interpretation)


Advantages:
• Platform independence
• Performance can get closer to pure compilation via the use of JIT
JIT: just-in-time compilation: first translates programs to an intermediate language, during
execution run-time compilation of intermediate representation of the code down to machine
code. The machine code version is kept for subsequent calls.
Disadvantages:
• A large development effort is needed to make it perform close to pure compilation

e.g. Java had bytecode as its intermediate code, and JVM as its virtual machine. Most JVM imp.
provide a JIT compiler embedded into the JVM.

Mixed compilation / interpretation


• Some languages can have both compiled and interpreted implementations.
• The interpreter is used to develop and debug programs
• Then, after a relatively bug-free state is reached, the programs are compiled to increase
their execution speed.

Principles of Programming Languages 9 Halil Özmen


Compilation Process
The steps are:
• Lexical Analysis
• Syntax Analysis
• Intermediate Code Generation and Semantic
Analysis
• Code Generation

Lexical Analyzer (scanner): gathers the


characters of the source program into lexical
units: identifiers, special words, operators.
Syntax Analyzer (parser): takes the lexical units
and produces a parse tree.
Semantic Analyzer: checks for errors.
Optimization: makes the programs faster or
smaller.

Fig. 1.5. The Compilation Process


Formal Language:
A formal language consists of words (strings, lexemes) whose letters are taken from an alphabet (∑)
and are well-formed according to a specific set of rules.
• Alphabet: a finite set of symbols. ∑ denotes the alphabet. E.g. ∑ = {A, B, ..., Z, a, b, ..., z}
• String (lexem): a finite sequence of symbols from some alphabet.
• Language: a set of strings.
• All programming languages, like Java, C, and Python, are formal languages.
Programming languages must be precise (unlike natural languages). Precision is required for:
• Syntax: the structure of statements in a computer language, or the format of the language.
• Semantics: meaning of these expressions, statements.

Regular Language:
A regular language is a language that can be expressed with a regular expression or a deterministic or
non-deterministic finite automata or a state machine.
Almost all programming languages are non-regular.

Lexical Analysis vs Syntax Analysis:


• The lexical analyzer deals with small-scale language constructs, such as names and numeric
literals. Regular expressions can be used in lexical analysis.
• The syntax analyzer deals with the large-scale constructs, such as expressions, statements,
and program units. BNF notations can be used in syntax analysis.

There are three reasons why lexical analysis is separated from syntax analysis:
1. Simplicity: Techniques for lexical analysis are less complex than those required for syntax
analysis, so the lexical analysis process can be simpler if it is separate. Also, removing the low-
level details of lexical analysis from the syntax analyzer makes the syntax analyzer both smaller
and less complex.
2. Efficiency: Although it pays to optimize the lexical analyzer, because lexical analysis requires a
significant portion of total compilation time, it is not fruitful to optimize the syntax analyzer.
Separation facilitates this selective optimization.
3. Portability: Because the lexical analyzer reads input program files and often includes buffering
of that input, it is somewhat platform dependent. However, the syntax analyzer can be platform
independent. It is always good to isolate machine-dependent parts of any software system.

Principles of Programming Languages 10 Halil Özmen


1.7. Syntax Analysis Languages over Binary Numbers
Syntax: the form or structure of the expressions, statements,
Alphabet: ∑ = {0, 1}
and program units.
• A sentence is a string of characters over some alphabet. Example languages:
• A language is a set of sentences. Set of all possible combinations:
• A lexeme is the lowest level syntactic unit of a language L0 = {0, 1, 00, 01, 10, 11, 000, 001, 010, ...}
(e.g., sum, begin, =, +, *). All binary numbers without leading zeros:
• A token is a category of lexemes (e.g., identifier). L1 = {0, 1, 10, 11, 100, 101, 110, 111, 1000, ...}
All binary numbers with two digits:
L2 = {00, 01, 10, 11}
Languages Recognizers
• A recognition device reads input strings of the language and decides whether the input strings
belong to the language.
• Example: syntax analysis part of a compiler.
Languages Generators
• A device that generates sentences of a language.
• One can determine if the syntax of a particular sentence is correct by comparing it to the structure
of the generator.

Grammar:
A grammar is a generative device for defining languages. The sentences of the language are
generated through a sequence of applications of the rules, beginning with a special nonterminal of the
grammar called the start symbol.
Derivation: This sequence of rule applications is called a derivation.

Backus-Naur Form and Context-Free Grammars


In computer science, Backus–Naur Form (BNF) or Backus Normal Form is a metasyntax notation for
context-free grammars, often used to describe the syntax of languages used in computing, such as
computer programming languages, document formats, instruction sets and communication protocols.

Context-Free Grammar:
A context-free grammar is a set of recursive rules used to generate patterns of strings. A context-free
grammar can describe all regular languages and more, but they cannot describe all possible
languages. Context-free grammars are developed by Noam Chomsky (mid 1950's), and BNF is
invented by John Backus to describe ALGOL 58 (1959).

A BNF specification is a set of derivation rules, written as:


<symbol> → __expression__ <symbol> ::= __expression__

Example: BNF of if statement (C):


<if_stmt> → if ( <logic_expr> ) <stmt>
<if_stmt> → if ( <logic_expr> ) <stmt> else <stmt>
or:
<if_stmt> → if ( <logic_expr> ) <stmt> | if ( <logic_expr> ) <stmt> else <stmt>

Recursive BNF rule:


<ident_list> → identifier | identifier, <ident_list>
This defines <ident_list> as either a single token (identifier) or an identifier followed by a
comma and another instance of <ident_list>.
<arithmetic_expr> → <identifier> | <int_literal> | ( <arithmetic_expr> ) |
<arithmetic_expr> <arithmetic_op> <arithmetic_expr>
This defines <arithmetic_expr> as either a single identifier or an integer constant or an
<arithmetic_expr> followed by an <arithmetic_op> followed by an <arithmetic_expr>.

Principles of Programming Languages 11 Halil Özmen


In a grammar for a complete programming language, the start symbol represents a complete program
and is often named <program>.
Example of a BNF (grammar) for a small language:
<program> → begin <stmt_list> end
<stmt_list> → <stmt> | <stmt> ; <stmt_list>
<stmt> → <var> = <expression>
<var> → a | b | c | d
<expression> → <var> | <var> + <var> | <var> - <var>
In this language, the only statement type is assignment statement, the only variable names are a, b, c
and d, and the only operators are + and -.

A derivation of a program in this language: (For: begin a = b + c ; b = d end)


<program> → begin <stmt_list> end
→ begin <stmt> ; <stmt_list> end
→ begin <var> = <expression> ; <stmt_list> end
→ begin a = <expression> ; <stmt_list> end
→ begin a = <var> + <var> ; <stmt_list> end
→ begin a = b + <var> ; <stmt_list> end
→ begin a = b + c ; <stmt_list> end
→ begin a = b + c; <stmt> end
→ begin a = b + c ; <var> = <expression> end
→ begin a = b + c ; b = <expression> end
→ begin a = b + c ; b = <var> end
→ begin a = b + c ; b = d end

Leftmost Derivation:
The replaced nonterminal is always the leftmost nonterminal in the previous sentential form.
The derivation continues until the sentential form contains no nonterminals. That sentential form,
consisting of only terminals, or lexemes, is the generated sentence.

Parse Tree
A parse tree is an ordered, rooted tree that represents the syntactic structure of a string
according to some context-free grammar.

Ambiguous Grammars:
• If a grammar results in derivations with different parse trees for the same string, then the
grammar is ambiguous.
• Ambiguous grammars create problems in parsing, since different parsing trees result in different
semantics. E.g.
+
Precedence: 3+4*5 means 3+(4*5) since * has higher precedence than +,
3 *
thus multiplication should be deeper in the parse tree.
Associativity: 10-4-3 means (10-4)-3 since - is left-associative, 4 5
thus the first subtraction should be deeper in the parse tree.

Exercise: (E denotes expression.)


E → a | b | c | d | E+E | E-E | E*E | E/E | (E)
Make parse tree for:
(b-c)*a
d*(d+b)-a
(a+b)*(c-d)
a+(b-c)*d
(a+b)/(c-d*e)

Principles of Programming Languages 12 Halil Özmen


1.8. Lexical Analysis
Lexical analyzers extract lexemes from a given input string and produce the corresponding tokens.
Most lexical analyzers are subprograms that locate the next lexeme in the input, determine its
associated token code, and return them to the caller, which is the syntax analyzer. So, each call to the
lexical analyzer returns a single lexeme and its token.
• A lexical analyzer is essentially a pattern matcher. A pattern matcher attempts to find a substring
of a given string of characters that matches a given character pattern.
• Lexical analysis is the first phase of a compiler. It takes modified source code from language
preprocessors that are written in the form of sentences. The lexical analyzer breaks these
syntaxes into a series of lexemes/tokens, by removing any whitespace or comments in the
source code.

There are three approaches to building a lexical analyzer:


1. Write a formal description of the token patterns of the language using a descriptive language
related to regular expressions.
2. Design a state transition diagram that describes the token patterns of the language and:
2.a. write a program that implements the diagram.
2.b. construct a table-driven implementation of the state diagram.

State transition diagram or just state diagram, is a directed graph and can be processed by finite
automata.

Lexemes and Tokens: Lexeme Token


Consider the following C or Java statement: result IDENT
result = oldsum - value / 100; = ASSIGN_OP
or: oldsum IDENT
- SUB_OP
result=oldsum-value/100; value IDENT
/ DIV_OP
Whitespaces does not affect lexical analysis. 100 INT_LIT
; SEMICOLON

Principles of Programming Languages 13 Halil Özmen


Finite Automata
A finite automata (FA) is a simple idealized machine used to recognize patterns within input taken from
some character set (or alphabet) C.
The job of an FA is to accept or reject an input depending on whether the pattern defined by the FA
occurs in the input.
In short, finite automata is a machine for recognizing regular languages.
A finite automaton consists of:
• a finite set S of N states,
• a special start state,
• a set of final (or accepting) states,
• a set of transitions T from one state to another,
labeled with chars in C.
(See regular expressions.)

A FA can be represented graphically, with nodes for


states, and arcs for transitions.
Start state:
State with incoming transition from no
other state;
Can have only 1 start state.
Final states:
States with double circle;
Can have 0 or more final states;
Any state, including the start state, can be final state.

DFA - Deterministic Finite Automata:


In DFA, there is only one path for specific input from the current state to the next state.
Deterministic refers to the uniqueness of the computation.

NFA - Non-Deterministic Finite Automata:


The finite automata are called NFA when there exist many paths for specific input from the
current state to the next state.

Example: (Refer to regular expressions)


Recognizes language: (a|b|...|z)((a|b|...|z)|(0|1|...|9))*
[a-z][a-z0-9]*
Accepts: a, a1b, counter481, x1a2b3z4w, t444
Rejects: 1a, A.

Exercise:
What language is this? (Express solution as a regexp.)

(Examine "a lexical analyzer system for simple arithmetic expressions" written in C, on pages 190-194
in the "4.2 Lexical Analysis" section of "Concepts of Programming Languages", 11th Ed., Roberta W.
Sebesta.)

Principles of Programming Languages 14 Halil Özmen


2. Data Types and Variables
Introduction
• A data type defines a collection of data objects and a set of predefined operations on those
objects
• A descriptor is the collection of the attributes of a variable
• An object represents an instance of a user-defined (abstract data) type
• One design issue for all data types: What operations are defined and how are they specified?

Types:
• Primitive data types: can't be decomposed into other subvalues.
• Composite data types: struct, tuples, arrays, functions
• Recursive data types: lists, trees, etc.

2.1. Data Types


Primitive Data Types:
• Almost all programming languages provide a set of primitive data types.
• Primitive data types: Those not defined in terms of other data types.
• Some primitive data types are merely reflections of the hardware.
• Others require only a little non-hardware support for their implementation.

• Almost always an exact reflection of the hardware so the mapping is trivial


Integer • There may be as many as eight different integer types in a language
• Java‘s signed integer sizes: byte, short, int, long
• Model real numbers, but only as approximations
• Languages for scientific use support at least two floating-point types (e.g., float and
Floating Point double; sometimes more)
• Usually exactly like the hardware, but not always
• IEEE Floating-Point, Standard 754 (1985, 2008, 2019)
• Range of values: two elements, one for “true” and one for “false”
Boolean
• Could be implemented as bits, but often as bytes
• Stored as numeric coding
• Most commonly used coding: ASCII
Character • An alternative, 16-bit coding: Unicode
– Includes characters from most natural languages,
– Originally used in Java; C# and JavaScript also support Unicode.
• Some languages support a complex type, e.g., Fortran and Python
Complex • Each value consists of two floats, the real part and the imaginary part
• Literal form (in Python): (7 + 3j), where 7 is the real part and 3 is the imaginary part
• For business applications (money)
– Essential to COBOL, C# offers a decimal data type.
Decimal • Store a fixed number of decimal digits, in coded form (BCD - binary coded decimal)
• Advantage: accuracy
• Disadvantages: limited range, wastes memory, slow computation

Principles of Programming Languages 15 Halil Özmen


Character String Types
• Values are sequences of characters
• Design issues:
– Is it a primitive type or just a special kind of array?
- C, C++: not primitive, char arrays.
- Python: primitive data type.
– Should the length of strings be static or dynamic?

Name (Identifier):
• A name is a string of characters used to identify some entity in the program.
• Languages often have various restrictions on names to make scanning and parsing easier.
• Names cannot include special characters.

2.2. Variables
Attributes of variables:
• name: identifier
• address: machine memory address of the variable
• value: contents of the memory associated with the variable
• type: range of values the variable can have
• scope: range of statements where the variable is visible

Binding:
Binding is an association between a name and what it refers to.
int x; /* x is bound to a stack location containing an int */
int f (int) { ... } /* f is bound to a function */
class C { ... } /* C is bound to a class */
let x = e1;; (* x is bound to e1 *)

Binding time is the time at which a binding takes place.


• Language design time: e.g. + operator is bound to the summation operation
• compile time: e.g. a variable is bound to its type
• link time: e.g. an extern function call is bound to the function definition
• run time: e.g. a variable is bound to its value

Static and Dynamic Type Binding:


• Static Type Binding: occurs before runtime and remains unchanged throughout program
execution.
The type may be specified by either an explicit or an implicit declaration
An explicit declaration is a program statement used for declaring the types of variables.
An implicit declaration is a default mechanism for specifying types of variables (the first
appearance of the variable in the program).
• Dynamic Type Binding: if the binding first occurs during runtime or can change in the course of
program execution.
Specified through an assignment statement (PHP, Python, JavaScript).
$num = "248"; list = [2, 4.33, 6, 8]
$num = 84.6; list = 17.3
$num = array(7, 4, 8); list = 360

Principles of Programming Languages 16 Halil Özmen


Comparison of Static and Dynamic Typing:
• Static typing is more efficient, dynamic typing requires (possibly repeated) run-time type checks
whereas static typing requires only compile-time type checks whose cost is minimal.
Static typing is more secure, the compiler can certify that the program contains no type errors.
• Dynamic typing provides greater flexibility where the types of the data are not known in advance.
• In practice the greater security and efficiency of static typing outweigh the greater flexibility of
dynamic typing.

2.3. Type Checking

Type Checking
• Type checking is the activity of ensuring that the operands of an operator are of compatible
types.
• A compatible type is one that is either legal for the operator, or is allowed under language rules
to be implicitly converted, by compiler- generated code, to a legal type. This automatic
conversion is called as coercion.
• A type error is the application of an operator to an operand of an inappropriate type.
• If all type bindings are static, nearly all type checking can be static.
• If type bindings are dynamic, type checking must be dynamic.
A programming language is strongly typed if type errors are always detected.

Weak vs. Strong Typing


• Weak typing allows one type to be treated as another or provides (many) implicit casts.
Example: int treated as bool in C, C++, Python, PHP, etc.
• Strong typing prevents one type from being treated as another (also known as type safe).
Example: int not treated as bool in Java, OCaml.

Type Conversions
If types are the same then, there are no issues.
But if the types are different, then conversions happen.
Coercion: implicit conversions. Can happen in assignments, arithmetic operations, function calls.
Coercions are supposed to be applied only when they preserve semantics.
E.g. lengthening a short value to a longer type preserves semantics, but shortening might or
might not preserve the semantics.
e.g. in C:
int x = 2;
double y = 3.5;
y = x * y; // in x * y, x is coerced into a double

Cast: explicit conversions


e.g. in C:
int x = 2;
double y = 3.5;
x = x * ((int) y); // y is cast into an int

Principles of Programming Languages 17 Halil Özmen


2.4. Scope
Scope
The scope of a variable is the range of statements where the variable is visible.
A variable is visible in a statement if it can be referenced or assigned in that statement.
Static Scope:
Scope of a variable can be determined prior to execution.
A variable is visible within the code block it is defined, and it refers to its closest binding, going
from inner to outer scope in the program text.
Languages like C, C++, Java, and OCaml are statically scoped.
Dynamic Scope:
Dynamic scoping is based on the calling sequence of subprograms, not on their spatial
relationship to each other. In dynamic scope, a global identifier refers to the identifier associated
with the most recent environment.
The dynamic scope can be determined only at runtime.
Dynamic scoping is uncommon in modern languages, was used in early dialects of Lisp, now it’s
available as an option in Perl and Common Lisp.
Blocks:
The block concept allows a section of code (called block) to have its own local variables whose
scope is minimized to the block. Such variables are typically stack dynamic, so their storage is
allocated when the section is entered and deallocated when the section is exited.
Blocks are allowed in C, C++, but not in Java and C#.
Global Scope:
Some languages allow a program structure that is a sequence of function definitions, in which
variable definitions can appear outside the functions. Definitions outside functions in a file create
global variables, which potentially can be visible to those functions. C, C++, PHP, Python, etc.

Examples: Python
def f(): def f(): def f():
print(s) s = "World" print(s)
print(s) s = "World"
s = "Hello" print(s)
f() s = "Hello"
f() s = "Hello"
print(s) f()
print(s)
Hello World UnboundLocalError: cannot
Hello access local variable 's' where it
is not associated with a value
Here, s is global, because s is s in f is local, because s is In f, s is defined but it is used
not defined (or assigned) in f. defined in f. before it is defined, s is local.
def f(): def f(): def f(x, y):
global s s = "World" global a
print(s) print(s) a = 10
s = "World" x, y = y, x
print(s) f() b = 11
print(s) print(a, b, x, y)
s = "Hello"
f() a, b, x, y = 1, 2, 3, 4
print(s) f(5, 6)
print(a, b, x, y)
Hello name 's' is not defined 10 11 6 5
World 10 2 3 4
World
s is global. The assignment to s s is local to f, it only exists in f. a is global, b, x and y are all
in f changes the global s. local.
Principles of Programming Languages 18 Halil Özmen
Example of static scope in C:
A variable is visible within the code block it is defined, and it refers to its closest binding, going
from inner to outer scope in the program text.
int i = 7;
{
int j;
{
float i = 2.4;

j = (int) i;
}
int k = i + 1;
}

Example: What values would be printed from the program on the left if:
• static scope is used?
• dynamic scope is used?
int m = 60; Static scope:
int n = 100;
function first() In main program, n is 100
{ In second, m is 60
print("In first, n is ", n); In second, n is 1
} In first, n is 100
In first, n is 100
function second(int n)
{ --------------------
print("In second, m is ", m); Dynamic scope
print("In second, n is ", n);
first(); In main program, n is 100
} In second, m is 60
In second, n is 1
print("In main program, n is ", n); In first, n is 1
second(1); In first, n is 100
first();

2.5. Storage Bindings and Lifetime


Lifetime of a variable:
The lifetime of a variable is the time during which the variable stays in memory and is therefore
accessible during program execution.
Static allocation:
Stored at fixed absolute addresses. Global variables, static variables, explicit constants.
Stack-based allocation:
Lifetime spans the period between the invocation and return of the function. Parameters, local
variables.
For subprogram calls, the working of the stack based allocation mechanism:
 Before the call, the caller pushes return address and arguments onto the stack.
 After being called, the subroutine initializes local variables.
 Before returning, the subroutine cleans up local data.
 After the call returns, the caller retrieves returns values and restores the stack to its state
before the call.
Heap-based allocation:
Objects created / destroyed at arbitrary times (dynamic allocation).

Principles of Programming Languages 19 Halil Özmen


Storage Bindings and Lifetime

Static Variables:
Static variables are bound to memory cells before program execution begins and remain bound
to the same memory cells throughout execution. Storage requirements known at compile time.
Advantages:
 efficiency (direct addressing), no run-time overhead for allocation & deallocation,
 history-sensitive: maintain values between successive function calls.
Disadvantage:
 storage cannot be shared among variables.

Stack-Dynamic Variables:
Stack-dynamic variables are those whose storage bindings are created when their declaration
statements are elaborated.
Storage is allocated & deallocated in last-in first-out order, from the run-
time stack.
Local variables, parameters, temporary variables.
Advantages:
 allows recursion,
 conserves storage.
Disadvantages:
 Overhead of allocation and deallocation,
 Subprograms cannot be history sensitive,
 Inefficient references (indirect addressing).

Explicit Heap-Dynamic Variables:


Allocated and deallocated by explicit directives, specified by the programmer, which take effect
during execution. Referenced only through pointers or references.
All objects in Java, dynamic objects in C++.
Advantage: provides for dynamic storage management.
Disadvantage: inefficient and unreliable.

Implicit Heap-Dynamic Variables:


Allocation and deallocation caused by assignment statements. All their attributes are bound
every time they are assigned. Location: Heap memory.
All variables in APL, all strings and arrays in Perl and JavaScript.
Advantage: flexibility.
Disadvantages: Inefficient, because all attributes are dynamic, and loss of error detection.

Memory allocation for an example C program

Principles of Programming Languages 20 Halil Özmen


Heap-based allocation methods:
• First fit: select the first block large enough to satisfy the request.
• Best fit: select the smallest block large enough to satisfy the request.

First Fit Best Fit

Heap Allocation
• The heap is finite - if too much things are put into heap, then it will run out.
• Solution: deallocate space when it is no longer necessary.
• Methods:
 Manual deallocation, with e.g., free, delete (C, Pascal)
 Automatic deallocation via garbage collection (Java, C#, Scheme, ML, Perl)
 Semi-automatic deallocation, using destructors (C++, Ada)
Automatic because the destructor is called at certain points automatically
Manual because the programmer writes the code for the destructor

Manual deallocation:
• The programmer is in charge of deciding when heap storage can be freed. (free function in C)
• Manual deallocation is dangerous. Two types of mistakes:
 storage is not freed even though it is no longer needed (memory leak)
 storage is freed but referred to later (dangling pointer), programmer accidentally deallocates a
block of memory that’s still in use.

Requirements for Automatic Garbage Collection:


• It should identify most garbage.
• Anything it identifies as garbage must be garbage.
• It should impose a low added time overhead.
• During garbage collection, the program may be paused, but these pauses should be short.

Garbage collection algorithms:


• Mark / Sweep algorithm
• Copying
• Reference Counting

2.6. Exceptions and Exception Handling

Types of Program Errors:


• Syntax errors: errors due to the fact that the syntax of the language is not respected.
• Semantic errors: improper use of program statements.
• Logical errors: The program does not generate the requested result.

Principles of Programming Languages 21 Halil Özmen


Exceptions
• Exception: occurrence of an unusual event during the execution of the program
• Exception handling in PLs is an error handling technique.
• An exception is raised when its associated event occurs.
• Handling the exception involves executing an exception handling procedure, which typically
changes the program flow.
• Alternatives
 return codes
 pass a label parameter to all subprograms
 pass an exception handling subprogram to all subprograms

Exception Handling
• In a language without exception handling, when an exception occurs, control goes to the
operating system, where a message is displayed and the program is terminated.
• In a language with exception handling, programs are allowed to trap some exceptions, thereby
providing the possibility of fixing the problem and continuing.

Detection:
• All syntax errors and some of the semantic errors (the static semantic errors) are detected by the
compiler.
• Other semantic errors ( the dynamic semantic errors) and the logical errors cannot be detected
by the compiler, and hence they are detected only when the program is executed.

Return codes in C:
• In C, the convention is that all functions return "int" values.
• A return value of 0 indicates that the function completed successfully.
• Each negative return value generally indicates a different error.
• Values that need to be returned from the function are generally managed with "pass by
reference" mechanism.

Advantages of built-in exception handling


• Error detection code is tedious to write and it clusters the program
• Exception handling encourages programmers to consider many different possible errors
• Exception propagation allows a high level of reuse of exception handling code

Design question: Resumption vs Termination


• Resumption: an exception handler can resume computation at the place where the exception
was thrown
• Termination: throwing an exception terminates execution at the point of the execution.

Example: Exception handling in C++


It was added to C++ in 1990. Earlier versions of C++ did not support exception handling.
try {
// code that is expected to raise an exception
}
int fac(int n) {
catch (formal parameter) {
if (n <= 0) throw (-1);
// handler code
else if (n > 15) throw ("n too large");
}
else return n*fac(n-1); }
...
void g (int n) {
catch (formal parameter) {
int k;
// handler code
try { k = fac (n); }
}
catch (int i) { cout << "negative value invalid"; }
catch (char *s) { cout << s; }
catch (...) { cout << "unknown exception"; }
... }
Principles of Programming Languages 22 Halil Özmen
Unhandled exceptions
An unhandled exception is propagated to the caller of the function in which it is raised.
This propagation continues to the main function.
If no handler is found, the default handler is called.

Exception handling in Java:


• Java offers a predefined set of exceptions that can be thrown during program execution.
• All exceptions are objects of classes that are descendants of the Throwable class.
• Errors represent events that cannot be controlled by the programmer (for example,
OutOfMemoryError), while exceptions can be handled during the execution of the program.

try {
// code that might throw multiple exceptions
}
catch ([Type of Exception 1] e) { // e.g. FileNotFoundException
// what to do if exception is thrown
}
catch ([Type of Exception 2] e2) { // e.g. IOException
// what to do if exception is thrown
}
finally {
// statements here always get executed, regardless of what happens in the try block.
// Can be used to close files or to release other system resources, etc...
}

Exception Handling in Python:


• Exceptions are objects; the base class is BaseException.
• All predefined and user-defined exceptions are derived from Exception.
• Predefined subclasses of Exception are ArithmeticError (subclasses are OverflowError,
ZeroDivisionError, and FloatingPointError) and LookupError (subclasses are IndexError and
KeyError).

try:
- The try block
except Exception1:
- Handler for Exception1
except Exception2:
- Handler for Exception2
...
else:
- The else block (no exception is raised)
finally:
- the finally block (do it no matter what)

Principles of Programming Languages 23 Halil Özmen


3. Subprograms, Parameter Passing and Polymorphism
3.1. Subprograms and Parameter Passing
A subprogram is a sequence of instructions whose execution is invoked from one or more remote
locations in a program, with the expectation that when the subprogram execution is complete,
execution resumes at the instruction after the one that invoked the subprogram.
A subprogram definition has:
 header: name, parameters (formals), return type;
 body: code.
E.g.: int triple(int x) { return x * 3; }
Call:
 name, parameters (actuals)
E.g.: triple(4);

Models of parameter passing


 in mode: actuals are copied into formals during the initiation of the call.
 out mode: formals are copied into actuals during the return of the call.
 inout mode: actuals are copied into formals during the initiation of the call and formals are
copied into actuals during the return of the call.

Implementations of Parameter Passing:


 Pass-by-value: in-mode
The value of the actual parameter is used to initialize the corresponding formal parameter, which
then acts as a local variable in the subprogram. Normally implemented by copying the actual
parameter to the stack location of formal parameter.
 Pass-by-reference: a variation of in-out mode.
There is no copy, the formal references the actual, which means they are aliases, when one
changes the other one changes too. This method transmits an access path (address) to the
called subprogram.
Aliasing: Two variables are aliased if they refer to the same storage location.
 Pass-by-result: out-mode
When a parameter is passed by result, no value is transmitted to the subprogram.
The corresponding formal parameter acts as a local variable.
When control is transferred back to the caller, its value is transmitted to actual parameter.
 Pass-by-value-result: in-out mode
Combination of pass-by-value and pass-by-result.
Actual parameters are copied to the corresponding formal parameters that have local storage.
When control is transferred back to the caller, formal parameters values are transmitted to
caller‘s actual parameters.
 Pass-by-name:
The symbolic "name" of a variable is passed, which allows it both to be accessed and updated.
Not used anymore.

Principles of Programming Languages 24 Halil Özmen


Pass-By-Value Pass-By-Reference
// Function prototypes: // Function prototypes:
void func1(int n); void func1(int *n);

int main(void) int main(void)


{ {
int n = 7; int n = 7;
printf("%d\n", n); printf("%d\n", n);
func1(n); func1(&n);
printf("%d\n", n); printf("%d\n", n);
return(0); return(0);
} // end main } // end main

void func1(int n) void func1(int *n)


{ {
n = n * 2; *n = *n * 2;
printf("%d\n", n); printf("%d\n", *n);
} // end func1 } // end func1
7 7
14 14
7 14

Pass-By-Result Pass-By-Value-Result
// C#: out specifier is used to indicate begin
// pass-by-result method. integer n;
void Fixer(out int x, out int y) procedure p(k: integer);
{ begin
x = 19; n := n+1;
y = 32; k := k+4;
} print(n);
end;
void Main(string[] args) n := 0;
{ p(n);
int a = 4, b = 7; print(n);
Fixer(a, b); end;
Console.WriteLine(a);
Console.WriteLine(b);
}
19 By-Value By-Reference By-Value-Result
32 1 5 1
1 5 4

Parameter passing methods for major languages:


C Pass-by-value
Pass-by-reference is achieved by using pointers as parameters
C++ Pass-by-value
Pass-by-reference is achieved by using a special pointer type called reference type
Java All parameters (except objects) are passed with pass by value
Object parameters are passed by reference
C# Default method: pass-by-value
Pass-by-reference is specified by preceding both a formal parameter and its actual parameter
with ref

Principles of Programming Languages 25 Halil Özmen


Subprograms as Parameters
It is sometimes convenient to pass subprogram names as parameters.
E.g. you would like to evaluate a mathematical function for a set of points, you shouldn’t write a
separate subprogram for each such math function.
Some programming languages allow passing subprograms as arguments to other subprograms.
In C and C++, functions cannot be passed as parameters, but pointers to functions can be passed. In
this way type checking is ensured.

Parameters that are Subprogram Names: Referencing Environment


To pass a subprogram as a parameter, the system passes a closure: a reference to the subprogram
body along with a pointer to the environment of definition of the subprogram.

Design Issues for Functions:


 Are side effects allowed?
 Parameters should always be in-mode to reduce side effect (like Ada).
 Side effect: Any assignment that changes a value in memory is a side effect.
 What types of return values are allowed?
 Most imperative languages restrict the return types.
 C functions return any type except arrays and functions.
 C++ is like C but also allows user-defined types.
 Java and C# do not have functions, but they have methods that can return any type.
 Python and Ruby treat methods as first-class objects, so they can be returned, as well as any
other class.

How to pass Python function as a function argument?


https://www.tutorialspoint.com/How-to-pass-Python-function-as-a-function-argument

3.2. Polymorphism
Definitions of Polymorphism:
 In programming language theory, polymorphism is the provision of a single interface to entities of
different types or the use of a single symbol to represent multiple different types.
 Polymorphism refers to the ability that a function or data structure can accomodate data of different
types.

Major classes of polymorphism:


 Ad hoc polymorphism:
Ad hoc polymorphism is a kind of polymorphism in which polymorphic functions can be applied
to arguments of different types, because a polymorphic function can denote a number of distinct
and potentially heterogeneous implementations depending on the type of argument(s) to which it
is applied. When applied to object-oriented or procedural concepts, it is also known as function
overloading or operator overloading.
 Parametric polymorphism: not specifying concrete types and instead use abstract symbols that
can substitute for any type.
Parametric polymorphism occurs when a routine, type or class definition is parameterized by
one or more types. It allows the actual parameter type to be selected by the user. This way, it is
possible to define types or functions that are generics, which can be expressed by using type
variables for the parameter type.
 Subtype polymorphism (inclusion polymorphism): when a name denotes instances of many
different classes related by some common superclass.
Found in object-oriented programming languages. Supported through inheritance.
Any function with object as parameter is polymorphic.
If formal parameter is of class A,
then actual parameter (argument) may be any object from subclass of A.
Principles of Programming Languages 26 Halil Özmen
Overloading
 Multiple copies of function
 Same function name
 Different number and/or type of parameters
 The return type is irrelevant in disambiguation process.
 Arguments determine the function to be invoked
static void print(int x) { ... }
static void print(double x) { ... }
print(7); // invokes 1st print
print(3.14159); // invokes 2nd print
 C++, Java and C# include predefined overloaded subprograms.
 C++, Java and C# allow users to write multiple versions of subprograms with the same name.
 Disadvantages:
 makes coder large and hard to maintain
 bugs need to be fixed in many places

Operator Overloading
 Treat operators as functions
 Behaviour different depending on operand type
Example in Java:
1+2 // integer addition
2.7 + 3.14159 // double (float) addition
"Hello " + "world" // string concatenation

Example of parametric polymorphism (in C++):


#include <iostream>
using namespace std;

template <class T>


T maximum(T a, T b) {
if(a >= b) { return a; }
else { return b; }
}

int main() {
cout << "maximum(4, 7) -> " << maximum(4, 7) << endl;
cout << "maximum(8.4, 3.14) -> " << maximum(8.4, 3.14) << endl;
cout << "maximum('K', 'T') -> " << maximum('K', 'T') << endl;
return 0;
}
maximum(4, 7) -> 7
maximum(8.4, 3.14) -> 8.4
maximum('K', 'T') -> T
The type variable T defined in the scope of maximum is a kind of generics, which will be
substituted at the function call. The function takes two parameters (a and b) of type T and
returns a value of type T.

Principles of Programming Languages 27 Halil Özmen


4. Functional Programming Languages
Functional programming is a programming paradigm where programs are constructed by composing
and applying functions. It is a declarative programming paradigm in which function definitions are trees
of expressions that map values to other values, rather than a sequence of imperative statements
which update the running state of the program.
Lambda calculus (also written as λ-calculus) is a formal system in mathematical logic for
expressing computation based on function abstraction and application using variable binding
and substitution. It was introduced by the mathematician Alonzo Church in the 1930s as part of
his research into the foundations of mathematics.

4.1. ML
ML (MetaLanguage) was originally designed in the 1980s by Robin Milner at the University of
Edinburgh as a metalanguage for a program verification system named Logic for Computable
Functions (LCF).
ML is primarily a functional language, but it also supports imperative programming. Unlike Lisp and
Scheme, the type of every variable and expression in ML can be determined at compile time. Types
are associated with objects rather than names. Types of names and expressions are inferred from
their context.
Unlike Lisp and Scheme, ML does not use the parenthesized functional syntax that originated with
lambda expressions. Rather, the syntax of ML resembles that of the imperative languages, such as
Java and C++.
Miranda was developed by David Turner in the early 1980s. Miranda is based partly on the languages
ML, SASL, and KRC.
Haskell is a purely functional language, having no variables and no assignment statement, and is
based in large part on Miranda. Another distinguishing characteristic of Haskell is its use of lazy
evaluation, which means that no expression is evaluated until its value is required.
Caml and its dialect that supports object-oriented programming OCaml, descended from ML and
Haskell. Finally, F# is a relatively new typed language based directly on OCaml. F# is a .NET
language with direct access to the whole .NET library.

In ML:
• Everything is an expression
• Everything evaluates to a value
• Everything has a type

Ineracting with ML:


"Read-Eval-Print" Loop (REPL):
Repeat
System reads expression e
System evaluates e to get value v
System prints value v and type t

ML's Holy Trinity:


1. Enter an expression e
2. ML infers a type t or emits an error
3. ML evaluates expression e down to a value v
4. Value v is guaranteed to have type t

ML Base types: integer, string, boolean


Type checking can be static or dynamic.
Example: OCaml: 4 + 2.8;;
Error: This expression has type float but an expression was expected of type int

Principles of Programming Languages 28 Halil Özmen


4.2. Haskell
Haskell is a purely functional language, having no variables and no assignment
statement, created in the late 1980’s.
Haskell uses lazy evaluation, which means that no expression is evaluated until its value is required.

Haskell is:
• Functional
 Functions are first-class, that is, functions are values which can be
used in exactly the same ways as any other sort of value.
 Haskell programs are centered around evaluating expressions
rather than executing instructions.
• Pure
Haskell expressions are always referentially transparent, that is:
 No mutation! Everything (variables, data structures, …) is immutable.
 Expressions never have “side effects” (like updating global variables or printing to the
screen).
 Calling the same function with the same arguments results in the same output every time.
• Lazy Evaluation
In Haskell, expressions are not evaluated until their results are actually needed. This is a simple
decision with far-reaching consequences. Some of the consequences include:
 It is easy to define a new control structure just by defining a function.
 It is possible to define and work with infinite data structures.
 It enables a more compositional programming style.
• Statically typed
 Every Haskell expression has a type, and types are all checked at compile-time.
 Programs with type errors will not even compile, much less run.

Online Haskell sites:


https://play.haskell.org
https://www.tutorialspoint.com/compile_haskell_online.php

Haskell Basics:
• Basic Syntax
A Haskell program consists of function definitions followed by main body.
Do not use tab characters anywhere in the program except inside comments.
func1 ... = ... --\
... -- > function definitions
funcn ... = ... --/
main = do Main program. "do" starts a block of statements
stmt_1 --\
... -- > indentation must be equal in all statements
stmt_n --/
• Comments:
The characters "--" followed by any sequence of characters up to end of line.
The symbol "{-" followed by any sequence of characters (including new lines) up to "-}".
• Declaring Values
let keyword is used to declare values:
let n = 42
let m = n + 6
let msg = "Result= " ++ show m
Haskell show function converts to string: show 48 evaluates to "48"

Principles of Programming Languages 29 Halil Özmen


• Operators:
Arithmetic Operators: + - * /
^ ^^ (integer power) ** (integer or double power, result is double)
Relational Operators: == /= < <= > >=
Logical Operators: not && (and) || (or)
String and List Concatenation Operator: ++
Range Operator: .. Generates values by increasing successively by 1.
[4..7] result: [4,5,6,7] [4..3] result: []
[4..7.49] result: [4.0,5.0,6.0,7.0]
[4..7.5] result: [4.0,5.0,6.0,7.0,8.0] (Attention to 8.0)
[4.4..7] result: [4.4,5.4,6.4,7.4] (Attention to 7.4)
Precedence of Operators:
Precedence Operator Description Associativity
10 highest f x Function application Left
9 !! Index operator Left
9 . Function composition Right
8 ^ ^^ ** Power Right
7 * / `div` `mod` `quot` `rem` Arithmetic operators Left
6 + - Arithmetic operators Left
5 : ++ Append to list Right
4 == /= < <= >= > `elem` `notElem` Comparisons, Element
4 <*> <$> Functor ops Left
3 && Logical AND Right
2 || Logical OR Right
1 >> >>= Monadic ops Left
1 =<< <|> Right
0 $ $! `seq` Right

Functions:
Function returns the value of the expression written after the equal sign (=):
funcname arg1 arg2 ... argn = <expr>

Get type of a function:


:t funcname

Examples:
double x = x + x
inc x = x + 1
in_range x min max = x >= min && x <= max
double 7 -- 14
inc 7 -- 8
in_range 4 2 7 -- True
in_range 8 2 7 -- False

Sometimes it is necessary to first define type of function and types of its parameters.
Last type is the type of the function, previous ones are types of parameters.
stringToInt :: String -> Int -- function type definition
stringToInt s = read s -- followed by function
stringToDouble :: String -> Double -- function type definition
stringToDouble s = read s -- followed by function
quadratic :: Double -> Double -> Double -> Double -> Double
quadratic x a b c = a * (x ** 2) + b * x + c
Calls of these functions:
let a = stringToInt "771" + 1 -- a will be 772
print(stringToDouble "7.71" + 3) -- 10.71
let y = quadratic 2 2 (-3) 5 -- y will be 7.0

Principles of Programming Languages 30 Halil Özmen


if ... then ... else: can be used anywhere an expression can be used.
max2 x y = if x >= y then x else y -- in function
max3 x y z = if x >= y && x >= z then x else if y >= z then y else z
main = do
let k = if n >= m then n else m -- in let statement
print(if 47 > 32 then "aaaa" else "bbbb") -- in print

Build-in Functions: (this is not the complete list of built-in Haskell functions)
Function Description Example
length Length of a string or list length "abcd" -- 4
mod Returns modulus of two integers mod 48 10 -- 8
div Integer division of two integers div 48 10 -- 4
even Returns True if parameter is even even 42 -- True
odd Returns True if parameter is odd odd 42 -- False
sqrt Square-root, returns double sqrt 16 -- 4.0
max Returns maximum of two values max 4 3 -- 4
min Returns minimum of two values min "good" "hi" -- "good"
gcd Returns the greatest common divisor. gcd 20 48 -- 4
lcm Returns the lowest common multiple. lcm 20 48 -- 240
head Returns the head (first element) of a list head [4,5,6,7,8] -- 4
tail Returns the tail (all except head) of a list tail [4..8] -- [5,6,7,8]
elem Element of list: True or False elem 42 [1..40] -- False
take Returns a list by taking the first n elements take 4 [8..20] -- [8,9,10,11]

Currying:
When a function has multiple arguments, the function consumes one argument at a time. This is
called currying the function.
Curried vs Uncurried:
mult3 a b c = a * b * c -- Curried Function
mult3u (a, b, c) = a * b * c -- Uncurried Function
main = do
let a = mult3 3 4 5 -- 60
let b = mult3u (3, 4, 5) -- 60

Data Types:
name :: <type> is read as: "name is of type <type>"
i :: Int
Int Machine-sized integers
let i = -78
n :: Integer
Integer Arbitrary-precision integers
let n = 12345678909876543210987340828798724
squares :: Float -> Float -> Float
Single precision floating point
Float squares x y = x*x + y*y
number
main = print (squares 2 3.8)
squares :: Double -> Double -> Double
Double precision floating point
Double squares x y = x*x + y*y
number
main = print (squares 2 3.8)
bignum n = n >= 1000000
Boolean.
Bool bignum 777
Either True or False.
False
Character :t 'a'
Char
delimited by single-quotes 'a' :: Char
String of characters
String delimited by double-quotes. length "Galaxy" -- 6
length function gives size. ['a', 'x', 'e'] is equivalent to "axe"
Principles of Programming Languages 31 Halil Özmen
Type Conversions:
 Conversion from Integer to Int: fromInteger
let age = fromInteger year - byear -- Convert year from Integer to necessary type
 Conversion to string: function show converts data of any type to string
let ns = show 44 -- "44"
let xs = show 3.14159 -- "3.14159"
let lst = [4, 3, 2, 4]
let lststr = show lst -- "[4,3,2,4]"
 Conversion from string to integer or double: write functions based on read and specify types.
stringToInt :: String -> Int
stringToInt s = read s -- stringToInt is defined under read function
stringToDouble :: String -> Double
stringToDouble s = read s -- stringToDouble is defined under read function
let a = stringToInt "778" -- 778
let x = stringToDouble "24.7048" -- 24.7048

Data Structures:
• Lists
[e1, e2, ..., en]
All elements of a list must be of the same type.
let a = [1, 2, 1, 4]
let b = 7 : 4 : 2 : 8 : []
let p = 7 : 4 : [2, 8, 7]
let q = [] : 7 : 4 : 2 : 8 : 6 : 1 -- ERROR
let c = 8 : 2 : [1, 2, 3]
let s = ["A4", "Galaxy", "abc", "777"]
-- Usage of range operator "..":
let b = [-4..4] -- [-4,-3,-2,-1,0,1,2,3,4]
-- List comprehension:
let c = [x*x | x <- [4..10]] -- [16,25,36,49,64,81,100]
length function gives the number of elements in a list: let alen = length [...]
let a = [1, 2, 1, 4, 8, 7]
let alen = length a -- alen will be 6
Indexing: Indexes start with 0. Index operator is !!.
let a = [1, 2, 1, 4, 8, 7]
let n = a !! 3 -- n will be 4
Appending two lists: ++ operator
let aa = [...] ++ [...]
Appending an element to a list: (1) create a list containing element, (2) append two lists.
let list1 = [...]
let n = 4
let list2 = list1 ++ [n] -- append to end
let list3 = [n] ++ list1 -- append to beginning
Functions can be used to append an element to a list:
-- Append an element to the end of list:
appendEnd v lst = lst ++ [v]
-- Append an element to the beginning of list:
appendBegin v lst = [v] ++ lst
main do =
let list1 = [2, 7, 8, 2, 5, 6]
let n = 4
let list2 = appendEnd n list1 -- append to end
let list3 = appendBegin n list1 -- append to beginning

Principles of Programming Languages 32 Halil Özmen


• Tuples
(e1, e2, ..., en)
The elements of a tuple may be of different types.
let t1 = (1, 2, 1, 4)
let t2 = ("A4", "Galaxy", 44, "777", 'T')
let t3 = (48, "Galaxy", 2.71828, '+', 4, 7)
mult3u (a, b, c) = a * b * c -- a tuple as parameter
The number of elements of a tuple cannot be obtained by length function.
It is not meaningfull to get the length of a tuple anyway.

Pattern Matching
Pattern matching applies to values. It is used to sum1n n =
recognize the form of this value and lets the if n == 0 then 0
computation be guided accordingly, associating else n + sum1n (n-1)
with each pattern an expression to compute. sum1nv2 n
If the function has single argument, then pattern | n == 0 = 0
matching can be done as follows, using logical | otherwise = n + sum1nv2 (n-1)
expressions after |. Logical expressions must sum1nv3 n = case n of
cover all possible cases (must be exhaustive). 0 -> 0
sign x | x > 0 = 1 _ -> n + sum1nv3 (n-1)
| x == 0 = 0
| x < 0 = -1
If the function has one or more arguments, then pattern matching can also be done using case:
f123 x s = case x of f123a x s
1 -> "one " ++ s | x == 1 = "one " ++ s
2 -> "two " ++ s | x == 2 = "two " ++ s
3 -> "three " ++ s | x == 3 = "three " ++ s
_ -> "other " ++ s | otherwise = "other " ++ s
count [] _ = 0
count (x:xs) v
| x == v = 1 + count xs v
count lst v = case lst of | otherwise = count xs v
[] -> 0
x : xs -> count xs v + if x == v then 1 else 0

Define a Haskell function isEmpty that gets a list as argument and evaluates to True if the given
list is empty, and to False if the list is not empty. Do not use if, use pattern matching.
isEmpty lst = case lst of isEmpty [] True
[] -> True isEmpty ["xyz"] False
_ : _ -> False isEmpty [2..8] False

Define a Haskell function equal1st2nd that gets a list as argument and evaluates to true if the
first two elements of the list are equal, and false in not equal. If the list has less than two
elements, the function will evaluate to false.
equal1st2nd lst = case lst of
[] -> False
[x] -> False
x : y : _ -> if x == y then True else False
main = do
equal1st2nd [4] -- False
equal1st2nd [4, 2, 7, 2, 1, 8, 2, 5] -- False
equal1st2nd [4, 4] -- True
equal1st2nd [4, 4, 7, 2, 1, 8, 4, 5] -- True
equal1st2nd ["at", "at", "www", "in", "a", "the"] -- True

Principles of Programming Languages 33 Halil Özmen


Recursion
In Haskell, recursion is no harder than using loop in imperative languages, frequently easier.

Forward Recursion
In forward recursion, the function recursively first calls on all recursive components, and then
builds the final result from the partial results.
I.e.: Wait until the whole structure has been traversed (recursively) to start building the answer.
sum1n n = sum1n 3
if n == 0 then 0 3 + sum1n 2
else n + sum1n(n-1) 2 + sum1n 1
main = do 1 + sum1n 0
let a = sum1n 3 0
print(a) 1 + 0
2+1
Call Stacks 3+3
6
While a program runs, there is a call stack of function calls that have
started but not yet returned.
• Calling a function f pushes an instance of f on the stack (with the return point in the
program),
• When a call to f finishes, it is popped from the stack.
These stack-frames store information such as the value of local variables and "what is left to do"
in the function.
Due to recursion, multiple stack-frames may be calls to the same function.

naiveSumList lst =
if (lst == []) then 0
else (head lst) + naiveSumList (tail lst)

Tail Recursion
• A recursive function is tail-recursive if all recursive calls are the last thing that the function does.
• Tail recursion generally requires extra "accumulator" arguments to pass partial results.
• May require an auxiliary function!
• The general idea is to write your recursive function such that the value returned by the recursive
call is what’s returned by your function.
• i.e., there’s no pending operation in the function waiting for the value returned by the
recursive call.
• That way, the function can say, ”Don’t bother with me anymore, just take the answer from my
recursive call as the result. You can just forget all of my state information.”

Principles of Programming Languages 34 Halil Özmen


sumList lst = sumLoop lst 0
sumLoop lst acc = -- This is tail recursive!
if lst == [] then acc
else sumLoop (tail lst) (acc + (head lst))

Why do we care?
Reusing the stack frame of the tail-recursive function is known as the tail call optimization.
It is an automatic optimization applied by the compilers and interpreters.

Experiment:
Write a function that takes a list, and returns the sum of the elements of this list.
sumList lst =
if lst == [] then 0
else (head lst) + sumList (tail lst)

main = do
let a = [1..100000000]
let suma = sumList a
print(suma)
ERRORS: (https://play.haskell.org/)
Main: Heap exhausted;
Main: Current maximum heap size ....
sumListTail lst acc = -- This is tail recursive!
if lst == [] then acc
else sumListTail (tail lst) (acc + (head lst))

main = do
let a = [1..100000000]
let suma = sumListTail a 0
print(suma)
Output:
5000000050000000
So, there really is a difference between tail vs. forward recursion.

How to write tail recursive functions?


To write tail-recursive functions, we need to answer the following question:
What information do I need to pass from the caller to the callee (i.e. from the lower stack frame
to the upper stack frame) so that I won't need the caller again, and can simply throw it away?
sumListT lst acc =
if lst == [] then acc
else sumListT (tail lst) (acc + (head lst))
main = do
let a = [1..1000]
let suma = sumListT a 0 -- Accumulator here

Better Programming of Tail Recursive Functions


Use an auxiliary function to hide the accumulator from the user:
sumList lst = sumListT lst 0 -- Call tail recursive with acc
sumListT lst acc = -- This is tail recursive!
if lst == [] then acc
else sumListT (tail lst) (acc + head lst)
main = do
let a = [1..1000]
let total = sumList a
print(total)

Principles of Programming Languages 35 Halil Özmen


Creating New Data Types:
The Haskell keyword "data" is used to define new data types.
-- data type for a 2-dimensional point:
data Point = Point Double Double deriving (Show, Eq)
-- [1] [2] [3] [3]
[1]: Type constructor.
[2]: Data constructor.
[3]: Types of data elements.
deriving (Show, Eq): to print values of the type and to compare them for equality.
If a type has more than one data constructors, then they are separated by the | character.

Example: (see h_data_01.hs)


-- Data type for weekdays:
data Weekday = Monday | Tuesday | Wednesday | Thursday | Friday |
Saturday | Sunday deriving (Show, Eq)

-- Data type for some geometric shapes:


data Shape = Circle Double | Square Double |
Rectangle Double Double
deriving (Show, Eq)
area (Circle r) = 3.14159 * r * r
area (Square a) = a * a
area (Rectangle a b) = a * b
main = do
let today = Tuesday
putStrLn("Today is " ++ show today)
let c1 = Circle 2 -- here data constructor is used
let s1 = Square 4 -- here data constructor is used
let r1 = Rectangle 8.2 4 -- here data constructor is used
let c1area = area c1
print c1area
let s1area = area s1
print s1area
let r1area = area r1
print r1area

Node:
Node is a recursive constructor. It has a value of type a that it receives as a parameter on
construction. The second parameter to Node is of type Seq a which is the type of Node itself,
thus it becomes a recursive data structure.

Recursive Data Types:


Binary tree data structure: (see h_data_binary_tree_01.hs)
data BTree a = Nil | Node (BTree a) a (BTree a) 4
deriving (Show, Eq)
main = do 2 7
let t1 = Node (Node Nil 2 Nil) 4 (Node (Node Nil 6 Nil) 7 (Node Nil 8 Nil))
print(t1) 6 8
Output:
Node (Node Nil 2 Nil) 4 (Node (Node Nil 6 Nil) 7 (Node Nil 8 Nil))

Write a function treeSum that evaluates to the sum of all values in a binary tree.
treeSum t = case t of
Nil -> 0
Node left v right -> v + treeSum left + treeSum right
Principles of Programming Languages 36 Halil Özmen
Write a function mirror that mirrors a binary tree (reverses left and right).
mirror t = case t of
Nil -> Nil
Node left v right -> Node (mirror right) v (mirror left)

Write a function contains that evaluates to true if a binary tree contains a given value.
contains t x = case t of
Nil -> False
Node left v right ->
if x == v then True else contains left x || contains right x
main = do
let t1 = Node (Node Nil 2 Nil) 4 (Node (Node Nil 6 Nil) 7 (Node Nil 8 Nil))
print(contains t1 6) -- True
print(contains t1 7) -- True
print(contains t1 5) -- False

Print Functions:
putStrLn "Hello Galaxy!"
putStrLn Prints a string then a newline
putStrLn ("Hello " ++ "Galaxy!")
putStr Prints a string, but not a newline putStr "Hello Galaxy!"
putChar Prints a character
print Prints its argument print a print(x, y, z)

While using these print functions, strings and non-string data can be printed together by
converting non-string data to string with "show" function.
Program Output
main = do Hello Galaxy!
putStrLn "Hello Galaxy!" "Hello Galaxy!"
print "Hello Galaxy!" 2
let a = 2 (2,"abba",2.71828)
print a Good work2.71828
let b = "abba" =========
let c = 2.71828 a=2, abba, e=2.71828
print (a, b, c)
putStr "Good work"
print c
putStrLn "========="
putStrLn ("a=" ++ show a ++ ", " ++ b ++ ", e=" ++ show c)

Input Functions
These functions read input from the standard input device (normally the user’s keyboard).
readLn :: Read a => IO a
getLine :: IO String
getChar :: IO Char

Example program that reads an integer, a double and a string:


main :: IO ()
main = do
putStrLn "Enter an integer: "
n <- readLn :: IO Int
Principles of Programming Languages 37 Halil Özmen
putStrLn ("You entered: " ++ show n)
putStrLn ("Double: " ++ show (n*2))

putStrLn "Enter a double: "


x <- readLn :: IO Double
putStrLn ("You entered: " ++ show x)
putStrLn ("Double: " ++ show (x*2))

putStrLn "Enter a line: "


s <- getLine :: IO String
putStrLn ("You entered: " ++ s)
putStrLn ("Length: " ++ show (length s))

Online Haskell sites:


https://play.haskell.org
https://www.tutorialspoint.com/compile_haskell_online.php

Principles of Programming Languages 38 Halil Özmen


5. Imperative and Object-Oriented Programming Languages

5.1. Imperative Programming


Imperative programming languages are, to varying degrees, abstractions of the underlying von
Neumann computer architecture. The architecture’s two primary components are its memory, which
stores both instructions and data, and its processor, which provides operations for modifying the
contents of the memory. The abstractions in a language for the memory cells of the machine are
variables.
Unlike declarative programming, which describes "what" a program should accomplish, imperative
programming explicitly focuses on describing "how" a program operates step by step to accomplish it.
Programs written this way often compile to binary executables that run more efficiently since all CPU
instructions are themselves imperative statements.

Imperative Programming Paradigm


• The central features are:
 variables which model the memory cells
 assignment statements
 iterative form of repetition
• Operands are piped from memory to the CPU and the result of evaluating the expression is
moved back to a memory cell.
• Iteration is fast because instructions are stored in adjacent cells of memory

Major imperative programming languages: Fortran, Algol, Cobol, Pascal, C, Basic, Python, PHP, ...

Control flow structures


In computer science, control flow (or flow of control) is the order in which individual statements,
instructions or function calls of an imperative program are executed or evaluated.
The emphasis on explicit control flow distinguishes an imperative programming language from a
declarative programming language.
Go to (goto): Fortran, C, C++, PHP
Selection
if All imperative languages
switch C, C++, Java, PHP
Loops (repetition statements)
do Fortran (similar to for loop)
for C, C++, Java, Python, PHP, etc.
while C, C++, Java, Python, PHP, Fortran 77 (as do while), etc.
do - while C, C++, Java, Python, PHP, etc.

History of imperative and object-oriented languages


The earliest imperative languages were the machine languages of the original computers. In these
languages, instructions were very simple, which made hardware implementation easier but hindered
the creation of complex programs.
FORTRAN, developed by John Backus at International Business Machines (IBM) starting in 1954, was
the first major programming language to remove the obstacles presented by machine code in the
creation of complex programs. FORTRAN was a compiled language that allowed named variables,
complex expressions, subprograms, and many other features now common in imperative languages.
The next two decades, many other major high-level imperative programming languages were
developped. In the late 1950s and 1960s, ALGOL was developed in order to allow mathematical
algorithms to be more easily expressed and even served as the operating system's target language for
some computers. COBOL (1960) and BASIC (1964) were both attempts to make programming syntax
look more like English.

Principles of Programming Languages 39 Halil Özmen


In the 1970s, Pascal was developed by Niklaus Wirth, and C was created by Dennis Ritchie. Wirth
went on to design Modula-2 and Oberon. For the needs of the United States Department of Defense,
Jean Ichbiah and a team at Honeywell began designing Ada in 1978. The specification was first
published in 1983, with revisions in 1995, 2005, and 2012.
The 1980s saw a rapid growth in interest in object-oriented programming. These languages were
imperative in style, but added features to support objects. Smalltalk-80, originally conceived by Alan
Kay in 1969, was released in 1980, by the Xerox Palo Alto Research Center (PARC). Drawing from
concepts in another object-oriented language Simula (which is considered the world's first object-
oriented programming language, developed in the 1960s), Bjarne Stroustrup designed C++, an object-
oriented language based on C. Design of C++ began in 1979 and the first implementation was
completed in 1983. In the late 1980s and 1990s, the notable imperative languages drawing on object-
oriented concepts were Perl, released by Larry Wall in 1987; Python, released by Guido van Rossum
in 1990; PHP, released by Rasmus Lerdorf in 1994; Java, by James Gosling in 1995, JavaScript, by
Brendan Eich (Netscape), and Ruby, by Yukihiro "Matz" Matsumoto, both released in 1995.

Fortran IV program example: (https://en.wikibooks.org/wiki/Fortran/Fortran_examples)


C AREA OF A TRIANGLE - HERON'S FORMULA
C INPUT - CARD READER UNIT 5, INTEGER INPUT, ONE BLANK CARD FOR END-OF-DATA
C OUTPUT - LINE PRINTER UNIT 6, REAL OUTPUT
C INPUT ERROR DISPAY ERROR MESSAGE ON OUTPUT
501 FORMAT(3I5)
601 FORMAT(4H A= ,I5,5H B= ,I5,5H C= ,I5,8H AREA= ,F10.2,
$13H SQUARE UNITS)
602 FORMAT(10HNORMAL END)
603 FORMAT(23HINPUT ERROR, ZERO VALUE)
INTEGER A,B,C
10 READ(5,501) A,B,C
IF(A.EQ.0 .AND. B.EQ.0 .AND. C.EQ.0) GO TO 50
IF(A.EQ.0 .OR. B.EQ.0 .OR. C.EQ.0) GO TO 90
S = (A + B + C) / 2.0
AREA = SQRT( S * (S - A) * (S - B) * (S - C) )
WRITE(6,601) A,B,C,AREA
GO TO 10
50 WRITE(6,602)
STOP
90 WRITE(6,603)
STOP
END

Imperative vs Functional Programming


Imperative: the execution of a program is both evaluation of expressions and changing a state
(values of variables)
• "stateful" programming
• evaluation can have side effects e.g. x = x + 1
• enable efficient time/space consumption but harder to reason
Functional: the execution of a program is just evaluation of expressions without changing the
values of variables
• "stateless" programming e.g. fun x = x + 1
• no side effects
• pure if all constructs are strictly declarative i.e., no side-effects

Pure and Impure Functions


• A pure function is one without any side effects.
e.g. strlen with the same string always returns the same number.
• Impure functions: an impure function may return different result for the same parameter(s).
malloc: if you call it with the same number, it won’t return the same pointer to you -> it relies on
the internal state (objects allocated on the heap, the allocation method in use etc.)

Principles of Programming Languages 40 Halil Özmen


Pure functions
There are various theoretical advantages of having pure functions. One advantage is that if a
function is pure, then if it is called several times with the same arguments, the compiler only
needs to actually call the function once.

Pure vs Impure functions:


f is pure. g is impure.
def f(x): def g(x):
return x + 1 if isTuesday():
return x + 1
else:
return x

5.2. Object-Oriented Programming


Object-Oriented Programming (OOP):
• Based on the concept of "objects", which can contain:
 Data: in the form of fields (often known as attributes or properties)
 Code: in the form of procedures (often known as methods).
Methods are attached to the objects, can access and modify the object's data fields.
There is usually a special name such as this or self used to refer to the current object.

OOP languages are diverse, but the most popular ones are class-based, meaning that objects are
instances of classes, which also determine their types. In these languages, computer programs are
designed by defining classes from which instances of objects are created.
Many of the OOP languages (such as C++, Java, Python, etc.) are multi-paradigm and they support
object-oriented programming to a greater or lesser degree, typically in combination with imperative
and procedural programming.
Significant object-oriented languages: Simula, Smalltalk, C++, Java, C#, JavaScript, Objective-C,
Object Pascal, Perl, PHP, Python, Visual Basic.NET.

Concepts of OOP Languages:


• Abstract data types (usually called classes)
• Objects that are class instances.
Objects store data in fields and behaviour in methods specified by their classes.
• Encapsulation
Encapsulation is hiding internals of an object from the user of the object.
• Inheritance
Inheritance is a core principle of object-oriented programming where a class can inherit
properties and behavior from another class. It enables customization of behaviour.
It promotes code reuse and hierarchical relationships among classes.
• Polymorphism
Polymorphism refers to the ability that a function or data structure can accomodate data of
different types.

Advantages of OO Programming:
• It reduces conceptual load:
It reduces the amount of detail the programmer must think about.
• It provides fault and change containment:
It limits the portion of a program that needs to be looked at when debugging.
It limits the portion of a program that needs to be changed when changing the behaviour of
an object without changing its interface.
• It provides independence of program components and thus facilitates code reuse.

Principles of Programming Languages 41 Halil Özmen


Inheritance
• Using inheritance we can define a new derived class, a subclass or child class based on an
existing base class, parent class or superclass.
• The derived class is a subtype if its parent class, and has an is-a relationship with it.
 Inherits all fields and methods of the superclass (except the private ones),
 Can define additional fields and methods, and
 Can override existing fields and methods.
• Purpose: Extend or specialize the behaviour of the superclass.
• Advantage: reuse the code / features from a previously defined class

Single vs Multiple Inheritence:


widget
• This allows programmers to define a class hierarchy.
• Single inheritance: if a new class is a subclass of a single parent.
push_button
• Multiple inheritance: If a class has more than one parent class.
• If only single inheritance is allowed, the hierarchy is a tree.
• If multiple inheritance is allowed, the hierarchy is an acyclic graph.

Syntax of Inheritance:
• C++
class push_button : public widget { ... }
class DrawThread : public Thread, public Drawing { ... } // Multiple inheritance
• Java
public class push_button extends widget { ... }
• Python
class push_button(widget):
...

Dynamic Binding:

Visibility in C++:
Three visibility levels:
• Private methods/fields are visible to members of objects of the same class and to friends.
• Protected methods/fields are visible to members of objects of the same class or derived classes
and to friends.
• Public methods/fields are visible to the whole world.
Friends: A class can declare other classes and functions to be its friends, thereby providing them with
access to its private and protected members.

Principles of Programming Languages 42 Halil Özmen


OOP Support:
• C++
 highly detailed control over access to class members (access controls within the class, during
derivation, friend classes and functions)
 supports multiple inheritance
• Java
 All objects are heap-dynamic, are referenced through reference variables.
 Whereas C++ classes can be defined to have no parent, that is not possible in Java. All Java
classes must be subclass of Object or its descendants.
 Garbage collection
 Base classes are always public.
 Protected members are visible in derived classes and in the same package.
 No notion of friends.
 Inheritance:
o Only single inheritance is supported.
o Abstract class category named interface that provides some of the benefits of multiple
inheritance
o An interface can include only method declarations and named constants can’t contain
constructors, nonabstract methods. E.g.
public interface Comparable <T> {
public int comparedTo (T b);
}
o A class doesn’t inherit an interface, it implements it.
o A class can implement any number of interfaces.

Principles of Programming Languages 43 Halil Özmen


6. Logic Programming Languages
Logic programming is a programming paradigm which is largely based on
formal logic.
Logic Programming:
• Uses abstract model, or deals with objects and their relationships.
• The syntax is basically the logic formulae.
• It computes by deducting the clauses.
• Logics and controls can be separated.
Logic programming languages are frequently used in AI.

Any program written in a logic programming language is a set of sentences in logical form, expressing
facts and rules about some problem domain.
If H is head and B1, B2, B3, ... are the elements of the body, then a rule is written as
H :- B1, B2, …, Bn. which means: "H is true, when B1, B2, ..., Bn all are true".
The rules are written in the form of logical clauses, where head and body are present.
On the other hand, facts are like the rules, but without any body. So, an example of fact is:
H. which means "H is true".

6.1. Prolog
Prolog, which stands for PROgramming in LOGic, is a logical and declarative
programming language.
Prolog has its roots in first-order logic or first-order predicate calculus. The language
was conceived in Marseilles, France in the early 1970s by a group led by Alain Colmerauer.
A Prolog program consists of data which is based on the facts and rules (logical relationships), rather
than computing how to find a solution.
A logical relationship describes the relationships which hold for the given application (or problem).

Online Prolog compilers: https://swish.swi-prolog.org/


https://www.tutorialspoint.com/execute_prolog_online.php

Basics of Prolog:
• Knowledge base: The knowledge base (or database) is a collection of facts and rules.
A Prolog program is a knowledge base.
• Facts: The fact is a predicate that is true. A fact declares something to be true.
E.g. if we say, "Jane is female" or “Jane is parent of Tom”, then these are facts.
female(jane). male(tom).
parent(jane, tom). likes(tom, pizza).
• Rules: A rule states conditions for something to be true. Rules are extentions of facts that
contain conditional clauses. To satisfy a rule all conditions should be met (true).
mother(X, Y) :- female(X), parent(X, Y).
grandmother(X, Y) :- female(X), parent(X, Z), parent(Z, Y).
This implies that for X to be the grandmother of Y, X should be female and X should be parent of
Z and Z should be a parent of Y.
• Questions (queries): To run a prolog program, some questions are needed, and those
questions can be answered by the given facts and rules. The answer is yes / true or no / false.
?- grandmother(jane, ann). % is Jane grandmother of Ann?

Prolog programs answer questions:

Principles of Programming Languages 44 Halil Özmen


Syntax and Elements of Prolog:
Prolog sentences must end with a period.
Comments:
The character % followed by any sequence of characters up to end of line.
The symbol "/*" followed by any sequence of characters (including new lines) up to "*/".
Symbols:
Prolog expressions are comprised of the following truth-functional symbols, which have the
same interpretation as in the predicate calculus.
English Prolog Predicate Calculus
if :- -->
and , ^
or ; v
not \ not ~

Variables:
A variable in Prolog is a string of letters, digits, and underscores (_) beginning either with a
capital letter or with an underscore. Examples: X , First_name , Z2, Jane, _ , _the , _k48.
The variable _ is called the anonymous variable, used as a "don't-care" variable, when the
value isn't important (anything). Every occurence of _ represents a different variable.
Atoms: An atom is either:
 A string of characters beginning with a lower-case letter and made up of letters, digits,
and the underscore character. E.g.: jane, goodWork, c3p0, adana_kebap.
 An arbitrary sequence of characters enclosed in single quotes. E.g.: ’the’, ’Hello World’,
’2 + 2 = 4’, ' ', ’&^%&#@$ &* ’.
 A string of special characters. Examples: @= ====> ; :- are all atoms. Some
atoms, such as ; and :- have a pre-defined meaning.
Functor: The word functor is used to refer to the atom at the start of a structure.
E.g.: In likes(mary, pizza) , likes is the functor.
In Prolog, predicates (functors) are not functions, and they do not
return values, they evaluate to either true or false. The values are
obtained from the parameters of functors.
Numbers: integers and floating point numbers.
Integers: …, -2, -1, 0, 1, 2, 3, …
Floating point numbers are not particularly important in typical Prolog applications.
Operators:
is: forces evaluation of arithmetic expressions. However, = do only symbolic assignment.
?- X is 2 + 4. Then: X = 6. ?- X = 2 + 4. X = 2+4
Arithmetic Operators: + - * / ** (power) // (int div) mod (modulus) sqrt max
?- A is (2 ** 8 - 12) mod 100. % Result: A 44
Identicality Operators: == \== Eg: jane \== ann (true) 4 == 2+2. (false)
Comparison Operators (numeric): =:= =\= < =< > >= Eg: 4 =:= 2+2. (true)
Comparison Operators (string): @< @=< @> @>= Eg: 'aa' @> 'ZZZZ'. (true)
Logical Operators: (see Symbols above) , (and) ; (or) not
Precedence of some operators: (0 to 1200; 0 has highest precedence.)
200 xfx **
400 yfx * / // div mod rem
500 yfx + -
700 xfx < = =.. =@= \=@= =:= =< == =\= > >= @< @=< @> @>= \= \== is
1000 xfy ,
1100 xfy ;
(See: https://www.swi-prolog.org/pldoc/man?section=operators)

Principles of Programming Languages 45 Halil Özmen


Prolog Built-in Predicates:
Predicate Description
var(X) succeeds if X is currently an un-instantiated variable.
novar(X) succeeds if X is not a variable, or already instantiated
atom(X) is true if X currently stands for an atom
number(X) is true if X currently stands for a number
integer(X) is true if X currently stands for an integer
float(X) is true if X currently stands for a real number.
atomic(X) is true if X currently stands for a number or an atom.
compound(X) is true if X currently stands for a structure.
ground(X) succeeds if X does not contain any un-instantiated variables.
write('...') Writes the string on screen.

Facts, Rules, Queries:


Facts: A fact declares something to be true.
relation(object1, object2...).
female(jane). % Jane is female.
parent(jane, jim). % Jane is parent of Jim.
likes(john, jane). % John likes Jane.
not(likes(john,meat)). % John does not like meat.
likes(X, susie). % Everyone likes Susie.
likes(ann, Y). % Ann likes everyone/everything.
Rules: A rule states conditions for something to be true.
rule_name(object1, object2, ...) :- fact/rule(object1, object2, ...).
rule_name(object1, object2, ...) :- fact/rule(object1, ...), fact/rule(object1, ...) ... .
is_bigger(X, Y) :- X > Y.
mother(X, Y) :- female(X), parent(X, Y).
X is mother of Y, if X is female and X is parent of Y.
grandfather(X, Y) :- father(X, Z), parent(Z, Y).
grandmother(X, Y) :- female(X), parent(X, Z), parent(Z, Y).
Two rules like: P :- Q.
P :- R. can be written as: P :- Q ; R.
Two rules like: P :- Q, R.
P :- S, T, U. can be written as: P :- (Q, R) ; (S, T, U).
Queries (Questions):
Queries are some questions on the relationships between objects and object properties.
Answering a query means proving that the goal represented by that query can be satisfied.
Queries are answered based on facts and rules. The answer is either yes/true or no/false.
To answer a query, Prolog executes this algorithm:
o If a goal matches with a fact, then it is satisfied.
o If a goal matches the head of a rule, then it is satisfied if the goal represented by the
rule's body is satisfied.
o If a goal consists of several subgoals separated by commas, then it is satisfied if all
its subgoals are satisfied.
o When trying to satisfy goals using the built-in predicates, Prolog also performs if
there is an action associated with it (eg. writing on screen with write predicate).
Queries start with: ?-
?- X is 12 ** 2. % Result: 144.0
?- X is 48 // 10. % Result: 4
?- X is 48 mod 10. % Result: 8
?- likes(john, banana). % Does John like banana?
?- likes(X, banana). % Who likes banana?

Principles of Programming Languages 46 Halil Özmen


Example:
Consider the following argument that has two premises and a conclusion:
Socrates is a man.
All men are mortal. (If someone is man, then he is mortal.)
Hence, Socrates is mortal.
Translated to Prolog: man(socrates).
mortal(X) :- man(X).
Conclusion can be formulated as a query: ?- mortal(socrates).
Yes
Goal execution:
1. The query mortal(socrates) is made the initial goal.
2. Prolog looks for the first matching fact or head of rule, and finds mortal(X). It does
variable instantiation X = socrates.
3. This variable instantiation is extended to the rule's body, so man(X) becomes
man(socrates).
4. So, new goal is: man(socrates).
5. Success, because man(socrates) is a fact itself.
6. Therefore, the initial goal succeeds.

Backtracking:
If an attempt to satisfy a goal fails, or if we ask for more answers, Prolog tries to re-satisfy the
goal by backtracking, as follows:
1. Move back along the search path to where the unifying clause was found.
2. Un-instantiate any variables that have been instantiated during the search process.
3. Start another search for another unifying clause from where the current unifying clause was
found (the place marked in the database).

Recursion in Prolog:
Prolog rules can be recursive.
Example:
descendant_of(X, Y) :- child_of(X, Y). % Base case
descendant_of(X, Y) :- child_of(X, Z), descendant_of(Z, Y). % Recursive case
The base case must always appear first!

Simple Prolog Program Example:


A little Prolog program consisting of four facts:
% Facts:
bigger(elephant, horse).
bigger(horse, donkey).
bigger(donkey, dog).
bigger(donkey, monkey).
We can query this program:
?- bigger(donkey, dog). Answer: Yes
?- bigger(monkey, elephant). Answer: No
?- bigger(elephant, monkey). Answer: No
The last answer is not what we want.
Let's add the following two rules, to define a new predicate is_bigger as transitive closure of
bigger (via recursion):
% Rules:
is_bigger(X, Y) :- bigger(X, Y).
is_bigger(X, Y) :- bigger(X, Z), is_bigger(Z, Y).
Now queries:
?- is_bigger(elephant, monkey). Answer: Yes

Principles of Programming Languages 47 Halil Özmen


Animals that are smaller than horse:
?- is_bigger(horse, X). Answer: X = donkey X = dog X = monkey
Animals that are smaller than horse, and bigger than donkey:
?- is_bigger(horse, X), is_bigger(X, donkey). Answer: None
(See bigger_01.pl)

Another Prolog Program Example: (see family_01.pl)


% Facts:
female(jane). female(mary). female(ann). female(pam).
male(paul). male(tom). male(peter). male(jim).
parent(jane,paul).
parent(tom,paul).
parent(tom,mary).
parent(paul,ann).
parent(paul,peter).
parent(peter,jim).
parent(pam,jim).
% Rules:
mother(X, Y) :- parent(X, Y), female(X).
father(X, Y) :- parent(X, Y), male(X).
haschild(X) :- parent(X, _).
sister(X, Y) :- parent(Z, X), parent(Z, Y), female(X), X \== Y.
brother(X, Y) :- parent(Z, X), parent(Z, Y), male(X), X \== Y.
grandmother(X, Y) :- female(X), parent(X, Z), parent(Z, Y).

Queries:
?- mother(jane, paul). ?- parent(paul, X).
?- mother(paul, ann). ?- sister(ann, X).
?- father(paul, ann). ?- sister(mary, X).
?- sister(ann, peter). ?- sister(x, peter).
?- brother(paul, mary). ?- grandmother(jane, X).
?- brother(tom, jane). ?- grandmother(X, peter).
?- grandmother(jane, ann). ?- grandmother(X, Y).
?- grandmother(tom, ann).

Trace: The "trace" command sets the debugger on, i.e. the intermediate steps are shown by the
Prolog compiler. To switch the debugger off, run "notrace" command.
?- trace.
The debugger will first creep -- showing everything (trace)
?- grandmother(jane, X).
1 1 Call: grandmother(jane,_23) ? Press Enter
2 2 Call: female(jane) ? here
2 2 Exit: female(jane) ?
3 2 Call: parent(jane,_116) ?
3 2 Exit: parent(jane,paul) ?
4 2 Call: parent(paul,_23) ?
4 2 Exit: parent(paul,ann) ?
1 1 Exit: grandmother(jane,ann) ?
a for all solutions
X = ann ? a
1 1 Redo: grandmother(jane,ann) ? ; for next solution
4 2 Redo: parent(paul,ann) ?
4 2 Exit: parent(paul,peter) ?
1 1 Exit: grandmother(jane,peter) ?
X = peter
yes
{trace}

Principles of Programming Languages 48 Halil Özmen


Exercises related with family relations: (Based on the above fact functors, write rules for ...)
• Write rule for child (X is child of Y). child(X, Y) :- ...
• Write rule for son (X is son of Y). son(X, Y) :- ...
• Write rule for daughter (X is daughter of Y). daughter(X, Y) :-
• Write rule for wife (X is wife of Y). wife(X, Y) :-
• Write rule for husband (X is husband of Y). husband(X, Y) :-
• Write rule for grandparent. grandparent(X, Y) :-
• Write rule for grandfather. grandfather(X, Y) :-
• Write rule for grandchild. grandchild(X, Y) :-
• Write rule for uncle. uncle(X, Y) :-
• Write rule for aunt. aunt(X, Y) :-
• Write rule for nephew (son of one's brother or sister). nephew(X, Y) :-
• Write rule for niece (daughter of one's brother or sister). niece(X, Y) :-

More Exercises:
• Write a Prolog program to compute factorial. ?- factorial(4, F). → F = 24

Lists:
Enclosed in square brackets, elements separated by comma (,).
[e1, e2, ... , en] % An n-element list in Prolog
[] % Empty list
[7, 4, p, 12, q, r, 2, 8]. % List may contain mixed types.
The head and tail of a list: The head is the first element and all the rest is tail.
The vertical bar (|) separates the head and tail parts.
?- [p,q,r,s] = [Head | Tail].
Head = p
Tail = [q,r,s]
[red, green, blue, purple, white] % can be written as:
[red | [green, blue, purple, white]]
[red | REST] % REST becomes [green, blue, purple, white]
Nested lists may be created: [a, b, [c, d, e], f, [g, h]]

List Unification: two lists unify iff they have the same structure and the corresponding elements unify.
?- [jane, likes, fish] = [P,Q,R].
P = jane Q = likes R = fish
?- [a, b, c] = [X | Y]. % X is a, and Y is [b,c]

Prolog Built-in List Predicates (Functions): (this is not the complete list)
length(L, N) Number of elements of a list length([2,4,6,7,8], N). %5
member(X, L) True if X is a member of list L. member(4, [2,4,8,6,7]) % true
last(L, X) True if X is the last element of L last([2,4,8,6,7], 4) % false
last([2,4,8,6,7], X) %X=7
nth0(N, L, X) True if X is N'th element of L (index nth0(3, [2,4,8,6,7], X). %X=6
starting from 0)
nth1(N, L, X) True if X is N'th element of L (index nth1(3, [2,4,8,6,7], X). %X=8
starting from 1)
append(L1, L2, L3) Append two lists to obtain a new list. append([7,4,8,2], [a,[b,8],4], L3).
L3 = [7,4,8,2,a,[b,8],4]
reverse(L1, L2) Reverses a list reverse([2,4,7,8,4], L2).
L2 = [4,8,7,4,2]
flatten(L1, L2) Flatten a nested list flatten([a, [b, c], d, e, [f, g, h]], L2).
L2 = [a,b,c,d,e,f,g,h]
numlist(A, B, List) Make a list from numbers starting numlist(4, 8, L).
with A to B (inclusive). L = [4, 5, 6, 7, 8]
Principles of Programming Languages 49 Halil Özmen
sum_list(List, Sum) Sum of all numbers in list of sum_list([7,4,2,8,4], S).
numbers S = 25
max_list(List, Max) Largest number in a list of numbers. max_list([7,4,2,8,4], M).
Fails if list is empty. M=8
min_list(List, Min) Smallest number in a list of min_list([7,4,2,8,4], M).
numbers. Fails if list is empty. M=2
bagof(T, G, L) Binds L to the list of all instances of stu(jane). stu(tom). stu(ann).
term T satisfying the goal G. students(L) :- bagof(S, stu(S), L).
% L = [jane, tom, ann]
setof(T, G, L) Binds L to the sorted list of all stu(jane). stu(tom). stu(ann).
unique instances of term T satisfying students(L) :- setof(S, stu(S), L).
the goal G. % L = [ann, jane, tom]
findall(T, G, L) Similar to bagof, used when the goal stuGradesW(Stu, Crs, GrWList) :-
is complex (with 2 or more findall((Gr, W), (exam(Crs, Exam,
predicates) W), grade(Stu, Crs, Exam, Gr)),
GrWList).

List Examples: (See "list_01.pl", "list_02.pl" and "list_03.pl" Prolog programs.)


• Write function list_length to get the length (number of elements) of a list:
list_length([],0).
list_length([_|TAIL],N) :- list_length(TAIL,NT), N is NT + 1.
Queries for this function:
?- list_length([8, 7, 1, 2, 3, 4], Len). % Len 6
?- list_length([jane, joe, ann, bob], Len). % Len 4

• Write function is_member to check for membership of a list:


is_member(X, [X|_]).
is_member(X, [_|REST]) :- is_member(X, REST).
In English: is_member(X,List) holds iff (a) X is the first element of List, or (b) X is a member of
the rest of List.
Queries for this Function :
?- is_member(b, [a,b,c]). % yes
?- is_member(d, [a,b,c]). % no
?- is_member(b, [a,[b,c]]). % no
?- is_member([b, c], [a, [b, c]]). % yes

• Write function list_sum to get sum of numbers in a list:


list_sum([], 0). % the sum of the empty list is zero:
% sum of the list with head H and tail T is N, if the sum of the list T is M and N is M+H.
list_sum([H | T], N) :- list_sum(T, M), N is M+H.
A query for this function:
?- list_sum([10, 20.4, 30, 40], N). % N 100.4

• Append a list L2 at the end of another list L1 and put the resultant list in L3.
% If L1 is empty, resultant list will be equal to L2 (base case).
append_list([], L2, L2).
append_list([H | T], L2, [H | L3]) :- append_list(T, L2, L3).
A query for this function:
?- append_list([1,2,3,4], [8,7,2], L). % Result: L [1, 2, 3, 4, 8, 7, 2]
Another query for this function:
?- append_list(L1, L2, [a,b,c]).
L1 = []
L2 = [a,b,c] ? a a for all solutions
L1 = [a] ; for next solution
L2 = [b,c]
Principles of Programming Languages 50 Halil Özmen
L1 = [a,b]
L2 = [c]
L1 = [a,b,c]
L2 = []

• Write function getNth to get the N'th element of a list, assuming indexes start from 1.
getNth([H | _], 1, H). % Base case.
getNth([_ | T], N, X) :- N > 1, N1 is N - 1, getNth(T, N1, X).
Queries for this function:
getNth([2,4,6,7,8], 1, X). % X = 2
getNth([2,4,6,7,8], 4, X). % X = 7
getNth([2,4,6,7,8], 6, X). % no

• Print all elements of a list: ?-print_list([a,b,c]). Output: a b c


print_list([]) :- nl. % nl = newline
print_list([H|T]) :- write(H), write(' '), print_list(T).

Exercises:
• Write a Prolog program to find the last element of a list. ?- last_list([a,b,c,d], X). → X = d
• Write a Prolog program to duplicate all elements of a list. ?- dupli([a,b], L2). L2 = [a,a,b,b]
• Write a Prolog program to check iff a list is subset of another list. (Hint: Use member function.)
?- subset(L1, L2). will be true iff all elements of L1 exist in L2.
• Write a Prolog program to check iff two lists are disjoint (i.e. they have no common elements).
(Hint: Use member function.)
?- disjoint(L1, L2). will be true iff there is no common element in L1 and L2.

Prolog execises with solutions:


• https://www.ic.unicamp.br/~meidanis/courses/mc336/2009s2/prolog/problemas/
• https://athena.ecs.csus.edu/~mei/logicp/exercises.html

Principles of Programming Languages 51 Halil Özmen


A. Regular Expressions

A regular expression (shortened as regex or regexp, sometimes referred to as rational expression) is a


sequence of characters that specifies a match pattern in text.
Regular Expressions are used to specify tokens.
They can also be used for input validation, or by string-searching algorithms for "find" or "find and
replace" operations on strings.

Set of strings that can be defined in terms of regular expressions are called
"regular languages".

Regular expressions: (special chars: | - . ? * + \ ^ [ ] ( ) $ / )


Abbrev. Equivalent Description or Meaning Examples
abc search for substring "abc" "abc", "abcd", "ktabc", "xyabcdef"
| Alternation (or); used with grouping character () s(e|i|alu)t --> set sit salut
(...) Grouping
[akt] a|k|t Set, search for one of "a", "k" or "t". "a", "k", "t", "as", te", "ok", "take"
[0-4] 0|1|2|3|4 Range. Search for one of "0", "1", "2", "3" or "4" [0-9] [A-Z] [A-Za-z] [A-Za-z0-9_]
[^...] true if string do not have any one of ... [^akt] : not have any one of a k t
. Any character a.i --> ali, aai, a+i, a6i, a>i, a*i
? Zero or one [i.e. optional) fa?st --> fst fast
* Zero or more fa*st --> fst fast faast faaast ...
+ One or more fa+st --> fast faast faaast ...
{n} preceeding item matched exactly n times [A-Z]{2}[0-9]{4}/ TS8014
{min,max} preceeding item matched min to max times [0-9]{1,3} --> 0 7 42 888 ...
{min,} preceeding item matched min or more times [A-Z]{2,} --> IN FOUR LIMITED
{,max} preceeding item matched up to max times v{,4} --> "" "v" "vv" "vvv" "vvvv"
\d [0-9] Any decimal digit [0123456789] \d \d\d \d* \d{4,6}
\D Any character that is not a decimal digit
\w [0-9A-Za-z_] Alphanumerics and underscore
\W Any "non-word" character
\s [ \t\n\r\f] Any Whitespace character
\S Any character that is not a whitespace char
\b At word boundary (at beginning or end of word) \ben or en\b or \ben\b
\t Tab
\n New line (CRLF)

More regular expression items in PHP:


Equivalent Description or Meaning Examples
/.../i Case insensitive /at/i --> at At aT AT
^... Starts with ^ta --> ta tar targ target ...
...$ Ends with ali$ --> ali mali somali ...

Exercise:
Write a regular expression that defines the language of all decimal numbers like 3.14 -0 4722 +2.75 ...
But not numbers lacking integer part, and not numbers with a decimal point but lacking fractional part.
So, not numbers like 47. .274 . Leading and trailing zeros are allowed: 007 0.0 008.00 2.700

Principles of Programming Languages 52 Halil Özmen


Example regular expression definitions for scanning:
DIGIT: [0-9] PLUS: '+'
LETTER: [A-Za-z] MINUS: '-'
ALPHANUMERIC: [A-Za-z0-9] MULT: '*'
IDENTIFIER: (LETTER|_)(ALPHANUMERIC|_)* DIV: '/'
OP: ('+'|'-'|'*'|'/') NEWLINE: '\n'
ASSIGN: "=" NONNEWLINE [^\n]
LPARAN: '(' NONSTAR: [^\*]
RPARAN: ')' NONSTARORDIV: [^\*/]
INTEGER: MINUS? DIGIT+ WHITESPACE: [ \t\n\r]
NUMBER: MINUS? (DIGIT+ | (DIGIT* '.' DIGIT+) | (DIGIT+ '.' DIGIT*))
COMMENT: ("//" NONNEWLINE* NEWLINE)
| ("/*" (NONSTAR | '*'+ NONSTARORDIV)* "*/")

Principles of Programming Languages 53 Halil Özmen


B. OCaml
OCaml is a dialect of ML.
OCaml programs can be both interpreted and compiled.
The interpreter is a so-called REPL, a read-eval-print-loop. It reads what we type, evaluates our input,
prints the results on the screen, then waits for the user’s next input.

Basics:
Everything has a type and evaluates to a value.
OCaml definitions (statements) ends with double semi-colon ";;". let e = 2.718281;;
Comments: (* .... *)
Documentation comments: (** .... *)

Toplevel:
The toplevel is like a calculator or command-line interface to OCaml.
The toplevel is handy for trying out small pieces of code without going to the trouble of launching
the OCaml compiler. The toplevel can be called REPL, which stands for read-eval-print-loop: it
reads programmer input, evaluates it, prints the result, and then repeats.
In a terminal window, type utop to start the toplevel. Press Control-D (or enter command
"#quit;;") to exit the toplevel.
Creating, compiling, and testing large programs will require more powerful tools.
Online OCaml toplevel: https://try.ocamlpro.com/
Online OCaml IDE: https://www.tutorialspoint.com/compile_ocaml_online.php
https://coderpad.io/languages/ocaml/
https://ocaml.org/play (difficult to clear output window)

Declaration: let
Declaration is used to "bind" a value to a name. The association of a name with a value is a
"binding". Declarations are made using the let keyword. After a declaration is made, the bound
name can be used when declaring other names and in subsequent expressions.
The word "name" is deliberately used, not "variable". This is because in OCaml, once bound, the
value of a name cannot be changed.
let n = 100;; (* val n : int = 100 *)
let pi = 3.1415926535;; (* val pi : float = 3.1415926535 *)
let city = "Antalya";; (* val city : string = "Antalya" *)
let em = '!';; (* val em : char = '!' *)

Data Types: int, float, char, string, bool


• int operators: + - * /
4 5+2 14 / 10
• float operators: +. -. *. /.
3.14159 1. + 3.2 2.718281 * 4.2 72. /. 0.0048
• char delimited by single-quote '
'A' '8' '!' '('
• string delimited by " concatenation operator: ^
"Galaxy" "Peace" ^ " in " ^ "Galaxy"
"Galaxy" ^ String.make 1 '!';; (* concat a char to string. "Galaxy!" *)
String.make 4 'A';; (* Make string by repeating char. "AAAA" *)
let s = "Peace in Galaxy!";; (* val s : string = "Peace in Galaxy!" *)
String.sub s 4 7;; (* - : string = "e in Ga" *)
• bool true false
4 <> 5 (* - : bool = true *)
"good" = "very good" (* - : bool = false *)

Principles of Programming Languages 54 Halil Özmen


Operators
int arithmetic operators: + - * / mod
float arithmetic operators: +. -. *. /.
string concatenation operator: ^
comparison operators: = <> > >= < <=
logical operators: && || not

let a = 2 + 4;; (* val a : int = 6 *)


let x = 4.2 *. 7.1;; (* val x : float = 29.82 *)
let city = "Ant" ^ "alya";; (* val city : string = "Antalya" *)
let p = (4 <> 7) && (8 > 6);; (* val p : bool = true *)

if
Syntax: if e1 then e2 else e3
OCaml if is similar to (e1 ? e2 : e3) in C. If e1 is true, then it evaluates to e2, otherwise to e3.
let a = 4;; (* val a : int = 4 *)
let b = a * 2 - 7;; (* val b : int = 1 *)
if a > b then 10 else 20;; (* - : int = 10 *)
if not(a > b) then 10 else 20;; (* - : int = 20 *)
let c = if a > b then 10 else 20;; (* val c : int = 10 *)
let d = if a >= b then a else b;; (* val d : int = 4 *)

Type Conversions
Conversion Function Example Toplevel output
let x = 8.7;; val x : float = 8.7
float to int int_of_float
let n = int_of_float x;; val n : int = 8
let k = -7;; val k : int = -7
int to float float_of_int
let x = float_of_int k;; val x : float = -7.
let z = "-124";; val z : string = "-124"
string to int int_of_string
let n = int_of_string z;; val n : int = -124
let spi = "3.14159";; val spi : string = "3.14159"
string to float float_of_string
let pi = float_of_string spi;; val pi : float = 3.14159
let n = 360;; val n : int = 360
int to string string_of_int
let sn = string_of_int n;; val sn : string = "360"
let e = -2.718281;; val e : float = -2.718281
float to string string_of_float
let se = string_of_float e;; val se : string = "-2.718281"
let c = 'E';; val c : char = 'E
char to string String.make 1 ch
let sc = String.make 1 c;; val sc : string = "E"

Scope
Variable declarations in OCaml bind variables within a scope, the part of the program where the
variable stands for the value it is bound to. For example, when we write let x = e1 in e2,
the scope of the identifier x is the expression e2. Within that scope, the identifier x stands for
whatever value v the expression e1 evaluated to. Since x = v, OCaml evaluates the let
expression by rewriting it to e2, but with the value v substituted for the occurrences of x. For
example, the expression let x = 2 in x + 3 is evaluated to 2 + 3, and then the result value is 5.

let x = 4 + 3 in x * 2;; (* - : int = 14 x is undefined outside*)


x;; (* Error: Unbound value x *)
let x = 7;; (* val x : int = 7 x is defined; *)

Tuples
Tuples are a sequence of values that may be of different types, and they are separated by
commas. They may be enclosed in parenthesis.
4, 8, 7;; (* - : int * int * int = (4, 8, 7) *)
(4, 8, 7);; (* - : int * int * int = (4, 8, 7) *)
Principles of Programming Languages 55 Halil Özmen
4, 7.2, "good";; (* - : int * float * string = (4, 7.2, "good") *)
(4, 7.2, "good");; (* - : int * float * string = (4, 7.2, "good") *)
(4, 2.8), "nice";; (* - : (int * float) * string = ((4, 2.8), "nice") *)
((4, 2.8), "nice");; (* - : (int * float) * string = ((4, 2.8), "nice") *)

let tp2 = (4, 7);; (* Define a name for tuple *)


let tp3 = (4, "Hi", 2.71828);; (* Define a name for tuple *)
(* Decompose tuple to names *)
let (a, b, c) = tp3;; (* val a : int = 4 val b : string = "Hi" val c : float = 2.71828 *)

Define tuple type:


type time = int * int * int;;
let t : time = (10, 42, 7);;
type student = int * string * string;;
let st1 : student = (202001942, "Jane Fonda", "CS");;
let st2 : student = (202001884, "Albert Einstein", "CS");;

Tuples can be nested:


let d = ((1, "wr", 7.7), (9.6, 7), "w");;
d;; (* - : (int * string * float) * (float * int) * string = .... *)
let (p, (_, s), _) = d;; (* _ matches anything *)
p;; (* int * string * float = (1, "wr", 7.7) *)
s;; (* - : int = 7 *)

Functions
A function definition has the following form:
let f x = .... where f is the function name and x is its argument.
let f x y z = .... where f is the function name and x, y and z are arguments.

Primitive operators are in fact functions:


(+);; - : int -> int -> int = <fun>
(+.);; - : float -> float -> float = <fun>
(-);; - : int -> int -> int = <fun>
( * );; - : int -> int -> int = <fun>
( *. );; - : float -> float -> float = <fun>
(^);; - : string -> string -> string = <fun>
(<);; - : 'a -> 'a -> bool = <fun>

Nameless (or anonymous) functions:


When an anonymous function is defined, the function has no name and keyword fun is used.
(fun n -> n * 2) 8;; (* - : int = 16 *)
(fun x -> x *. 2.) 7.2;; (* - : float = 14.4 *)

Currying:
When a function has multiple arguments, the function consumes one argument at a time. This is
called currying the function.
let mult a b = a * b;;
val mult : int -> int -> int = <fun>
let mult a = (fun b -> a * b);;
val mult : int -> int -> int = <fun>
mult 4;;
- : int -> int = <fun>
(mult 4) 3;;
- : int = 12
Principles of Programming Languages 56 Halil Özmen
mult 4 3;;
- : int = 12

Curried vs Uncurried:
let addThree a b c = a + b + c;; (* Curried *)
val addThree : int -> int -> int -> int = <fun>
addThree 4 7 5;;
- : int = 16

let addTriple (a, b, c) = a + b + c;; (* Uncurried *)


val addTriple : int * int * int -> int = <fun>
addTriple (4, 7, 5);;
- : int = 16

Functions as Arguments:
Functions can be arguments of other functions.
let add2 n = n + 2;;
val add2 : int -> int = <fun>
let thrice f x = f(f(f(x)));;
val thrice : ('a -> 'a) -> 'a -> 'a = <fun>
let thrice f x = f(f(f x));;
thrice add2 4;;
- : int = 10 (* What is 'a in this case? *)
thrice (fun s -> "Hi! " ^ s) "Jane";;
- : string = "Hi! Hi! Hi! Jane" (* What is 'a in this case? *)

Functions Returning Functions:


let div2 n = n / 2;;
let mul3add1 n = n * 3 + 1;;
let func1 b = if b then div2 else mul3add1;;
let g = func1 true;; (* func1 returns div2 *)
g 88;; (* here div2 is used *)
let h = func1 false;; (* func1 returns mul3add1 *)
h 77;; (* here mul3add1 is used *)

Function Exercises:
Define an OCamle function halfF that gets an int and evaluates to the double of its argument.
let halfF n = float_of_int n /. 2.;;
val halfF : int -> float = <fun>
halfF 7;;
- : float = 3.5

Define an OCamle function squareF that gets a float and evaluates to the square of its
argument.
let squareF x = x *. x;;
val squareF : float -> float = <fun>
squareF 2.5;;
- : float = 6.25

Define an OCamle function max that gets two arguments (of any type), and evaluates to the
maximum of its arguments.
let max x y = if x >= y then x else y;;
val max : 'a -> 'a -> 'a = <fun>
max 4 7;;
- : int = 7
max 44.8 24.8;;
- : float = 44.8
max "Tale" "abcd";;
- : string = "abcd" (* Because 'A' ... 'Z' < 'a' ... 'z' *)
Principles of Programming Languages 57 Halil Özmen
Remark: OCaml has already built-in functions max and min that gets two arguments and
evaluates to the maximum and minimum of their two arguments.

Define an OCaml function revpair that takes a pair (tuple of two elements) and returns its
reverse.
let revpair p =
let (a, b) = p
in (b, a);;
val revpair : 'a * 'b -> 'b * 'a = <fun>
revpair (4, 7);; (* - : int * int = (7, 4) *)
revpair ("Hi", 7.4);; (* - : float * string = (7.4, "Hi") *)

Lists
An OCaml list is an ordered sequence of values all of which have the same type. They are
implemented as singly-linked lists.
[] Empty list.
[e1; e2; ...; en] Lists elements are written in [] and separated by semi-colons.
e1 :: e2 :: ... :: en :: [ ] e1, e2, ..., en are elements.
e1 is the head element and e2 is the tail (the rest).
e1 :: e2 or h :: t
Usually used in pattern matching.

Length of list: List.length [8;7;4;2];; (* - : int = 4 *)


Append to beginning: 7 :: [1;2;3];; (* - : int list = [7; 1; 2; 3] *)
Append to end: [1;2;3] @ [7];; (* - : int list = [1; 2; 3; 7] *)
Append two lists: [7;8;9] @ [1;2;3];; (* - : int list = [7; 8; 9; 1; 2; 3] *)
List type:
All elements of a list must have the same type. If element type is t, then the type of the list is
t list. It should be read from right to left: t list is read as a list of t's. "t list list" is a list of lists of t's.
In the rule for [ ], recall that 'a is a type variable: it stands for an unknown type.
[] : 'a list List of alpha types ('a is read alpha)
[4; 7; 2; 8; 5; 4; 6];; (* - : int list = [4; 7; 2; 8; 5; 4; 6] *)
[4.; 7.2; 8.5; 4.68];; (* - : float list = [4.; 7.2; 8.5; 4.68] *)
['T'; '('; '8'; ' '; '+'; 'e'];; (* - : char list = ['T'; '('; '8'; ' '; '+'; 'e'] *)
["peace"; "in"; "galaxy"; "!"];; (* - : string list = ["peace"; "in"; "galaxy"; "!"] *)
7 :: [4; 2; 8];; (* - : int list = [7; 4; 2; 8] *)
24 :: 40 :: 7 :: 12 :: [];; (* - : int list = [24; 40; 7; 12] *)
24 :: 40 :: 7 :: 12 :: [8; 4];; (* - : int list = [24; 40; 7; 12; 8; 4] *)
[4; 2; 8] :: 7;; (* Error: ........ expected of type int list list *)
[2, 7];; (* - : (int * int) list = [(2, 7)] *)
[(2, 7); (8, 4)];; (* - : (int * int) list = [(2, 7); (8, 4)] *)
[(2, 7); (8, 4, 6)];; (* Error: This expression has type 'a * 'b * 'c but .... expected type int * int *)
[("Jane", 84); ("Ali", 76)];; (* - : (string * int) list = [("Jane", 84); ("Ali", 76)] *)
[[4; 7]; [2; 8; 5]; []; [6; 3; 7; 2]];; (* - : int list list = [[4; 7]; [2; 8; 5]; []; [6; 3; 7; 2]] *)
[[4; 7]; [2.4; 8.]; [6; 3; 7; 2]];; Error: This expression has type float but expected of type int
[("Jane", 1936, 168); ("Peter", 1982, 194)];; (* - : (string * int * int) list = [(....); (....)] *)
[("Jane", 1936, 168); ("Peter", 1982, 194); (1974, "Alpha", 88)];;
Error: This expression has type int but an expression was expected of type string
List.of_seq (String.to_seq "Dune Trilogy");; (* Create list from chars of string *)
- : char list = ['D'; 'u'; 'n'; 'e'; ' '; 'T'; 'r'; 'i'; 'l'; 'o'; 'g'; 'y']

Principles of Programming Languages 58 Halil Özmen


Pattern Matching
Pattern matching applies to values. It is used to recognize the form of this value and lets the
computation be guided accordingly, associating with each pattern an expression to compute.
General structure:
match something with
| pattern1 -> result1
| pattern2 -> action; result2 (* after action, result2 *)
| pattern3 -> result3

Define an OCaml function isEmpty that gets a list as argument and evaluates to true if the given
list is empty, and to false if the list is not empty.
let isEmpty lst =
match lst with
| [] -> true
| _ :: _ -> false;;
val isEmpty : 'a list -> bool = <fun>
(* or: *)
let isEmpty = function
| [] -> true
| _ :: _ -> false;;
val isEmpty : 'a list -> bool = <fun>

isEmpty [];;
- : bool = true
isEmpty [7];;
- : bool = false
isEmpty [7; 4; 8; 2];;
- : bool = false

Define an OCaml function equal1st2nd that gets a list as argument and evaluates to true if the
first two elements of the list are equal, and false in not equal. If the list has less than two
elements, the function will evaluate to false.
let equal1st2nd lst =
match lst with
| [] -> false
| [x] -> false
| x :: y :: _ -> if x = y then true else false;;
val equal1st2nd : 'a list -> bool = <fun>

equal1st2nd [];;
- : bool = false
equal1st2nd [7];;
- : bool = false
equal1st2nd ["a"; "the"; "a"];;
- : bool = false
equal1st2nd [2.4; 2.4; 4.; 7.8];;
- : bool = true

(* Solution-2: *)
let equal1st2nd lst =
match lst with
| x :: y :: _ -> if x = y then true else false
| _ -> false;;

Principles of Programming Languages 59 Halil Özmen


Recursion
In OCaml, recursion is no harder than using loop.

Forward Recursion
In forward recursion, the function recursively first calls on all recursive components, and then
builds the final result from the partial results.
I.e.: Wait until the whole structure has been traversed (recursively) to start building the answer.
let rec sum n = if n = 0 then 0 else n + sum1n (n-1);;
sum 3;;
sum 3
3 + sum 2
Call Stacks 2 + sum 1
While a program runs, there is a call stack of function calls that 1 + sum 0
have started but not yet returned. 0
• Calling a function f pushes an instance of f on the stack 1 + 0
2+1
(with the return point in the program), 3+3
• When a call to f finishes, it is popped from the stack. 6
These stack-frames store information such as the value of local
variables and "what is left to do" in the function.
Due to recursion, multiple stack-frames may be calls to the same function.

Tail Recursion
• A recursive function is tail-recursive if all recursive calls are the last
thing that the function does.
• Tail recursion generally requires extra "accumulator" arguments to
pass partial results.
• May require an auxiliary function!
• The general idea is to write your recursive function such that the
value returned by the recursive call is what’s returned by your
function
• i.e., there’s no pending operation in the function waiting for
the value returned by the recursive call.
• That way, the function can say, ”Don’t bother with me anymore, just
take the answer from my recursive call as the result. You can just
forget all of my state information.”

Principles of Programming Languages 60 Halil Özmen


Why do we care?
Reusing the stack frame of the tail-recursive function is known as the tail call optimization.
It is an automatic optimization applied by the compilers and interpreters.

Experiment:
Write a function that takes a value x and an integer n, and returns a list of length n whose
elements are all x.
let rec makeList x n =
if n = 0 then []
else x :: makeList x (n-1);;
val makeList : 'a -> int -> 'a list = <fun>
makeList "hi" 5;;
- : string list = ["hi"; "hi"; "hi"; "hi"; "hi"]
makeList 7 4;;
- : int list = [7; 7; 7; 7]
makeList 7 1234567;;
Stack overflow during evaluation (looping recursion?).

let rec makeListT x n acc=


if n = 0 then acc
else makeListT x (n-1) (x::acc);;
val makeListT : 'a -> int -> 'a list -> 'a list = <fun>
makeListT "hi" 5 [];;
- : string list = ["hi"; "hi"; "hi"; "hi"; "hi"]
makeListT 7 4 [];;
- : int list = [7; 7; 7; 7]
makeListT 7 1234567 [];;
- : int list =
[7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7;
7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7;
7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7;
7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7; 7;
...]
So, there really is a difference between tail vs. forward recursion.

How to write tail recursive functions?


To write tail-recursive functions, we need to answer the following question:
What information do I need to pass from the caller to the callee (i.e. from the lower stack frame
to the upper stack frame) so that I won't need the caller again, and can simply throw it away?
let rec squareF lst =
match lst with
| [] -> []
| h :: t -> (h*h) :: squareF t;;
val squareF : int list -> int list = <fun>

squareF [1;2;3;4];;
- : int list = [1; 4; 9; 16]

let rec squareT lst acc =


match lst with
| [] -> acc
| h :: t -> squareT t (acc @ [h*h]);;
val squareT : int list -> int list -> int list = <fun>

squareT [1;2;3;4] [];;


- : int list = [1; 4; 9; 16]

Principles of Programming Languages 61 Halil Özmen


Convert the following functions to tail-recursive form.
Forward Recursive Function Tail Recursive Function
let rec factorial n = let rec factorial n acc =
if n = 0 then 1 if n = 0 then acc
else n * factorial (n - 1);; else factorial (n - 1) (n * acc);;
val factorial : int -> int -> int
factorial 4 1;; (* - : int = 24 *)
let rec power x n = let rec power x n acc =
if n = 0 then 1 if n = 0 then acc
else x * power x (n - 1);; else power x (n - 1) (x * acc);;
val power : int -> int -> int -> int = <fun>
power 2 10 1;; (* - : int = 1024 *)
let rec fib n = let rec fib n nm1 nm2 =
if n = 0 then 1 if n = 0 then nm2
else if n = 1 then 1 else if n = 1 then nm1
else fib(n - 1) + fib(n-2);; else fib(n - 1) (nm1 + nm2) nm1;;
val fib : int -> int -> int -> int = <fun>
fib 0 1 1;; (* - : int 1 *)
fib 1 1 1;; (* - : int 1 *)
fib 2 1 1;; (* - : int 2 *)
fib 3 1 1;; (* - : int 3 *)
fib 8 1 1;; (* - : int 34 *)

Better Programming of Tail Recursive Functions


Use an auxiliary function to hide the accumulator from the user:
let factorial n =
let rec fact m acc =
if m = 0 then acc
else factorial (m - 1) (m * acc)
in fact n 1;;
val factorial : int -> int = <fun>

let fib n =
let rec fibo n nm1 nm2 =
if n = 0 then nm2
else if n = 1 then nm1
else fib(n - 1) (nm1 + nm2) nm1
in fibo n 1 1;;
val fib : int -> int = <fun>

Recursive Functions With Pattern Matching


A recursive function is a function that calls itself.
In OCaml, rec keyword is used to define a recursive function.

Define an OCaml function factorial to compute the factorial of an int number. *)


let rec factorial n =
match n with
| 0 -> 1
| m -> m * factorial (m - 1);;
val factorial : int -> int = <fun>

factorial 0;; factorial 10;;


- : int = 1 - : int = 3628800

Define an OCaml function printMult that gets an integer n and a string to print the string n times
on different lines.
let rec printMult n s =
match n with
Principles of Programming Languages 62 Halil Özmen
| 0 -> ()
| _ -> print_endline s; printMult (n-1) s;;
val printMult : int -> string -> unit = <fun>

printMult 4 "Peace in Galaxy!";;


Peace in Galaxy!
Peace in Galaxy!
Peace in Galaxy!
Peace in Galaxy!
- : unit = ()

Recursive Functions With Pattern Matching Related With Lists


The list is matched with the possible patterns. Usually the first pattern is [] (empty list).
Function to compute the sum of the elements of an int list:
let rec sumList lst =
match lst with
| [] -> 0
| h :: t -> h + sumList t;;
val sumList : int list -> int = <fun>

sumList [];;
- : int = 0
sumList [7; 1; 4];;
- : int = 12

Folding Left and Folding Right


fold_left is a very important function that is used frequently.
fold_left f a [x1; x2; ...; xn] evaluates to
f( ...(f (f x x1) x2) ...)
So, f takes two arguments: (1) the accumulated value over the list from the left, (2) the current
element of the list. a is the initial value of accumulation which is also the result if the list is empty.
let rec fold_left f a lst =
match lst with
| [] -> a
| h :: t -> fold_left f (f a h) t;;
val fold_left : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a = <fun>

fold_left ( * ) 1 [2; 3; 7];;


- : int = 42

fold_right is another very important function that is used frequently.


fold_right f a [x1; x2; ...; xn] evaluates to
f x1 (f x2 (...(f xn a)...))
So, f takes two arguments: (1) the current element of the list, (2) the accumulated value over the
list from the right. a is the initial value of accumulation which is also the result if the list is empty.
let rec fold_right f lst a =
match lst with
| [] -> a
| h :: t -> f h (fold_right f t a);;
val fold_right : ('a -> 'b -> 'b) -> 'a list -> 'b -> 'b = <fun>

fold_right (+) [4; 7; 8] 0;;


- : int = 19
fold_right ( * ) [2; 3; 7] 1;;

Principles of Programming Languages 63 Halil Özmen


- : int = 42

fold_left and fold_right are already defined in List library!


List.fold_left;;
- : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a = <fun>
List.fold_right;;
- : ('a -> 'b -> 'b) -> 'a list -> 'b -> 'b = <fun>

let lst = [1;0;9;0;7;4;0];;


List.fold_right (fun a b -> if a=0 then b else a::b) lst [];;
- : int list = [1; 9; 7; 4]

Reverse a list by using fold_left and fold_right:


List.fold_left (fun a b -> ????) [] [1;2;3;4;5;6];;
List.fold_right (fun a b -> ????) [1;2;3;4;5;6] [];;
Solutions:
List.fold_left (fun a b -> b :: a) [] [1;2;3;4;5;6];;
- : int list = [6; 5; 4; 3; 2; 1]
List.fold_right (fun a b -> b @ [a]) [1;2;3;4;5;6] [];;
- : int list = [6; 5; 4; 3; 2; 1]

Defining Own Data Types


Users can define custom data types by using type keyword and by specifying the constructors.
Data type constructor names start with uppercase letter. Can do pattern matching.

type weekday = Monday | Tuesday | Wednesday | Thursday | Friday |


Saturday | Sunday;;

Monday;
- : weekday = Monday
let today = Thursday;;
val today : weekday = Thursday

let day_after day =


match day with
| Monday -> Tuesday | Tuesday-> Wednesday | Wednesday -> Thursday
| Thursday -> Friday | Friday -> Saturday | Saturday -> Sunday
| Sunday-> Monday;;
val day_after : weekday -> weekday = <fun>
day_after Tuesday;;
- : weekday = Wednesday
day_after today;;
- : weekday = Friday

Exercise: Write a function named isweekend: weekday -> bool.

Exercise: Define a data type shape, that can be a circle, square or a triangle. Circle has a
radius, square has a side length, and triangle has three sides (all float).
type shape =
Circle of float
| Square of float
| Triangle of float * float * float;;

let c = Circle 7.2;;


val c : shape = Circle 7.2
let s = Square 6. and t = Triangle (2., 3., 4.);;
val s : shape = Square 6.
Principles of Programming Languages 64 Halil Özmen
val t : shape = Triangle (2., 3., 4.)

let area s =
match s with
| Circle r -> 3.14159 *. r *. r
| Square a -> a *. a
| Triangle (a,b,c) -> let s = (a +. b +. c) /. 2.
in sqrt(s *. (s -. a) *. (s -. b) *. (s -. b));;
val area : shape -> float = <fun>

area c;;
- : float = 162.8600256
area s;;
- : float = 36.
area (Triangle(3., 4., 5.));;
- : float = 8.48528137423857

Exercise: Given this new data type:


type personalInfo = Address of int * string
| Phone of string
| Age of int;;
Define function street (personalInfo -> string) that returns the street name for an address, and
the empty string for any other kind of value.
let street p = match p with
| Address (num, streetname) -> streetname
| _ -> "";;

let home = Address (26, "Agac");;


val home : personalInfo = Address (26, "Agac")
street home;;
- : string = "Agac"
let myphone = Phone "2422450278";;
val myphone : personalInfo = Phone "2422450278"
street myphone;;
- : string = ""

Recursive Data Types


Binary tree data structure:
type tree = Leaf of int
| Node of (tree * tree);;

let myTree = Node(Node(Leaf 4, Node(Leaf 5, Leaf 8)),


Node(Leaf 9, Leaf 12));;

Write a function contains that evaluates to true if a binary tree contains a given value.
let rec contains t n =
match t with
| Leaf i -> i = n
| Node (t1, t2) -> contains t1 n || contains t2 n;;
val contains : tree -> int -> bool = <fun>
contains myTree 6;;
- : bool = false
contains myTree 8;;
- : bool = true

Write a function flatten that traverse a binary tree and creates a list.
let rec flatten t =
match t with
Principles of Programming Languages 65 Halil Özmen
| Leaf n -> [n]
| Node(t1,t2) -> flatten t1 @ flatten t2;;
let myTree = Node(Node(Leaf 4, Node(Leaf 5, Leaf 8)),
Node(Leaf 9, Leaf 12));;
flatten myTree;;
- : int list = [4; 5; 8; 9; 12]

Write a function mirror that mirrors a binary tree (reverses left and right).
let rec mirror t =
match t with
| Leaf n -> Leaf n
| Node(t1, t2) -> Node(t2, t1);;

myTree;;
- : tree = Node (Node (Leaf 4, Node (Leaf 5, Leaf 8)), Node (Leaf
9, Leaf 12))
mirror myTree;;
- : tree = Node (Node (Leaf 9, Leaf 12), Node (Leaf 4, Node (Leaf
5, Leaf 8)))

Write a mapping function for binary tree.


let rec treeMap f t =
match t with
| Leaf n -> Leaf (f n)
| Node(t1, t2) -> Node(treeMap f t1, treeMap f t2);;
val treeMap : (int -> int) -> tree -> tree = <fun>

treeMap (fun n -> n*2) myTree;;


- : tree = Node (Node (Leaf 8, Node (Leaf 10, Leaf 16)), Node
(Leaf 18, Leaf 24))

Write a function treeSum that evaluates to the sum of all values in a binary tree.
let rec treeSum t =
match t with
| Leaf n -> n
| Node(t1, t2) -> treeSum t1 + treeSum t2;;
val treeSum : tree -> int = <fun>

treeSum myTree;;
- : int = 38

Polymorphic Data Types


Polymorhic data types are homogeneous. (E.g. Node(4, Leaf 'x', Leaf 6) gives error.
type 'a tree = Leaf of 'a
| Node of ('a * 'a tree * 'a tree);;

let intTree = Node(4, Node(7, Leaf 7, Leaf 2),


Node(3, Leaf 9, Node(6, Leaf 12, Leaf 6)));;
let flTree = Node(3.14, Leaf 7.2, Node(2.7, Leaf 4.2, Leaf 3.6));;
let chTree = Node('a', Node('b', Leaf 'c', Leaf 'd'), Leaf 'e');;
let strTree = Node("a", Node("in", Leaf "at", Leaf "the"), Leaf
"on");;
a
let rec size t =
match t with in on
| Leaf a -> 1
at the
Principles of Programming Languages 66 Halil Özmen
| Node(a, t1, t2) -> 1 + size t1 + size t2;;

size intTree;;
- : int = 9
size strTree;;
- : int = 5

let rec flatten t =


match t with
| Leaf a -> [a]
| Node(a, t1,t2) -> flatten t1 @ [a] @ flatten t2;;

flatten strTree;;
- : string list = ["at"; "in"; "the"; "a"; "on"]

Write a function contains that evaluates to true if a tree contains a given value.
let rec contains t x =
match t with
| Leaf a -> a == x
| Node(a,t1,t2) -> a == x || contains t1 x || contains t2 x;;

Printing
Printing has no usefull meaning at the toplevel. It is meaningfull when an Ocaml program is run.
print_int 7;; print_string "Hello World!";; print_string "\n";;
print_float 2.7182;; print_endline "Hello World!";; print_endline "";;
print_char 'T';;

Printf.printf "%d: %s %F\n" 24 "Jane" 8.7;;


Printf.printf "%i: %s %F\n" 24 "Jane" 8.74;; (* %i or %d for int *)
Printf.printf "%d: %s %f\n" 24 "Jane" 8.7;;
Printf.printf "%04d: %-10s %10.2f\n" 24 "Jane" 8.7;;
Printf.printf "%04i: %-10s %10.2f\n" 824 "Albert" 20.647;;

Output:
24: Jane 8.7
24: Jane 8.74
24: Jane 8.700000
0024: Jane 8.70
0824: Albert 20.65

Define an OCaml function printInt that prints an integer number and move to next line.
let printInt n = print_int n; print_endline "";;

Define an OCaml function printis that prints an integer number followed by a string.
let printis n s = print_int n; print_string s;;

printis 77 "\n";; (* prints int and go to next line! *)


printis 77 " ";; (* prints int, followed by space *)

Principles of Programming Languages 67 Halil Özmen


Define an OCaml function print1n that takes an int and a string as arguments, and prints int
numbers from 1 to n separated by the string.
Use printis define above. *)
let rec print1n n s = ????;;
val print1n : int -> string -> unit = <fun>

print1n 10 "\n";; (* prints 1 to 10 on separate lines *)


print1n 10 " ";; (* prints 1 to 10 separated by space *)

(* Solution: *)
let rec print1n n s =
match n with
| 0 -> ()
| v -> print1n (v-1) s; printis v s;;

Input
Input has no usefull meaning at the toplevel. It is usefull when an Ocaml program is run.

let s = read_line ();; (* read whole line as string *)


let n = read_int ();; (* read int *)
let x = read_float ();; (* read float *)
print_endline s;; (* print string *)
print_int (n * 2);; (* print double of input int *)
print_endline "";;
print_float (x /. 2.);; (* print half of input float *)
print_endline "";;

Exercises:
1. Provide values (other than empty list) to form lists of given types.
????;; ????;;
- ; int list - ; string list list
????;; ????;;
- ; int list list - ; (int * string list) list
????;; ????;;
- ; (int * float) list - ; (int * string list) list list
????;;
- ; (int * string) list

2. Convert the following functions to tail recursive form.


let rec reverse lst = let reverse lst =
match lst with let rec aux lst acc =
| [] -> [] match lst with
| h :: t -> rev t @ [h];; | [] -> acc
| h :: t -> aux t (h :: acc)
let rec cntZeros lst =
in aux lst [];;
match lst with
| [] -> 0
| 0 :: t -> 1 + cntZeros t;;
| h :: t -> cntZeros t;;

Principles of Programming Languages 68 Halil Özmen

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy