Unit 1 2
Unit 1 2
prints 1
prints 2
1.6.6 Parameter Passing Mechanisms
• Actual Parameters Used in the call of a procedure
• Formal Parameters Used in the procedure definition
• Example:
main() { func(a,b)} void func(int a,
int b) { }
Actual Parameters Formal Parameters
1) Call by value
2) Call by reference
Call by value
• A copy of the actual parameter's value is passed to the
function.
• Changes made inside the function do not affect the
original variable.
• Example:
def The function modified only the copy, not the
modify(x):
original a.
x = x+5
print(“Inside
function:”, x)
a=10
modify(a)
print(“Outside function:”,
a)
Call by reference
• One token for each keyword. The pattern for a keyword is the same as the keyword itself
• comparison is a token representing all the comparison operators like <=, >=, etc. The specific operators are
the lexemes belonging to the comparison token
• id (identifier) is a token representing all the identifiers which start from letters followed by letters and digits.
The “pi”, “score”, “D2” are the specific lexemes belonging to the identifier (id) token.
• number is a token matching all the numeric values like 3.14, etc. Each numeric value is a lexeme belonging
to the number token.
• literal is a token matching all the lexemes starting with “ and ending with”/
2.1.1 TOKENS, PATTERNS, LEXEMES
• A token is a pair consisting of a token name and an optional attribute value. The
token name is an abstract symbol representing a kind of lexical unit, e.g., a
particular keyword, or a sequence of input characters denoting an identifier. The
token names are the input symbols that the parser processes.
• A pattern is a description of the form that the lexemes of a token may take. In the
case of a keyword as a token, the pattern is just the sequence of characters that
form the keyword. For identifiers and some other tokens, the pattern is a more
complex structure that is matched by many strings.
• A lexeme is a sequence of characters in the source program that matches the
pattern for a token and is identified by the lexical analyzer as an instance of that
token.
• TOKEN represents category like operators, keywords, etc.
• PATTERN is a regular expression which describes lexemes
• LEXEME is the actual sequence of characters from the source code that
matches the pattern
2.1.1 TOKENS, PATTERNS, LEXEMES
• printf ("Total = %d\n”, score) ;
• both printf and score are lexemes matching the pattern for token id, and
• "Total = %d\n” is a lexeme matching literal token
• When multiple lexemes match a pattern, the lexical analyzer provides extra information to help
the compiler identify the specific lexeme.
• For example, both 0 and 1 match the number token pattern, but the exact lexeme found is
important for code generation.
• The lexical analyzer returns both the token name (for parsing) and an attribute value (to
describe the lexeme for later translation).
2.1.2 TOKENS, PATTERNS, LEXEMES
• Example 3.2 : The token names and associated attribute values for the Fortran
statement are written below as a sequence of pairs.
• E = M * C ** 2
In certain pairs, especially
operators, punctuation, and
keywords, there is no need for
an attribute value. In this
example, the token number has
been given an integer-valued
attribute. In practice, a typical
compiler would instead store a
character string representing the
constant and use as an attribute
value for number a pointer to
that string
2.1.2 LEXICAL ERRORS
• A lexical analyzer alone can't easily detect source-code errors without help
from other components.
• For example, encountering fi in a C program could be a misspelled if or an
undeclared identifier.
• Since fi is a valid identifier, the lexical analyzer returns it as id, leaving error
detection to the parser or later compiler phases.
• If no token pattern matches the remaining input, the lexical analyzer can't
proceed.
• In panic mode recovery, characters are deleted until a valid token is found.
This may confuse the parser but is often sufficient in interactive environments.
• Other functions performed by lexical analysis are keeping track of line
numbers, stripping out white spaces like redundant blanks and tabs, and
deleting comments.
2.2 LEXICAL ANALYSIS: INPUT BUFFERING
• Lexical Analyzer scans the characters of the source program to
discover tokens.
• But, many tokens have to be examined before the next token
itself can be determined.
• So lexical analyzer reads the input from an input buffer.
p r i n t f
Patterns
for tokens
1 2
a+ a
a 2
a+b 1
b 3
a+ + b+
Construct transition diagrams for the
following regular expressions
1) digit [0-9]
[0-
Ans: 1 9] 2
[0-
2) digits digit+ [0-
9]
1 9] 2
3) number digit.(digit)*
[0-
[0-
. 9]
1 9] 2 3
Construct transition diagrams for the following regular expressions
i f
1 2 3
6) if t
h e n
7) then e
4 5 6 7
8) else 8
l
9
s 1 e
11
0
Construct Transition diagram for the following:
relop < | > | <= | >= | = | == | <> | !=
Ans:
2.5 TRANSITION DIAGRAMS
1 2 3 4 5
THE ROLE OF A PARSER
• The parser receives a string
of tokens from the lexical
analyzer and verifies if it
follows the source language
grammar.
• It reports syntax errors
clearly and attempts to
recover to continue
processing the program.
• For well-formed programs, the
parser constructs a parse
tree (explicitly or implicitly)
and passes it to the compiler
for further processing.
• The parser and the rest of the
front end may be
implemented as a single
module.
ROLE OF A PARSER
• There are three types of parsers: universal, top-
down, and bottom-up.
• Universal parsers can handle any grammar but are
too inefficient for production compilers.
• Top-down parsers build parse trees from the root to
the leaves, while bottom-up parsers build from the
leaves to the root.
• Both scan input left to right.
SYNTAX ERROR HANDLING
Where,
V={expression, term, factor}
T={+, -, *, /, (, ), id}
S={expression}
DERIVATIONS, PARSE TREES
• Leftmost derivation (LMD) : is a step-by-step process
in which the leftmost non-terminal in a string is replaced
first at every step according to the grammar rules of a
language.