CT 1
CT 1
Answer all the questions. Take and state suitable assumptions, if needed.
No clarifications will be provided during the examination.
1. Parse the string aabbde by executing the non-deterministic recursive descent parser under the grammar
specified below. Apply the productions in the increasing order of the Rule #. Clearly show the functions
invoked by the parser and the backtracking steps (if any). No need to write the codes for the functions.
Rule # Production
1. S → Aa
2. S → Ce
3. A → aaB
4. A → aaba
5. B → bbb
6. C → aaD
7. D → bbd
[5]
Ans:
2. Consider the following grammar with terminal symbols {a, b, d, g, h}, and nonterminals {S, A, B, C}.
Here, S is the start symbol. The productions of the grammar are given below. The bodies of two productions,
called Body1 and Body2, are missing.
S → Body1 | CbB | Ba
A → da | Body2
B→g|ϵ
C→ h|ϵ
The following table lists the FIRST and FOLLOW of the nonterminals.
FIRST FOLLOW
S {d, g, h, b, a, ϵ} {$}
A {d, g, h, ϵ} {h, g, $}
B {g, ϵ} FOLLOW(S) ∪ {a} ∪ FIRST(C) ∪ FOLLOW(A)
C {h, ϵ} FIRST(B) ∪ FOLLOW(S) ∪ FOLLOW(A) ∪ {b}
Using this table, derive the missing bodies Body1 and Body2 of the two productions given above. Show all
the steps of your derivation.
[5]
Ans:
A → BC [First of A includes g and h, where First of B is g and First of C is g]
S → ABC [First of S includes g, h, which infers B, C in production body. First of S includes d,
which infers A in production body. Follow of B includes First of C, Follow of A is {d, g, h, ϵ}, and
does not include Follow(S). This decides the order ABC.]
3. (a) Consider a programming language, which supports the following tokens.
Write down the regular definitions of the tokens INT, NE_REAL, E_REAL, ID, ID_N.
[2]
Ans:
INT : digit(digit)*[.digit]?(digit)*[E[+/-]?digit]?digit*
NE_REAL : (epsilon/+/-)digit(digit)*[.digit]?(digit)*
E_REAL : (epsilon/+/-)digit(digit)*[.digit]?(digit)*[E[+/-]?digit]?digit*
ID: letter+
ID_N: letter(letter+digit)*
For the code snippet below, write down the stream of tokens generated by the lexical analyzer. Write each
input token as <token_name, lexeme>.
(b) Consider a programming language L which supports three tokens T1, T2 and T3 defined by the regular
expressions T1 = a?(b|c)*a, T2 = b?(a|c)*b, T3 = c?(b|a)*c. Consider a string w = accbbbccaabc in the
language L. Arrange these tokens in such a way that the tokens generate the string w, satisfying the
following two conditions. (i) The number of tokens should be minimized, and (ii) the tokens cannot be
repeated. [It is not necessary that you have to use all these three tokens.]
[3]
Ans:
T1 will generate accbbbcca and T3 will generate abc. So the sequence of tokens are T1T3.
4. (a) Consider the following grammar with terminal symbols {a, b, c, d} and nonterminal symbols {X, Y},
where X is the start symbol. Eliminate left recursion from the grammar, and write down the transformed
grammar.
X → Ya ∣ Xa ∣ c
Y → Yb | Xb | d
[2]
Ans:
X → Ya ∣ Xa ∣ c … Eliminate left recursion
X → YaX′ ∣ cX′
X′ → aX′ | ϵ
Y → cX′bY′ | dY′
Y′ → bY′ | aX′bY′ | ϵ
X → YaX′ ∣ cX′
X′ → aX′ | ϵ
Y → cX′bY′ | dY′
Y′ → bY′ | aX′bY′ | ϵ
(b) Prove or disprove with justification: The following grammar is LL(1). Do not construct the parsing
table. Here, S is the only nonterminal symbol.
S → aSbS | bSaS | ϵ
[2]
Ans:
Follow(S)={a, b, $}
First(aSbS)={a} overlaps with Follow(S)