CS3342a1 copy
CS3342a1 copy
1. (15pt) Assume the following date formats; for simplicity, each of D, M, Y can be any decimal digit:
- full format: DD/MM/YYYY, YYYY/MM/DD, or MM/YYYY/DD
- short format: DD/MM or MM/YYYY.
(a) (5pt) Write a regular expression that describes dates in all formats above (one expression for
all). You can use d to represent a decimal digit 0, 1, . . . , 9.
(b) (5pt) Build a scanner in the form of a DFA with the fewest states for the above date formats;
the scanner identifies full-format and short-format dates.
(c) (5pt) Modify the above DFA to include the additional short date format YYYY/MM. The
DFA should still have the fewest states.
2. (15pt) Consider a language containing integers, floats and ranges (digit is any decimal digit 0..9);
there are four token types:
- integer : digit digit∗
- float: digit∗ (digit . + . digit)digit∗
- int-range: integer ..integer
- float-range: float..float
Construct:
(a) (10pt) a DFA and
(b) (5pt) a scanner table
for this language (like those on slide 28).
3. (20pt) Consider the switch statement in C-like languages, defined here according to this grammar:
1. P −→ Switch $$
2. Switch −→ switch ( expr ) { Cases Default }
3. Cases −→ Cases Case
4. Cases −→ ε
5. Case −→ case const: Stmt
6. Stmt −→ stmt; break;
7. Stmt −→ stmt;
8. Stmt −→ ε
9. Default −→ default: stmt;
10. Default −→ ε
(a) (2pt) Compute the sets first(X) and follow(X), for all non-terminals X, and the sets
predict(p) for all production rules p.
(b) (3pt) Prove that G is not an LL(1) grammar. Describe all conflicts.
(c) (5pt) Build an LL(1) grammar, G1 , equivalent with G. Compute, for G1 , the sets first(X)
and follow(X), for all non-terminals X, and the sets predict(p) for all production rules p.
1
(d) (5pt) Show how the switch statement can be used, in general, to simulate any if..then..else
statement.
(e) (5pt) How is it possible that the switch statement can simulate if’s and have an LL(1)
grammar, since we said that there is no top-down grammar for if..then..else statements?
4. (50pt) Consider dated C++ comments, which are the usual C++ comments with a mandatory date
in DD/MM/YYYY format occurring anywhere in the first line of the comment; D, M, Y can be
any decimal digits. Here are some examples:
(a) // 31/12/2023 ... working on New Year’s eve; there must be more to life ...!
(b) /* Modifed todai 00/00/0000
There shuld be no mor erors. */
Write a Python program dcom rm.py to remove all comments from a C++ program. The program
should work as follows:
where inputC.cpp is any (correct) C++ program and inputC_rm.cpp is the same program with
dated comments removed. Regular comments, non-dated, are allowed in the input but are not
removed.
You are not allowed to use any regular expression capabilities of Python, or external libraries, as
that would defeat the purpose of the question.
Notes: !!! Submit two files: one pdf file with responses to Assignment 1 Q1-3 and one text file
with code as required above to Assignment 1 Q4 (code); anything else receives 0pt.
Solutions should be typed but high-quality hand-written solutions are acceptable.
JFLAP: You are allowed to use JFLAP to help you solve the assignment. You still need to explain
clearly your solution. Also, make sure you understand what it does; JFLAP will not be available
during exams!
LLMs: You are allowed to use LLMs (Large Language Models), such as ChatGPT, but, again,
they will not be available during exams.
LATEX: For those interested, the best program for scientific writing is LATEX. It is far superior to
all the other programs, it is free, and you can start using it in minutes; here is an introduction:
https://tobi.oetiker.ch/lshort/lshort.pdf. It is also available online at https://www.overleaf.com/.