Homework 1: Syntax and semantics
| Assigned: | 2007-01-11 |
| Due: | 2007-01-23 in class |
| Optional: | yes |
This homework assignment is optional, but recommended: It helps you learn the concepts, helps you discover if there is anything you don't understand, and is a good way to prepare for the midterm and final exams. If you turn the homework in by the due date, the TA will check your answers and give feedback.
1. Lexical analysis
The following are the lexical rules for a small language called Epsilon. The first group of lines defines some regular expressions; the second group of lines are the lexical rules, some of which refer to the regular expressions defined previously.
whitespace [ \t]
digit [0-9]
identificator [a-zA-Z_][a-zA-Z0-9_]*
comment (\/\*(.|\n)*\*\/|--.*)
%%
procedure return PROCEDURE;
function return FUNCTION;
begin return BEGIN_T;
end return END;
while return WHILE;
do return DO;
if return IF;
then return THEN;
else return ELSE;
write return WRITE;
read return READ;
return return RETURN;
odd return ODD;
var return VAR;
"!=" return NEQ;
"<" return LT;
">" return GT;
"<=" return LEQ;
">=" return GEQ;
":=" return ASSIGN;
[=+*/();,.\[\]-] return yytext[0];
"-"?{digit}+ return VALUE;
{identificator} return IDENT(yytext);
{whitespace}
{comment}
Give the stream of tokens for the following programs.
1.1.
var a1; begin read a1; write a1 * a1 end.
1.2.
var i;
begin
read i;
while i > 0 do
begin
i := i - 1;
end
end.
2. Syntax analysis
The following syntax rules define the grammar for Epsilon.
/* Resolve the shift/reduce conflics if...then/if...then...else by
shifting else instead of reducing if...then. We do this by giving
higher precedence to else. */
%nonassoc THEN
%nonassoc ELSE
%%
/* The following is the grammar. */
program : decl_list compound_stmt '.'
| compound_stmt '.'
;
decl_list : decl_list declaration
| declaration
;
declaration : vars
| proc
;
proc : head ';' body ';'
;
head : PROCEDURE IDENT formal_params
| FUNCTION IDENT formal_params
;
formal_params : '(' ident_list ')'
| '(' ')'
;
body : vars compound_stmt
| compound_stmt
;
compound_stmt : BEGIN_T statements END
;
ident_list : ident_decl
| ident_decl ',' ident_list
;
vars : VAR ident_list ';'
;
statements : statements ';' statement
| statement
;
statement : ident_expr ASSIGN expression
| IDENT actual_params
| IF condition THEN statement
| IF condition THEN statement ELSE statement
| WHILE condition DO statement
| WRITE expression
| READ ident_expr
| RETURN expression
| compound_stmt
|
;
actual_params : '(' param_list ')'
| '(' ')'
;
param_list : ident_expr
| ident_expr ',' param_list
;
condition : ODD expression
| expression '=' expression
| expression NEQ expression
| expression LT expression
| expression GT expression
| expression LEQ expression
| expression GEQ expression
;
expression : expression '+' expression
| expression '-' expression
| expression '*' expression
| expression '/' expression
| '-' expression %prec '*'
| '+' expression %prec '*'
| '(' expression ')'
| IDENT actual_params
| ident_expr
| VALUE
;
ident_expr : IDENT '[' expression ']'
| IDENT
;
ident_decl : IDENT '[' VALUE ']'
| IDENT
;
2.1.
Draw parse trees for the following programs and program fragments.
2.1(a)
(Same program as in 1.1.)
2.1(b)
while i > 0 do
begin
i := i - 1
end
2.1(c)
if a >= 0 then
if a > 5 then
write 1;
else
write 2
2.1(d)
fahrenheit := 32 + (9 / 5) * celcius;
2.2.
The parse trees you found in 2.1 are very large and complex. Simplifying the tree can make it easier for the compiler to perform type checking, code generation and other tasks. (The simplified tree is sometimes called a syntax tree.) Simplify the parse tree from 2.1(a) in the following ways:
- Remove terminal nodes that were necessary for the parser to find the logical structure of the program, but which give no extra information once the parse tree is in place. For instance, the BEGIN and END terminals can be removed from the compound_stmt subtree because a compound_stmt always begins with BEGIN and ends with END.
- Other terminal nodes give information that can be incorporated into nodes higher up in the tree. For instance, the fragment a1 * a1 yields a parse tree rooted in an expression node; the '*' terminal can be removed if we replace the expression node with a multiplication_expression node.
- Many of the internal nodes that only have only one child are there only to represent "is-a" relationships. These can be removed. For instance, the fragment a1 * a1 yields a parse tree that can be replaced with a multiplication_expression node that has two children, both of which are IDENT(a1).
3. Axiomatic semantics
3.1.
Find the weakest precondition for the assignment y := 3 * x + 11 given the postcondition y >= 20.
3.2.
Consider the following program:
var sum, i, n;
begin
read n;
sum := 0;
i := 0;
while i < n do
begin
i := i + 1;
sum := sum + i;
end;
write sum
end.
Note that the program computes 1 + 2 + ... + n, which can also be written n(n + 1) / 2.
3.2(a)
For the program to be correct, what is the postcondition of the loop (i.e., the predicate that must be true after the last iteration of the loop finishes)?
3.2(b)
Find a reasonable loop invariant. (Note that true is a loop invariant, but not a reasonable one.)
3.2(c)
What is the precondition of the loop corresponding to the loop invariant you found in 3.2(b)? In other words, what is the weakest predicate that must be true for your loop invariant to be satisfied before the first iteration of the loop?
4. Denotational semantics
4.1
Consider the assignment i := i + 1. Let M be the function
that describes the state before the assignment: M(i) is the
value of the variable
4.2
Consider the assignment sum := sum + i. Let M be the function that describes the state before the assignment. What is M' = dsem(sum := sum + i, M)? (Write it in terms of M.)
4.3
Now consider the following loop:
while i < n do begin i := i + 1; sum := sum + i; end
For brevity, we write the loop as while B do L in the following. Let M be the function that describes the state before the assignment. What is M' = dsem(while B do L, M)? (Write it in terms of M).