Computer Science 301 - 2014 - Test for week 3 - Solutions

This should take no longer than 25 minutes and is simply intended to probe whether you have understood the material in the prac questions. YOU MAY NOT USE YOUR TEXTBOOK, NOTES OR A COMPUTER. Answer on the question sheet (3 sides). You may write in pencil if you like.

This test was very badly done in general (with some notable exceptions). Please study the answers carefully.

1. You were expecting this one, I bet! What are your name (surname) and student number? [1]

Pat (Terry) 63T0844

2. Part of the grammar for the extended Parva language you met in the practical might have been expressed as follows:

   PRODUCTIONS
     Parva            = "void" "main" "(" ")" Block .
     Block            = "{" { Statement } "}" .
     Statement        = (   Block | WhileStatement | IfStatement | RepeatStatement
                          | ReadStatement | WriteStatement | Assignment
                          | ReturnStatement | EmptyStatement
                          | ConstDeclaration | VarDeclarations
                        ).
     IfStatement      = "if" "(" Condition ")" Statement
                        { "elseif" "(" Condition ")" Statement }
                        [ "else" Statement ] .
     WhileStatement   = "while" "(" Condition ")" Statement .
     RepeatStatement  = "repeat" { Statement } "until" "(" Condition ")" ";" .

What (if anything) would be the effect of replacing the productions as follows? [4]

     Parva            = "void" "main" "(" ")" Block .
     Block            = "{" { Declaration | Statement } "}" .
     Declaration      = ConstDeclaration | VarDeclarations .
     Statement        = (   WhileStatement | IfStatement | RepeatStatement
                          | ReadStatement | WriteStatement | Assignment
                          | ReturnStatement | EmptyStatement
                        ).
     IfStatement      = "if" "(" Condition ")" Block
                        { "elseif" "(" Condition ")" Block }
                        [ "else" Block ] .
     WhileStatement   = "while" "(" Condition ")" Block .
     RepeatStatement  = "repeat" Block "until" "(" Condition ")" ";" .

Very few submissions got this right, so it is useful to comment on some of the misconceptions. The main point that was quite often missed is that the second set of productions forces you to have braces around the subsidiary statement (or statements) in if, while and repeat statements. You have to write, for example

if (a == b) { c = d; } else { c = f; }

rather than the simpler, C-like code

if (a == b) c = d; else c = f;

which is allowed by the first set of productions. The first set does not forbid writing the braces, of course, as Statement is still able to derive Block in that set. However, the point of all this is that by requiring the braces one eliminates the "dangling else" ambiguity and non-LL(1) problem from which the first set of productions suffers. If you don't believe me, try submitting the grammars to Coco/R. Another point of interest is that, in the new set of productions Statement can no longer derive Block, so that blocks cannot be nested - rather silly code like

{ a = b; c = d; { e = f; } g = h; }

would be allowed by the first set of productions, but not by the second set.

Wrong answers received might have included:

(a) "The second set requires you to declare variables before you use them". Alas, no. Both sets are completely equivalent in that respect, and allow declarations and statements to be freely mingled.

In general, productions like

     A = B | C .
     B = "x" | "y" .
     C = "p" | "q" .

and

     A = "x" | "y" | "p" | "q" .
     A = "y" | "p" | "x" | "q" .

or indeed, any other permutations of these, do not force any ordering at all.

(b) "The first set allow you to write meaningless code like

if (a == b) int c;

with only a declaration instead of a statement". Well, yes, but the same apparently useless/meaningless code is allowed by the second set as

if (a == b) { int c; }

Few languages will prevent the dedicated idiots among us from writing meaningless or misleading code, though I suggest C++ is worse in this respect than some other languages I could mention. What do you make of the statements in this gem, which compiles quite happily and illustrates several potentially hard to find bugs?

   int silly() { return (int) silly + silly(); }

   void main () {
     int a, b;
     if (a);           while (silly);         while (silly());
     while (a) {}      if (a) 3;              if (a) a+b;
     if (a) int x; else int y;                42;      -6;
     silly;            silly();               a+++--b;
     ++a + + + - - --b;
   }

3. Show how one might specify a numeric literal in Cocol, allowing for the possibility of numbers of any of the following forms [4]

      123   1.23   1.   .23   1.2E12  1.2E-12   1E6   1E-6   .234E6   .234E-6

Many students had not appreciated two key ideas:

(a) We are looking for a TOKENS definition of a numeric literal, not a PRODUCTIONS definition. A numeric literal is a single token formed of contiguous digits, with no spaces permitted within it (the sequences of tokens permitted and defined by PRODUCTIONS rules can have spaces between them).

(b) One cannot define one token in terms of others. This point was made in lectures, but probably not emphasized enough.

All of this means that we have to find one (big) regular expression-like way of specifying the pattern of a numeric literal, and, since there are many permissible forms of this, the definition is, indeed, quite complicated. Here are two ways of doing it:

    CHARACTERS
      digit  = "0123456789" .

    TOKENS
      number =   "." digit { digit } [ ( "e" | "E" ) [ "+" | "-" ] digit { digit } ]
               | digit { digit } [ "." { digit } ] [ ( "e" | "E" ) ["+" | "-"] digit { digit } ] .

or

    TOKENS
      number =   ( "." digit { digit } | digit { digit } [ "." { digit } ] )
                    [ ( "e" | "E" ) ["+" | "-"] digit { digit } ] .

It takes some skill (and experience) to make sure that one gets all the options correct, while ruling out a definition that allows totally erroneous numbers like 1.2.3.45, for example. Here is a definition that looks seductively correct:

    TOKENS
      number = [ digit { digit } ] [ "." { digit } ] [ ( "e" | "E" ) ["+" | "-"] digit { digit } ] .

but this allows a "number" to be completely "empty" - devoid of any characters at all.

4. Develop a Cocol grammar for describing a list of Boolean expressions, like those you should remember from Boolean Algebra courses you may have taken. In this context, allow your Boolean expressions to be terminated with an = operator, and to include parentheses, single-letter variables, the constants FALSE and TRUE (sometimes written as 0 and 1), and the operators AND (written either as AND or "&&" or as a "dot", or simply omitted), OR (written either as OR or as "||" or as a "plus") and NOT, written either as a prefix NOT or as a suffix apostrophe. So some examples of Boolean expressions might be

         a AND B OR (C OR NOT D) =     a . b + (c + d') =     a b + (c + d') =
         NOT (a OR b) AND TRUE =       (a + b)' . 1 =         (a + b)' AND 1 =
         b AND NOT C AND D =           b . c' . d =           b c' d =

Note that there is a precedence ordering among Boolean operators - parentheses take precedence over NOT, which takes precedence over AND, which takes precedence over OR. [8]

This was very disappointing. I had hoped that after you had seen the "calculator" task and the "Parva" task in the prac that you would have realized that what was required was a rather straightforward variation on the former, with the operators NOT, AND and OR taking the place of leading minus, * and + respectively. But no, the answers received ranged from good ones from students who had clearly done the practical to complete rubbish, presumably from people who had not. Ah well.

Here is a complete solution - actually this goes a bit further to allow NOT to be represented by ! as well as by NOT. Note that the NOT/! operator appears as a prefix (before the operand) while the ' operator appears as a suffix (after the operand) so they are not interchangeable, as some of you thought.

    COMPILER Bool $CN

    CHARACTERS
      letter     = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .

    TOKENS
      variable   = letter .

    IGNORE CHR(0) .. CHR(31)

    PRODUCTIONS
      Bool       = { Expression "=" } EOF .
      Expression = Term { Or Term } .
      Term       = Factor { [ And ] Factor } .
      Factor     = Not Factor | Primary { "'" } .
      Primary    = True | False | variable | "(" Expression ")" .
      True       = "TRUE"   |  "1" .
      False      = "FALSE"  |  "0" .
      And        = "AND"  |  "&&"  |  "." .
      Or         = "OR"   |  "||"  |  "+" .
      Not        = "NOT"  | "!" .
    END Bool.

5. The PRODUCTIONS sections in your grammar should have made use of the EBNF "meta brackets" { } and [ ] in the production rules. How would you rewrite the PRODUCTIONS section without making any use of parentheses ( ) or of either of the { } or [ ] form of meta brackets? [6]

Hint: use right recursive productions, and assume that Cocol has a way of representing the empty string e.

This was extremely badly done, yet the transformations required are always easy. If you have a right hand side like

P = Something { Option } SomethingElse .

introduce a new nonterminal RepeatedOptions in place of { Option }

and then rewrite productions as

P = Something RepeatedOptions SomethingElse .
RepeatedOptions = Option RepeatedOptions | e .

Similarly, and even more simply, if you have something like

Q = Startoff [ OneOption ] Finish .

introduce a new nonterminal Possible in place of [ OneOption]

and then rewrite productions as

Q = Startoff Possible Finish .
Possible = OneOption | e .

With that idea, the PRODUCTIONS section given above might be rewritten mechanically as

    PRODUCTIONS
      Bool         = Expressions EOF .
      Expressions  = Expression "=" Expressions | e
      Expression   = Term MoreTerms .
      MoreTerms    = Or Term MoreTerms | e  .
      Term         = Factor MoreFactors .
      MoreFactors  = And Factor MoreFactors | e .
      Factor       = Not Factor | Primary Quotes .
      Quotes       = "'" Quotes | e .
      Primary      = True | False | variable | "(" Expression ")" .
      True         = "TRUE"   |  "1" .
      False        = "FALSE"  |  "0" .
      And          = "AND"  |  "&&"  |  "."  |  e .
      Or           = "OR"   |  "||"  |  "+" .
      Not          = "NOT"  | "!" .
    END Bool.


Home  © P.D. Terry