Computer Science 3 - 2007 - Test on Prac 21 - Solutions

This should take no longer than 25 minutes and is simply intended to probe whether you have understood the material in the prac questions. You may use your textbook if you think it will help. Answer on the question sheet (3 sides).

1. You were expecting this one, I bet! What are your name (surname) and student number? [1]

Pat (Terry) 63T0844

2. What do you understand by the term ambiguous grammar? [2]

There must be at least one string in the language that the grammar defines which can be parsed in at least
two ways. There may be many such strings in general (see the example in grammar G9 in section 6.4), but
the presence of just one such string is enough to cause ambiguity in the grammar as a whole (which may be
one reason why it is hard to establish whether an arbitrary, large, grammar is ambiguous).

3. Part of the grammar for the extended Parva language you met in the practical might have been expressed as follows:

   PRODUCTIONS
     Block            = "{" { Statement } "}" .
     Statement        = (   Block | WhileStatement | IfStatement | DoWhileStatement
                          | ReadStatement | WriteStatement | Assignment
                          | ReturnStatement | EmptyStatement
                          | ConstDeclaration | VarDeclarations
                        ).
     IfStatement      = "if" "(" Condition ")" Statement [ "else" Statement ] .
     WhileStatement   = "while" "(" Condition ")" Statement .
     DoWhileStatement = "do" Statement "while" "(" Condition ")" ";" .

What (if anything) would be the effect of replacing the productions as follows? [4]

     Parva            = "void" "main" "(" ")" Block .
     Block            = "{" { Declaration | Statement } "}" .
     Declaration      = ConstDeclaration | VarDeclarations .
     Statement        = (   WhileStatement | IfStatement | DoWhileStatement
                          | ReadStatement | WriteStatement | Assignment
                          | ReturnStatement | EmptyStatement
                        ).
     IfStatement      = "if" "(" Condition ")" Block [ "else" Block ] .
     WhileStatement   = "while" "(" Condition ")" Block .
     DoWhileStatement = "do" Block "while" "(" Condition ")" ";" .

Very few people got this completely right, so it is useful to comment on some of the misconceptions. The second set of productions forces one to have braces around the subsidiary statement (or statements) in if, while and do-while statements. One has to write, for example

if (a == b) { c = d; } else { c = f; }

rather than the simpler, C-like code

if (a == b) c = d; else c = f;

which is allowed by the first set of productions. The first set does not forbid writing the braces, of course, as Statement is still able to derive Block. However, the point of all this is that by requiring the braces one eliminates the "dangling else" ambiguity and non-LL(1) problem from which the first set of productions suffers. If you don't believe me, try submitting the grammars to Coco/R.

Wrong answers received included

(a) "The second set requires you to declare variables before you use them". Alas, no. Both sets are completely equivalent in that respect, and allow declarations and statements to be freely mingled.

In general, productions like

     A = B | C .
     B = "x" | "y" .
     C = "p" | "q" .

and

     A = "x" | "y" | "p" | "q" .
     A = "y" | "p" | "x" | "q" .

or indeed, any other permutations of these, do not force any ordering at all.

(b) "The first set allow you to write meaningless code like

if (a == b) int c;

with only a declaration instead of a statement". Well, yes, but the same apparently useless/meaningless code is allowed by the second set as

if (a == b) { int c; }

Few languages will prevent the dedicated idiots among us from writing meaningless or misleading code, though I suggest C++ is worse in this respect than some other languages I could mention. What do you make of the statements in this gem, which compiles quite happily and illustrates several potentially hard to find bugs?

   int silly() { return (int) silly + silly(); }

   void main () {
     int a, b;
     if (a);           while (silly);         while (silly());
     while (a) {}      if (a) 3;              if (a) a+b;
     if (a) int x; else int y;                42;      -6;
     silly;            silly();               a+++--b;
     ++a + + + - - --b;
   }

4. In the prac you have just completed you were asked to produce a Cocol grammar to describe a set of BNF productions. Here are two attempts at doing this. Are they equivalent? If not, can you find an example of a BNF production rule that one would accept and the other would reject? [3]

   COMPILER BNF $CN
   /* Grammar to describe BNF productions */

   CHARACTERS
     letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
     alpha  = letter + "0123456789_" .
     eol    = CHR(10) .
     space  = CHR(32) .

   IGNORE CHR(9) + CHR(11) .. CHR(31)

   TOKENS
     EOL         = eol .
     nonterminal = "<" letter { alpha | space } ">" .
     terminal    = alpha { alpha } .

   PRODUCTIONS /* Attempt 1 */
     BNF         = { Production } EOF .
     Production  = nonterminal "::=" Expression EOL .
     Expression  = Term { "|" Term } .
     Term        = Factor { Factor } .
     Factor      = nonterminal | terminal | "eps" .
   END BNF.

   PRODUCTIONS /* Attempt 2 */
      BNF        = { Production } EOF .
      Production = nonterminal "::=" Expression EOL .
      Expression = Term { "|" Term } .
      Term       = Factor { Factor } | "eps" .
      Factor     = nonterminal | terminal .
   END BNF.

Many people missed the point - the "eps" is a terminal here - it is not being used as a meta symbol meaning "null string".

They are not equivalent. Attempt 1 will allow rules like

<nonterm> ::= term eps term eps eps eps

while Attempt 2 will only allow one eps to appear in an option, as in

<nonterm> ::= term | eps

It will, however, allow nonsense like

<nonterm> ::= term | eps | eps | eps

If one wants to restrict this (and still remain LL(1)) it turns out to be a bit awkward:

   PRODUCTIONS /* attempt 3 */
      BNF        = { Production } EOF .
      Production = nonterminal "::=" Expression EOL .
      Expression = [ "eps" "|" ] Term { "|" Term } .
      Term       = Factor { Factor } .
      Factor     = nonterminal | terminal .
   END BNF.

5. The PRODUCTIONS sections above make use of the EBNF "meta brackets" { } in the production rules. How would you rewrite the PRODUCTIONS section for Attempt 2 without making use of parentheses ( ) or of either of the {} or [] form of meta brackets? [5]

This was badly done, yet the transformations required are always easy. If you have a right hand side like

P = Something { Option } SomethingElse .

introduce a new nonterminal RepeatedOptions in place of { Option }

and then rewrite things as

P = Something RepeatedOptions SomethingElse .
RepeatedOptions = Option RepeatedOptions | e .

Similarly, and even more simply, if you have something like

Q = Startoff [ OneOption ] Finish .

introduce a new nonterminal Possible in place of [ OneOption ]

and then rewrite things as

Q = Startoff Possible Finish .
Possible = OneOption | e .

With that idea, the last of the PRODUCTIONS sections given above might become

   PRODUCTIONS /* Attempt 2 modified */
      BNF          = Productions EOF .
      Productions  = Production Productions | e .
      Production   = nonterminal "::=" Expression EOL .
      Expression   = Term OtherTerms .
      OtherTerms   = "|" Term OtherTerms | e .
      Term         = Factor OtherFactors | "eps" .
      OtherFactors = Factor OtherFactors | e .
      Factor       = nonterminal | terminal .
   END BNF.

6. Complete the following Cocol specification for a program that would recognize a list of pop songs of my era (your parents might have sung along to some of them too). Here is a sample of such a list (I don't remember the years of all of them, so one might leave the year out - and some were recorded several times): [5]

"Please, Please Me" (The Beatles) [1963 ] .
"I wanna hold your hand" (The Beatles) .
"She loves you, yeah, yeah, yeah" ( The Beatles ).
"All I have to do is dream" (Everly Brothers) , (Roy Orbison) [ 1959 ].
"All I have to do is dream" (Richard Chamberlain) [ 1963 ].
"Sheila" (Tommy Roe) [ 1962 ] .
"Young Girl" (Gary Puckett and the Union Gap) .
"You ain't nothing but a Hound Dog" (Elvis Presley) [ 1956 ] .

The trick here is to define appropriate character sets and tokens, then it all becomes very easy:

   COMPILER Nostalgia

     CHARACTERS
       Control  = CHR(0) .. CHR(31) .
       InTitle  = ANY - Control - '"' .
       InArtist = ANY - Control - "()" .
       Digits   = "0123456789" .
     TOKENS
       Title  = '"' InTitle { InTitle } '"' .
       Artist = "(" InArtist { InArtist } ")" .
       Year   = Digit Digit Digit Digit .

     IGNORE Control

     PRODUCTIONS
       Nostalgia = { PopTune } EOF.
       PopTune   = Title Artist { "," Artist } [ "[" Year "]" ] .
     END Nostalgia /* they don't write them like they used to do! */.

There are, of course, other ways to try to do this. The above solution shows how, since the tokens are clearlu demarcated by quotes at either end or by parentheses at either end, it makes some sense to treat Title and Artist as simple tokens - which, unusually for these sorts of grammars, can contain embedded spaces - as well as all sorts od other printable characters, perhaps, like single quotes, digits, hyphens and the like.

Several people missed the fact that a year would have exactly four digits, and was best specified as above. The examples showed a variety of ways of adding the year, with suggestive spaces - for example [1963 ] and [ 1962 ] - so that it is preferable to leave the Year token as the digit part, and add the enclosing [ ] brackets as shown above, rather than defining, for example

     TOKENS
       Year   = "[" Digit Digit Digit Digit "]" .

Some submissions suggested a grammar with productions like

     PRODUCTIONS
       Nostalgia = { PopTune } EOF.
       PopTune   = Title Artist [ "[" Year "]" , { "[" Year "]" } ] .
     END Nostalgia /* they don't write them like they used to do! */.

but the examples suggested that a tune could be recorded by several artists on one year, rather than by one artist in several years. Come to think of it, a better way of recording tunes would be to record the artist along with the year, leading to a grammar like

     PRODUCTIONS
       Nostalgia   = { PopTune } EOF.
       PopTune     = Title Recording { "," Recording } .
       Recording   = Artist [ "[" Year "]" ] .
     END Nostalgia /* they don't write them like they used to do! */.

Would this grammar also describe the sample list of tunes above? If not, where not?


Home  © P.D. Terry