Computer Science 3 - 2004 - Test on Prac 21 - Solutions

This should take no longer than 25 minutes and is simply intended to probe whether you have understood the material in the prac questions. You may use your textbook if you think it will help. Answer on the questions sheet (3 sides).

1. You were expecting this one, I bet! What are your name (surname) and student number? [1]

Pat (Terry) 63T0844

2. What do you understand by the term ambiguous grammar? [2]

There must be at least one string in the language that the grammar defines which can be parsed in at least
two ways. There may be many such strings in general (see the example in grammar G9 in section 6.4), but
the presence of just one such string is enough to cause ambiguity in the grammar as a whole (which may be
one reason why it is hard to establish whether an arbitrary, large, grammar is ambiguous).

3. Part of the grammar for the extended Parva language you met in the practical might have been expressed as follows:

   PRODUCTIONS
     Block            = "{" { Statement } "}" .
     Statement        = (   Block | WhileStatement | IfStatement | RepeatStatement
                          | ReadStatement | WriteStatement | Assignment
                          | ReturnStatement | EmptyStatement
                          | ConstDeclaration | VarDeclarations
                        ).
     IfStatement      = "if" "(" Condition ")" Statement
                        { "elsif" "(" Condition ")"  Statement } [ "else" Statement ] .
     WhileStatement   = "while" "(" Condition ")" Statement .
     RepeatStatement  = "repeat" { Statement } "until" "(" Condition ")" ";" .

What (if anything) would be the effect of replacing the productions as follows? [4]

     Parva            = "void" "main" "(" ")" Block .
     Block            = "{" { Declaration | Statement } "}" .
     Declaration      = ConstDeclaration | VarDeclarations .
     Statement        = (   WhileStatement | IfStatement | RepeatStatement
                          | ReadStatement | WriteStatement | Assignment
                          | ReturnStatement | EmptyStatement
                        ).
     IfStatement      = "if" "(" Condition ")" Block
                        { "elsif" "(" Condition ")" Block } [ "else" Block ] .
     WhileStatement   = "while" "(" Condition ")" Block .
     RepeatStatement  = "repeat" Block "until" "(" Condition ")" ";" .

Very few people got this completely right, so it is useful to comment on some of the misconceptions. The second set of productions forces one to have braces around the subsidiary statement (or statements) in if, while and repeat statements. One has to write, for example

if (a == b) { c = d; } else { c = f; }

rather than the simpler, C-like code

if (a == b) c = d; else c = f;

which is allowed by the first set of productions. The first set does not forbid writing the braces, of course, as Statement is still able to derive Block. However, the point of all this is that by requiring the braces one eliminates the "dangling else" ambiguity and non-LL(1) problem from which the first set of productions suffers. If you don't believe me, try submitting the grammars to Coco/R.

Wrong answers received included

(a) "The second set requires you to declare variables before you use them". Alas, no. Both sets are completely equivalent in that respect, and allow declarations and statements to be freely mingled.

In general, productions like

     A = B | C .
     B = "x" | "y" .
     C = "p" | "q" .

and

     A = "x" | "y" | "p" | "q" .
     A = "y" | "p" | "x" | "q" .

or indeed, any other permutations of these, do not force any ordering at all.

(b) "The first set allow you to write meaningless code like

if (a == b) int c;

with only a declaration instead of a statement". Well, yes, but the same apparently useless/meaningless code is allowed by the second set as

if (a == b) { int c; }

Few languages will prevent the dedicated idiots among us from writing meaningless or misleading code, though I suggest C++ is worse in this respect than some other languages I could mention. What do you make of the statements in this gem, which compiles quite happily and illustrates several potentially hard to find bugs?

   int silly() { return (int) silly + silly(); }

   void main () {
     int a, b;
     if (a);           while (silly);         while (silly());
     while (a) {}      if (a) 3;              if (a) a+b;
     if (a) int x; else int y;                42;      -6;
     silly;            silly();               a+++--b;
     ++a + + + - - --b;
   }

4. In the prac you have just completed you were asked to produce a Cocol grammar to describe a set of BNF productions. Here are two attempts at doing this. Are they equivalent? If not, can you find an example of a BNF production rule that one would accept and the other would reject? [3]

   COMPILER BNF $CN
   /* Grammar to describe BNF productions */

   CHARACTERS
     letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
     alpha  = letter + "0123456789_" .
     eol    = CHR(10) .
     space  = CHR(32) .

   IGNORE CHR(9) + CHR(11) .. CHR(31)

   TOKENS
     EOL         = eol .
     nonterminal = "<" letter { alpha | space } ">" .
     terminal    = alpha { alpha } .

   PRODUCTIONS /* Attempt 1 */
     BNF         = { Production } EOF .
     Production  = nonterminal "::=" Expression EOL .
     Expression  = Term { "|" Term } .
     Term        = Factor { Factor } .
     Factor      = nonterminal | terminal | "eps" .
   END BNF.

   PRODUCTIONS /* Attempt 2 */
      BNF        = { Production } EOF .
      Production = nonterminal "::=" Expression EOL .
      Expression = Term { "|" Term } .
      Term       = Factor { Factor } | "eps" .
      Factor     = nonterminal | terminal .
   END BNF.

Many people missed the point - the "eps" is a terminal here - it is not being used as a meta symbol meaning "null string".

They are not equivalent. Attempt 1 will allow rules like

<nonterm> ::= term eps term eps eps eps

while Attempt 2 will only allow one eps to appear in an option, as in

<nonterm> ::= term | eps

It will, however, allow nonsense like

<nonterm> ::= term | eps | eps | eps

If one wants to restrict this (and still remain LL(1)) it turns out to be a bit awkward:

   PRODUCTIONS /* attempt 3 */
      BNF        = { Production } EOF .
      Production = nonterminal "::=" Expression EOL .
      Expression = [ "eps" "|" ] Term { "|" Term } .
      Term       = Factor { Factor } .
      Factor     = nonterminal | terminal .
   END BNF.

5. The PRODUCTIONS sections above make use of the EBNF "meta brackets" { } in the production rules. How would you rewrite the PRODUCTIONS section for Attempt 2 without making use of parentheses ( ) or of either of the {} or [] form of meta brackets? [5]

This was badly done, yet the transformations required are always easy. If you have a right hand side like

P = Something { Option } SomethingElse .

introduce a new nonterminal RepeatedOptions in place of { Option }

and then rewrite things as

P = Something RepeatedOptions SomethingElse .
RepeatedOptions = Option RepeatedOptions | e .

Similarly, and even more simply, if you have something like

Q = Startoff [ OneOption ] Finish .

introduce a new nonterminal Possible in place of [ OneOption ]

and then rewrite things as

Q = Startoff Possible Finish .
Possible = OneOption | e .

With that idea, the last of the PRODUCTIONS sections given above might become

   PRODUCTIONS /* Attempt 2 modified */
      BNF          = Productions EOF .
      Productions  = Production Productions | e .
      Production   = nonterminal "::=" Expression EOL .
      Expression   = Term OtherTerms .
      OtherTerms   = "|" Term OtherTerms | e .
      Term         = Factor OtherFactors | "eps" .
      OtherFactors = Factor OtherFactors | e .
      Factor       = nonterminal | terminal .
   END BNF.

6. Complete the following Cocol specification for a program that would recognize validly constructed railway trains. Railway trains are headed by one or two locomotives, and fall into two categories - passenger trains and goods trains. Goods trains consist of a series of open and closed trucks, in any mixture, followed by optional petrol tankers and then a mandatory brake van. Passenger trains consist of at least three coaches, but maybe more, though the last one of all must be a guard's coach. [5]

   COMPILER Spoornet
     PRODUCTIONS
       Spoornet = { Train } EOF.
       /* ... */
   END Spoornet.

Several people did not realise that a train needs a locomotive to pull it, and left the locomotives out of it altogether; others did not realise that a guard's coach is also a coach (semantically) and so can contribute to the minimum of three coaches needed to make up a passenger train. Ah well. I suppose some of you have never travelled by train?

   COMPILER Spoornet
     PRODUCTIONS
       Spoornet = { Train } EOF.
       Train = "loco" [ "loco" ] ( GoodsPart | PassengerPart ) .
       GoodsPart = { "open" | "closed" } { "petrol" } "brake" .
       PassengerPart = "coach" "coach" { "coach" } "guardCoach" .
   END Spoornet.

It may be of interest to comment on a few common mistakes. Some people wrote a production like this

Train = LocoPart GoodsPart | PassengerPart .

forgetting that the "precedence" of the | operator is low - which means that the production is equivalent to

Train = ( LocoPart GoodsPart) | ( PassengerPart ) .

leaving passenger trains without any locomotives. Some people wrote productions something like

GoodsPart = { { "open" } { "closed" } } [ { "petrol" } ] "brake" .

which rather overdoes the bracketing, and in fact yields non LL(1) grammars. Finally, a common mistake was

GoodsPart = { "open" } { "closed" } [ "petrol" ] "brake" .

which (a) forces all the open trucks to come before the closed trucks and (b) only allows one petrol truck.

It might also be useful to demonstrate an LL(1) analysis. Rewrite the productions as

   COMPILER Spoornet
     PRODUCTIONS
       Spoornet      = Trains EOF.
       Trains        = Train Trains | e .
       Train         = "loco" secondLoco Tail .
       secondLoco    = "loco" | e .
       Tail          = GoodsPart | PassengerPart  .
       GoodsPart     = Trucks Tankers "brake" .
       Trucks        = Truck  Trucks | e.
       Truck         = "open" | "closed" .
       Tankers       = "petrol" Tankers | e .
       PassengerPart = "coach" "coach" MoreCoaches  "guardCoach" .
       MoreCoaches   = "coach" MoreCoaches | e .
   END Spoornet.

To apply Rule 1 we look at the productions for Trains, secondLoco, Tail, Trucks, Truck, Tankers, MoreCoaches - all of which have alternatives. But it is easy to see that the alternatives in each case have disjoint sets. To apply Rule 2 we look at the productions for Trains, secondLoco, Trucks, Tankers, MoreCoaches - all of which are nullable. But

    FIRST(Trains) = { "loco" }
    FOLLOW(Trains) = { EOF }

    FIRST(secondLoco) = { "loco" }
    FOLLOW(secondLoco) = { "open", "closed", "petrol", "coach", "brake" }

    FIRST(Trucks) = { "open", "closed" }
    FOLLOW(Trucks) = { "petrol", "brake" }

    FIRST(Tankers) = { "petrol" }
    FOLLOW(Tankers) = { "brake" }

    FIRST(MoreCoaches) = { "coach" }
    FOLLOW(MoreCoaches) = { "guardCoach" }

so Rule 2 is satisfied in all cases where we have to apply it.


Home  © P.D. Terry