Test on week 22

Computer Science 3 - 2007 - Test on Prac 22 - Solutions

This should take no longer than about 35 minutes and is simply intended to probe whether you have understood the material in the prac questions. You may use your textbook if you think it will help. Answer on the questions sheet (4 sides).

This test was not well done. Clearly the IS project had a bad effect on far too many people's CS studies this week. Please make sure you study these solutions carefull, and if you don't understand, come and ask for help.

1. Sorry, I am getting past it and have forgotten again. What are your name (surname) and student number? [1]

Pat (Terry) 63T0844

2. A very quick and dirty solution to the stack assembler grammar was picked out of the rubbish bin early this morning. The PRODUCTIONS section read

   PRODUCTIONS
     Program = { Statement } .
     Statement = [ [ "{" ] number [ "}" ] ] Opcode [ number | string | word ] EOL .
     Opcode = "LDA" | "LDV" | "BZE" | (* all the rest were there ... *)
              "STO" | "INPI" | "PRNS" .
   END Program.

What critical message do you think the marker would have scribbled on this, if it had been handed in? (Assume that the TOKENS, COMMENTS, CHARACTERS and IGNORE sections are correct - focus your attention simply on the PRODUCTIONS as given here). [3]

Clearly this one, at least, had been encountered in the practical itself. The worst problem is that no distinction is drawn between the various classes of opcode and as this is so easily specified, it should have been done. A lesser problem is that the optional numbers at the start of a line could have appeared between unbalanced braces. Lastly, the grammar did not allow for statements to be properly labelled.

Here is one better solution; there are plenty more like it.

      PRODUCTIONS
        Program     = { [ Statement ] EOL } .
        Statement   = [ CodeAddress ] [ word ] Operation .
        Operation   =   Opcode1
                      | Opcode2 number
                      | Opcode3 ( number | word )
                      | Opcode4 string .
        Opcode1     = "STO" | "INPI" | "LDV" (* all the ones like this *) .
        Opcode2     = "LDC" | "LDA"  (* all the ones like that *) .
        Opcode3     = ( "BZE" | "BRN" ) ( number | word ) .
        Opcode4     = "PRNS" .
        CodeAddress = number | "{" number "}" .
      END Program.

3. Consider the following grammar expressed in EBNF for describing the progress of an excellent compiler course offered at a University Near You:

Course = Introduction Section { Section } Conclusion .
Introduction = "handout" [ "lecture" ] .
Section = { "lecture" | "practical" "test" | "tutorial" | "handout" } "test" .
Conclusion = [ "panic" ] "examination" .

(a) Is this an LL(1) grammar? If not, where does it break the rules? [3]

Most people seemed to see that this was not an LL(1) grammar, but did not do the analysis correctly The component [ "lecture" ] in Introduction is nullable, and

FIRST( [ "lecture" ] ) = { "lecture" },

while

FOLLOW( [ "lecture" ] ) = { "lecture" , "practical", "tutorial" , "handout" , "test" }.

Note that Introduction is NOT nullable (many people thouhght it was), and that there is no problem with the nullable component of Section, or with the nullable component within Conclusion. Some students said that because a Section could start with "practical" "test" and be followed by "test" that the grammar was non LL(1), but this is not true. Be careful where and how you test Rule 2.

(b) If it breaks the rules, do you suppose a recursive descent parser would have trouble recognizing the history of a course? Explain. [2]

A point I wish to stress here is that a recursive descent parser can handle some non-LL(1) grammars sensibly. The most obvious example of this is the "dangling else" grammar (go and read that part of the text again!). In this particular case, too, there would be no trouble. If the string to be parsed began with "handout" "lecture" the parser would simply associate the lecture with an Introduction, and not with a following Section (this is completely analogous with the dangling else problem, of course).

(c) Do you think it is possible to complete the course described by this grammar without attending any lectures? (Explain your reasoning.) [2]

Yes, indeed. "lecture" only appears within nullable components of the grammar.

(d) What is the shortest form the course could take? [2]

"handout" "test" "examination"

is the minimum one that is acceptable to the parser, one which would result if all the nullable options were exercised.

(d) (Bonus mark) So why are some of you missing some of my lectures? [1]

There is no agreement on the correct solution here. Probably the closest is that some people are just too lazy to attend, or have stayed up too late and over sleep. Crazy, really? I'm willing to bet that if you pay for a drink at the Rat you would drink it; since you've paid for a whole lot of drinks at the Fountain of Knowledge, why don't you drink those too? Those who submitted no solution but who are known always to attend earned the bonus mark anyway. This must be almost the first time I have awarded marks for leaving an answer blank. It is not likely to happen again!

5. Grammars don't get much simpler than this next one - at least to write down. It has one nonterminal A and two terminals ( and )

A = "(" A ")" A | e .

(a) Is it an LL(1) grammar? If not, why not? [4]

Lots of people got this wrong, which suggests that there is still a lot of confusion about where and how to apply tests for the LL(1) condition. The production for A has two alternatives. The FIRST sets for these two alternatives are { "(" } and the empty set F, so Rule 1 is trivially satisfied. Since A is nullable we must test Rule 2 (we only need do this if a non-terminal is nullable; people miss this point). To determine FOLLOW(A) look at the right hand sides of the productions for instances of A. This only happens in the situation where A is followed by ")", so FOLLOW(A) = { ")" } and so Rule 2 is satisfied, and the grammar is LL(1). Some people dream up the idea that FOLLOW(A) = { "(" , ")" } but I don't understand why.

(b) What language do you suppose it describes (write down a few strings derived from A and see the hopefully obvious pattern!) [3]

It's the language of "matched parentheses". Strings of nested parentheses are permitted, such as ( ( ( ) ) ), but so are strings like ( ) ( ) ( ), and of course combinations of these like ( ( ( ) ) ) ( ) ( ( ) ( ) ). Cute, really - amazing what a simple grammar can do. Note that the "language" cannot contain any As in the strings it produces; these can only appear in the intermediate sentential forms. This mistake was made by several people.

It cannot be an ambiguous grammar - we have shown it to be an LL(1) grammar, and hence it must be both deterministic and non-ambiguous.

5. A weary CSC 301 student has been set the task of deriving a grammar to describe a list of trains. Trains fall into three categories - passenger trains, freight trains, and mixed passenger and freight trains. A train may have one (or more) locomotives. Behind the locomotive(s) in mixed and freight trains come one or more freight trucks. A freight train is terminated by a guard's van; a mixed train on the other hand follows the freight trucks with one or more passenger coaches, the last of which should be a passenger brake van. A passenger train has no freight trucks, but only a sequence of coaches and the passenger brake van behind the locomotive(s). Freight trucks come in various forms - open trucks, fuel trucks, coal trucks, cattle trucks and cold trucks. In the interests of safety, there is a regulation to the effect that fuel trucks may not be marshalled immediately behind the locomotives, or immediately in front of a passenger coach.

After many hours of wrestling with this she has come up with the following idea (which happens to be a valid grammar to describe such a train). But has she produced an LL(1) grammar? Justify your answer in detail (don't just guess or wave your arms!) [8]

    COMPILER Train $CN
    /* Grammar for simple railway train
       P.D. Terry, Rhodes University, 2007 */

    IGNORECASE
    COMMENTS FROM "(*" TO "*)" NESTED
    IGNORE CHR(0) .. CHR(31)

    PRODUCTIONS
      Train        = LocoPart [ SafeLoad | "guard" | Passengers ] "." .
      LocoPart     = "loco" { "loco" } .
      Passengers   = { "coach" } "brake" .
      SafeLoad     = SafeTruck RestSafeLoad .
      RestSafeLoad = SafeLoad | "guard" | Passengers | Fuel .
      Fuel         = FuelTruck { FuelTruck } ( SafeLoad | "guard" ) .
      SafeTruck    = "coal" | "open" | "cattle" | "cold" .
      FuelTruck    = "fuel" .
    END Train.

Here is one way to solve the problem - as usual, rewrite the grammar without using the meta-brackets

    COMPILER Train $TF
    /* Grammar for simple railway train
       P.D. Terry, Rhodes University, 2007 */

    IGNORECASE
    COMMENTS FROM "(*" TO "*)" NESTED
    IGNORE CHR(0) .. CHR(31)

    PRODUCTIONS
      Train        = LocoPart Payload "." .                      /* 1           */
      Payload      = SafeLoad | "guard" | Passengers |  .        /* 2 3 4       */
      LocoPart     = "loco" Locos .                              /* 5           */
      Locos        = "loco" Locos | .                            /* 6 7         */
      Passengers   = Coaches "brake" .                           /* 9           */
      Coaches      = "coach" Coaches | .                         /* 9 10        */
      SafeLoad     = SafeTruck RestSafeLoad .                    /* 11          */
      RestSafeLoad = SafeLoad | "guard" | Passengers | Fuel .    /* 12 13 14    */
      Fuel         = FuelTruck Fuels FuelTail .                  /* 15          */
      FuelTail     = SafeLoad | "guard" .                        /* 16 17       */
      Fuels        = "fuel" Fuels | .                            /* 18 19       */
      SafeTruck    = "coal" | "open" | "cattle" | "cold" .       /* 20 21 22 23 */
      FuelTruck    = "fuel" .                                    /* 24          */
    END Train.

To check Rule 1 we look at the productions for Payload, RestSafeLoad, FuelTail, SafeTruck. Several of these obviously satisfy Rule 1, but for the record here are the full details:

First(Payload1) = FIRST(SafeLoad) = { "coal", "open", "cattle", "cold" }
First(Payload2) = { " guard" }
First(Payload3) = FIRST(Passengers) = {"brake", "coach" }

First(RestSafeLoad1) = FIRST(SafeLoad) = { "coal", "open", "cattle", "cold" }
First(RestSafeLoad2) = { "guard" }
First(RestSafeLoad3) = FIRST(Passengers) = {"brake", "coach" }
First(RestSafeLoad4) = First(Fuel) = { "fuel" }

First(FuelTail1) = FIRST(SafeLoad) = { "coal", "open", "cattle", "cold" }
First(FuelTail2) = { "guard" }

These productions all satisfy Rule 1; and the productions for SafeTruck clearly satisfy Rule 1, as the alternatives are all simple distinct terminals.
To check Rule 2 we must examine FIRST and FOLLOW sets for the nullable non-terminals Payload, Locos, Coaches, Fuels. These are easily seen to be

FIRST(Payload) = { "guard", "brake", "coach", "coal", "open", "cattle", "cold" }
FOLLOW(Payload" = { "." }

FIRST(Locos) = { "loco" }
FOLLOW(Locos) = { ".", "guard", "brake", "coach", "coal", "open", "cattle", "cold" }

FIRST(Coaches) = { "coach" }
FOLLOW(Coaches) = { "brake" }

FIRST(Fuels) = { "fuel" }
FOLLOW(Fuels) = { "guard", "coal", "open", "cattle", "cold" }

so Rule 2 is satisfied for all the nullable non-terminals.

It is worth an anecdote - I have used the exercise of finding an LL(1) grammar for a safe train over many years (at discreet intervals). It took quite a long time before somebody found one, although once one has seen it, the solution appears straightforward. Many attempts founder - requiring extra trucks, at least two of something or other, and so on.