Test on Practical 21

Computer Science 3 - 2005 - Test on Prac 21

Please answer all questions in the spaces provided (four sides). Text books may not be used. Max 40.

This test was intended to check your understanding of important concepts in formal language theory and the construction of grammars, with some questions relating back to earlier sections. It was not well done, pointing to an unwillingness on the part of some students to study the theoretical parts in the book.

1. You must have swotted for hours for this hot spot! What are your name and student number? [1]

          Pat Terry  663T0844

2. A language is to be defined over an alphabet {a, b}. Give a regular expression (in standard notation) that would describe

(a) possible strings in the language that would contain exactly one occurrence of a [2]

b^* a b^*

(b) possible strings in the language that would contain at least one occurrence of a [2]

(a | b)^* a (a | b)^* or b^* a (a | b)^*

3. Repeat question 2 using Cocol notation [4]

          CHARACTERS
             Alphabet = "ab" .   /* not really needed */
          TOKENS
             ExactlyOneA = { "b" } "a" { "b" } .
             AtLeastOneA = { "a" | "b" } "a" { "a" | "b" } .
          or AtLeastOneA = { "b" } "a" { Alphabet } .

4. A language L restricted by a grammar G is defined by L(G) = {N, T, S, P}. Give a concise mathematical definition (not a lengthy English description) of what is meant if one were to claim that a string s was

(a) a Sentential Form [2]

A sentential form is the goal or start symbol, or any string that can be derived from it:

Sentential form = s | S Þ^* s Ù s Î (N È T )^*

(b) a Sentence [2]

A sentence is a string derivable from S that contains only terminals.

Sentence = w | S Þ^* w Ù w Î T ^*

5. (a) What restrictions imposed on the grammar G of question 4 would be needed for us to claim that it was a "reduced grammar". Once again, give a concise mathematical description. [3]

A reduced grammar is one for which

S Þ^* a A b

for some strings a and b, and

A Þ^* g

for some g Î T ^* , A Î N.

(b) Explain very briefly in English what practical effect the restrictions above have. [3]

The first condition imposes the restriction that every non-terminal is "reachable", that is, will appear in at least one sentential form dervable from the goal symbol. The second condition imposes the restriction that every terminal is "terminating", that is, if the relevant productions are applied to it, can lead to a string consisting only of terminals.

6. (a) What is meant by the statement "a grammar is ambiguous"? [2]

There must be at least one string in the language that the grammar defines which can be parsed in at least two ways. There may be many such strings in general (see the example in grammar G9 in section 6.4), but the presence of just one such string is enough to cause ambiguity in the grammar as a whole (which may be one reason why it is hard to establish whether an arbitrary, large, grammar is ambiguous).

(b) Why is the following grammar that describes a list of Roman numbers ambiguous? [2]

            Roman  = Number { "," Number }  "."  EOF .
            Number = StartI | StartV | StartX .
            StartI = "I" ( "V" | "X" | [ "I" [ "I" ] ] ) .
            StartV = "V" [ "I" ] [ "I" ] [ "I" ] .
            StartX = "X" .

Either of the strings V I I and V I can be produced in three ways

One rather hoped that students would see the enormous hint given in the production for StartI, frankly!

            StartV = "V" [ "I" [ "I" [ "I" ] ] ] .

7. In the land of Phool the official language is Phoolish. A Phoolish professor noticed that many of his students had not mastered the syntax of Phoolish well. Tired of correcting their many syntactical mistakes, he decided to challenge the students and asked them to write a program that could check the syntactical correctness of any sentence they wrote. Similar to the nature of Phools, the syntax of Phoolish is also pleasantly simple. Here are the rules:

0. The characters in the language are the 26 letters of the standard western alphabet and the language distinguishes between lower and upper case letters.
1. Every lower case letter is a simple correct sentence.
2. If s is a correct sentence, then so is a sentence consisting of an upper case vowel followed by s, that is As Es Is Os or Us
3. If s and t are correct sentences, then so is a sentence consisting of an upper case consonant followed by st, for example Bst Cst Dst ... Zst.
4. Rules 0 to 3 are the only rules that determine the syntactical correctness of a sentence.

Complete the following Cocol description of this language [8]

        COMPILER Phoolish  /* given in test */
        CHARACTERS         /* if necessary or useful */
        TOKENS             /* if necessary or useful */
        IGNORE CHR(0) .. CHR(31)
        PRODUCTIONS
           Phoolish =
        END Phoolish.

This one is deceptively simple, once you see that it is necessary to choose character sets and token definitions appropriately.

        COMPILER Phoolish $CN
        CHARACTERS
          letter    = "abcdefghijklmnopqrstuvwxyz" .
          consonant = "BCDFGHJKLMNPQRSTVWXYZ" .
          vowel     = "AEIOU" .
        TOKENS
          Lletter    = letter .
          Uconsonant = consonant .
          Uvowel     = vowel .
        IGNORE CHR(0) .. CHR(31)
        PRODUCTIONS
          Phoolish = Lletter | Uvowel Phoolish | Uconsonant Phoolish Phoolish .
        END Phoolish.

Notice that the names of character sets cannot appear in the PRODUCTIONS section, seductive though this may appear. So the following is incorrect:

COMPILER Phoolish $CN CHARACTERS letter = "abcdefghijklmnopqrstuvwxyz" . consonant = "BCDFGHJKLMNPQRSTVWXYZ" . vowel = "AEIOU" . IGNORE CHR(0) .. CHR(31) PRODUCTIONS /* INCORRECT !!!!!!!! */ Phoolish = letter | vowel Phoolish | consonant Phoolish Phoolish . END Phoolish.

8. Here is a grammar that describes a railway train (not the one you were expected to produce in the recent practical, but more or less on the same track (excuse the pun!)).

   COMPILER Train
     PRODUCTIONS
       Train         = "loco" [ "loco" ] ( GoodsPart | PassengerPart ) .
       GoodsPart     = { "open" | "closed" } { "fuel" } "brake" .
       PassengerPart = "coach" "coach" { "coach" } "guardCoach" .
   END Train.

Rewrite the productions to give an equivalent grammar that avoids the use of the meta-brackets [ ] and { } [4]

This question, at least, was quite well done. There are, of course, several ways in which this can be done. Here is one of them:

   COMPILER Train
     PRODUCTIONS
       Train         = "loco" OptionalLoco ( GoodsPart | PassengerPart ) .
       OptionalLoco  = "loco" | e .
       GoodsPart     = FirstTrucks LastTrucks .
       FirstTrucks   = ( "open" | "closed" ) FirstTrucks | e .
       LastTrucks    = "fuel" LastTrucks | "brake" .
       PassengerPart = "coach" "coach" LastCoaches .
       LastCoaches   = "coach" LastCoaches | "guardCoach" .
   END Train.

9. (a) What is meant by the statement "a non-terminal is nullable"? [2]

It is possible to derive an empty string by applying appropriate productions to A and to the sentential forms derivable from A. Of course one may be able to produce non-empty strings as well; in general this will be the case. A non-terminal that can derive only the empty string is not very useful. Mathematically the definition of nullable can be expressed:

A Þ^* e

(b) In the grammar defined by the following productions, identify (giving reasons) the nullable non- terminals [2]

            A = B "a" | C D .
            B = [ A ] .
            C = B { "a" } .
            D = "a" B .

The nullable non-terminals are B (fairly obviously) and C (less obviously, but since B is nullable and the sequence { "a" } is nullable the concatenation of the two is also nullable).