Computer Science 3 - 2008 - Test on Prac 21

Please answer all questions in the spaces provided (four sides). Text books may not be used. Max 40.

This test was intended to check your understanding of important concepts in formal language theory and the construction of grammars, with some questions relating back to earlier sections. It was not well done, pointing to an unwillingness on the part of some students to study the theoretical parts in the book.

1. You must have swotted for hours for this hot spot! What are your name and student number? [1]

          Pat Terry  63T0844

2. A language is to be defined over an alphabet {a, b}. Give a regular expression (in standard notation) that would describe

(a) possible strings in the language that would contain exactly one occurrence of a [2]

b* a b*

(b) possible strings in the language that would contain at least one occurrence of a [2]

(a | b)* a (a | b)*         or     b* a (a | b)*

3. Repeat question 2 using Cocol notation [4]

          CHARACTERS
             Alphabet = "ab" .   /* not really needed */
          TOKENS
             ExactlyOneA = { "b" } "a" { "b" } .
             AtLeastOneA = { "a" | "b" } "a" { "a" | "b" } .
          or AtLeastOneA = { "b" } "a" { Alphabet } .

4. A language L restricted by a grammar G is defined by L(G) = {N, T, S, P}. Give a concise mathematical definition (not a lengthy English description) of what is meant if one were to claim that a string s was

(a) a Sentential Form [2]

A sentential form is the goal or start symbol, or any string that can be derived from it:

Sentential form =  s |  S Þ* s Ù s Î (N  È T )* 

(b) a Sentence [2]

A sentence is a string derivable from S that contains only terminals.

Sentence =  w |  S Þ* w Ù w Î T * 

5. (a) What restrictions imposed on the grammar G of question 4 would be needed for us to claim that it was a "reduced grammar". Once again, give a concise mathematical description. [3]

A reduced grammar is one for which

S Þ* a A b

for some strings a and b, and

A Þ* g

for some g Î T * , A Î N.

(b) Explain very briefly in English what practical effect the restrictions above have. [3]

The first condition imposes the restriction that every non-terminal is "reachable", that is, will appear in at least one sentential form dervable from the goal symbol. The second condition imposes the restriction that every terminal is "terminating", that is, if the relevant productions are applied to it, can lead to a string consisting only of terminals.

6. Here is an attempt to describe the PVM assembler language you fought with last week. The PRODUCTIONS section could be greatly improved. Suggests how this could be done. [5]

    COMPILER PVMAsm
    /* Grammar for subset of PVM assembler language
       P.D. Terry, Rhodes University, 2008 */
    CHARACTERS
      Control     = CHR(0) .. CHR(31) .
      Printable   = ANY - Control .
      Digits      = "0123456789" .
      LF          = CHR(10) .
      CR          = CHR(13) .
    IGNORE CHR(9) + CHR(11) .. CHR(13)
    TOKENS
      Number      = [ "-" ] Digits { Digits } .
      String      = '"' { Printable } '"' .
      EOL         = LF .
      Comment     = ";" { Printable } CR .
    PRODUCTIONS
      PVMAsm      = { Statement } .
      Statement   = [ Number ] [ Instruction ] [ Comment ] EOL .
      Instruction = Opcode [ Number | String ] .
      OpCode      = "LDA" | "LDV" | "DSP " | ...  /* assume the list is completed */ .
    END PVMAsm.

We need to distinguish between the various kinds of opcode. Like this, perhaps:

    PRODUCTIONS
      PVMAsm      = { Statement } .
      Statement   = [ Number ] Instruction [ Comment ] EOL .
      Instruction = TwoWord | OneWord | PrintString .
      TwoWord     = ( "LDA" | "LDC" | "DSP" | /* and others like this */ ) Number .
      OneWord     = "STO" | "LDV" | "HLT" | /* and others like this */ .
      PrintString = "PRNS" String .
    END PVMAsm.

7. In the land of Phool the official language is Phoolish. A Phoolish professor noticed that many of his students had not mastered the syntax of Phoolish well. Tired of correcting their many syntactical mistakes, he decided to challenge the students and asked them to write a program that could check the syntactical correctness of any sentence they wrote. Similar to the nature of Phools, the syntax of Phoolish is also pleasantly simple. Here are the rules:

0. The characters in the language are the 26 letters of the standard western alphabet and the language distinguishes between lower and upper case letters.

1. Every lower case letter is a simple correct sentence.

2. If s is a correct sentence, then so is a sentence consisting of an upper case vowel followed by s, that is As Es Is Os or Us

3. If s and t are correct sentences, then so is a sentence consisting of an upper case consonant followed by st, for example Bst Cst Dst ... Zst.

4. Rules 0 to 3 are the only rules that determine the syntactical correctness of a sentence.

Complete the following Cocol description of this language [8]

        COMPILER Phoolish  /* given in test */
        CHARACTERS         /* if necessary or useful */
        TOKENS             /* if necessary or useful */
        IGNORE CHR(0) .. CHR(31)
        PRODUCTIONS
           Phoolish =
        END Phoolish.

This one is deceptively simple, once you see that it is necessary to choose character sets and token definitions appropriately.

        COMPILER Phoolish $CN
        CHARACTERS
          letter    = "abcdefghijklmnopqrstuvwxyz" .
          consonant = "BCDFGHJKLMNPQRSTVWXYZ" .
          vowel     = "AEIOU" .
        TOKENS
          Lletter    = letter .
          Uconsonant = consonant .
          Uvowel     = vowel .
        IGNORE CHR(0) .. CHR(31)
        PRODUCTIONS
          Phoolish = Lletter | Uvowel Phoolish | Uconsonant Phoolish Phoolish .
        END Phoolish.

Notice that the names of character sets cannot appear in the PRODUCTIONS section, seductive though this may appear. So the following is incorrect:

        COMPILER Phoolish $CN
        CHARACTERS
          letter    = "abcdefghijklmnopqrstuvwxyz" .
          consonant = "BCDFGHJKLMNPQRSTVWXYZ" .
          vowel     = "AEIOU" .
        IGNORE CHR(0) .. CHR(31)
        PRODUCTIONS  /* INCORRECT !!!!!!!! */
          Phoolish = letter | vowel Phoolish | consonant Phoolish Phoolish .
        END Phoolish.

8. Here is a grammar that describes a railway train (not the one you were expected to produce in the recent practical, but more or less on the same track (excuse the pun!)).

   COMPILER Train
     PRODUCTIONS
       Train         = "loco" [ "loco" ] ( GoodsPart | PassengerPart ) .
       GoodsPart     = { "open" | "closed" } { "fuel" } "brake" .
       PassengerPart = "coach" "coach" { "coach" } "guardCoach" .
   END Train.

Rewrite the productions to give an equivalent grammar that avoids the use of the meta-brackets [ ] and { } [4]

This question, at least, was quite well done. There are, of course, several ways in which this can be done. Here is one of them:

   COMPILER Train
     PRODUCTIONS
       Train         = "loco" OptionalLoco ( GoodsPart | PassengerPart ) .
       OptionalLoco  = "loco" | e .
       GoodsPart     = FirstTrucks LastTrucks .
       FirstTrucks   = ( "open" | "closed" ) FirstTrucks | e .
       LastTrucks    = "fuel" LastTrucks | "brake" .
       PassengerPart = "coach" "coach" LastCoaches .
       LastCoaches   = "coach" LastCoaches | "guardCoach" .
   END Train.

Notice, however, that there is the world of difference between the following productions. If you can't see why, then come to ask.

       FirstTrucks   = ( "open" | "closed" ) FirstTrucks | e .

       FirstTrucks   = ( "open" | "closed" ) ( FirstTrucks | e ) .

       FirstTrucks   = ( "open" | "closed" | e ) FirstTrucks .

9. (a) What is meant by the statement "a non-terminal is nullable"? [2]

It is possible to derive an empty string by applying appropriate productions to A and to the sentential forms derivable from A. Of course one may be able to produce non-empty strings as well; in general this will be the case. A non-terminal that can derive only the empty string is not very useful. Mathematically the definition of nullable can be expressed:

A Þ* e

(b) In the grammar defined by the following productions, identify (giving reasons) the nullable non- terminals [2]

            A = B "a" | C D .
            B = [ A ] .
            C = B { "a" } .
            D = "a" B .

The nullable non-terminals are B (fairly obviously) and C (less obviously, but since B is nullable and the sequence {  "a" } is nullable the concatenation of the two is also nullable).


Home  © P.D. Terry