Computer Science 301 - 2014

Second Tutorial for week 6 - LL(1) grammar checks again


Stop watching videos and playing computer games - listen to steam radio

As an example of checking the LL(1) conditions for a grammar expressed in EBNF, let us consider how we might describe the daily activities of SAFM - "your news and information leader" (more realistically described as a non- stop party political broadcast). Hour by hour Tim, Tsepiso, Isaac, Nancy, Ashraf, Elvis, Karabo and their comrades present what are generously called shows. These include music and advertisements (since people pay no licence fees, the station has to raise revenue through advertising). What passes as "information" is provided by news bulletins, weather reports and talk shows. News bulletins consist of a stream of stories, while talk shows consist of interchanges between the host and listeners who are keen enough to phone in - with the person hosting the exchange getting the first and last word on the subject, of course.

Items like "advert", "rain" and "music" are really in the category of the lexical terminals which a scanner (in the person of someone listening to the radio) would recognize as key symbols while parsing a broadcast. So one attempt to describe a day's activities might be on the lines of

     Radio        =   { TalkShow | NewsBulletin | "music" | "advert" } EOF .
     NewsBulletin =   "advert" NewsItem { NewsItem } [ Weather ] Filler .
     NewsItem     =   "ANC" [ "cosatu" ] | "strike" | [ "cosatu" ] "ANC" | "zuma" | "corruption" | "murder" .
     TalkShow     =   "host" { "listener" "host" } [ Filler ] .
     Filler       =   "music" | "advert"  .
     Weather      =   { "snow" | "rain" | "cloudy" | "windy" } .

Analyze this grammar in detail. One way is to begin by recasting it in a form that introduces further non- terminals that allow for the elimination of the [ ] and { } meta-brackets. If it proves out to be non-LL(1), try to find an essentially equivalent description that is LL(1), or argue why this should be impossible.


Palindromes

Palindromes are character strings that read the same from either end, like "Hannah" or my brother's favourite line when he did the CTM ads: "Bob Bob". The following represent various attempts to find grammars that describe palindromes made only of the letters a and b:

     (1)        Palindrome = "a"  Palindrome  "a" | "b"  Palindrome  "b" .
     (2)        Palindrome = "a"  Palindrome  "a" | "b"  Palindrome  "b" | "a" | "b" .
     (3)        Palindrome = "a" [ Palindrome ] "a" | "b" [ Palindrome ] "b" .
     (4)        Palindrome = [ "a"  Palindrome  "a" | "b"  Palindrome  "b" | "a" | "b" ] .

Which grammars achieve their aim? If they do not, explain why not. Which of them are LL(1)? Can you find other (perhaps better) grammars that describe palindromes and which are LL(1)?


Reverse Polish Notation

You are now familiar with RPN or "Reverse Polish Notation" as a notation that can describe expressions without the need for parentheses. The notation eliminates parentheses by using "postfix" operators after the operands. To evaluate such expressions one uses a stack architecture, such as forms the basis of the PVM machine studied in the course. Examples of RPN expressions are:

3 4 +        - equivalent to    3 + 4
3 4 5 + *    - equivalent to    3 * (4 + 5)

In many cases an operator is taken to be "binary" - applied to the two preceding operands - but the notation is sometimes extended to incorporate "unary" operators - applied to one preceding operand:

4 sqrt       - equivalent to    sqrt(4)
5 -          - equivalent to    -5

Here are two attempts to write grammars describing an RPN expression:

          (G1)     RPN       =     RPN RPN binOp
                                 | RPN unaryOp
                                 | number .
                   binOp     =   "+" | "-" | "*" | "/" .
                   unaryOp   =   "-" | "sqrt" .

and

          (G2)     RPN       =   number REST .
                   REST      =   [ number REST binOp REST | unaryOp ].
                   binOp     =   "+" | "-" | "*" | "/" .
                   unaryOp   =   "-" | "sqrt" .

Are these grammars equivalent? Is either (or both) ambiguous? Do either or both conform to the LL(1) conditions? If not, explain clearly where the rules are broken, and come up with an LL(1) grammar that describes RPN notation, or else explain why it might be necessary to modify the language itself to overcome any problems you have uncovered.


Meet the family in another format

Consider the following grammar:

   COMPILER Home
   IGNORE CHR(0) .. CHR(31)
   PRODUCTIONS
     Home      = Family { Pets } [ Vehicle ] "house" .
     Pets      = "dog" [ "cat" ] | "cat" .
     Vehicle   = ( "scooter" | "bicycle" ) "fourbyfour" .
     Family    = Parents { Children } .
     Parents   = [ "Dad" ] [ "Mom" ] | "Mom" "Dad" .
     Child     =   "Helen" | "Margaret" | "Alice" | "Robyn" | "Cathy"
                 | "Janet" | "Anne" | "Ntombizodwa" | "Ntombizanele" .
   END Home.

Analyse this grammar in detail. Is it LL(1) compliant? If not, why not?


Home  © P.D. Terry