Practical for week 22

Computer Science 3 - 2012

Programming Language Translation

Practical for Week 22, beginning 24 September 2012

Hand in your solutions to the second part of this practical before lunch time on your next practical day, correctly packaged in a transparent folder with your cover sheets. Please do NOT come to a practical and spend the first hour printing or completing solutions from the previous week's exercises. Since the practical will have been done on a group basis, please hand in one copy of the cover sheet for each member of the group. These will be returned to you in due course, signed by the marker. Please make it clear whose folder you have used for the electronic submission, for example g03A1234. Lastly, please resist the temptation to carve up the practical, with each group member only doing one task. The group experience is best when you discuss each task together.

Objectives:

In this practical you are to

familiarize you with the rules and restrictions of LL(1) parsing, and
help you understand the concept of ambiguity, and
get more experience in writing simple grammars.

You will need this prac sheet and your text book. As usual, copies of the prac sheet are also available at http://www.cs.ru.ac.za/courses/CSc301/Translators/trans.htm.

Outcomes:

When you have completed this practical you should understand

how to apply the LL(1) rules manually and automatically;
how to use Coco/R more effectively;

To hand in:

This week you are required to hand in, besides the cover sheet:

The solutions to tasks 1, 2, 3 and 4 by 08h00 tomorrow morning on the attached sheet or an edited copy.
Listings of your solutions to the grammar problems, produced on the laser printer by using the LPRINT utility (take care; lines can get quite wide!)
Electronic copies of your grammar files (ATG files), stored in a folder under one of the group's student numbers.

I do NOT require listings of any Java code produced by Coco/R.

Keep the prac sheet and your solutions until the end of the semester. Check carefully that your mark has been entered into the Departmental Records.

You are referred to the rules for practical submission which are clearly stated in our Departmental Handbook. However, for this course pracs must be posted in the "hand-in" box outside the laboratory and not given to demonstrators.

A rule not stated there, but which should be obvious, is that you are not allowed to hand in another group's or student's work as your own. Attempts to do this will result in (at best) a mark of zero and (at worst) severe disciplinary action and the loss of your DP. You are allowed - even encouraged - to work and study with other students, but if you do this you are asked to acknowledge that you have done so. You are expected to be familiar with the University Policy on Plagiarism, which you can consult by following the links at:

        http://www.scifac.ru.ac.za/            or from     http://www.ru.ac.za/

The first few tasks do not need you to use a computer, nor should you. Do them by hand.

Task 1 - Meet the family

Consider the following grammar:

   COMPILER Home
   IGNORE CHR(0) .. CHR(31)
   PRODUCTIONS
     Home      = Family { Pets } [ Vehicle ] "house" .
     Pets      = "dog" [ "cat" ] | "cat" .
     Vehicle   = ( "scooter" | "bicycle" ) "fourbyfour" .
     Family    = Parents { Children } .
     Parents   = [ "Dad" ] [ "Mom" ] | "Mom" "Dad" .
     Child     =   "Helen" | "Margaret" | "Alice" | "Robyn" | "Cathy"
                 | "Janet" | "Anne" | "Ntombizodwa" | "Ntombizanele" .
   END Home.

Analyse this grammar in detail. Is it LL(1) compliant? If not, why not?

Task 2 - Expressions - again

The following grammar attempts to describe expressions incorporating the familiar operators with their correct precedence and associativity.

   COMPILER Expression
   IGNORE CHR(0) .. CHR(31)
   PRODUCTIONS
     Expression = Term    { ( "+" | "-" ) Term  } .
     Term       = Factor  { ( "*" | "/" ) Factor } .
     Factor     = Primary [ "^" Expression ] .
     Primary    = "a" | "b" | "c" .
  END Expression.

Is this an ambiguous grammar? (Hint: try to find an expression that can be parsed in more than one way). Is it an LL(1) grammar? If not, why not, and can you find a suitable grammar that is LL(1)?

Task 3 - Palindromes

Palindromes are character strings that read the same from either end, like "Hannah" or my brother Peter's favourite line when he did the CTM advertisements: "Bob Bob". The following represent various attempts to find grammars that describe palindromes made only of the letters a and b:

     (1)        Palindrome = "a"  Palindrome  "a" | "b"  Palindrome  "b" .
     (2)        Palindrome = "a"  Palindrome  "a" | "b"  Palindrome  "b" | "a" | "b" .
     (3)        Palindrome = "a" [ Palindrome ] "a" | "b" [ Palindrome ] "b" .
     (4)        Palindrome = [ "a"  Palindrome  "a" | "b"  Palindrome  "b" | "a" | "b" ] .

Which grammars achieve their aim? If they do not, explain why not. Which of them are LL(1)? Can you find other (perhaps better) grammars that describe palindromes and which are LL(1)?

Task 4 - Pause for thought

Which of the following statements are true? Justify your answer.

(a) An LL(1) grammar cannot be ambiguous.
(b) A non-LL(1) grammar must be ambiguous.
(c) An ambiguous language cannot be described by an LL(1) grammar.
(d) It is possible to find an LL(1) grammar to describe any non-ambiguous language.

Hand in your solutions to tasks 1 through 4 before continuing.

Task 5 - Grab a mug of hot Coco and press on

There are several files that you need, zipped up this week in the file PRAC22.ZIP (Java version) or PRAC22C.ZIP (C# version)

Copy the prac kit into a newly created directory/folder in your file space in the usual way:

             j:
             md  prac22
             cd  prac22
             copy  i:\csc301\trans\prac22.zip
             unzip  prac22.zip

You will find the executable version of Coco/R and batch files for running it, frame files, and various sample programs and grammars, including ones for the grammars given in tasks 1, 2 and 3.

After unpacking this kit attempt to make the parsers, as you did last week. Ask the demonstrators to show you how to get Coco/R to show you the FIRST and FOLLOW sets for the non-terminals of the grammar, and verify that the objections (if any) that Coco/R raises to these grammars are the same as you have determined by hand.

Task 6 - Eliminating meta-brackets

The file EBNF.ATG contains a Cocol grammar that describes EBNF using EBNF conventions, which might be familiar from lectures. Try this out "as is" to begin with, for example:

         cmake EBNF
         crun EBNF EBNF.TXT
         crun EBNF EBNF.BAD  -L

Next, use the grammar as a guide to develop a new system that will recognise or reject a set of productions like those in EBNF.txt, but with your Cocol grammar written in such a way that the PRODUCTIONS section does not itself use any of the Wirth "meta brackets" (that is { } and [ ]). As a hint, you will have to set up an equivalent grammar, using right-recursive production rules. Check that you get the same results as before when you run the program on test data.

Task 7 - Describe BNF

Use the original EBNF grammar from task 6 as a guide to develop a new system that will recognize or reject a set of productions written one to a line in "traditional" BNF notation, like those below:

      <name>               ::= <title> <first part> <surname>
      <title>              ::= Mr | Miss | Ms | Dr | Prof | eps
      <first part>         ::= <name> | <first part> <name> | eps
      <surname>            ::= <name>
      <name>               ::= <letter> | <letter> <name>
      <letter>             ::= <first half letter> | <second half letter>
      <first half letter > ::= a | b | c | d | e | f | g | h | i | j | k | l | m
      <second half letter> ::= n | o | p | q | r | s | t | u | v | w | x | y | z

Examples like this can be found in the files BNF.txt and BNF.bad. Note that we have used eps to denote the Greek letter e (epsilon) which cannot easily be represented in the character set you have available.

Task 8 - Reverse Polish Notation

You may be familiar with RPN or "Reverse Polish Notation" as a notation that can describe expressions without the need for parentheses. The notation eliminates parentheses by using "postfix" operators after the operands. To evaluate such expressions one uses a stack architecture, such as forms the basis of the PVM machine studied in the course. Examples of RPN expressions are:

3 4 + - equivalent to 3 + 4
3 4 5 + * - equivalent to 3 * (4 + 5)

In many cases an operator is taken to be "binary" - applied to the two preceding operands - but the notation is sometimes extended to incorporate "unary" operators - applied to one preceding operand:

4 sqrt - equivalent to sqrt(4)
5 - - equivalent to -5

Here are two attempts to write grammars describing an RPN expression:

          (G1)     RPN       =     RPN RPN binOp
                                 | RPN unaryOp
                                 | number .
                   binOp     =   "+" | "-" | "*" | "/" .
                   unaryOp   =   "-" | "sqrt" .

and

          (G2)     RPN       =   number REST .
                   REST      =   [ number REST binOp REST | unaryOp ].
                   binOp     =   "+" | "-" | "*" | "/" .
                   unaryOp   =   "-" | "sqrt" .

Are these grammars equivalent? Is either (or both) ambiguous? Do either or both conform to the LL(1) conditions? If not, explain clearly where the rules are broken, and come up with an LL(1) grammar that describes RPN notation, or else explain why it might be necessary to modify the language itself to overcome any problems you have uncovered.

Hand in sheet: - Available in the kit in the ASCII file HANDIN.TXT if you prefer

Task 1 - Meet the family What form does your grammar take when you eliminate the meta-brackets? Which, if any, productions break the LL(1) rules, and why? Can you find an equivalent grammar that does obey the LL(1) constraints? If so, give it. If not, explain why you think it canot be done. Task 2 - Expressions - again Is this an ambiguous grammar? If so, why? is it an LL(1) grammar? If not, why not, and can you find a suitable grammar that is LL(1)? Task 3 - Palindromes Does grammar 1 describe palindromes? If not, why not? Is it an LL(1) grammar? If not, why not? Does grammar 2 describe palindromes? If not, why not? Is it an LL(1) grammar? If not, why not? Does grammar 3 describe palindromes? If not, why not? Is it an LL(1) grammar? If not, why not? Does grammar 4 describe palindromes? If not, why not? Is it an LL(1) grammar? If not, why not? Can you find a better grammar to describe palindromes? If so, give it, if not, explain why not. Task 4 Which of the following statements are true? Justify your answers. (a) An LL(1) grammar cannot be ambiguous. (b) A non-LL(1) grammar must be ambiguous. (c) An ambiguous language cannot be described by an LL(1) grammar. (d) It is possible to find an LL(1) grammar to describe any non-ambiguous language.