Computer Science 3 - 2005

Programming Language Translation


Practical for Week 21, beginning 19 September 2005 - Solutions

Complete sources to these solutions can be found on the course WWW pages in the files PRAC21A.ZIP or PRAC21AC.ZIP


Task 2 - Extensions to the Simple Calculator

Exponentiation is a stronger operation than multiplication, so it has to be introduced deeper into the hierarchy. Functions like sqrt also take precedence over other operations:

   COMPILER Calc  $CN
   /* Simple four function calculator - extended
      P.D. Terry, Rhodes University, 2005 */

   CHARACTERS
     digit      = "0123456789" .
     hexdigit   = digit + "ABCDEF" .

   TOKENS
     decNumber  = digit { digit } .
     hexNumber  = "$" hexdigit { hexdigit } .

   IGNORE CHR(0) .. CHR(31)

   PRODUCTIONS
     Calc       = { Expression "=" } EOF .
     Expression = Term { "+" Term  |  "-" Term } .
     Term       = Factor { "*" Factor |  "/" Factor } .
     Factor     = Primary [ "^" Factor ] .
     Primary    =   decNumber | hexNumber
                  | "(" Expression ")"
                  | "sqrt" "(" Expression ")" .
   END Calc.


Task 3 - Highland Gatherings

The solutions received were mixed. Some didn't capture the idea that there should be N competitions each with M bands. A typical over simplified attempt looks like this

     Gath1         = { Competition } .
     Competition   = SlowQuick |  MSR |  Medley .
     SlowQuick     = "SlowMarch" "March"  .
     MSR           = "March" "Strathspey" "Reel" [ "March" ] .
     Medley        = OneTune { OneTune } .
     OneTune       =   "March" | "SlowMarch" | "Jig" | "Hornpipe" | "Reel"
                     | "Strathspey" { "Strathspey" } "Reel" .

If we want to introduce the idea that there are multiple competitions with multiple bands, then surely it makes sense to try

   COMPILER Gath2 $CN
   /* Describes the Pipe Band events at a Highland Gathering
      P.D. Terry, Rhodes University, 2005 */

   IGNORE CHR(0) .. CHR(31)

   PRODUCTIONS
     Gath2       = { Competition } .
     Competition = Band { Band } .
     Band        = SlowQuick | MSR | Medley .
     SlowQuick   = "SlowMarch" "March" .
     MSR         = "March" "Strathspey" "Reel" [ "March" ] .
     Medley      = OneTune { OneTune } .
     OneTune     =   "March" | "SlowMarch" | "Jig" | "Hornpipe" | "Reel"
                   | "Strathspey" { "Strathspey" } "Reel" .

   END Gath2.

This is badly non LL(1) (It represents a continual wail of noise). We need to get some sort of break between bands at least:

   COMPILER Gath3 $CN
   /* Describes the Pipe Band events at a Highland Gathering
      P.D. Terry, Rhodes University, 2005 */

   IGNORE CHR(0) .. CHR(31)

   PRODUCTIONS

     Gath3       = { "AnnounceCompetition" Competition } .
     Competition = Band { Band } .
     Band        = "AnnounceBand" ( SlowQuick | MSR | Medley ).
     SlowQuick   = "SlowMarch" "March" "break" .
     MSR         = "March" "Strathspey" "Reel" [ "March" ] "break" .
     Medley      = OneTune { OneTune } "break" .
     OneTune     =   "March" | "SlowMarch" | "Jig" | "Hornpipe" | "Reel"
                   | "Strathspey" { "Strathspey" } "Reel" .

   END Gath3.

Even this is non-LL(1). We can get an LL(1) grammar if we make the right kind of announcements:

   COMPILER Gath4 $CN
   /* Describes the Pipe Band events at a Highland Gathering
      P.D. Terry, Rhodes University, 2005 */

   IGNORE CHR(0) .. CHR(31)

   PRODUCTIONS

     Gath4       = { "AnnounceCompetition" Competition } .
     Competition = Band { Band } .
     Band        =   "AnnounceSlow"  SlowQuick
                   | "AnnounceMSR" MSR
                   | "AnnounceMedley" Medley .
     SlowQuick   = "SlowMarch" "March" "break" .
     MSR         = "March" "Strathspey" "Reel" [ "March" ] "break" .
     Medley      = OneTune { OneTune } "break" .
     OneTune     =   "March" | "SlowMarch" | "Jig" | "Hornpipe" | "Reel"
                   | "Strathspey" { "Strathspey" } "Reel" .

   END Gath4.

Here is yet another solution:

   COMPILER Gath5 $CN
   /* Describes the Pipe Band events at a Highland Gathering
      P.D. Terry, Rhodes University, 2005 */

   IGNORE CHR(0) .. CHR(31)

   PRODUCTIONS

     Gath5         = { Competition } .
     Competition   =   "AnnounceSlow"   SlowQuickComp
                     | "AnnounceMSR"    MSRComp
                     | "AnnounceMedley" MedleyComp .
     SlowQuickComp = SlowQuick { SlowQuick } .
     SlowQuick     = "SlowMarch" "March" "break" .
     MSRComp       = MSR { MSR } .
     MSR           = "March" "Strathspey" "Reel" [ "March" ] "break" .
     MedleyComp    = Medley { Medley } .
     Medley        = OneTune { OneTune } "break" .
     OneTune       =   "March" | "SlowMarch" | "Jig" | "Hornpipe" | "Reel"
                     | "Strathspey" { "Strathspey" } "Reel" .

   END Gath5.

In all these notice that we have not fallen into the trap of defining:

     OneTune     =   "March" | "SlowMarch" | "Jig" | "Hornpipe" | "Reel"
                   | "Strathspey" { "Strathspey" } "Reel" { "Reel" } .

which is non-LL(1) again, or of trying to write

     OneTune     =   "March" | "SlowMarch" | "Jig" | "Hornpipe" |
                   | "Strathspey" { "Strathspey" } "Reel" { "Reel" } .

which is LL(1), but precludes reels being played without a strathspey immediately before them.


Task 4 - Alternative description of EBNF

As hinted in the prac sheet, the trick here is to rework the productions that use meta braces for repetition by ones that use right recursion (to avoid LL(1) errors. Here is one possibility:

   COMPILER EBNF $CN
   /* Parse a set of EBNF productions
      P.D. Terry, Rhodes University, 2005 */

   CHARACTERS
     letter   = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
     lowline  = "_" .
     control  = CHR(0) .. CHR(31) .
     digit    = "0123456789" .
     noquote1 = ANY - "'" - control .
     noquote2 = ANY - '"' - control .

   TOKENS
     nonterminal = letter { letter | lowline | digit } .
     terminal    = "'" noquote1 { noquote1 } "'" | '"' noquote2 { noquote2 } '"' .

   COMMENTS FROM "(*" TO "*)"  NESTED

   IGNORE control

   PRODUCTIONS
     EBNF        = Productions EOF .
     Productions = Production Productions | .
     Production  = nonterminal "=" Expression "." .
     Expression  = Term MoreTerms .
     MoreTerms   = "|" Term MoreTerms | .
     Term        = Factor MoreFactors .
     MoreFactors = Factor MoreFactors | .
     Factor      =   nonterminal
                   | terminal
                   | "[" Expression "]"
                   | "(" Expression ")"
                   | "{" Expression "}" .
   END EBNF.

The above grammar matches the one given in the prac sheet. It does not, however, have the property of being able to describe itself any longer - we might argue that we need to be able to describe a nullable factor (the original could not do this). This might be achieved as follows:

   ...

   PRODUCTIONS
     EBNF        = Productions EOF .
     Productions = Production Productions | .
     Production  = nonterminal "=" Expression "." .
     Expression  = Term MoreTerms .
     MoreTerms   = "|" Term MoreTerms | .
     Term        = Factor Term | .
     Factor      =   nonterminal
                   | terminal
                   | "[" Expression "]"
                   | "(" Expression ")"
                   | "{" Expression "}" .
   END EBNF.

but notice that this also allows one to write production rules like

A = b | c | | | .

which you might argue is a bit silly. Further reflection on this is left as a useful exercise!


Task 5 - Extensions to Parva

These were very simple really. We cannot avoid the LL(1) warning for the else option in the IfStatement. This is the well-known "dangling else" problem that is a feature of many real languages. Fortunately - unlike some other LL(1) violations - it causes no harm.

   COMPILER Parva $CN
   /* Parva level 1 grammar  - Extended for prac 21
      P.D. Terry, Rhodes University, 2005
      Grammar only */

   CHARACTERS
     lf         = CHR(10) .
     backslash  = CHR(92) .
     control    = CHR(0) .. CHR(31) .
     letter     = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
     digit      = "0123456789" .
     stringCh   = ANY - '"' - control - backslash .
     charCh     = ANY - "'" - control - backslash .
     printable  = ANY - control .

   TOKENS

/* Most people expressed the next one like this

     identifier = letter { letter | digit | "_" ( letter | digit ) } .

   Technically this is not quite what was asked.  The restriction is really that an
   identifier cannot end with an underscore.  Identifiers like Pat_____Terry are allowed: */

     identifier = letter { letter | digit | "_" { "_" } ( letter | digit ) } .

     number     = digit { digit } .
     stringLit  = '"' { stringCh | backslash printable } '"' .
     charLit    = "'" ( charCh   | backslash printable ) "'" .

   COMMENTS FROM "//" TO lf
   COMMENTS FROM "/*" TO "*/"

   IGNORE CHR(9) .. CHR(13)

   PRODUCTIONS
     Parva             = "void" identifier "(" ")" Block .
     Block             = "{" { Statement } "}" .

/* Labels are easily added */

     Label             = number ":" .
     Statement         = { Label }
                         (   Block | ConstDeclarations | VarDeclarations | Assignment
                           | IfStatement | WhileStatement | ReturnStatement | HaltStatement
                           | DoWhileStatement | ForStatement | GoToStatement
                           | ReadStatement | WriteStatement
                           | ";"
                         ) .

     ConstDeclarations = "const" OneConst { "," OneConst } ";" .
     OneConst          = identifier "=" Constant .
     Constant          = number | charLit | "true" | "false" | "null" .
     VarDeclarations   = Type OneVar { "," OneVar } ";" .
     OneVar            = identifier [ "=" Expression ] .
     Assignment        = Designator "=" Expression ";" .
     Designator        = identifier [ "[" Expression "]" ] .

/* The extension to the IfStatement is easy, though it leads to a non-critical LL(1) warning */

     IfStatement       = "if" "(" Condition ")" Statement [ "else" Statement ] .
     WhileStatement    = "while" "(" Condition ")" Statement .
     ReturnStatement   = "return" ";" .
     HaltStatement     = "halt" ";" .

/* Remember that the DoWhileStatement and GoToStatement end with a semicolon! */

     GoToStatement     = "goto" number ";" .
     DoWhileStatement  = "do" Statement "while" "(" Condition ")" ";" .

/* The ForStatement really needs to avoid using "Assignment" as many people tried to do */

     ForStatement      = "for" "(" [ BasicType ] identifier "=" Expression ";"
                         Condition ";" identifier "=" Expression ")" Statement .

     ReadStatement     = "read" "(" ReadElement { "," ReadElement } ")" ";" .
     ReadElement       = stringLit | Designator .
     WriteStatement    = "write" "(" WriteElement { "," WriteElement } ")" ";" .
     WriteElement      = stringLit | Expression .
     Condition         = Expression .

/* The extended Expression grammar was not well done, except by a few.  Note the use of
   the { } metabrackets in all but one of the following.  Clearly you are not really
   familiar with Java precedence rules.  I don't blame you - there are too many levels! */

     Expression        = AndExp { "||" AndExp } .
     AndExp            = EqlExp { "&&" EqlExp } .
     EqlExp            = RelExp { EqlOp RelExp } .
     RelExp            = AddExp [ RelOp AddExp ] .  // This must be [ ] not { }
     AddExp            = MulExp { AddOp MulExp } .
     MulExp            = Factor { MulOp Factor } .
     Factor            = Primary | ( "+" | "-" | "!" ) Factor .
     Primary           =   Designator | Constant
                         | "new" BasicType "[" Expression "]"
                         | "(" Expression ")" .

     Type              = BasicType [ "[]" ] .
     BasicType         = "int" | "bool" .
     AddOp             = "+" | "-" .
     MulOp             = "*" | "/" | "%" .

/* Java has the "equality" operators at a different level from the "relational" ones: */

     EqlOp             = "==" | "!=" .
     RelOp             = "<" | "<=" | ">" | ">=" .
   END Parva.


Task 6 - Spoornet are looking for programmers

The wheels came off in many solutions. It is quite hard to get right, and one cannot easily find an LL(1) grammar that really matches the problem as set. Your situation was not helped by a rather poorly phrased question, which confused some groups, nor by the fact that the question implied that a guard could come after a fuel truck, but in the specimen trains this was commented as being invalid. I apologise. Here is a simple first attempt, but with no safety regulations.

   COMPILER Train1 $CN
   /* Grammar for simple railway trains
      P.D. Terry, Rhodes University, 2005 */

   IGNORECASE
   COMMENTS FROM "(*" TO "*)" NESTED
   IGNORE CHR(0) .. CHR(31)

   PRODUCTIONS
     Train1         = { OneTrain } EOF .
     OneTrain       = LocoPart ( Passengers | FreightOrMixed ) "." .
     LocoPart       = "loco" { "loco" } .
     FreightOrMixed = Truck { Truck } ( "guard" | Passengers ) .
     Passengers     = { "coach" } "brake" .
     Truck          = "coal" | "open" | "cattle" | "fuel" | "cold" .
   END Train1.

Here is an attempt at safety. But this one insists on at least two safe trucks in any train, and is not LL(1)

   PRODUCTIONS
     Train2         = { OneTrain } EOF .
     OneTrain       = LocoPart ( Passengers | FreightOrMixed ) "." .
     LocoPart       = "loco" { "loco" } .
     FreightOrMixed = SafeTruck { AnyTruck } LastPart .
     LastPart       = "guard" | SafeTruck Passengers .
     Passengers     = { "coach" } "brake" .
     SafeTruck      = "coal" | "open" | "cattle" | "cold" .
     AnyTruck       = SafeTruck | "fuel" .
   END Train2.

Why is it not LL(1)? We could apply all the theory of chapter 7, but maybe an example will suffice. Suppose we have a valid train like

    loco coal coal coal coal coach brake

The first coal truck is parsed by the leading SafeTruck in GoodsPart. The next two coal trucks must be parsed by the repetitive part { AnyTruck }, but you can probably see that the last coal truck would have to be parsed by the alternative within LastPart. Unfortunately an LL(1) parser can't see far enough ahead to make that decision, and would be tempted to treat this last coal truck as part of the { AnyTruck } sequence.

Here is one that is LL(1)

   PRODUCTIONS
     Train3         = { OneTrain } EOF .
     OneTrain       = LocoPart ( Passengers | FreightOrMixed ) "." .
     LocoPart       = "loco" { "loco" } .
     FreightOrMixed = SafeTruck MoreTrucks HumanPart .
     MoreTrucks     = { FuelTruck { FuelTruck } SafeTruck | SafeTruck } .
     HumanPart      = "guard" | Passengers .
     Passengers     = { "coach" } "brake" .
     SafeTruck      = "coal" | "open" | "cattle" | "cold" .
     FuelTruck      = "fuel" .
   END Train3.

where you might notice that we have used a production rather like that used in describing the sequence of tunes in the Medley competition in Task 4. At first you might thing that this is, at last, a correct solution. But no, it isn't quite. This solution does not allow you to have a train like

    loco loco open fuel fuel guard .

as the last fuel truck in a sequence has to be followed by at least one safe truck. The grammar does, however, allow trains like

    loco open coach coach brake .

with only one truck in the goods section.

It is remarkable that something that at first sight looks so simple should turn out to be frustratingly difficult. Not being able to find an LL(1) grammar is not a train smash - one quite often cannot find an LL(1) grammar for a language. But it's usually worth a try, as parsers for LL(1) grammars are so easy to write. Here is a solution that seems to meet all requirements.

   PRODUCTIONS
     Train4       = { OneTrain } EOF .
     OneTrain     = LocoPart [ SafeLoad | "guard" | Passengers ] "." .
     LocoPart     = "loco" { "loco" } .
     Passengers   = { "coach" } "brake" .
     SafeLoad     = SafeTruck RestSafeLoad .
     RestSafeLoad = SafeLoad | "guard" | Passengers | Fuel .
     Fuel         = FuelTruck { FuelTruck } ( SafeLoad | "guard" ) .
     SafeTruck    = "coal" | "open" | "cattle" | "cold" .
     FuelTruck    = "fuel" .
   END Train4.


Home  © P.D. Terry