RHODES UNIVERSITY


Computer Science 301 - 2006 - Programming Language Translation

Well, here you are. Here is the free information you have all been waiting for, with some extra bits of advice:


Section B [ 90 marks ]

As you should know by now, having seen an example in an earlier practical, a pretty-printer is a form of translator that takes source code and translates this into object code that is expressed in the same language as the source. This probably does not sound very useful! However, the main aim is to format the "object code" neatly and consistently, according to some simple conventions, making it far easier for humans to understand.

For example, a system that will read a set of EBNF productions, rather badly laid out as

    Goal={One}.
    One
    = Two "plus" Four ".".

    Four =Five {"is" Five|(Six Seven)}.
    Five = Six [Six       ].
    Six=  Two| Three    |"(*" Four "*)"|'(' Four ')'.

and produce the much neater output

    Goal
      = { One } .

    One
      = Two "plus" Four "." .

    Four
      = Five { "is" Five | ( Six Seven ) } .

    Five
      = Six  [ Six ] .

    Six
      = Two | Three | "(*" Four "*)" | '(' Four ')' .

is easily developed by using the following Cocol grammar

  import Library.*;

  COMPILER EBNF $CN
  /* Pretty-print a set of EBNF productions
     P.D. Terry, Rhodes University, 2006 */

    static int indentation = 0;

    public static void append(String s) {
    // Append string s to results file
      IO.write(s);
    }

    public static void newLine() {
    // Write line mark to results file but leave indentation as before
      int i = 1;
      IO.writeLine();
      while (i <= indentation) { IO.write(' '); i++; }
    }

    public static void indentNewLine() {
    // Write line mark to results file and prepare to indent further lines
    // by two spaces more than before
      indentation += 2;
      newLine();
    }


    public static void exdentNewLine() {
    // Write line mark to results file and prepare to indent further lines
    // by two spaces less
      indentation -= 2;
      newLine();
    }

    public static void indent() {
    // Increment indentation level by 2
      indentation += 2;
    }

    public static void exdent() {
    // Decrement indentation level by 2
      indentation -= 2;
    }

  CHARACTERS
    letter   = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
    lowline  = "_" .
    control  = CHR(0) .. CHR(31) .
    digit    = "0123456789" .
    noquote1 = ANY - "'" - control .
    noquote2 = ANY - '"' - control .

  TOKENS
    nonterminal = letter { letter | lowline | digit } .
    terminal    = "'" noquote1 { noquote1 } "'" | '"' noquote2 { noquote2 } '"' .

  COMMENTS FROM "(*" TO "*)"  NESTED

  IGNORE control

  PRODUCTIONS
    EBNF
    = { Production                        (. newLine(); .)
      } EOF .

    Production                            (. String name; .)
    = NT<out name>                        (. append(name); .)
    "="                                   (. indentNewLine(); append("= "); .)
    Expression  "."                       (. append(" ."); exdentNewLine(); .)
    .

    Expression
    = Term { "|"                          (. append(" | "); .)
      Term } .

    Term
    = Factor {                            (. append(" "); .)
      Factor } .

    Factor                                (. String name; .)
    =   NT<out name>                      (. append(name); .)
      | terminal                          (. append(token.val); .)
      | "["                               (. append("[ "); .)
        Expression "]"                    (. append(" ]"); .)
      | "("                               (. append("( "); .)
        Expression ")"                    (. append(" )"); .)
      | "{"                               (. append("{ "); .)
        Expression "}"                    (. append(" }"); .)
    .

    NT<out String name>
    =  nonterminal                        (. name = token.val; .)
    .

  END EBNF.

The "code generator" is embodied in the simple methods append, indent, exdent and so on, which for simplicity have simply been incorporated at the start of the grammar. A more sophisticated system might direct the output to a named file in the manner familiar to you from earlier practicals.

As the first step in the examination you are invited to experiment with the above system, code for which is included in the examination kit, and then to go on to develop a pretty-printer for programs expressed in the version of Parva described in the following grammar. A version of this grammar - spread out to make the addition of actions easy - can also be found in the examination kit, along with the necessary frame files.

  COMPILER Parva $NC
  /* Parva level 1 grammar  - Coco/R for C# or Java */

  CHARACTERS
    lf         = CHR(10) .
    backslash  = CHR(92) .
    control    = CHR(0) .. CHR(31) .
    letter     = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
    digit      = "0123456789" .
    stringCh   = ANY - '"' - control - backslash .
    charCh     = ANY - "'" - control - backslash .
    printable  = ANY - control .
  TOKENS
    identifier = letter { letter | digit | "_" } .
    number     = digit { digit } .
    stringLit  = '"' { stringCh | backslash printable } '"' .
    charLit    = "'" ( charCh   | backslash printable ) "'" .

  COMMENTS FROM "//" TO lf
  COMMENTS FROM "/*" TO "*/"
  IGNORE CHR(9) .. CHR(13)

  PRODUCTIONS
    Parva = "void" Ident "(" ")" "{" { Statement } "}" .
    Statement =   Block | ";" | Assignment | ConstDeclarations | VarDeclarations
                | IfStatement | WhileStatement | DoWhileStatement | ForStatement
                | BreakStatement | ContinueStatement | HaltStatement | ReturnStatement
                | ReadStatement | WriteStatement .
    Block = "{" { Statement } "}" .
    ConstDeclarations = "const" OneConst { "," OneConst } ";" .
    OneConst  = Ident "=" Constant .
    Constant = IntConst | CharConst | "true" | "false" | "null" .
    VarDeclarations = Type OneVar { "," OneVar } ";" .
    Type = BasicType [ "[]" ] .
    BasicType = "int" | "bool" | "char" .
    OneVar = Ident [ "=" Expression ] .
    Assignment = Designator ( "=" Expression | "++" | "--" ) ";" .
    Designator = Ident [  "[" Expression "]" ] .
    IfStatement = "if" "(" Condition ")" Statement [ "else" Statement ] .
    WhileStatement = "while" "(" Condition ")" Statement .
    DoWhileStatement = "do" Statement "while" "(" Condition ")" ";" .
    ForStatement = "for" Ident "=" Expression ( "to" | "downto" ) Expression Statement .
    BreakStatement = "break" ";" .
    ContinueStatement = "continue" ";" .
    HaltStatement = "halt" ";" .
    ReturnStatement = "return" ";" .
    ReadStatement = "read" "(" ReadElement { "," ReadElement } ")" ";" .
    ReadElement = ( StringConst | Designator ) .
    WriteStatement = "write" "(" WriteElement { "," WriteElement } ")" ";" .
    WriteElement = ( StringConst | Expression ) .
    Condition = Expression .
    Expression = AndExp { "||" AndExp } .
    AndExp = EqlExp { "&&" EqlExp } .
    EqlExp = RelExp { EqualOp RelExp } .
    RelExp = AddExp [ RelOp AddExp ] .
    AddExp = MultExp { AddOp MultExp } .
    MultExp = Factor { MulOp Factor } .
    Factor = Primary | "+" Factor | "-" Factor | "!" Factor .
    Primary =   Designator | Constant | "new" BasicType "[" Expression "]"
              | "(" ( "char" ")" Factor | "int" ")" Factor | Expression ")" ) .
    AddOp = "+" | "-" .
    MulOp = "*" | "/" | "%" .
    EqualOp = "==" | "!=" .
    RelOp = "<" | "<=" | ">" | ">=" .
    Ident = identifier .
    StringConst = stringLit .
    CharConst = charLit .
    IntConst = number .
  END Parva.

You should find that this is quite easy to do, leading to a system that might be able to convert a "messy" little Parva program like

     void main(){int b,c,d;
       b=        10;
       while (b>0) { write("b = ", b); b--;
       if (b == 4) write(" hit four!\n")
       ;       else write(" on we go\n");}
       }

into the rather more readable code

     void main() {
       int b, c, d;
       b = 10;
       while (b > 0) {
         write("b = ", b);
         b--;
         if (b == 4)
           write(" hit four!\n");
         else
           write(" on we go\n");
       }
     }

Hints: (a) Do not change the grammar to include any of the other extensions you made in Prac 25. (b) One test for correctness is to take the output from the pretty-printer and pretty-print that again (ie take the output file from one run and use it as the input file for another run). You should, of course, just get the same result again.