Computer Science 3 - 2007

Programming Language Translation


Practical for Week 24, beginning 8 October 2007 - solutions

As usual, the sources of full solutions for these problems may be found on the course web page as the file PRAC24A.ZIP or PRAC24AC.ZIP.

While there were some splendid submissions, there were also some very weak ones, so please study the suggestions below, as the ability to add attributes and actions to grammars is crucially important if you are to use a tool like Coco.

Many people had not done as requested and provided specimen output, which at least would have given some indication of how well their systems worked.


Task 3 - The extended calculator.

Most people had little difficulty with this. Many used two "parallel" arrays, one of double values and one of boolean values, to solve the problem of not using variables before they had been given values. An alternative solution just to show how it might be done using an array of objects representing the encapsulated properties of each variable would be as follows:

   import Library.*;
   import java.util.*;

   COMPILER Calc  $CN
   /* Simple four function calculator with 26 memory cells - extended
      P.D. Terry, Rhodes University, 2007 */

     static class MemCell {
       public double value = 0.0;
       public boolean defined = false;
     }

     static MemCell[] mem = new MemCell[26];

   CHARACTERS
     digit      = "0123456789" .
     letter     = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" .

   TOKENS
     Number     = digit { digit } [ "." { digit } ] .
     Variable   = letter .

   IGNORE CHR(0) .. CHR(31)

   PRODUCTIONS
     Calc                               (. int index = 0; double value = 0.0;
                                           for (int i = 0; i < 26; i++) mem[i] = new MemCell(); .)
     = { (
             Variable                   (. index = token.val.charAt(0) - 'A'; .)
             WEAK "="
             Expression<out value>      (. mem[index].value = value;
                                           mem[index].defined = true; .)
           | "print"
             Expression<out value>      (. IO.writeLine(value); .)
          ) SYNC ";"
       } EOF .

     Expression<out double expVal>      (. double expVal1 = 0.0; .)
     = Term<out expVal>
       {   "+" Term<out expVal1>        (. expVal += expVal1; .)
         | "-" Term<out expVal1>        (. expVal -= expVal1; .)
       } .

     Term<out double termVal>           (. double termVal1 = 0.0; .)
     = Factor<out termVal>
       {   "*" Factor<out termVal1>     (. termVal *= termVal1; .)
         | "/" Factor<out termVal1>     (. if (termVal1 == 0.0) SemError("division by zero");
                                           else termVal /= termVal1; .)
       } .

     Factor<out double factVal>         (. factVal = 0.0;
                                           double factVal2; .)
     =   Number                         (. try {
                                             factVal = Float.parseFloat(token.val);
                                           } catch (NumberFormatException e) {
                                             factVal = 0; SemError("number out of range");
                                           } .)
       | Variable                       (. int index = token.val.charAt(0) - 'A';
                                           if (!mem[index].defined) SemError("undefined value");
                                           else factVal = mem[index].value; .)
       | "(" Expression<out factVal>
         ")"
       | "sqrt"
         "(" Expression<out factVal>    (. if (factVal < 0) {
                                             SemError("square root of negative value");
                                             factVal = 0.0;
                                           }
                                           else factVal = Math.sqrt(factVal); .)
         ")"
       | "max"
         "(" Expression<out factVal>
             ","
             Expression<out factVal2>   (. if (factVal2 > factVal) factVal = factVal2; .)
         ")" .

   END Calc.

This has altered the grammar to demand that a semicolon follow each statement so that it can be used as a synchronization point.


Tasks 4, 5 - Internet anyone

This was very straightforward. Here is a solution using parameters to the various parsing methods:

   import Library.*;
   import java.util.*;

   COMPILER Check5 $CN
   // Check lists of IP numbers and computer names for duplicates and errors
   // P.D. Terry, Rhodes University, 2007
   // This version uses parameters
   //
   // Typical data
   // 146.231.122.131   bungee.ru.ac.za       #comments appear like this
   //                   humfac.ru.ac.za       #alternative name
   // 146.231.122.75    bungee.ru.ac.za       #invalid - name already used
   // 146.231.122.1123  pdt1.ict.ru.ac.za     #invalid IP address
   // 146.231.122.156   pdt2.ict.ru.ac.za
   // 146.231.122.156   pdt3.ict.ru.ac.za     # non-unique IP address

   IGNORECASE

   CHARACTERS
     digit   = "0123456789" .
     letter  = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
     eol     = CHR(10) .
   TOKENS
     number = digit { digit } .
     name   = letter { letter | digit | "." | "-" } .

   COMMENTS FROM "#" TO eol

   IGNORE CHR(0) .. CHR(31)

   PRODUCTIONS
     Check5                        (. ArrayList<Integer> IPList = new ArrayList<Integer> ();
                                      ArrayList<String>  NameList = new ArrayList<String>(); .)
     =
     { Entry<IPList, NameList> }
     EOF                           (. if (Successful()) IO.writeLine("all correct"); .)
     .

     Entry<. ArrayList<Integer> IPL, ArrayList<String> NL .>
     = IPNumber<IPL> Name<NL> { Name<NL> } .

     Name<. ArrayList<String> NL .>
     =
     name                          (. String s = token.val;
                                      if (NL.contains(s)) SemError("duplicate computer name");
                                      else NL.add(s); .)
     .

     IPNumber<. ArrayList<Integer> IPL .>
     =                             (. int n, m; .)
     Number<out n>
     "." Number<out m>             (. n = 256 * n + m; .)
     "." Number<out m>             (. n = 256 * n + m; .)
     "." Number<out m>             (. n = 256 * n + m;
                                      if (IPL.contains(n)) SemError("duplicate IP number");
                                      else IPL.add(n); .)
     .

     Number<out int n>
     =  number                     (. try {
                                        n = Integer.parseInt(token.val);
                                      } catch (NumberFormatException e) {
                                        n = 256;
                                      }
                                      if (n > 255) SemError("number too large"); .)
     .

   END Check5.

This problem is so simple that we can get away with using "static" (global) lists:

   import Library.*;
   import java.util.*;

   COMPILER Check6 $CN
   // Check lists of IP numbers and computer names for duplicates and errors
   // P.D. Terry, Rhodes University, 2007
   // This version uses static ArrayLists, but no parameters
   //
   // Typical data
   // 146.231.122.131   bungee.ru.ac.za       #comments appear like this
   //                   humfac.ru.ac.za       #alternative name
   // 146.231.122.75    bungee.ru.ac.za       #invalid - name already used
   // 146.231.122.1123  pdt1.ict.ru.ac.za     #invalid IP address
   // 146.231.122.156   pdt2.ict.ru.ac.za
   // 146.231.122.156   pdt3.ict.ru.ac.za     # non-unique IP address

   static ArrayList <Integer> IPList   = new ArrayList <Integer>();
   static ArrayList <String>  NameList = new ArrayList <String>();

   IGNORECASE

   CHARACTERS
     digit   = "0123456789" .
     letter  = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
     eol     = CHR(10) .

   TOKENS
     number = digit { digit } .
     name   = letter { letter | digit | "." | "-" } .

   COMMENTS FROM "#" TO eol

   IGNORE CHR(0) .. CHR(31)

   PRODUCTIONS
     Check6
     =
     { Entry }
     EOF                           (. if (Successful()) IO.writeLine("all correct"); .)
     .

     Entry
     = IPNumber Name { Name } .

     Name
     =
     name                          (. String s = token.val;
                                      if (NameList.contains(s)) SemError("duplicate computer name");
                                      else NameList.add(s); .)
     .

     IPNumber
     =                             (. int n, m; .)
     Number<out n>
     "." Number<out m>             (. n = 256 * n + m; .)
     "." Number<out m>             (. n = 256 * n + m; .)
     "." Number<out m>             (. n = 256 * n + m;
                                      if (IPList.contains(n)) SemError("duplicate IP number");
                                      else IPList.add(n); .)
     .

     Number<out int n>
     =  number                     (. try {
                                        n = Integer.parseInt(token.val);
                                      } catch (NumberFormatException e) {
                                        n = 256;
                                      }
                                      if (n > 255) SemError("number too large"); .)
     .

   END Check6.


Tasks 6 - Eliminating metabrackets

The basic idea is quite simple - simply copy the text in the productions to a list of strings stroring the tokens and metabrackets for each production. However, when the { RepeatedOption } or [ SingleOption ] construction is encountered, invent a name, store this in the string instead, and then start a stylised production for that name whose right hand side contains the RepeatedOption or SingleOption. The fact that the whole system is recursive handles the problem of options within options and so on effectively "for free".

   COMPILER EBNF $CN
   /* Convert a set of EBNF productions to use BNF conventions, and carry out
      some rudimentary checks on their being properly defined.
      P.D. Terry, Rhodes University, 2007 */

   CHARACTERS
     cr       = CHR(10) .
     lf       = CHR(13) .
     letter   = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
     lowline  = "_".
     digit    = "0123456789" .
     noquote1 = ANY - "'" - cr - lf .
     noquote2 = ANY - '"' - cr - lf .

   TOKENS
     nonterminal = letter {letter | lowline | digit} .
     terminal    = "'" noquote1 {noquote1} "'" | '"' noquote2 {noquote2} '"' .

   COMMENTS FROM "(*" TO "*)"  NESTED

   IGNORE CHR(9) .. CHR(13)

   PRODUCTIONS
     EBNF
     = { Production }
       EOF                                 (. if (Successful()) {
                                                Table.listProductions();
                                                Table.testProductions();
                                              }
                                           .)
       .

     Production
     = SYNC nonterminal                    (. String name = token.val;
                                              int i = Table.add(name, Table.LHS);
                                              Table.addText(name, i);
                                              Table.addText("= ", i);
                                           .)
       WEAK "=" Expression<name, i>
       SYNC "."                            (. Table.addText(".\n", i); .)
       .

     Expression<String s, int i>
     = Term<s, i> { WEAK "|"               (. Table.addText("|", i); .)
       Term<s, i> } .

     Term<String s, int i> = [ Factor<s, i> { Factor<s, i> } ] .

     Factor<String s, int i>               (. String name;
                                              int j; .)
     =   nonterminal                       (. name = token.val;
                                              j = Table.add(name, Table.RHS);
                                              Table.addText(name, i);
                                           .)
       | terminal                          (. name = token.val;
                                              Table.addText(name, i);
                                           .)
       | "["                               (. name = Table.newName(s, "Opt");
                                              Table.addText(name, i);
                                              j = Table.add(name, Table.BOTH);
                                              Table.addText(name, j);
                                              Table.addText("= ", j); .)
         Expression<s, j>
         "]"                               (. Table.addText("|", j);
                                              Table.addText(".\n", j); .)
       | "("                               (. Table.addText("(", i); .)
         Expression<s, i>
         ")"                               (. Table.addText(")", i); .)
       | "{"                               (. name = Table.newName(s, "Seq");
                                              Table.addText(name, i);
                                              j = Table.add(name, Table.BOTH);
                                              Table.addText(name, j);
                                              Table.addText("= ", j);
                                              Table.addText("(", j); .)
         Expression<s, j>                  (. Table.addText(")", j);
                                              Table.addText(name, j);
                                              Table.addText("|", j);
                                              Table.addText(".\n", j);
                                           .)
         "}"
       .

   END EBNF.

The table handler to match this might be coded as follows. Note the simple device of counting how many times each non-terminal is added to the left hand and right hand side of a production. this makes checking for undefined and redefined and unreachable non-terminals quite easy.

    // Table handler for EBNF -> BNF style converter
    // P.D. Terry, Rhodes University, 2007

    package EBNF;

    import java.util.*;
    import Library.*;

      class Table {

        static class Entry {
          public int left, right;
          public String name;
          public ArrayList<String> text;
          public Entry(String name) {
            this.left   = 0;
            this.right  = 0;
            this.name   = name;
            this.text   = new ArrayList<String>();
          }
        } // Entry

        public static final int
          LHS  = 1,
          RHS  = 2,
          BOTH = 3;

        static int extraNames = 0;
        static ArrayList<Entry> list = new ArrayList<Entry>();
        static int maxLength = 0;

        public static String newName(String oldName, String ext) {
          extraNames++;
          return oldName + ext + extraNames;
        }

        public static int add(String name, int where) {
        // Search for, and possibly add a nonterminal "name" to the table, according as
        // to "where" in a production it was found, and return the position as the
        // index where the name was either added previously or added by this call
        // debug: IO.writeLine("add " + name + " " + where);
          int position = 0;
          while (position < list.size() && !name.equals(list.get(position).name)) position++;
          if (position >= list.size()) {
            list.add(new Entry(name));
            if (name.length() > maxLength) maxLength = name.length();
          }
          switch (where) {
            case LHS  :
              list.get(position).left++; break;
            case RHS  :
              list.get(position).right++; break;
            case BOTH :
              list.get(position).left++;
              list.get(position).right++; break;
          }
          return position;
        }

        public static void addText(String str, int i) {
        // add str to the list of strings for the rule in the table at position i
        // debug: IO.writeLine("addtext " + str);
          list.get(i).text.add(str);
        }

        public static void listProductions() {
        // List all productions to standard output file
          for (int i = 0; i < list.size(); i++)
            if (list.get(i).text.size() > 0) {
              int paren = 0;
              IO.write(list.get(i).text.get(0), -maxLength);
              for (int j = 1; j < list.get(i).text.size(); j++) {
                if (list.get(i).text.get(j).equals("(")) paren++;
                if (list.get(i).text.get(j).equals(")")) paren--;
                if (list.get(i).text.get(j).equals("|") && paren == 0) { // neater output for alternatives
                  IO.writeLine();
                  IO.write(' ', maxLength + 1);
                }
                IO.write(" " + list.get(i).text.get(j));
              }
            }
          IO.writeLine();
        }

        public static void testProductions() {
        // Check that all non terminals have appeared once on the left side of
        // each production, and at least once on the right side of each production
          boolean OK = true; // optimistic
          int i;
          for (i = 0; i < list.size(); i++) {
            if (list.get(i).left == 0) {
              OK = false;
              IO.writeLine("(* " + list.get(i).name + " never defined *)");
            }
            else if (list.get(i).left > 1) {
              OK = false;
              IO.writeLine("(* " + list.get(i).name + " defined more than once *)");
            }
          }
          for (i = 1; i < list.size(); i++) {
            if (list.get(i).right == 0) {
              OK = false;
              IO.writeLine("(* " + list.get(i).name + " cannot be reached *)");
            }
          }
          if (!OK) IO.writeLine("\n(* Cannot be correct grammar *)");
        }

      } // Table

Another solution might be as follows. Here we have taken cognisance of the fact that there is nothing really wrong with productions like

      Supper =  "pizza" .
      Supper =  "lasagna" .

which are of course equivalent to

      Supper = "pizza" | "lasagna" .

so that if a further rule is discovered for a nonterminal we can simply concatenate the alternatives together.

The table handler needs some changes (only the changes are shown here)

    public static void listProductions() {
    // List all productions to standard output file
      for (int i = 0; i < list.size(); i++)
        if (list.get(i).text.size() > 0) {
          int paren = 0;
          IO.write(list.get(i).text.get(0), -maxLength);
          for (int j = 1; j < list.get(i).text.size(); j++) {
            if (list.get(i).text.get(j).equals("(")) paren++;
            if (list.get(i).text.get(j).equals(")")) paren--;
            if (list.get(i).text.get(j).equals("|") && paren == 0) { // neater output for alternatives
              IO.writeLine();
              IO.write(' ', maxLength + 1);
            }
            IO.write(" " + list.get(i).text.get(j));
          }
          IO.writeLine(" .");  // ------ added
        }
      IO.writeLine();
    }

    public static int positionLHS(String name) { // ------------ added
    // Return the position in the table where name can be found on the LHS,
    // or -1 if not yet entered found in that way
    // debug: IO.writeLine("positionLHS " + name + " " + size());
      int position = 0;
      while (position < list.size() && !name.equals(list.get(position).name)) position++;
      return (position >= list.size() || list.get(position).left == 0 ? -1 : position);
    }

The grammar also needs a few changes:

      Production
      = SYNC nonterminal                    (. String name = token.val;
                                               int i = Table.positionLHS(name);
                                               if (i == -1) {
                                                 i = Table.add(name, Table.LHS);
                                                 Table.addText(name, i);
                                                 Table.addText("= ", i);
                                               } else {
                                                 Table.addText("|", i);
                                               }
                                            .)
        WEAK "=" Expression<name, i>
        SYNC "."
        .


      Factor<String s, int i>               (. String name;
                                               int j; .)
      =   nonterminal                       (. name = token.val;
                                               j = Table.add(name, Table.RHS);
                                               Table.addText(name, i);
                                            .)
        | terminal                          (. name = token.val;
                                               Table.addText(name, i);
                                            .)
        | "["                               (. name = Table.newName(s, "Opt");
                                               Table.addText(name, i);
                                               j = Table.add(name, Table.BOTH);
                                               Table.addText(name, j);
                                               Table.addText("= ", j); .)
          Expression<s, j>
          "]"                               (. Table.addText("|", j); .)
        | "("                               (. Table.addText("(", i); .)
          Expression<s, i>
          ")"                               (. Table.addText(")", i); .)
        | "{"                               (. name = Table.newName(s, "Seq");
                                               Table.addText(name, i);
                                               j = Table.add(name, Table.BOTH);
                                               Table.addText(name, j);
                                               Table.addText("= ", j);
                                               Table.addText("(", j); .)
          Expression<s, j>                  (. Table.addText(")", j);
                                               Table.addText(name, j);
                                               Table.addText("|", j);
                                            .)
          "}"
        .


Home  © P.D. Terry