Computer Science 3 - 2004

Programming Language Translation


Practical for Week 24, beginning 11 October 2004

Hand in this prac sheet before lunch time on your next practical day, correctly packaged in a transparent folder with your solutions and the "cover sheets". Please do NOT come to a practical and spend the first hour printing or completing solutions from the previous week's exercises. Since the practical will have been done on a group basis, please hand in one copy of the cover sheet for each member of the group. These will be returned to you in due course, signed by the marker.


Objectives:

Ja, well, no, fine. All you folks who thought this was a bit of a jorl so far, or have been thinking that IS projects are an excuse for back-sliding - think again. In this practical you are to

You will need this prac sheet and your text book. As usual, copies of the prac sheet are also available at http://www.cs.ru.ac.za/CSc301/Translators/trans.htm.


Outcomes:

When you have completed this practical you should understand


To hand in:

This week you are required to hand in, besides the cover sheets (one per group member):

I do NOT require listings of any Java code produced by Coco/R.

Keep the prac sheet and your solutions until the end of the semester. Check carefully that your mark has been entered into the Departmental Records.

You are referred to the rules for practical submission which are clearly stated on page 13 of our Departmental Handbook. However, for this course pracs must be posted in the "hand-in" box outside the laboratory and not given to demonstrators.

A rule not stated there, but which should be obvious, is that you are not allowed to hand in another student's work as your own. Attempts to do this will result in (at best) a mark of zero and (at worst) severe disciplinary action and the loss of your DP. You are allowed - even encouraged - to work and study with other students, but if you do this you are asked to acknowledge that you have done so. You are expected to be familiar with the University Policy on Plagiarism, which you can consult at:

        http://www.scifac.ru.ac.za/plag.htm 


Task 1 - Create a working directory and unpack the prac kit

There are several files that you need, zipped up in the file PRAC24.ZIP (Java) or PRAC24C.ZIP (C#).


Task 2 - Study the handout (that will cost you!)

(This task is based on a problem set in an examination some years back. In the examination, candidates were given the basic grammar and asked to suggest the actions that had to be added.)

As you should be aware, IP addresses as used in Internet communication are typically expressed in "dotted quad" form, as exemplified by 146.231.128.6. The IP address actually represents a 32 bit integer; the four numbers in the quadruple corresponding to successive 8 bit components of this integer. For humans, machine addressing is usually more memorable when expressed in "DNS" format, as exemplified by terrapin.ru.ac.za. Some systems maintain tables of matching addresses, for example

         146.231.122.131   bungee.ru.ac.za       #comments appear like this
         146.231.128.6     terrapin.ru.ac.za
         146.231.56.10     thistle-sp.ru.ac.za
         147.28.0.62       psg.com

When we moved our CS and IS departments to new premises in Hamilton on the famous 9/11 in 2001, a decision was made to rename and uniquely renumber all the many machines in our possession. Our system administrators tried to draw up a table like the one above, which was then merged with the existing table in the IT division. Unfortunately, a few mistakes were made, which caused havoc until they were ironed out. For example, there were lines reading

         146.231.122.1123  pdt1.ict.ru.ac.za     #invalid IP address
         146.231.122.156   pdt2.ict.ru.ac.za
         146.231.122.156   pdt3.ict.ru.ac.za     #non-unique IP address

The ATG files below shows how Coco/R might be used to develop a system that would enable a file in this format to be checked quickly, and the errors identified. One shows how parameters may be passed between parsing methods, and the other how static variables in the parser class may be accessed from several methods:

    import Library.*;
    import java.util.*;

    COMPILER Check1 $CN
    // Put your names and a description here
    // This version uses parameters

    IGNORECASE
    CHARACTERS
      digit   = "0123456789" .
      letter  = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
      eol     = CHR(10) .
    TOKENS
      number = digit { digit } .
      name   = letter { letter | digit | "." | "-" } .
    COMMENTS FROM "#" TO eol
    IGNORE CHR(0) .. CHR(31)
    PRODUCTIONS
      Check1                        (. ArrayList IPList = new ArrayList(); .)
      = { Entry<IPList> } EOF       (. if (Successful()) IO.writeLine("all correct"); .) .

      Entry<ArrayList IPL>
      = IPNumber<IPL> name .

      IPNumber<ArrayList IPL>
      =                             (. int n, m; .)
      Number<out n>
      "." Number<out m>             (. n = 256 * n + m; .)
      "." Number<out m>             (. n = 256 * n + m; .)
      "." Number<out m>             (. n = 256 * n + m;
                                       if (IPL.contains(new Integer(n))) SemError("duplicate IP number");
                                       else IPL.add(new Integer(n)); .) .

      Number<out int n>
      =  number                     (. try {
                                         n = Integer.parseInt(token.val);
                                         } catch (NumberFormatException e) {
                                         n = 256;  // illegal anyway
                                         }
                                       if (n > 255) SemError("number too large"); .) .
    END Check1.

----------------------------------------------------------------------------

   import Library.*;
   import java.util.*;

   COMPILER Check2 $CN
   // Put your names and a description here
   // This version uses a static Arraylist, but no parameters

   static ArrayList IPList = new ArrayList();

   IGNORECASE  // as before
   CHARACTERS  // as before
   TOKENS      // as before
   COMMENTS    // as before
   IGNORE      // as before
   PRODUCTIONS
     Check2
     = { Entry } EOF               (. if (Successful()) IO.writeLine("all correct"); .) .

     Entry
     = IPNumber name .

     IPNumber
     =                             (. int n, m; .)
     Number<out n>
     "." Number<out m>             (. n = 256 * n + m; .)
     "." Number<out m>             (. n = 256 * n + m; .)
     "." Number<out m>             (. n = 256 * n + m;
                                      if (IPList.contains(new Integer(n))) SemError("duplicate IP number");
                                      else IPList.add(new Integer(n)); .) .

     Number<out int n>
     ... as before

   END Check2.

Start off by studying these grammars carefully, and then making and executing the program.

As before, you should be able to use Coco/R to generate and then compile source for an application like this, as exemplified by

cmake Check1 or cmake Check1

A command like

crun Check1 ipdata.txt

will run the program Check1 and try to parse the file ipdata.txt, sending error messages to the screen. Giving the command in the form

crun Check1 ipdata.txt -L

will send an error listing to the file listing.txt, which might be more convenient. Try this out.


Task 3 Extending the application

As you may know, multiple names may be given to a host (associated with one IP number) on the Internet. On the basis that we might wish to do this for the computers in our building, but still only assign each name once, extend the application so that it might react sensibly to data of a form exemplified below.

          146.231.122.131   bungee.ru.ac.za       #comments appear like this
                            humfac.ru.ac.za       #alternative name
                            www.humfac.ru.ac.za   #and yet another name
          146.231.122.75    bungee.ru.ac.za       #invalid - name already used
          146.231.128.6     terrapin.ru.ac.za
          146.231.56.10     thistle-sp.ru.ac.za
          147.28.0.62       psg.com
          147.28.0.67       psg.com               #invalid - name already used
          146.231.122.1123  pdt1.ict.ru.ac.za     #invalid IP address
          146.231.122.156   pdt2.ict.ru.ac.za
          146.231.122.156   pdt3.ict.ru.ac.za     # non-unique IP address

Hint: This is not hard to do. Use the ArrrayList class to create two different lists.


Task 4 - Towards marketing a versatile calculator

All you have ever wanted - turn an enormous Wintel machine into a calculator!

Since the question in Practical 22 that extended expression grammars in Parva was not very well done, let us try to reinforce the concepts by constructing a calculator that can (a) deal with Boolean or Integer expressions (b) store and retrieve values in variables of the user's choice (c) guard against mixed-mode errors (such as dividing Boolean values or applying the NOT operator to an integer value).

Health warning. This is the most complex prac yet in this course, so do it in stages. But DO it!

A grammar describing the input to such a calculator (Calc.atg) appears below. The hierarchy of the set of productions for Expression imposes a precedence ordering on operators similar to the one used in C++/C#/Java.

     import Library.*;

     COMPILER Calc $NC
     //  Put your names and a description here

     static int toInt (boolean b) {
     // return 0 or 1 according as b is false or true
       return b ? 1 : 0;
     }

     static boolean toBool (int i) {
     // return false or true according as i is 0 or 1
       return i == 0 ? false : true;
     }

     CHARACTERS
       digit      = "0123456789" .
       letter     = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
     TOKENS
       number     = digit { digit } .
       identifier = letter { letter | digit } .
     IGNORE CHR(0) .. CHR(31)
     PRODUCTIONS
       Calc       = { Print | Assignment } "quit" .

       Assignment = Variable "=" Expression SYNC ";" .
       Print      = "print" Expression { WEAK ","  Expression } SYNC ";" .

       Expression = AndExp { "||" AndExp } .
       AndExp     = EqlExp { "&&" EqlExp } .
       EqlExp     = RelExp { EqlOp RelExp } .
       RelExp     = AddExp [ RelOp AddExp ] .
       AddExp     = MultExp { AddOp MultExp } .
       MultExp    = UnaryExp { MulOp UnaryExp } .
       UnaryExp   = Factor | "+" UnaryExp | "-" UnaryExp | "!" UnaryExp .
       Factor     = Variable | Number | "true" | "false" | "(" Expression ")" .
       Variable   = identifier .
       Number     = number .
       MulOp      = "*" | "/" | "%"  .
       AddOp      = "+" | "-" .
       RelOp      = "<" | "<=" | ">" | ">=" .
       EqlOp      = "==" | "!=" .

     END Calc.

To make up a parser proceed in the usual way

cmake Calc
crun Calc calc.txt
crun Calc calc.bad -L

If you simply do that the results will not be very interesting - you will be able to parse expressions, and detect syntactically incorrect expressions, but not much more.

Add actions to this grammar so that it will be able to evaluate the expressions. To start with, assume that expressions will not make use of any "variables". (Calc.atg is already "spread out" to help in this regard).

Hints:

Bonus: Can anything go wrong in evaluating simple expressions? If so, how do you detect the fact and report it?


Task 5

Extend the system so that values can be stored in and retrieved from variables.

Hints:


Task 6

Just because in C++ you can mix integer and Boolean operands in almost any way in expressions does not mean that this is sensible - and of course you cannot really do that in Java or C# either. Extend your system so that it keeps track of the types of operands and expressions, and forbids mixing of the wrong types.

Hint: You might like to define an enumeration

      static final int
        noType   = 0,
        intType  = 1,
        boolType = 2;

to represent the various types that might be encountered. intType and boolType can be associated with correctly formed and defined operands and expressions, while noType can be associated with expressions and operands where the type is incorrect. Thus for example

      4 + 5                is of type  intType
      true || false        is of type  boolType
      4 == 5               is of type  boolType
      4 && 5               is of type  noType
      (4 == 5) && !false   is of type  boolType


Appendix - simple use of the ArrayList class

The following program shows how the ArrayList class can be used to construct a list of records of people's names and ages, and then to display this list and interrogate it. Note the use of casting operations!

  import Library.*;
  import java.util.*;

  class ShowArrayList {
  // Demonstrate simple application of the ArrayList class

    static class Entry {
      public String name;
      public int age;                                     // other fields could be added

      public Entry(String name, int age) {                // constructor
        this.name = name;
        this.age = age;
      }
    }

    static ArrayList list = new ArrayList();              // global for simplicity here!

    public static int position(String name) {
    // Finds position of entry with search key name in list, or -1 if not found
      int i = list.size() - 1;                            // index of last entry
      while (i >= 0 &&                                    // short-circuit protection
             !name.equals(((Entry)list.get(i)).name))     // must cast before extracting field
        i--;                                              // will reach -1 if no match
      return i;
    }

    public static void main (String[] args) {
    // Build a list of people's names and ages
      do {
        String name = IO.readWord();
        int age = IO.readInt();
        IO.readLn();
        if (!IO.eof()) list.add(new Entry(name, age));    // add to end of list
      } while (!IO.eof());

      IO.writeLine(list.size() + " items stored");        // interrogate size of list

      for (int i = 0; i < list.size(); i++) {             // display each entry
        Entry e = (Entry) list.get(i);                    // retrieve (via a cast) an item at position i
        IO.write(e.name, -16); IO.writeLine(e.age);       // -16 means "left justify"
      }
      IO.writeLine();

      int where = position("Pat");                        // find the silly fellow!
      if (where < 0) IO.writeLine("Pat not found")
      else {
        Entry pat = (Entry) list.get(where);
        IO.writeLine("Pat found at position " + where + ". He is " + pat.age + " years old");
      }
    }
  }


Home  © P.D. Terry