RHODES UNIVERSITY


Computer Science 301 - 2010 - Programming Language Translation

Well, here you are. Here is the free information you have all been waiting for, with some extra bits of advice:


Section B [ 70 marks ]

Extending Parva to allow for enumeration types

Most computer languages provide simple, familiar, notations for handling arithmetic, character and Boolean types of data. Variables, structures and arrays can be declared of these basic types; they may be passed from one routine to another as parameters, and so on.

Some languages, notably Pascal, Modula-2, C, C++, and Ada, allow programmers the flexibility to define what are often known as enumeration types, or simply enumerations. Here are some examples to illustrate this idea:

     TYPE  (* Pascal or Modula-2 *)
       COLOURS = ( Red, Orange, Yellow, Green, Blue, Indigo, Violet );
       INSTRUMENTS = ( Drum, Bass, Guitar, Trumpet, Trombone, Saxophone, Bagpipe );
     VAR
       Walls, Ceiling, Roof : COLOURS;
       JazzBand : ARRAY [0 .. 40] OF INSTRUMENTS;

or the equivalent

     typedef  /* C or C++ */
       enum { Red, Orange, Yellow, Green, Blue, Indigo, Violet } COLOURS;
     typedef
       enum { Drum, Bass, Guitar, Trumpet, Trombone, Saxophone, Bagpipe } INSTRUMENTS;
     COLOURS      Walls, Ceiling, Roof;
     INSTRUMENTS  JazzBand[41];

Sometimes the variables are declared directly in terms of the enumerations:

     VAR  (* Pascal or Modula-2 *)
       CarHireFleet : ARRAY [1 .. 100] OF ( Golf, Tazz, Sierra, BMW316 );

     enum CARS { Golf, Tazz, Sierra, BMW316 } CarHireFleet[101];  /* C or C++ */

Java got into the act rather later; the original version of Java did not provide the facility that is now manifest in Java and C#, where we might declare:

     enum COLOURS { Red, Orange, Yellow, Green, Blue, Indigo, Violet };
     enum INSTRUMENTS { Drum, Bass, Guitar, Trumpet, Trombone, Saxophone, Bagpipe };
     COLOURS      Walls, Ceiling, Roof;
     INSTRUMENTS[] JazzBand = new INSTRUMENTS[41];

The big idea here is to introduce a distinct, usually rather small, set of values which a variable can legitimately be assigned. Internally these values are represented by small integers - in the case of the CarHireFleet example the "value" of Golf would be 0, the value of Tazz would be 1, the value of Sierra would be 2, and so on.

In the C/C++ development of this idea the enumeration, in fact, results in nothing more than the creation of an implicit list of const int declarations. Thus the code

        enum CARS { Golf, Tazz, Sierra, BMW316 } CarHireFleet[101]; 

is semantically completely equivalent to

      const int Golf = 0; const int Tazz = 1; 
      const int Sierra = 2, const int BMW316 = 3; 
      int CarHireFleet[101]; 

and to all intents and purposes this gains very little, other than possible readability - an assignment like

      CarHireFleet[N] = Tazz; 

might, of course, convey more to a reader than the semantically identical

      CarHireFleet[N] = 1; 

In the much more rigorous Pascal and Modula-2 approach one would not be allowed this freedom; one would be forced to write

      CarHireFleet[N] := Tazz; 

Furthermore, whereas in C/C++ one could write code with rather dubious meaning like

      CarHireFleet[4] = 45;             /* Even though 45 does not correspond to any known car! */ 
      CarHireFleet[1] = Tazz / Sierra;  /* Oh come, come! */ 
      Walls = Sierra;                   /* Whatever turns you on is allowed in C++ */ 

in Pascal, Modula-2, Java and C# one cannot perform arithmetic on variables of these types directly, or assign values of one type to variables of an explicitly different type. In short, the idea is to promote "safe" programming - if variables can meaningfully only assume one of a small set of values, the compiler (and/or run-time system) should prevent the programmer from writing (or executing) meaningless statements.

Clearly there are some operations that could have sensible meaning. Looping and comparison statements like

      if (Walls == Indigo) Redecorate(Blue); 

or

      for (Roof = Red; Roof <= Violet; Roof++) DiscussWithNeighbours(Roof); 

or

      if (JazzBand[N] >= Saxophone) Shoot(JazzBand[N]); 

might reasonably be thought to make perfect sense - and would be easy to "implement" in terms of the underlying integer values.

In fact, the idea of a limited enumeration is already embodied in the standard character (and, in some languages, Boolean) types - type Boolean is, in a sense the enumeration of the values {0, 1} identified as {false, true}, although this type is so common that the programmer is not required to declare the type explicitly. Similarly, the character type is really an enumeration of a sequence of (typically) ASCII codes, and so on.

Although languages that support enumeration types forbid programmers from abusing variables and constants of any enumeration types that they might declare, the idea of "casting" allows programmers to bypass the security where necessary. The reader will be familiar with the (rather strange) notation in the C family of languages that allows code like

      char uc = (char) 65;                 // 'A' 
      char lc = (char) ( (int) uc + 32 );  // 'a' 

In Pascal and Modula-2 a standard function ORD(x) can be applied to a value of an enumeration type to do little more than cheat the compiler into extracting the underlying integral value. This, and the inverse operation of cheating the compiler into thinking that it is dealing with a user-defined value when you want to map it from an integer, are exemplified by Modula-2 code like

      IF (ORD(Bagpipe) > 4) THEN ..... 
      Roof := VAL(COLOURS, I + 5)); 

Rather annoyingly, in Pascal and Modula-2 one cannot read and write values of enumeration types directly - one has to use these casting functions and switching statements to achieve the desired effects.

Enumerations are a "luxury" - clearly they are not really needed, as all they provide is a slightly safer way of programming with small integers. Not surprisingly, therefore, they are not found in languages like Oberon (simplified from Modula-2).

In recent times you have studied and extended a compiler for a small language, Parva, in the implementation of which we have repeatedly stressed the ideas and merits of safe programming.

How would you add the ability to define enumeration types in Parva programs and to implement these types, at the same time providing safeguards to ensure that they could not be abused? Initially, strive to develop a system that will allow

You may wish to read up a little more on enumeration types as they are used in languages like Modula-2 and Pascal. Enumeration types in Java and C# are rather more complex in their full ramifications, however.


Hints: