Computer Science 3 - 2000

Programming Language Translation


Practical for Week 24, beginning 9 October 2000

Hand in this prac sheet on your next practical day, correctly packaged in a transparent folder and with your solutions. This prac sheet forms the "cover sheet". Since the practical will have been done on a group basis, please hand in one copy of the prac sheet for each member of the group. These will be returned to you in due course, signed by the marker, and you will be asked to sign to acknowledge that you have received your own copy.

Your surname: (PRINT CLEARLY)          Your prac day:

Your student number: (PRINT CLEARLY)

Your signature:                        Your group partners:


Other people with whom you collaborated (you need not give tutors' names):



Mark awarded:                          Tutor's signature


Objectives

Ja, well, no, fine. All you folks who thought this was a bit of a jol so far, think again. Time to move towards some real compiler-like problems. The objectives of this practical are to allow you to find what life as a Computer Scientist is all about, by extending a simple computer language, producing a pretty-printer and a high level translator in an hour or two, and learning how to make a Boolean calculator and handle symbol tables. And don't panic. It's not as bad as it might appear - some of these tasks literally only require you to write two or three lines of code. But please, please, resist the temptation to dive in and hack. You really will spoil the weekend if you do.

You will need this prac sheet and your text book.


To hand in:

This week you are required to hand in, besides this cover sheet:

I do NOT require listings of any C++ or Pascal code produced by Coco/R.

Keep the prac sheet and your solutions until the end of the semester. Check carefully that your mark has been entered into the Departmental Records.

You are referred to the rules for practical submission which are clearly stated on page 10 of our Departmental Handbook. However, for this course pracs must be posted in the "hand-in" box in the secretary's office for collection by Pat Terry.

A rule not stated there, but which should be obvious, is that you are not allowed to hand in another student's work as your own. Attempts to do this will result in (at best) a mark of zero and (at worst) severe disciplinary action and the loss of your DP. You are allowed - even encouraged - to work and study with other students, but if you do this you are asked to acknowledge that you have done so.


Task 1 Buy the kit (for free!)

In the kit PRAC24C.ZIP or PRAC24P.ZIP you will find even more goodies than usual. For example, you will find the grammar for Clang, level 1, as discussed in section 8.7.2, in Cocol/R format ready for processing by Coco/R. You will also find two attributed grammars. One produces a system that will analyse a set of EBNF productions to check some aspects of correctness - specifically that all non-terminals introduced seem to have been defined, and that all non-terminals appear on the right of at least one production. The other produces a "pretty- printer" for simple Clang programs. There is a further, unattributed, grammar for a simple Clang-like language called Topsy. As in the story about the little girl of that name, this week and in the next two, Topsy will "just grow". (The C++ versions of these grammars are listed in an appendix; listings of the very similar Pascal versions are available on request.)


Task 2 Study the handout (that will cost you!)

Study the sample grammar for analysing EBNF productions to see how the attributes have been added. If you like, make the system and try it out in the usual way:

GENMAKE EBNF BORLAND
MAKE -f EBNF.MAK
EBNF EBNF.TXT

Please ask the demonstrators if there are things you don't understand in the grammar.


Task 3 - Build a Boolean Calculator

Something you have always wanted - a Boolean Calculator that will allow you to do fun things like this

      able = TRUE;
      baker = FALSE;
      charlie = able OR baker;
      dog = charlie AND NOT (baker OR 1);
      PRINT able, TRUE + FALSE, able' . baker'';
      QUIT

that is, the calculator can take a sequence of statements which have simple or complex Boolean expressions either in the form of assignments or as parameters to PRINT statements, the whole caboodle finally terminated with QUIT. The form of the input, in other words, is described by a grammar like this (BOOL.ATG in the kit; the grammar is just an extension of that which you should have produced in Prac 22 anyway, and I am feeling generous!)

   COMPILER Bool

   IGNORE CHR(1) .. CHR(31)

   CHARACTERS
     letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .

   TOKENS
     identifier = letter { letter } .

   PRODUCTIONS
     Bool       = { Print | Assignment } "QUIT" .
     Print      = "PRINT" OneExpr { WEAK "," OneExpr } SYNC ";" .
     OneExpr    = Expression .
     Assignment = identifier "=" Expression ";" .
     Expression = Term { Or Term } .
     Term       = Factor { [ And ] Factor } .
     Factor     = Not Factor | Primary { "'" } .
     Primary    = identifier | True | False | "(" Expression ")" .
     And        = "AND" | "&" | "." .
     Or         = "OR" | "|" | "+" .
     Not        = "NOT" | "!" .
     True       = "TRUE" | "1" .
     False      = "FALSE" | "0" .
   END Bool.

Your task is now to add the appropriate attributes into this grammar to allow you to generate a complete calculator. This you will be able to run, if you are using C++, by giving a sequence of commands like

GENMAKE BOOL BORLAND or GENMAKE BOOL VISUAL
MAKE -f BOOL.MAK
BOOL BOOL.TXT

or, if you are a Pascal user,

CR BOOL.ATG
TPC /M BOOL
BOOL BOOL.TXT

Okay, so maybe we need a bit more generosity. Here are some suggestions:

It will suffice to write the results of the calculations to the standard output file.

You might start by creating a system that parses expressions that might contain variables, but simply assumes that these all have the value "true". That is, cheat a bit to get going - get the system to evaluate silly expressions like

TRUE OR (NOT FALSE OR !1)

first. Why should this be easier? Well, you will later need to use some sort of table to contain the variables' names and their current values. But if this table does not exist, you can still handle expressions with constants only, because parsing such expressions will never actually pass through the parts of the system that need to look into the table.

When you go on to build the table routines, have a look at what has been done in miscebnf.h or MISCEBNF.PAS and develop an analogous miscbool.h or MISCBOOL.PAS that you can include into your calculator system.

You will notice that you have been given a compiler frame file BOOL.FRM which may need editing if you incorporate semantic error checks. Actually, not "if", but "when" you incorporate such checks. Your system must complain if you try to evaluate an expression that uses variables that have not (yet) had values assigned to them.

Incidentally, this task is something like the one that was set as a "24 hour" question in 1997.


Task 4 - Clang parser and pretty printer (trivial work here)

In case you have not met this concept before, a pretty printer is a "compiler" that takes a source program and "translates" the source into the same language. That probably does not sound very useful, but the clever trick is that the "object code" is formatted neatly, according to some simple conventions. For example, given a program that originally reads like this (don't laugh; I have seen this sort of thing!):

        PROGRAM Backwards; VAR Terminator;

        PROCEDURE Start(VAR Local1,Local2);
            FUNCTION ABS (x); BEGIN IF x < 0 THEN ABS := - x ELSE ABS := x END;

            PROCEDURE Reverse;
              VAR Number;
              BEGIN READ(Number);IF Local1<>ABS(Number) THEN BEGIN Reverse; WRITE(Number)END
              END;

            BEGIN
                  Local1:=Terminator;Reverse
                     END;

          BEGIN
      Terminator         :=       9;Start END.

A pretty printer might produce something like this (the exact layout is a matter of taste:

        PROGRAM Backwards;
          VAR
            Terminator;

          PROCEDURE Start (VAR Local1, Local2);

            FUNCTION ABS (x);
              BEGIN
                IF x < 0 THEN
                  ABS := - x
                ELSE
                  ABS := x
              END;

            PROCEDURE Reverse;
              VAR
                Number;
              BEGIN
                READ(Number);
                IF Local1 <> ABS(Number) THEN
                  BEGIN
                    Reverse;
                    WRITE(Number)
                  END;
              END;

... and so on

In the prac kit you will find two grammars for Clang level 1. The one is unattributed (CLANG.ATG) and is just the one from the textbook. The other, (CLPR.ATG) defines a pretty printer system for simple Clang programs. Essentially it does this by adding "actions" to the grammar within the meta-brackets (. ... .), as shown in the following extract:

            CompoundStatement =
              "BEGIN"                (. Append("BEGIN"); IndentNextLine(); .)
                 Statement
                   { ";"             (. Append(";"); NewLine(); .)
                     Statement }
              "END"                  (. ExdentNextLine(); Append("END"); .)   .

The actions incorporate calls to auxiiliary routines, which are coded up for you in the miscpret.h file (PRETTIER.PAS unit for Pascal fans). At the start of the attribute grammar there is the appropriate code (#include ... or USES ... ) to include these functions into your system.

Study the grammar carefully, and you will, perhaps, start to appreciate how powerful a tool like Coco/R can become.

You can build the system in the usual way:

GENMAKE CLPR BORLAND or GENMAKE CLPR VISUAL
MAKE -f CLPR.MAK
CLPR MESS.CLN

or, if you are a Pascal user,

CR CLPR.ATG
TPC /M CLPR
CLPR MESS.CLN

The output simply appears on the stdout strean; you can pipe it in the usual way:

CLPR MESS.CLN >PRETTY.CLN


Task 5 - More power to the people

Clang, as you are well aware, is a bit restricted. We must have more features or nobody will buy our software!

Carry on to extend your Clang grammar as it is used in the pretty printer to accept the following extensions:

Once you have done all that, you should be able to parse programs like the following (TEST.CLN)

  PROGRAM Better;
    VAR X, Y, Z;
    BEGIN
      X := 0;
      REPEAT
        X++;
        IF X MOD 3 = 0
          THEN WRITE(X, ' is divisible by three')
          ELSE WRITE(X, ' is not divosible by three')
      UNTIL X = 100;
      FOR Y := X - 1 TO X + 4 DO
        BEGIN WRITE(Y); Z := Y / 2 END;
    END.

Of course you won't be able to run them. But you should be able to pretty print them, and also detect syntactic errors in programs that you are given to parse.

Some more hints? Well, it is actually all pretty straightforward. You may have to work a bit to get your grammars to be close to LL(1), and in one case you won't be able to, I don't think. But even then you can still get a decent parser. If you are troubled by this, either read the textbook for the first time, or after you have read the textbook, ask a demonstrator for help.


Task 6 - A million lemmings can't be wrong (trivial work here)

For the next exercise, let us imagine that we have finally decided to bow to pressure to throw out that dreadful Clang language, and start to get with it, by using a language that resembles C++ rather more closely than it does Pascal.

In the kit you will find a syntactic description of one such language, which I have christened Topsy (TOPSY.ATG). In a flash of an eye you can generate a parser for it, and if you look, you will see that it's really very similar to Clang in many respects - just with a few extras here and there. There's even a simple translation of VOTE.CLN in VOTE.TOP for you to parse.

GENMAKE TOPSY BORLAND or GENMAKE TOPSY VISUAL
MAKE -f TOPSY.MAK
TOPSY VOTE.TOP

or, if you are a Pascal user,

CR TOPSY.ATG
TPC /M TOPSY
TOPSY VOTE.TOP


Task 7 - A million lemmings are never satisfied

Extend your Topsy grammar to give you the equivalent goodies to those in the extended Clang in Task 5 - the ++ and -- "statements", the modulus operator, an else option in the if statement, the do-while loop, and the for loop. You should, of course, model your syntax on that familiar from C++ (hey, does this mean we are inventing Topsy++?). It will suffice to limit for loops to the simple ones exemplified by

for (i = initial; i <= final; i++) ...

or

for (i = initial; i >= final; i--) ...

Your parser should then be able to handle code rather like the following (SORT.TOP):

  void main (void) {
  // A simple Topsy++ program for doing a Bubble Sort
    const Max = 20;
    cout << "Type a list of numbers to sort, terminated by a multiple of 8\a\n";
    int List[20], N = 0;
    do {
      cin >> List[N]; N++;
    } while ((List[N-1] % 8 != 0); /* a number divisible by 8 stops the list */
    if (N > 1) {
      bool Sorted = false;         /* local Boolean variable */
      N--;                         /* throw away stopping value */
      int Number = N;
      while (N > 1) {
        int I = 0;                 /* local loop control variable */
        while (I <= N - 2) {
          if (List[I] > List[I + 1]) {
            int Temp = List[I];    /* swap two values using local Temp */
            List[I] = List[I + 1];
            List[I + 1] = Temp;
          }
          I++;
        }
        N--;
        ;
      }
      cout << "Sorted list of " << Number << " numbers\n";;
      int J;
      for ( J = 0; J < Number; J++) cout << List[J] << " ";
      cout << "\n";
    }
    else cout << "No data given\n";
  }


Task 8 - Feed the lemmings

So what about all the Clang fans with millions of lines of applications, and whose idiotic managers have told them that from now on all software development must be done in Topsy? Groan. How do we rewrite millions of lines of code developed in a one language into another one?

Answer - we get a translator to do the job for us. We have to write that translator, of course, but we have a set of powerful tools at our disposal ...

If two high level languages are very similar, a translator from one to the other can often be developed by taking the idea of a pretty printer one stage further - rather than writing the same symbols that it reads, it writes slightly different ones. For example, a Clang CompoundStatement could be translated to the equivalent Topsy version by attributing the production as follows:

            CompoundStatement =
              "BEGIN"                (. Append("{"); IndentNextLine(); .)
                 Statement
                   { ";"             (. NewLine(); .)
                     Statement }
              "END"                  (. ExdentNextLine(); Append("}"); .)   .

Develop a complete Clang - Topsy translator in this way. Test it out by converting your Clang examples to Topsy, and then checking that your Topsy parser will still accept them.

Much of this is very simple. Note that Topsy is actually more "powerful" than Clang, as it has integers, chars and Booleans, it has the ability to initialise variables as you declare them and so on. Clang does not have these features, but that simply makes the translation even easier. There are a few problem areas, he says, encouragingly:


Appendix - C++ versions of grammars

COMPILER EBNF $XCN
/* Describe simple EBNF productions */

/* we may include arbitrary text here, in particular #include directives
   Pascal version have USES clauses to achieve the same effect */

#include "miscebnf.h"

CHARACTERS
  letter   = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz".
  lowline  = "_".
  digit    = "0123456789".
  noquote1 = ANY - "'".
  noquote2 = ANY - '"'.

IGNORE CHR(9) .. CHR(13)

COMMENTS FROM "(*" TO "*)"  NESTED

TOKENS
  nonterminal = letter {letter | lowline | digit} .
  terminal =  "'" noquote1 {noquote1} "'" | '"' noquote2 {noquote2} '"' .

PRODUCTIONS
   EBNF
   =                                     (. TABLE_InitTable() .)
     { Production }                      (. if (Successful())
                                              TABLE_TestProductions(); .)
     EOF .

   Production
   =                                     (. char Name[100]; .)
     SYNC nonterminal                    (. LexString(Name, sizeof(Name) - 1);
                                            TABLE_Add(Name, TABLE_LHS) .)
     WEAK "=" Expression SYNC "." .

   Expression = Term { WEAK "|" Term } .

   Term = [ Factor { Factor } ] .

   Factor
   =                                     (. char Name[100]; .)
       nonterminal                       (. LexString(Name, sizeof(Name) - 1);
                                            TABLE_Add(Name, TABLE_RHS) .)
     | terminal
     | "[" Expression "]"
     | "(" Expression ")"
     | "{" Expression "}" .

END EBNF.


// Various common items for grammar checker

#ifndef MISCEBNF_H
#define MISCEBNF_H

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <ctype.h>
#include <limits.h>

#define  boolean  int
#define  bool     int
#define  true     1
#define  false    0
#define  TRUE     1
#define  FALSE    0
#define  maxint   INT_MAX

#if __MSDOS__ || MSDOS || WIN32 || __WIN32__
#  define  pathsep '\\'
#else
#  define  pathsep '/'
#endif

static void appendextension (char *oldstr, char *ext, char *newstr)
// Changes filename in oldstr from PRIMARY.xxx to PRIMARY.ext in newstr
{ int i;
  char old[256];
  strcpy(old, oldstr);
  i = strlen(old);
  while ((i > 0) && (old[i-1] != '.') && (old[i-1] != pathsep)) i--;
  if ((i > 0) && (old[i-1] == '.')) old[i-1] = 0;
  if (ext[0] == '.') sprintf(newstr,"%s%s", old, ext);
    else sprintf(newstr, "%s.%s", old, ext);
}

typedef enum { TABLE_LHS, TABLE_RHS } SIDES;

typedef struct {
  char Name[50];
  int Left, Right;
} ENTRIES;

const int TABLE_MaxProductions = 100;
ENTRIES TABLE_Table[TABLE_MaxProductions];
int TABLE_NEntries;

static void TABLE_InitTable() {
// Initialise table
  TABLE_NEntries = 0;
}

static void TABLE_Add (char *N, SIDES Where) {
// Add a nonterminal name N to the table, according as to Where in a
// production it was found
  if (TABLE_NEntries > TABLE_MaxProductions) {
    fprintf(stderr, "Table Overflow"); exit(1);
  }
  strcpy(TABLE_Table[TABLE_NEntries + 1].Name, N); // sentinel
  TABLE_Table[TABLE_NEntries + 1].Left = 0;
  TABLE_Table[TABLE_NEntries + 1].Right = 0;
  int i = 1;
  while (strcmp(N, TABLE_Table[i].Name)) i++;
  if (i > TABLE_NEntries) TABLE_NEntries++; // new entry
  if (Where == TABLE_LHS) TABLE_Table[i].Left++; else TABLE_Table[i].Right++;
}

static void TABLE_TestProductions() {
// Check that all non terminals have appeared once on the left side of
// each production, and at least once on the right side of each production
  bool OK = TRUE; // optimistic, even if he is a bastard!
  for (int i = 1; i <= TABLE_NEntries; i++) {
//  printf("%s %d %d\n", TABLE_Table[i].Name, TABLE_Table[i].Left, TABLE_Table[i].Right);
    if (TABLE_Table[i].Left == 0) {
      OK = false;
      printf("%s never defined\n", TABLE_Table[i].Name);
    }
    else if (TABLE_Table[i].Left > 1) {
      OK = false;
      printf("%s defined more than once\n", TABLE_Table[i].Name);
    }
  }
  for (i = 2; i <= TABLE_NEntries; i++) {
    if (TABLE_Table[i].Right == 0) {
      OK = false;
      printf("%s cannot be reached\n", TABLE_Table[i].Name);
    }
  }
  if (OK) printf("Seems to be reduced grammar\n");
  else printf("Cannot be reduced grammar\n");
}

#endif /* MISCEBNF_H */

COMPILER ClPr $XCN
/* CLANG level 1 prettyprinter
   P.D. Terry, Rhodes University, 2000 */

#include "miscpret.h"

IGNORE CASE
IGNORE CHR(9) .. CHR(13)

CHARACTERS
  cr         = CHR(13) .
  lf         = CHR(10) .
  letter     = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
  digit      = "0123456789" .
  instring   = ANY - "'" - cr - lf .

COMMENTS FROM "(*" TO "*)"

TOKENS
  identifier = letter { letter | digit } .
  number     = digit { digit } .
  string     = "'" (instring | "''") { instring | "''" } "'" .

PRODUCTIONS
  ClPr
  =  "PROGRAM"                   (. Append("PROGRAM "); .)
     Identifier
     WEAK ";"                    (. Append(";"); IndentNextLine(); .)
     Block
     "."                         (. Append("."); NewLine(); .).

  Block
  =  SYNC { ( ConstDeclarations | VarDeclarations ) SYNC }
     CompoundStatement .

  ConstDeclarations
  =  "CONST"                     (. Append("CONST"); IndentNextLine(); .)
     OneConst
     {                           (. NewLine(); .)
       OneConst }                (. ExdentNextLine(); .) .

  OneConst
  =  Identifier WEAK "="         (. Append(" = ") .)
     Number ";"                  (. Append(";") .) .

  VarDeclarations
  =  "VAR"                       (. Append("VAR"); IndentNextLine(); .)
     OneVar
     { WEAK ","                  (. Append(", ") .)
       OneVar
     }
     ";"                         (. Append(";"); ExdentNextLine(); .) .

  OneVar
  =  Identifier [ UpperBound ] .

  UpperBound
  =  "["                         (. Append("[") .)
     Number
     "]"                         (. Append("]") .) .

  CompoundStatement
  =  "BEGIN"                     (. Append("BEGIN"); IndentNextLine(); .)
        Statement
        { WEAK ";"               (. Append(";"); NewLine(); .)
          Statement }
     "END"                       (. ExdentNextLine(); Append("END") .) .

  Statement
  =  SYNC [   CompoundStatement | Assignment
            | IfStatement       | WhileStatement
            | ReadStatement     | WriteStatement ] .

  Assignment
  =  Variable
     ":="                        (. Append(" := ") .)
     Expression SYNC .

  Variable
  =  Designator .

  Designator
  =  Identifier
     [  "["                      (. Append("[") .)
        Expression
        "]"                      (. Append("]") .)
     ] .

  IfStatement
  =  "IF"                        (. Append("IF ") .)
        Condition
     "THEN"                      (. IndentNextLine(); Append("THEN ") .)
        Statement                (. Exdent .) .

  WhileStatement
  =  "WHILE"                     (. Append("WHILE ") .)
        Condition
     "DO"                        (. Append(" DO ") .)
        Statement .

  ReadStatement
  =  "READ" "("                  (. Append("READ(") .)
      Variable
      { ","                      (. Append(", ") .)
        Variable }
     ")"                         (. Append(")") .) .

  WriteStatement
  =  "WRITE"                     (. Append("WRITE") .)
     [ "("                       (. Append("(") .)
       WriteElement
       { WEAK ","                (. Append(", ") .)
       WriteElement }
       ")"                       (. Append(")") .)
     ] .

  WriteElement
  =  String | Expression .

  Condition
  =  Expression RelOp Expression .

  Expression
  =  (   "+"                     (. Append("+") .)
        Term
      | "-"                      (. Append("-") .)
        Term
      | Term
     ) { AddOp Term } .

  Term
  =  Factor { MulOp Factor } .

  Factor
  =   Designator
    | Number
    | "("                        (. Append("(") .)
      Expression
	      ")"                        (. Append(")") .) .

  AddOp
  =  "+"                         (. Append(" + ") .)
   | "-"                         (. Append(" - ") .) .

  MulOp
  =  "*"                         (. Append(" * ") .)
   | "/"                         (. Append(" / ") .) .

  RelOp
  =  "="                         (. Append(" = ") .)
   | "<>"                        (. Append(" <> ") .)
   | "<"                         (. Append(" < ") .)
   | "<="                        (. Append(" <= ") .)
   | ">"                         (. Append(" > ") .)
   | ">="                        (. Append(" >= ") .) .

  Identifier
  =                              (. char str[100]; .)
     identifier                  (. LexString(str, 100); Append(str); .) .

  Number
  =                              (. char str[100]; .)
     number                      (. LexString(str, 100); Append(str); .) .

  String
  =                              (. char str[100]; .)
     string                      (. LexString(str, 100); Append(str); .) .

END ClPr.

// Various common items

#ifndef MISC_PRETH
#define MISC_PRETH

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <ctype.h>
#include <limits.h>

#define  boolean  int
#define  bool     int
#define  true     1
#define  false    0
#define  TRUE     1
#define  FALSE    0
#define  maxint   INT_MAX

#if __MSDOS__ || MSDOS || WIN32 || __WIN32__
#  define  pathsep '\\'
#else
#  define  pathsep '/'
#endif

static void appendextension (char *oldstr, char *ext, char *newstr)
// Changes filename in oldstr from PRIMARY.xxx to PRIMARY.ext in newstr
{ int i;
  char old[256];
  strcpy(old, oldstr);
  i = strlen(old);
  while ((i > 0) && (old[i-1] != '.') && (old[i-1] != pathsep)) i--;
  if ((i > 0) && (old[i-1] == '.')) old[i-1] = 0;
  if (ext[0] == '.') sprintf(newstr,"%s%s", old, ext);
    else sprintf(newstr, "%s.%s", old, ext);
}

// pretty printing auxiliary routines

static int Indentation = 0;

void Append(char String[])
// Append String to results file
{  
  printf("%s",String);
}

void NewLine(void)
// Write line mark to results file but leave indentation as before
{
  int i = 1;
  printf("\n");
  while (i <= Indentation) { printf(" "); i++; }
}

void IndentNextLine(void)
// Write line mark to results file and prepare to indent further lines
// by two spaces more than before
{ 
  Indentation += 2;
  NewLine();
}

void ExdentNextLine(void)
// Write line mark to results file and prepare to indent further lines
// by two spaces less
{  
  Indentation -= 2;
  NewLine();
}

void Indent(void)
// increment indentation level by 2
{
  Indentation += 2;
}

void Exdent(void)
// decrement indentation level by 2
{ 
  Indentation -= 2;
}

#endif /* MISC_PRETH */

Appendix - Pascal versions of Grammars

COMPILER EBNF  $XCN
/* Describe simple EBNF productions and perform simple checks on them
   P.D. Terry, Rhodes University, 2000 */

USES MiscEBNF; (* USES clauses may appear right at the start *)

CHARACTERS
  letter   = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz".
  lowline  = "_".
  digit    = "0123456789".
  noquote1 = ANY - "'".
  noquote2 = ANY - '"'.

IGNORE CHR(9) .. CHR(13)

COMMENTS FROM "(*" TO "*)"  NESTED

TOKENS
  nonterminal = letter {letter | lowline | digit} .
  terminal =  "'" noquote1 {noquote1} "'" | '"' noquote2 {noquote2} '"' .

PRODUCTIONS
   EBNF
   =                                     (. TABLE_InitTable .)
     { Production }                      (. IF Successful THEN
                                              TABLE_TestProductions; .)
     EOF .

   Production                            (. VAR Name : STRING; .)
   = SYNC nonterminal                    (. LexString(Name);
                                            TABLE_Add(Name, TABLE_LHS) .)
     WEAK "=" Expression SYNC "." .

   Expression = Term { WEAK "|" Term } .

   Term = [ Factor { Factor } ] .

   Factor                                (. VAR Name : STRING; .)
   =   nonterminal                       (. LexString(Name);
                                            TABLE_Add(Name, TABLE_RHS) .)
     | terminal
     | "[" Expression "]"
     | "(" Expression ")"
     | "{" Expression "}" .

END EBNF.


UNIT MiscEBNF;

INTERFACE
  TYPE
    SIDES = (TABLE_LHS, TABLE_RHS);

  PROCEDURE TABLE_InitTable;
  (* Initialise table *)

  PROCEDURE TABLE_Add (N : STRING; Where : SIDES);
  (* Add a nonterminal name N to the table, according as to Where in a
     production it was found *)

  PROCEDURE TABLE_TestProductions;
  (* Check that all non terminals have appeared once on the left side of
     each production, and at least once on the right side of each production *)

IMPLEMENTATION

TYPE
  ENTRIES = RECORD
    Name : STRING;
    Left, Right : INTEGER;
  END;

CONST
  TABLE_MaxProductions = 100;

VAR
  TABLE_Table : ARRAY [0 .. TABLE_MaxProductions] OF ENTRIES;
  TABLE_NEntries : INTEGER;

PROCEDURE TABLE_InitTable;
  BEGIN
    TABLE_NEntries := 0;
  END;

PROCEDURE TABLE_Add (N : STRING; Where : SIDES);
  VAR
    I : INTEGER;
  BEGIN
    IF TABLE_NEntries > TABLE_MaxProductions THEN BEGIN
      WriteLn('Table Overflow'); HALT
    END;
    TABLE_Table[TABLE_NEntries + 1].Name := N; (* sentinel *)
    TABLE_Table[TABLE_NEntries + 1].Left := 0;
    TABLE_Table[TABLE_NEntries + 1].Right := 0;
    I := 1;
    WHILE N <> TABLE_Table[I].Name DO INC(I);
    IF I > TABLE_NEntries THEN INC(TABLE_NEntries); (* new entry *)
    IF Where = TABLE_LHS
      THEN INC(TABLE_Table[I].Left)
      ELSE INC(TABLE_Table[I].Right)
  END;

PROCEDURE TABLE_TestProductions;
  VAR
    OK : BOOLEAN;
    I : INTEGER;
  BEGIN
    OK := TRUE; (* optimistic, even IF he is a bastard! *)
    FOR I := 1 TO TABLE_NEntries DO
      IF TABLE_Table[I].Left = 0
        THEN BEGIN
          OK := FALSE;
          WriteLn(TABLE_Table[I].Name, ' never defined');
        END
        ELSE IF TABLE_Table[I].Left > 1 THEN BEGIN
          OK := FALSE;
          WriteLn(TABLE_Table[I].Name, ' defined more than once');
        END;
    FOR I := 2 TO TABLE_NEntries DO
      IF TABLE_Table[I].Right = 0 THEN BEGIN
        OK := FALSE;
        WriteLn(TABLE_Table[I].Name, ' cannot be reached');
      END;
    IF OK
      THEN WriteLn('Seems to be reduced grammar')
      ELSE WriteLn('Cannot be reduced grammar')
  END;

END.

COMPILER ClPr $XCN
/* CLANG level 1 prettyprinter
   P.D. Terry, Rhodes University, 2000 */

USES Prettier;

IGNORE CASE
IGNORE CHR(9) .. CHR(13)

CHARACTERS
  cr         = CHR(13) .
  lf         = CHR(10) .
  letter     = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
  digit      = "0123456789" .
  instring   = ANY - "'" - cr - lf .

COMMENTS FROM "(*" TO "*)"

TOKENS
  identifier = letter { letter | digit } .
  number     = digit { digit } .
  string     = "'" (instring | "''") { instring | "''" } "'" .

PRODUCTIONS
  ClPr
  =  "PROGRAM"                   (. Append('PROGRAM ') .)
     Identifier
     WEAK ";"                    (. Append(';'); IndentNextLine .)
     Block
     "."                         (. Append('.'); NewLine .).

  Block
  =  SYNC { ( ConstDeclarations | VarDeclarations ) SYNC }
     CompoundStatement .

  ConstDeclarations
  =  "CONST"                     (. Append('CONST'); IndentNextLine .)
     OneConst
     {                           (. NewLine .)
       OneConst }                (. ExdentNextLine .) .

  OneConst
  =  Identifier WEAK "="         (. Append(' = ') .)
     Number ";"                  (. Append(';') .) .

  VarDeclarations
  =  "VAR"                       (. Append('VAR'); IndentNextLine .)
     OneVar
     { WEAK ","                  (. Append(', ') .)
       OneVar
     }
     ";"                         (. Append(';'); ExdentNextLine .) .

  OneVar
  =  Identifier [ UpperBound ] .

  UpperBound
  =  "["                         (. Append('[') .)
     Number
     "]"                         (. Append(']') .) .

  CompoundStatement
  =  "BEGIN"                     (. Append('BEGIN'); IndentNextLine .)
        Statement
        { WEAK ";"               (. Append(';'); NewLine .)
          Statement }
     "END"                       (. ExdentNextLine; Append('END') .) .

  Statement
  =  SYNC [   CompoundStatement | Assignment
            | IfStatement       | WhileStatement
            | ReadStatement     | WriteStatement ] .

  Assignment
  =  Variable
     ":="                        (. Append(' := ') .)
     Expression SYNC .

  Variable
  =  Designator .

  Designator
  =  Identifier
     [  "["                      (. Append('[') .)
        Expression
        "]"                      (. Append(']') .)
     ] .

  IfStatement
  =  "IF"                        (. Append('IF ') .)
        Condition
     "THEN"                      (. IndentNextLine; Append('THEN ') .)
        Statement                (. Exdent .) .

  WhileStatement
  =  "WHILE"                     (. Append('WHILE ') .)
        Condition
     "DO"                        (. Append(' DO ') .)
        Statement .

  ReadStatement
  =  "READ" "("                  (. Append('READ(') .)
      Variable
      { ","                      (. Append(', ') .)
        Variable }
     ")"                         (. Append(')') .) .

  WriteStatement
  =  "WRITE"                     (. Append('WRITE') .)
     [ "("                       (. Append('(') .)
       WriteElement
       { WEAK ","                (. Append(', ') .)
       WriteElement }
       ")"                       (. Append(')') .)
     ] .

  WriteElement
  =  String | Expression .

  Condition
  =  Expression RelOp Expression .

  Expression
  =  (   "+"                     (. Append('+') .)
        Term
      | "-"                      (. Append('-') .)
        Term
      | Term
     ) { AddOp Term } .

  Term
  =  Factor { MulOp Factor } .

  Factor
  =   Designator
    | Number
    | "("                        (. Append('(') .)
      Expression
      ")"                        (. Append(')') .) .

  AddOp
  =  "+"                         (. Append(' + ') .)
   | "-"                         (. Append(' - ') .) .

  MulOp
  =  "*"                         (. Append(' * ') .)
   | "/"                         (. Append(' / ') .) .

  RelOp
  =  "="                         (. Append(' = ') .)
   | "<>"                        (. Append(' <> ') .)
   | "<"                         (. Append(' < ') .)
   | "<="                        (. Append(' <= ') .)
   | ">"                         (. Append(' > ') .)
   | ">="                        (. Append(' >= ') .) .

  Identifier
                                 (. VAR str : STRING; .)
  =  identifier                  (. LexString(str); Append(str); .) .

  Number
                                 (. VAR str : STRING; .)
  =  number                      (. LexString(str); Append(str); .) .

  String
                                 (. VAR str : STRING; .)
  =  string                      (. LexString(str); Append(str); .) .

END ClPr.

UNIT Prettier;
(* Pretty printing auxiliary functions for use with pretty printing
   programs generated by COCO from an attributed grammar
   P.D. Terry, Rhodes University, 2000 *)

INTERFACE

  VAR
    results : TEXT;

  PROCEDURE Append (Str : STRING);
  (* Append Str to results file *)

  PROCEDURE IndentNextLine;
  (* Write line mark to results file and prepare to indent further lines
     by two spaces more than before *)

  PROCEDURE ExdentNextLine;
  (* Write line mark to results file and prepare to indent further lines
     by two spaces less *)

  PROCEDURE NewLine;
  (* Write line mark to results file but leave indentation as before *)

  PROCEDURE Indent;
  PROCEDURE Exdent;

IMPLEMENTATION

  VAR
    Indentation : INTEGER;

  PROCEDURE Append (Str : STRING);
    BEGIN
      Write((* results, *) Str);
    END;

  PROCEDURE NewLine;
    VAR
      I : INTEGER;
    BEGIN
      WriteLn (*(results)*) ;
      FOR I := 1 TO Indentation DO Write((* results, *) ' ');
    END;

  PROCEDURE IndentNextLine;
    BEGIN
      INC(Indentation, 2); NewLine
    END;

  PROCEDURE ExdentNextLine;
    BEGIN
      DEC(Indentation, 2); NewLine
    END;

  PROCEDURE Indent;
    BEGIN
      INC(Indentation, 2)
    END;

  PROCEDURE Exdent;
    BEGIN
      DEC(Indentation, 2)
    END;

  BEGIN
    Indentation := 0
  END.