RHODES UNIVERSITY

November Examinations - 1996 - Solutions

Computer Science 3 - Paper 3 - Solutions


Marks overall ranged from about 42% to 78%. Two people scored over 75%, and three of the four failures were very near misses. Class average mark was about 59%, which is about what it usually is.

Section A [ 90 marks ]

A1 Explain what is meant by each of the following. Where possible, give simple examples to illustrate your answers. [18]

See text book

A2 What are the main differences between a compiler and an interpreter? What advantages and disadvantages accrue from the use of an interpreter rather than a compiler? [5]

See text book

A3 Explain what is meant by the FIRST and FOLLOW sets as applied to grammar analysis, describe how these sets are evaluated, and state the rules that a grammar must obey for it to be LL(1). [16]

See text book

A4 Shakespearian plays all have five acts. In each act there may be several scenes, and in each scene appear one or more actors, who gesticulate and make speeches to one another (for the benefit of the audience, of course). Actors come onto the stage at the start of each scene, and come and go as the scene proceeds - to all intents and purposes between speeches - finally leaving at the end of the scene (in the Tragedies some leave dead, of course, but they usually revive themselves for the after-show party). Plays are staged with an interval between the third and fourth acts.

The question on a grammar for describing plays was not very well done (average 50%). In spite of all my efforts, very few people know how to test a grammar for being LL(1). Nobody thought of simply submitting their grammar to Coco/R, which would have been the very easy way to get the solution (and quite legal!).

        (Production set 1)

              Play   = Act Act Act "interval" Act Act .
              Act    = Scene { Scene } .
              Scene  = { "speech" } "entry" { Action } .
              Action = "speech" | "entry" | "exit" | "death" | "gesticulation" .
Of course there can be many variations on this. For example we could think of a speech as something that could be punctuated by other actions (you were actually told to regard other actions as coming between speeches, but that's not really how it always happens). This would lead to
        (Production set 2)

              Play   = Act Act Act "interval" Act Act .
              Act    = Scene { Scene } .
              Scene  = { Speech } "entry" { Speech | Action } .
              Speech = "word" { "word" | Action } .
              Action =  "entry" | "exit" | "death" | "gesticulation" .
Notice that nothing we have so far requires all the actors to leave at the end of any scene (sometimes they don't in real life, either). We could try to get this effect by writing
            Scene  = { Speech } "entry" { Speech | Action } { "exit" } .
but note that the context-free grammar cannot force as many actors to leave as entered - in computer language terms you will/may/should recognize this as the same problem of not being able to guarantee that the number of formal and actual parameters to a procedure agree!

The after-stage party is just "syntactic noise" (pun intended).

        (Production set 3)

              Play   = Act Act Act "interval" Act Act .
              Act    = Scene { Scene } .
              Scene  = Action { Action } .
              Action = "speech" | "entry" | "exit" .
To check the rules, let's look at my set 1 again.
      Play   = Act Act Act "interval" Act Act .
      Act    = Scene { Scene } .
      Scene  = { "speech" } "entry" { Action } .
      Action = "speech" | "entry" | "exit" | "death" | "gesticulation" .
To check Rule 1 we look at the productions for each non-terminal that yields alternatives on its right hand side. There is seemingly only one non-terminal in this category here, namely Action. So we have to look at
      FIRST("speech")  Intersection  FIRST("entry")
      FIRST("speech")  Intersection  FIRST("exit")
      FIRST("speech")  Intersection  FIRST("death")
      FIRST("speech")  Intersection  FIRST("gesticulation")
etc. We do not have to look at FIRST(Action) Intersection anything!

Clearly Rule 1 is satisfied for any pairwise combination taken from the options in the productions for Action.

One has to be careful, very careful, about Rule 2. Rule 2 applies when there are eps productions and alternatives. It does not apply to the productions for Action, and so some people will be lulled into thinking that (since this is the only place where alternative right hand sides are explicitly given), that Rule 2 is never broken. Alas, not so. When you use EBNF and use the useful [ option ] and { repetition } constructs, there are implicit eps's that creep in.

If we eliminate the metabrackets we get something like

      Play        = Act Act Act "interval" Act Act .
      Act         = Scene RestOfScene .
      Scene       = Prologue "entry" Actions .
      RestOfScene = Scene RestOfScene | eps .
      Prologue    = "speech" Prologue | eps .
      Actions     = Action Actions | eps .
      Action      = "speech" | "entry" | "exit" | "death" | "gesticulation" .
in which the following now have to be checked for violations of Rule 2
            Productions                               Check

      RestOfScene = Scene RestOfScene | eps .    (FIRST(RestOfScene) Intersection FOLLOW(RestOfScene) )
      Prologue = "speech" Prologue | eps .       (FIRST(Prologue) Intersection FOLLOW(Prologue) )
      Actions = Action Actions | eps .           (FIRST(Actions) Intersection FOLLOW(Actions)  )

Now:

    First(RestOfScene) = First(Scene) = FIRST(Prologue) U { "entry" } = { "speech", "entry" } }
    Follow(RestOfScene) = FOLLOW(Act) = FIRST(Act) U {"interval" }
                                      = FIRST(Scene) U { "interval" } = { "speech", "entry", "interval" }
so Rule 2 is broken in this case. For the second production in this category
    FIRST(Prologue) = { "speech" }
    FOLLOW(Prologue) = { "entry" }
so Rule 2 is not broken that time, but for the third of these new productions
    FIRST(Actions) = FIRST(Action)  = { "speech", "entry", "exit", "death", "gesticulation" }
    FOLLOW(Actions) = FOLLOW(Scene) = FIRST(RestOfScene) U FOLLOW(Act)
                                    = { "speech", "entry" } U {"speech", "entry" "interval" }
                                    = {"speech", "entry" "interval" }
and Rule 2 is broken again.

With more experience it is possible to check Rule 2 more quickly. Consider the original production set again

      Play   = Act Act Act "interval" Act Act .
      Act    = Scene { Scene } .
      Scene  = { "speech" } "entry" { Action } .
      Action = "speech" | "entry" | "exit" | "death" | "gesticulation" .
and in particular the production
       Act = Scene { Scene } .
In this case Rule 2 really governs whether a parser will be able to determine whether it has to go around the loop corresponding to { Scene } for another iteration. It can only do this if FIRST(Scene) has nothing in common with what we could loosely call FOLLOW(loop), that is, FOLLOW(Scene). From the third production we should quickly see that FIRST(Scene) = { "speech" , "entry" }, and from the second production we should quickly see that FOLLOW(Scene) = FOLLOW(Act) = { "interval" } U FIRST(Act), a set which clearly includes the elements { "speech" , "entry" } again.

Even more experience or common sense might bring you to the same conclusion just from "first principles". If there is no recognizable "end of scene" marker, as a member of the audience one cannot readily tell, when a new Scene starts, whether it is the start of a new Act or not. For that matter, one cannot even tell when a new speech starts, whether there has been a scene change or not! The "language" can be made LL(1) if the theatre raises and lowers the curtain between scenes, or actually changes the furniture on the stage.

      Play   = Act Act Act "interval" Act Act .
      Act    = "dimlights" Scene { Scene } "raiselights" .
      Scene  = "raisecurtain" { "speech" } "entry" { Action } "lowercurtain" .
      Action = "speech" | "entry" | "exit" | "death" | "gesticulation" .
Similarly, there is a "quick" way to see that the production
      Scene  = { "speech" } "entry" { Action } .
gets you into trouble. The first loop (corresponding to { "speech" } ) gives no trouble - one more time round the loop if we hear another "speech", very different from the follower, "entry" (you hear one, you see the other, in real life). The second loop is more of a problem. One more time round the loop if we detect any element of the set of { "speech", "entry", "exit", "death", "gesticulation" }? Well, that would be nice, but after the loop is all over we must start the next Scene, and that will also start with one of { "speech", "entry" }.

A5 Consider the following Cocol/R description of a simple language

         COMPILER A
           CHARACTERS
             digit  = "0123456789" .
             letter = "abcdefgefghijklmnopqrstuvwxyz" .
           TOKENS
             number = digit { digit } .
             identifier  = "a" { letter } .
           PRODUCTIONS
             A = B "." .
             B = identifier | number | "(" C ")" | "(." B ".)" | .
             C = B D .
             D = { "+" B } .
         END A.
The question on constructing a scanner was very badly done (average 27%). Many people constructed a parser, which was awful. Only one thought of simply typing in the grammar from the paper and submitting it to Coco/R to do the construction in about 5 seconds - about the easiest way of earning 12 marks I could imagine. ("Answers may be submitted on any medium" it said). Of course, what I had imagined people would do was to trot out the sort of stuff that was in the handouts on pages 85 - 86, where the exact same case study had been done. But I guess nobody bothered to read any of that or go to or over those lectures.

A6 Recursive descent parsers have to pay particular attention to the problem of error recovery to handle the not-uncommon situations where they are presented with source programs that are syntactically incorrect. Describe a systematic approach that might be taken to handling this problem. [14]

This question was very badly done (average 31%). Only one student seemed to have read or understood that important section properly. Mostly what I got was guesswork and vague comments on Coco/R.


SECTION B [ 60 Marks ]

Please note that there is no obligation to produce a machine readable solution for this section. Coco/R and other files are provided so that you can enhance, refine, or test your solution if you desire. If you choose to produce a machine readable solution, you should create a working directory, unpack EXAM.ZIP, modify any files that you choose, and then copy all the files back to the blank diskette that will be provided. In this case it would help if you were to highlight your changes with (* +++ comments ++ *).

The current meeting of ISO/IEC JTC1/CS22/WG1984 has erupted into chaos. ISO/IEC JTC1/CS22/WG1984 is, as you may recall, the international committee of experts charged with the responsibility of standardizing the language Topsy++ which is poised to Take Over The World.

The trouble started when delegates pointed out that, while they were happy that in Topsy++ each identifier should be declared before it was used, the current definition requires that all identifiers be declared before any are used. Thus, while a fragment of code like

        void main (void) {
          int j, k = 10;
          bool okay;
          j = k + 100;
          okay = j > 55;
        }
is currently legal both in C++ and Topsy++, a fragment like
        void main (void) {
          int j, k = 10;
          j = k + 100;
          bool okay = j > 55;
        }

is currently illegal in Topsy++, even though it would be acceptable in C++.

Further trouble erupted when some delegates pointed out that Topsy++ and its implementations are supposed to be very safe. They have objected to the fact that one is currently allowed to write for loops like

        for (i = 0; i < 10; i++) {
          cout <<< i;  // So far, so good
          i--;        // Oh dear - this will cancel out the effect of i++
        }
which would result in an infinite loop.

These complaints have been referred to a Working Group of delegates from Rhodes University, South Africa. The Committee have requested that the definition of Topsy++, and the prototype implementation under development by the Working Group, be modified so that:

(a) Identifiers may be declared at any convenient point before they are used (that is, declarations and statements may be intermingled).

(b) An identifier may be declared at most once in any scope, but the scope of identifiers should be constrained as tightly as possible. Thus, for example, in code reading

         if (List[i] > List[i+1]) { // swap two elements
           int temp = List[i]; List[i] = List[i+1]; List[i+1] = temp;
         }

the scope of the variable temp should extend only from the point where it is declared to the end of the statement sequence forming the body of the if statement. However, a declaration of temp in a surrounding scope would be legal; the scope of this previous declaration would not extend over the body of the if statement, but would extend over statements that preceded and followed the body of the if statement.

(d) The existence of variables is to be made as economical as possible. Thus, to use the same example as before, storage allocated for the variable temp may be used for other purposes at other stages in the program's execution.

(e) Some means must be found of preventing a for loop control variable from being altered, either by assignment, or by the action of an input statement within the statement that forms the body of the for loop.

Since the standardization of Topsy++ has already taken far too long, the Working Group have been given a strict 24 hour limit to study these requests, at the end of which they are required to come up with satisfactory recommendations.

As a member of this development team, how would you attempt to satisfy the requests of the standardization Committee?

You will, hopefully, be relieved to know that copies of the draft definition of Topsy++ have been made available to you, and will also be available during the reportback meeting. Besides this, machine readable copies of a prototype implementation of Topsy++ are (and will be) available, as well as copies of Coco/R (the tool used to develop this implementation), and documentation on the use of Coco/R. Lastly, the Committee have supplied some fragmentary specimen programs written in Topsy++ that highlight their concerns.

Your suggestions can be presented to the Committee in machine readable form, or you might elect to present a hand written report, showing how the Coco/R specification and any supporting modules need alteration.

As members of the team you have, naturally, been allowed to interact with one another before the 24 hour deadline is reached, but each one of you will be called upon to present your suggestions to the Committee individually at their next 3 hour session, whereafter judgement will be passed on your efforts.

When I set this question I was aware that I ran the risk that I might see the same solution 19 times. As it happened, I distinguished only about 4 groups at work. The one pair was, I thought, pretty astounding, and the top mark awarded was 58/60. They had seen all the issues very clearly, and solved them all with a lot of insight, including the issue of how to determine the maximum stack frame size (which nobody else got right). The average for the question was 42/60. The lowest mark was 30/60, for someone who clearly either did not remember their solution, or did not have time to type it all in (my, you were all very brave!).

The largest group (ie about 12 of you!) had solutions that were not right in a very fundamental way - one does not create a new stack frame every time a CompoundStatement begins execution, and never discard it again (which is what this group all tried to do, clearly having missed the important distinction between the compile-time and run-time aspects of the problem). So that meant quite a loss of marks, I'm sorry to say.

The "for loop" restriction was very well done by all but two people who did not realise that setting Entry1.CannotModify in a local copy of a symbol table entry would not do what they thought it would do. I was delighted to see that you had all realised that this was a static semantic feature, and was easily handled by a nice interface into the Table Handler. (I had had visions of seeing the most appalling hacks in this regard). And, to be quite honest, two or three had put in a check that I had not myself thought of - well done.

As stated in the free information, the amount of code required is actually quite small. Essentially what is needed is summed up as

A solution follows - the lines marked *** show where changes are needed. Many of these are "clones" - once you have seen one, you have really seen them all!

     PRODUCTIONS
       Topsy                        (. VAR
                                         FrameSize : CARDINAL;
                                         EnterLabel : CGEN.LABELS; .)
       =  "void" "main" "(" "void" ")"
***                                 (. FrameSize := 0; MaxFrame := 0;
                                       CGEN.OpenStackFrame(EnterLabel) .)
          Block<FrameSize>          (. (* reserve space for variables *)
***                                    CGEN.FixUp(EnterLabel, MaxFrame);
                                       CGEN.LeaveProgram .) .

***    Block<VAR FrameSize : CARDINAL>
***    =  "{"                       (. TABLE.OpenScope(FrameSize) .)
***       { Statement<FrameSize> }  (. TABLE.CloseScope(FrameSize) .)
          "}" .

       OneVar<VAR FrameSize : CARDINAL; Type : TABLE.TYPES>
                                    (. VAR
                                         Entry : TABLE.ENTRIES;
                                         ExpType  : TABLE.TYPES; .)
       =                            (. Entry.IDClass := Vars; Entry.Size := 1;
                                       Entry.Scalar := TRUE; Entry.Type := Type;
                                       Entry.Offset := FrameSize + 1 .)
          Ident<Entry.Name>
          [ ArraySize<Entry.Size>   (. Entry.Scalar := FALSE .)
          ]                         (. TABLE.Enter(Entry);
                                       INC(FrameSize, Entry.Size);
***                                    IF FrameSize > MaxFrame THEN
***                                      MaxFrame := FrameSize
                                       END .)
          [ "="                     (. IF NOT Entry.Scalar THEN SemError(210) END;
                                       CGEN.StackAddress(Entry.Offset) .)
            Expression              (. CGEN.Assign(FALSE) .)
          ] .

       Statement<VAR FrameSize : CARDINAL>
       =  SYNC (   Assignment
                 | ReadStatement
                 | WriteStatement
                 | ReturnStatement
                 | ";"
                 | "stackdump"      (. CGEN.Dump .)
                 | ConstDeclaration
***              | VarDeclarations<FrameSize>
***              | IfStatement<FrameSize>
***              | WhileStatement<FrameSize>
***              | RepeatStatement<FrameSize>
***              | ForStatement<FrameSize>
***              | Block<FrameSize> ) .

***    IfStatement<VAR FrameSize : CARDINAL>
                                    (. VAR
                                         TestLabel, OutLabel : CGEN.LABELS; .)
       =  "if" "(" Condition ")"    (. CGEN.JumpOnFalse(TestLabel, CGEN.undefined) .)
***       Statement<FrameSize>
          (   "else"                (. CGEN.Jump(OutLabel, CGEN.undefined);
                                       CGEN.BackPatch(TestLabel) .)
***           Statement<FrameSize>  (. CGEN.BackPatch(OutLabel) .)
            | (* no else part *)    (. CGEN.BackPatch(TestLabel) .)
          ) .

***    WhileStatement<VAR FrameSize : CARDINAL>
                                    (. VAR
                                         StartLoop, TestLabel, DummyLabel : CGEN.LABELS; .)
       =  "while" "("               (. CGEN.StoreLabel(StartLoop) .)
          Condition ")"             (. CGEN.JumpOnFalse(TestLabel, CGEN.undefined) .)
***       Statement<FrameSize>      (. CGEN.Jump(DummyLabel, StartLoop);
                                       CGEN.BackPatch(TestLabel) .) .

       RepeatStatement<VAR FrameSize : CARDINAL>
                                    (. VAR
                                         StartLoop, DummyLabel : CGEN.LABELS; .)
       =  "do"                      (. CGEN.StoreLabel(StartLoop) .)
***       { Statement<FrameSize> }
          "until" Condition ";"     (. CGEN.JumpOnFalse(DummyLabel, StartLoop) .) .

***    ForStatement<VAR FrameSize : CARDINAL>
                                    (. VAR
                                         StartLoop, TestLabel, BodyLabel,
                                         StartEpilogue, DummyLabel : CGEN.LABELS;
                                         ControlType, ExpType : TABLE.TYPES;
                                         Entry1, Entry2 : TABLE.ENTRIES; .)
       =  "for" "("
          Designator<CLASSSET{Vars}, Entry1, ControlType>
                                    (. IF NOT Entry1.Scalar OR NOT (ControlType IN ArithTypes)
                                         THEN SemError(225)
                                       END; .)
          "=" Expression<ExpType>   (. IF Compatible(ControlType, ExpType)
                                         THEN CGEN.Assign(ControlType = chars)
                                         ELSE SemError(218)
                                       END;
                                       CGEN.StoreLabel(StartLoop) .)
          ";" Condition             (. CGEN.JumpOnFalse(TestLabel, CGEN.undefined) .)
          [ ";"                     (. CGEN.Jump(BodyLabel, CGEN.undefined);
                                       CGEN.StoreLabel(StartEpilogue) .)
             Designator<CLASSSET{Vars}, Entry2, ControlType>
                                    (. IF FileIO.Compare(Entry1.Name, Entry2.Name) # 0
                                         THEN SemError(226)
                                       END; .)
             (   "=" Expression<ExpType>
                                    (. CGEN.Assign(Entry1.Type = chars) .)
               | "++"               (. CGEN.Increment(Entry1.Type = chars) .)
               | "--"               (. CGEN.Decrement(Entry1.Type = chars) .)
             )                      (. CGEN.Jump(DummyLabel, StartLoop);
                                       CGEN.BackPatch(BodyLabel);
                                       StartLoop := StartEpilogue .)
          ]
***       ")"                       (. TABLE.Protect(Entry1) .)
***       Statement<FrameSize>      (. CGEN.Jump(DummyLabel, StartLoop);
                                       CGEN.BackPatch(TestLabel);
***                                    TABLE.Unprotect(Entry1) .) .


***    Variable<VAR VarType : TABLE.TYPES>
                                    (. VAR
                                         Entry : TABLE.ENTRIES; .)
***    =  Designator<CLASSSET{Vars}, Entry, VarType>
***                                 (. IF NOT Entry.CanChange THEN SemError(229) END .) .

The table handler module needs the following modifications


     DEFINITION MODULE TABLE;
     (* Handle symbol table for Topsy level 1 - no procedures
        Extensions for November examination - 1996
        P.D. Terry, Rhodes University, 1996 *)

       IMPORT FileIO;

       CONST
         AlfaLength = 15 (* maximum length of identifiers *);
       TYPE
         ALFA      = ARRAY [0 .. AlfaLength - 1] OF CHAR;
         IDCLASSES = (Consts, Vars, Progs);
         TYPES     = (none, ints, chars, bools);
         ENTRIES   = RECORD
***                    Self : CARDINAL;
                       Name : ALFA;
                       Type : TYPES;
                       CASE IDClass : IDCLASSES OF
                            Consts :
                              Value : INTEGER;
                         |  Vars :
                              Size, Offset : CARDINAL;
***                           CanChange, Scalar : BOOLEAN
                         (* ELSE (* Progs *) nothing *)
                       END
                     END;
     (* rest as before *)

     IMPLEMENTATION MODULE TABLE;
     (* Handle symbol table for Topsy level 1 - no procedures
        Extensions for November examination - 1996
        P.D. Terry, Rhodes University, 1996 *)

       IMPORT FileIO, REPORT;

       CONST
         TableMax = 100;
       TYPE
         INDEX = CARDINAL [0 .. TableMax];
       VAR
***      Scope,                     (* Point to top of scope stack *)
         Top        : CARDINAL;     (* Point to last entry currently in table *)
         SymTable   : ARRAY INDEX OF ENTRIES;
***      ScopeTable : ARRAY [0 .. TableMax] OF
***                    RECORD
***                      Top,                 (* Record latest Top pointer *)
***                      FrameSize : CARDINAL (* Record latest FrameSize *)
***                    END;

       PROCEDURE Enter (VAR Entry : ENTRIES);
         VAR
           Look : INDEX;
         BEGIN
           Find(Entry.Name, Look);
***        IF Look > ScopeTable[Scope].Top (* check current scope for duplicates *)
             THEN REPORT.Error(201)
             ELSE
              IF Top = TableMax THEN (* Symbol Table overflow *)
                FileIO.WriteString(FileIO.con, "Symbol table full"); HALT
              END;
***           INC(Top); Entry.Self := Top;  Entry.CanChange := TRUE;
              SymTable[Top] := Entry;
            END;
         END Enter;

     (* These next are a bit of a kludge.  The Self pointer allows us to
        modify an Entry without the client apparently knowing how the table
        is set up *)

***    PROCEDURE Protect (Entry : ENTRIES);
***      BEGIN
***        SymTable[Entry.Self].CanChange := FALSE
***      END Protect;

***    PROCEDURE Unprotect (Entry : ENTRIES);
***      BEGIN
***        SymTable[Entry.Self].CanChange := TRUE
***      END Unprotect;

***    PROCEDURE OpenScope (FrameSize : CARDINAL);
***      BEGIN
***        INC(Scope);
***        IF Scope = TableMax THEN (* Symbol Table overflow *)
***          FileIO.WriteString(FileIO.con, "Scope overflow"); HALT
***        END;
***        ScopeTable[Scope].FrameSize := FrameSize;
***        ScopeTable[Scope].Top := Top;
***      END OpenScope;

***    PROCEDURE CloseScope (VAR FrameSize : CARDINAL);
***      BEGIN
***        FrameSize := ScopeTable[Scope].FrameSize;
***        Top  := ScopeTable[Scope].Top;
***        DEC(Scope)
***      END CloseScope;

       BEGIN
***      Top := 0; Scope := 0;
         SymTable[Top].Name := "";
         SymTable[Top].IDClass := Progs
       END TABLE.


SECTION C [ 30 marks ]

C1 Answer either this question or C2 but not both.

This question relates to the Topsy++ system as described in section B.

To their horror, if and when the development team tackle the challenges posed by the Topsy++ standards committee, they will also discover that the programmers responsible for the prototype implementation had shirked their responsibilities. They had almost completely omitted to deal with the fact that Topsy++ has three data types - integers, characters, and Boolean, and that these are required to obey various compatibility constraints. Discuss in some detail where these constraints need to be checked, and the form that the checks would take. You are not required to produce a fully detailed answer (although you are free to do so), but you should answer in sufficient detail to make it clear that you understand the problem and how to solve it.

C2 Answer either this question or C1 but not both.

Suppose that one wishes to introduce the regular procedure concept into an imperative programming language on the lines of that used in Modula-2 or Pascal, where procedures may be nested, declare their own local variables, call one another (or even recursively, themselves), communicate by means of parameters, and access variables that are in scope. What particular features are needed to provide compile-time and run- time support for these ideas? Discuss these features in some detail; it is not, however, necessary to provide detailed code for a compiler.

Probably the less said about Section C the better. Maybe you were getting tired (he says generously). The average was only 44%. A very few people had studied the last chapter or the last practical (well done those who had). The questions were set on very important aspects of the course which I regretfully feel simply have passed most people by. But then, going off on short vacation was much more important than anything else, wasn't it?