Computer Science 3 - 2002 - Test on Prac 12

You may use your notes, or refer to material on the website if you like. This test is designed to see whether you have gained a real insight into languages and code generation.

1. I really will learn it one day. But not today! What is your name (surname)? [1]

Pat (Terry)

2. After a lot of effort a student came up with the following solution to the problem of extending the IfStatement posed last week.

  IfStatement<int &framesize>
  =                                 (. CGEN_labels testlabel, outlabel; .)
     "if" "(" Condition ")"         (. CGen->jumponfalse(testlabel, CGen->undefined); .)
     Block<framesize>               (. CGen->jump(outlabel, CGen->undefined);
                                       CGen->backpatch(testlabel); .)
     [ "else" Block<framesize> ]    (. CGen->backpatch(outlabel); .) .

Although this seemed to work, it is not as good as it could be. Why not? [3]

If the "else" clause is absent it generates an unnecessary BRN instruction. For example, source like

if (a > 4) c = 5;

generates code like this (the addresses are just for illustration)

         12   ADR  -34
         14   VAL
         15   LIT  4
         17   GTR       ; if (a > 4)
         18   BZE  27   ;   we need this one
         20   ADR  -12
         22   LIT  5
         24   STO       ; c = 5
         25   BRN  27   ;   unnecessary - we are about to get there anyway
         27   ....

Better code is generated using the system found in the prac solution:

  IfStatement<int &framesize>
  =                             (. CGEN_labels testlabel, outlabel; .)
     "if" "(" Condition ")"     (. CGen->jumponfalse(testlabel, CGen->undefined); .)
     Block<framesize>
     (   "else"                 (. CGen->jump(outlabel, CGen->undefined);
                                   CGen->backpatch(testlabel); .)
         Block<framesize>       (. CGen->backpatch(outlabel); .)
       | /* no else part */     (. CGen->backpatch(testlabel); .)
     ) .

3. As you should know, that grammar for the IfStatement does not lead to the "dangling else" LL(1) problem, as would arise if we wrote a production like

IfStatement = "if" "(" Condition ")" Statement [ "else" Statement ] .

because of the enforced presence of braces { } around the component sub-statements of an IfStatement. But some people regard these braces as a nuisance - it means having to code "nested ifs" with lots of braces, e.g.

     if (something) {
       a = b;
     } else
     { if (somethingElse) {
       c = d;
     } else
     { if (somethingDifferent) {
       e = f:
     }
     }
     }

or

     if (something) { a = b; } else { if (somethingElse) { c = d; }
     else { if (somethingDifferent) { e = f: } } }

with many easily confused closing braces instead of the much simpler

     if (something)
       a = b;
     else if (somethingElse)
       c = d;
     else if (somethingDifferent)
       e = f;

What is your reaction to a suggestion that we use a production like [ 3 ]

IfStatement = "if" "(" Condition ")" Block [ "else" ( IfStatement | Block ) ] .

A lot of people missed the point here - which was to examine whether this scheme, which certainly cuts out at least some of the { } braces in source code, would lead to a "dangling else" problem again. In fact it does not. We could rewrite the production:

       IfStatement = "if" "(" Condition ")" Block Else .
       Else        = "else" ElseOptions | e .
       ElseOptions = IfStatement | Block .

Here ElseOptions clearly introduces no LL(1) problems as

       FIRST(IfStatement) = { "if" }
       FIRST(Block)       = { "{" }

Else is nullable, so we must check FIRST(Else) and FOLLOW(Else). But

       FIRST(Else)  = { "else" }
       FOLLOW(Else) = FOLLOW(IfStatement) = { "}" } + FIRST(Statement)

which gives us no problems at all!

4. In C++ the grammar for statements includes productions like

  Statement
  =  { number ":" }
     (  Block | Assignment | WhileStatement | IfStatement | GoToStatement | EmptyStatement ) .

  GoToStatement  = "goto" number ";" .
  EmptyStatement = ";" .
  IfStatement    = "if" "(" Condition ")" Statement .
  WhileStatement = "while" "(" Condition ")" Statement .
  Assignment     = Variable ( "=" Expression | "++" | "--" ) ";" .

The EmptyStatement allows one to write code like

a = b; ; ; ; c = d;

which causes no great harm. But it also allows one to write code like

   while (a < b);
     { a++; }

which is deceptive - one only executes the a++; statement at most once, and then only if a > b. If a < b the effect is to execute an indefinite loop.

Suppose one wanted to prevent this - not to allow an EmptyStatement as the body of a while loop, but allow them in other places. What simple actions could you add to the grammar to achieve this? [ 5 ]

There are several solutions possible, and some that look correct but are subtly wrong. Here is a correct one:

  Statement
  =  { number ":" }
     (  Block | Assignment | WhileStatement | IfStatement | GoToStatement | EmptyStatement ) .

  NonEmptyStatement
  =  { number ":" }
     (  Block | Assignment | WhileStatement | IfStatement | GoToStatement ) .

  GoToStatement  = "goto" number ";" .
  EmptyStatement = ";" .
  IfStatement    = "if" "(" Condition ")" Statement .
  WhileStatement = "while" "(" Condition ")" NonEmptyStatement .
  Assignment     = Variable ( "=" Expression | "++" | "--" ) ";" .

but here is one that is subtly wrong:

  Statement
  =  { number ":" }
     (  Block | Assignment | WhileStatement | IfStatement | GoToStatement | EmptyStatement ) .

  NonEmptyStatement
  =  Block | Assignment | WhileStatement | IfStatement | GoToStatement .

  WhileStatement = "while" "(" Condition ")" NonEmptyStatement .

as is one like this

  WhileStatement = "while" "(" Condition ")"
                   ( Block | Assignment | WhileStatement | IfStatement | GoToStatement ) .

There is another approach one can take - add an action into the parser productions:

  Statement<bool EmptyAllowed>
  =  { number ":" }
     (    Block | Assignment | WhileStatement | IfStatement | GoToStatement
        | EmptyStatement (. if (!EmptyAllowed) SemError(999); .)
     ) .

  GoToStatement  = "goto" number ";" .
  EmptyStatement = ";" .
  IfStatement    = "if" "(" Condition ")" Statement<true> .
  WhileStatement = "while" "(" Condition ")" Statement<false> .
  Assignment     = Variable ( "=" Expression | "++" | "--" ) ";" .

A trick one student thought of that I hadn't (which is always fun to see) was to write

  WhileStatement = "while" "(" Condition ")"
                   [ ";"                       (. SemError(999); .)
                   ]
                   Statement .

(non LL(1), although that does not in fact matter; you might like to puzzle out why this is so) except that this would still have allowed silly code like

while (a < b) 10: ; { a++; }

Of course, in Parva, which insists on the use of Blocks we don't seem to be able to make this mistake - or do we? Well, how about

while (a < b) { } ( a++; }

or even

while (a < b) { ; ; ; } ( a++; }

but it is probably harder to do that "by accident" than it is in the C++ style.

5. The grammar above allows one to label statements and branch in all sorts of directions to them - for example

      20:  a = a + c;
           c++;
           if (c < 10) goto 10;
           c = d;
      10:  a++;
           goto 20;

This style of programming is very much frowned upon, as it leads rapidly to "spaghetti code" that is hard for a reader to follow. But suppose we wanted to incorporate labels and goto statements into Parva. Without going into too much detail, suggest what we might have to do when labels were "declared", and when they were "used". (Hint: last week you made use of the template List class - do so again). [ 8 ]

It was gratifying to see how so many students had quite a good grasp of what would be needed. Points that many people missed were (a) the labels are simple numbers, so we do not have to store them in the symbol table with the other identifiers; (b) at the point where a label is used in a GoToStatement it may not yet have been defined - that is quite legal, though of course it must be defined somewhere, meaning that provision must be made for fixing or back patching, and in fact it is simpler to fix all GoToStatements at the end of the Block rather then distinguish between forward and backward references as we go; (c) we need two lists - one for the labels as they are declared and another for the labels as they are used; (d) the structures in the label lists need to record both the source form (a simple value of integer type) and the object address (conceptually of CGEN_label type) and (d) a new list is preferable for each Block, to limit the nonsensical transfers that might otherwise occur into the middle of loops from arbitrary points in the code.

Although it was not asked for, here are extracts from the grammar in the solution file showing how it can all be put together. In CGEN.H we introduce the declarations needed to set up and maintain the list

   struct CGEN_labelstruct {
     int label;             // source label number
     CGEN_labels codelabel; // its location in code

     friend int operator == (CGEN_labelstruct x, CGEN_labelstruct y)
       { return x.label == y.label; }

     friend int operator != (CGEN_labelstruct x, CGEN_labelstruct y)
       { return x.label != y.label; }
   };

   typedef List<CGEN_labelstruct> CGEN_LabelList;

   class CGEN {
     public:
       ...

       void label(CGEN_LabelList &L, int n);
       // Adds n to the list L of known user-defined labels

       void gotolabel(CGEN_LabelList &G, int n);
       // Generates a goto instruction, adding to the list G of goto references

       void fixgotolist(CGEN_LabelList &L, CGEN_LabelList &G);
       // Backpatches goto statements from reference list G using label list L

   };

and in PARVA.ATG we add the simple syntax and actions that we need

     Block<int &framesize>
     =  "{"                        (. CGEN_LabelList labellist, gotolist;
                                      Table->openscope(framesize); .)
        { Statement<framesize, labellist, gotolist> }
                                   (. if (Report->debugging) /* demonstration purposes */
                                        Table->printtable(stdout);
                                      Table->closescope(framesize) .)
        WEAK "}"                   (. CGen->fixgotolist(labellist, gotolist); .) .

     Statement<int &framesize, CGEN_LabelList &labellist, CGEN_LabelList &gotolist>
     =  SYNC (   Label<labellist>
               | GoToStatement<gotolist>
               | Block<framesize>
               ....
             ) .

     Label<CGEN_LabelList &labellist>
       =                           (. int n; .)
       Number<n>                   (. CGen->label(labellist, n); .)
       ":" .

     GoToStatement<CGEN_LabelList &gotolist>
       =                           (. int n; .)
       "goto" Number<n>            (. CGen->gotolabel(gotolist, n); .)
       ";" .

Finally, in CGEN.CPP we add the implementation of the list handling routines

   void CGEN::label(CGEN_LabelList &L, int n)
   { CGEN_labelstruct thisLabel;
     thisLabel.label = n; thisLabel.codelabel = codetop;
     if (L.isMember(thisLabel)) Report->error(240);  // label already defined
     else L.add(thisLabel);
   }

   void CGEN::gotolabel(CGEN_LabelList &G, int n)
   { CGEN_labelstruct thisLabel;
     thisLabel.label = n; thisLabel.codelabel = codetop;
     G.add(thisLabel);
     emit(int(STKMC_brn)); emit(undefined);
   }

   void CGEN::fixgotolist(CGEN_LabelList &L, CGEN_LabelList &G)
   { CGEN_labelstruct gotoLabel, toLabel;
     while (!G.isEmpty()) {
       G.remove(0, gotoLabel);
       int pos = L.position(gotoLabel);
       if (pos == -1) Report->error(241);            // label never defined
       else fixup(gotoLabel.codelabel, L[pos].codelabel);
     }
   }

However, if you were to try to write programs like that you would find that the GoToStatement is quite useless. It only becomes useful if you can write code like

if (something) goto 100;

and if we restrict the IfStatement to require a Block in each of the "arms", and each Block opens a new label list (or label scope, it comes to the same thing!) we will not be able to write code like

if (something) { goto 100; }

The solution here is to extend the grammar for IfStatement on the lines of that suggested in Question 3, allowing a GoToStatement as a special option after if and after else:

  IfStatement<int &framesize, CGEN_LabelList &gotolist>
  =                             (. CGEN_labels testlabel, outlabel; .)
     "if" "(" Condition ")"     (. CGen->jumponfalse(testlabel, CGen->undefined); .)
     (   Block<framesize>
       | GoToStatement<gotolist>
     )
     (   "else"                 (. CGen->jump(outlabel, CGen->undefined);
                                   CGen->backpatch(testlabel); .)
       (   Block<framesize>
         | GoToStatement<gotolist>
       )                        (. CGen->backpatch(outlabel); .)
       | /* no else part */     (. CGen->backpatch(testlabel); .)
     ) .

Alternatively one could use a "global" approach to labels, but this leads to complete nonsense if one can jump in and out of blocks or functions at random.


Home  © P.D. Terry