Computer Science 301 - 2006

Tutorial for week 26 - Code generation

(a) Many languages provide for a ForStatement in one or other form. In the practical you have been asked to add a simple Pascal-style for loop to Parva, to allow statements with concrete syntax defined by

       ForStatement = "for" Identifier "=" Expression ("to" | "downto") Expression Statement .

for example

       for i = 1 to 10 write(i);          // forwards  1 2 3 4 5 6 7 8 9 10
       for i = 10 downto 1 write(i);      // backwards 10 9 8 7 6 5 4 3 2 1
       for i = 10 to 5 write(i);          // no effect
       for b = false to true write(b);    // writes "false true"
       for i = i - 1 to i + 6 write(i);   // what does this do?

As discussed in class early last Friday morning, the semantics and implementation of a ForStatement is quite tricky. Here is one suggested way that overcomes some of the problems we discussed. If you were not at the class, tough. Revenge will be sweet in a few weeks time!

Many of these issues can be overcome by a fairly dramatic process. We arrange that the code generation for the generic loops

        for Variable = Expression1 to Expression2 Statement 
        for Variable = Expression1 downto Expression2 Statement 

must yield lower level code of the form

         Temp1 := Expression1                             Temp1 := Expression1
         Temp2 := Expression2                             Temp2 := Expression2
         IF Temp1 > Temp2 THEN GOTO EXIT                  IF Temp1 < Temp2 THEN GOTO EXIT
         Variable := Temp1;                               Variable := Temp1;
  BODY:  Statement                                 BODY:  Statement
         IF Variable = Temp2 THEN GOTO EXIT               IF Variable = Temp2 THEN GOTO EXIT
         Variable := Variable + 1                         Variable := Variable - 1
         GOTO BODY                                        GOTO BODY
  EXIT:                                            EXIT:

The simplest way of handling all these issues in the PVM system is to note that the obvious calls to the parsing routines

           Designator
           Expression
           Expression

will ensure that code to push the address of the control variable and the values of the temporary variables will have been generated by the time the auxiliary Statement is encountered. At this point code must be generated for the initial test, and if we avail ourselves of the ability to define extensions to the opcode set of the PVM we can generate a special opcode at this point that will leave these three values on the stack so that they can be used again later. Similarly, after generating the code for the Statement we can generate a second special opcode that can be responsible for the final test and possible increment of the control variable. One last complication is that once the for loop has completed execution we have to discard these three elements from the stack, which suggests the introduction of yet another special opcode.

Develop these ideas in detail.

The production for ForStatement can be attributed as follows:

  ForStatement<StackFrame frame>        (. boolean up = true;
                                           int expType;
                                           String name;
                                           loopExit = new Label(!known); .)
  =  "for" Ident<out name>              (. Entry control = Table.find(name);
                                           if (!control.declared) {
                                             SemError("undeclared identifier");
                                           if (isRef(control.type) || control.kind != Entry.Var)
                                             SemError("illegal control variable");
                                           CodeGen.loadAddress(control); .)
     "=" Expression<out expType>        (. if (!compatible(expType, control.type))
                                             SemError("incompatible with control variable"); .)
     ( "to" | "downto"                  (. up = false; .)
     )
     Expression<out expType>            (. if (!compatible(expType, control.type))
                                             SemError("incompatible with control variable");
                                           CodeGen.startForLoop(up, loopExit);
                                           Label startLoop = new Label(known); .)
     Statement<frame>                   (. CodeGen.endForLoop(up, startLoop);
                                           loopExit.here();
                                           CodeGen.pop3(); .)
  .

which is relatively straightforward. So is the code generation:

  public static void startForLoop(boolean up, Label destination) {
  // Generates prologue test for a for loop (either up or down)
    if (up) emit(PVM.sfu); else emit(PVM.sfd);
    emit(destination.address());
  }

  public static void endForLoop(boolean up, Label destination) {
  // Generates epilogue test and increment/decrement for a for loop (either up or down)
    if (up) emit(PVM.efu); else emit(PVM.efd);
    emit(destination.address());
  }

  public static void pop3() {
  // Generates code to discard top three elements from the stack
    emit(PVM.pop3);
  }

The nitty-gritty is contained in the extensions to the interpreter, which might require careful study:

     case PVM.sfu:           // start for loop "to"
       if (mem[cpu.sp + 1] > mem[cpu.sp]) cpu.pc = mem[cpu.pc];
       else {
         mem[mem[cpu.sp + 2]] = mem[cpu.sp + 1]; cpu.pc++;
       }
       break;
     case PVM.sfd:           // start for loop "downto"
       if (mem[cpu.sp + 1] < mem[cpu.sp]) cpu.pc = mem[cpu.pc];
       else {
         mem[mem[cpu.sp + 2]] = mem[cpu.sp + 1]; cpu.pc++;
       }
       break;
     case PVM.efu:           // end for loop "to"
       if (mem[mem[cpu.sp + 2]] == mem[cpu.sp]) cpu.pc++;
       else {
         mem[mem[cpu.sp + 2]]++; cpu.pc = mem[cpu.pc];
       }
       break;
     case PVM.efd:           // end for loop "downto"
       if (mem[mem[cpu.sp + 2]] == mem[cpu.sp]) cpu.pc++;
       else {
         mem[mem[cpu.sp + 2]]--; cpu.pc = mem[cpu.pc];
       }
       break;
     case PVM.pop3:          // discard 3 elements from top of stack
       cpu.sp += 3;
       break;

(b) The PVM has opcodes LDL N and STL N, as you will recall from an earlier practical. And our CodeGen class has methods for generating these. But the attributed parser does not do so - all code for loading and storing variables only makes use of the LDA, LDV and STO opcodes.

How would you change the parser routines to generate LDL and STL opcodes where it can?

This is something that must be done with great care. Various of the productions -Assignment, OneVar, Designator and Factor need alteration. The trick is to modify the Designator production so that it does not generate the LDA opcode immediately. But we need to distinguish between designators that correspond to "simple" variables that are to be manipulated with the LDL and STL opcodes, and array elements which will still require use of LDV and STO opcodes. So the DesType class is extended yet again:

The discussion here applies to the simple Parva compiler with the Pascal operator precedences only. Obviously the same sort of thing can be done for the version developed in the current practical for Java operator precedences.

    class DesType {
    // Objects of this type are associated with l-value and r-value designators
      public Entry entry;          // the identifier properties
      public int type;             // designator type (not always the entry type)
**    public boolean isSimple;     // true unless it is an indexed designator

      public DesType(Entry entry) {
        this.entry = entry;
        this.type = entry.type;
**      this.isSimple = true;
      }
    } // end DesType

The Designator production is now attributed as follows - note in particular where the code generation occurs:

    Designator<out DesType des>           (. string name;
                                             int indexType; .)
    =  Ident<out name>                    (. Entry entry = Table.find(name);
                                             if (!entry.declared)
                                               SemError("undeclared identifier");
                                             des = new DesType(entry); .)
       [  "["                             (. if (isRef(des.type)) des.type--;
                                             else SemError("unexpected subscript");
                                             if (entry.kind != Entry.Var)
                                               SemError("unexpected subscript");
**                                           des.isSimple = false;
**                                             CodeGen.loadValue(entry); .)
              Expression<out indexType>   (. if (!isArith(indexType)) SemError("invalid subscript type");
**                                             CodeGen.index(); .)
          "]"
       ] .

Within the Factor production, when a Designator is parsed one must either complete the array access by generating the LDV opcode, or generate the LDL opcode.

    Factor<out int type>                  (. int value = 0;
                                             type = Entry.noType;
                                             int size;
                                             DesType des;
                                             ConstRec con; .)
    =    Designator<out des>              (. type = des.type;
                                             switch (des.entry.kind) {
                                               case Entry.Var:
**                                               if (des.isSimple) CodeGen.loadValue(des.entry);
**                                               else CodeGen.dereference();
                                                 break;
                                               case Entry.Con:
                                                 CodeGen.loadConstant(des.entry.value);
                                                 break;
                                               default:
                                                 SemError("wrong kind of identifier");
                                                 break;
                                             } .)
       | Constant<out con> ... // as before  .

When variables are declared we can always make use of the STL code if they are initialized:

    OneVar<StackFrame frame, int type>    (. int expType; .)
    =                                     (. Entry var = new Entry(); .)
       Ident<out var.name>                (. var.kind = Entry.Var;
                                             var.type = type;
                                             var.offset = frame.size;
                                             frame.size++; .)
       ( AssignOp
         Expression<out expType>          (. if (!compatible(var.type, expType))
                                               SemError("incompatible types in assignment");
**                                           CodeGen.storeValue(var); .)
       |                                  (. if (!canChange)
                                               SemError("defining expression required"); .)
       )                                  (. Table.insert(var); .)
    .

The production for ReadElement will have to generate the LDA opcode if the element to be read is a simple variable:

    ReadElement                           (. string str;
                                             DesType des; .)
    =   StringConst<out str>              (. CodeGen.writeString(str); .)
      | Designator<out des>               (. if (des.entry.kind != Entry.Var)
                                               SemError("wrong kind of identifier");
**                                           if (des.isSimple) CodeGen.loadAddress(des.entry);
                                             switch (des.type) {
                                             ...  // as before

Similarly, the production for Assignment may have to generate the LDA opcode if the ++ or --operation is applied to simple variables, and to choose between generating the STL or STO opcodes for regular assignment statements:

    Assignment                            (. int expType;
                                             DesType des; .)
    =  Designator<out des>                (. if (des.entry.kind != Entry.Var)
                                               SemError("invalid assignment");
       (  AssignOp
          Expression<out expType>         (. if (!compatible(des.type, expType))
                                               SemError("incompatible types in assignment");
**                                           if (des.isSimple) CodeGen.storeValue(des.entry);
**                                           else CodeGen.assign(des.type); .)
**       | "++"                           (. if (des.isSimple) CodeGen.loadAddress(des.entry);
                                             if (!isArith(des.type))
                                               SemError("arithmetic type needed");
                                             CodeGen.increment(des.type); .)
**       | "--"                           (. if (des.isSimple) CodeGen.loadAddress(des.entry);
                                             if (!isArith(des.type))
                                               SemError("arithmetic type needed");
                                             CodeGen.decrement(des.type); .)
       )
       WEAK ";" .


Home  © P.D. Terry