Computer Science 3 - 2002 - Test on Prac 8

(This should take no longer than 25 minutes and is simply intended to probe whether you have understood the material in the prac. You may use your textbook if you think it will help. Answer on the question sheet.)

1. I have a memory like a sieve and cannot remember - remind me of your name (surname) [1]

Pat (Terry)

2. Write down the stack assembler code that corresponds to this (silly) program [5]

     PROGRAM Silly;
       VAR
         I, J, A[4], B[8];
       BEGIN
         READ(I);
         WHILE I < 3 DO BEGIN
           A[I] := B[I+4];
           I++;
         END;
       END.

There were some brave attempts, some disastrous attempts, many attempts that made silly mistakes, and some that tried (unsuccessfully I thought) to be clever from people who had obviously never learned about the IND opcode (the same, I suppose, that don't really come to pracs or lectures?). Here is what I was expecting, and a lot of people got this or something so close as made no difference.

   0 DSP  16
   2 ADR  -1
   4 INN     ; Read(I)
   5 ADR  -1
   7 VAL
   8 LIT   3
  10 LSS     ; WHILE I < 3 DO BEGIN
  11 BZE  45
  13 ADR  -3
  15 ADR  -1
  17 VAL
  18 LIT   5
  20 IND     ;  Address of A[I]
  21 ADR  -8
  23 ADR  -1
  25 VAL
  26 LIT   4
  28 ADD
  29 LIT   9
  31 IND     ;    Address of B[I+4]
  32 VAL     ;    Value of B[I+4]
  33 STO     ;  A[I] := B[I+4]
  34 ADR  -1
  36 ADR  -1
  38 VAL
  39 LIT   1
  41 ADD
  42 STO     ;   I++;
  43 BRN   5 ; END
  45 HLT

A few people used the opcodes that had been invented for the prac, leading to something like the solution on the left below.

If you persist in not using the safe, bounds checked IND opcode you can write code something like that on the right below. People who attempted this, as I recall, with one exception used ADD where they should have used SUB (remember that the stack grows "downward"). But this is not as clever as it might appear. Sure, there is less code and it runs faster. But probably one of the most common sources of errors in programs written in C or C++ is that attempts are made to access element of arrays that don't exist - C provides no checks on this (it cannot, because of the way the language is defined) and this leads to unspeakable anguish at times.

   0 DSP  16 ; Shorter version new opcodes        0 DSP  16 ; Clever (stupid?) version
   2 ADR  -1                                      2 ADR  -1 ; without bounds checks
   4 INN     ; Read(I)                            4 INN     ; Read(I)
   5 PSH  -1                                      5 PSH  -1
   7 LIT   3                                      7 LIT   3
   8 LSS     ; WHILE I < 3 DO BEGIN               8 LSS     ; WHILE I < 3 DO BEGIN
   9 BZE  35                                      9 BZE  30
  11 ADR  -3                                     11 ADR  -3 ;   Address of A[0]
  13 PSH  -1                                     13 PSH  -1 ;   Value of I
  15 LIT   5                                     15 SUB     ;   Address of A[I]
  17 IND     ;  Address of A[I]                  16 ADR  -8 ;   Address of B[0]
  18 ADR  -8                                     18 PSH  -1 ;     Value of I
  20 PSH  -1                                     20 LIT   4
  22 VAL                                         22 ADD     ;     Value of I + 4
  23 LIT   4                                     23 SUB     ;     Address of B[I+4]
  25 ADD                                         24 VAL     ;     Value of B[I+4]
  26 LIT   9                                     25 STO     ;   A[I] := B[I+4]
  28 IND     ;    Address of B[I+4]              26 ADR  -1 ;   Address of I
  29 VAL     ;    Value of B[I+4]                27 PPP     ;   I++
  30 STO     ;   A[I] := B[I+4]                  28 BRN   5 ; END
  31 ADR  -1                                     30 HLT
  33 PPP     ;   I++
  34 BRN   5 ; END
  35 HLT

3. In the header for the original stack machine emulator you will find the lines

   // machine instructions - order important
   enum STKMC_opcodes {
     STKMC_adr, STKMC_lit, STKMC_dsp, STKMC_brn, STKMC_bze, STKMC_prs, STKMC_add,
     STKMC_sub, STKMC_mul, STKMC_dvd, STKMC_eql, STKMC_neq, STKMC_lss, STKMC_geq,
     STKMC_gtr, STKMC_leq, STKMC_neg, STKMC_val, STKMC_sto, STKMC_ind, STKMC_stk,
     STKMC_hlt, STKMC_inn, STKMC_prn, STKMC_nln, STKMC_nop, STKMC_nul
   };

Why was the "order important"? [2]

Too few people got this right, and yet they must surely have known that the valid opcodes all come before "nul", otherwise the simple search in the assembler that associated mnemonics with opcodes will fail. Looking further we see that all the two-word operations are defined ahead of the one-word operations. Although this is not important in the rest of my code, it just might be useful to stick with that convention.

The most frequent "wrong" answer was that the order of the enumeration had to match the order of the labels in the switch/case statement. I am not sure where this misconception came from. The order of the labels in a switch/case statement is unimportant. What is important, of course, is that the labels must be unique - something like

              switch (exp) {
                case 1: something(); break;
                case 2: somethingelse(); break;
                case 1: somethingdifferent(); break;
              }

should not be allowed.

4. In the prac you were invited to note that sequences like

         ADR  -X          and   ADR  -Y
         VAL                    code
                                STO

occurred so frequently that it might be worth introducing special opcodes PSH -X and POP -Y instead. By now you will recognise something of the Terry perversity - maybe the machine is defined to have ADR, VAL and STO opcodes rather than simply having only PSH and POP opcodes for no other reason than that it makes for lots of fun late at night to add these opcodes to the machine. But maybe not? If you have PSH and POP, could you completely remove ADR, VAL and STO from the opcode list, or would you still need to have them available? Justify your answer. [3]

If every VAL was preceded by an ADR one could get away with using PSH instead of ADR + VAL, and if every other use of ADR was followed by a STO one could get away without ADR. But this is not the case. Consider the following code:

           Read(I); List[I] := List[J];

which, even if we use every PSH and POP we can, translates to

           ADR   -I
           INN
           ADR   -List
           PSH   -I
           LIT   Size
           IND
           ADR   -List
           PSH   -J
           LIT   Size
           IND
           VAL
           STO

5. The stack machine supports two branch instructions BZE and BRN, with emulation on the lines of

        case STKMC_brn:
          cpu.pc = mem[cpu.pc]; break;
        case STKMC_bze:
          cpu.sp++;
          if (mem[cpu.sp - 1] == 0) cpu.pc = mem[cpu.pc]; else cpu.pc++;
          break;

It does not need a BNZ opcode, as we can always choose/rearrange comparisons to get the effect we want. But suppose we wished to add BNZ - what would we have to add to the interpreter? [3]

A trivial change to the code for BZE is all that is needed to be added, plus of course the changes needed in the lists of opcodes and mnemonics (nearly everyone got this right - well done):

        case STKMC_bnz:
          cpu.sp++;
          if (inbounds(cpu.sp))
          { if (mem[cpu.sp - 1] == 1) cpu.pc = mem[cpu.pc]; else cpu.pc++; }
          break;

We would not have to rearrange comparisons if we had a logical NOT operation - the effect of

            LSS                  is equivalent to    GEQ
            BNZ  Somewhere                           NOT
                                                     BZE  Somewhere

How would we add NOT to the interpreter? [3]

The most common errors in submissions here were (a) attempts to push or pop something from the stack - one must not alter cpu.sp or (b) attempts that thought Boolean "not" was the same as "unary subtraction".

Once again, it was depressing to see the wide variety of very wrong answers. There are a couple of correct ones:

        case STKMC_bze:
          if (inbounds(cpu.sp)) mem[cpu.sp] = ! mem[cpu.sp];
          break;

or if you insist:

        case STKMC_bze:
          if (inbounds(cpu.sp))
            if (mem[cpu.sp] == 0) mem[cpu.sp] = 1; else mem[cpu.sp = 0;
          break;

or, for those who avoid the use of the C++ "not" operator and who are confident that true will always be represented by 1 and false by 0 (dangerous assumption if you insist on using that dangerous language!)

        case STKMC_bze:
          if (inbounds(cpu.sp)) mem[cpu.sp] = 1 - mem[cpu.sp];
          break;

6. The emulator you have been using has been for a machine that only supports "integer" arithmetic. Outline briefly what you would have to do to develop an emulator for a system that (instead) supported only "real" (floating-point) arithmetic. [3]

This was intended to see who had really gained a proper insight into how emulators work. Nobody had a really good answer, though several students observed astutely that the solution would depend on declaring a memory array to be one of reals (floating point) rather than one of integers. The problem with this is that the memory has to store not only data (on the stack), but also addresses, code and string literals, and these are not really best stored as real values.

There are two ways around this, commonly found in such interpreters. The one is to have two separate arrays - one for code/strings and the other for the stack of real values. The other is to declare the memory to be an array of what C programmers call unions and Pascal/Modula programmers call variant records, where the same location can be addressed in the simulator in one of two ways. Actually an array of even simpler structures would suffice - here is one possible set of declarations:

    struct WORDS {
      int intVal;
      float floatVal;
    };

    WORDS mem[STKMC_memsize];   // The memory array

and then operations like BRN and VAL get coded up like this

        case STKMC_brn:
          cpu.pc = mem[cpu.pc].intVal; break;

        case STKMC_val:
          if (inbounds(cpu.sp) && inbounds(mem[cpu.sp].intVal))
            mem[cpu.sp].floatVal = mem[mem[cpu.sp].intVal].floatVal;
          break;

But apart from a lot of changes like this the system is virtually the same (the I/O has to accommodate different calls to I/O primitives, of course).