Test for week 2

Computer Science 3 - 2017 - Test on Week 2 - Solutions

This should take no longer than 30 minutes and is intended to probe whether you have understood the PVM. Use your textbook - but not your notes or solutions - if you think it will help. Answer on the question sheet (4 sides).)

1. I have a memory like my friend Ερατοσθενes and his σιεϕ and cannot remember - remind me of your first name, (surname) and student number (in that format) [1]

Pat (Terry) 63T0884

2. In the prac you were invited to note that sequences like

         LDA   X          and   LDA   Y
         LDV                    <code>
                                STO

occurred so frequently that it might be worth introducing special opcodes LDL X and STL Y instead. By now you will recognise something of the Terry perversity - maybe the machine is defined to have LDA, LDV and STO opcodes rather than simply having only LDL and STL opcodes for no other reason than that it makes for lots of fun late at night to add these opcodes to the machine. But maybe not? If you have LDL and STL, could you completely remove LDA, LDV and STO from the opcode list, or would you still need to have them available? Justify your answer. [2]

If every LDV was preceded by an LDA one could get away with using LDL instead of LDA + LDV, and if every other use of LDA was followed by a STO one could get away without LDA. But this is not the case. Consider the following code:

         read(i); list[i] = list[j];

which, even if we use every LDL and STL we can, translates to

LDA i INPI ; read(i) LDL list LDL i INXA ; address of list[i] LDL list LDL j INXA ; address of list[j] LDV ; value of list[j] STO ; list[i = list[j]

PLEASE NOTE: You have to learn Pat Terry-speak in test and exam questions. When he says "justify" or "illustrate" he wants you to give a proper example. Don't just wave your arms, figuratively speaking, and say "LDA and STO would be useful in other situations". Give an example like I have just done, so that I won't think you are just guessing or bluffing!

3. (Character manipulations) Here is an algorithm that hopefully is familiar:

     void main() {
     // Read a sequence of digit characters and convert this to the corresponding integer
       int n = 0;
       char ch;
       read(ch);
       while (n < 214748364 && isDigit(ch)) {
         n = 10 * n + digitValue(ch);
         read(ch);
       }
       write(n);
     } // main

In the practical you (should have) introduced opcodes to achieve the effect of the character handling function upperCase() and the I/O functions read() and write().

(a) What changes would you have to make to the switch statement in the PVM emulator in either of the Assembler systems you have been using, to support additional opcodes isdig and digv that achieve the effect of character handling functions isDigit(ch) and digitValue(ch)? [6]

(Hint: the Char class has a method Char.IsDigit(c) which might prove to be useful.)

PLEASE NOTE: You have to learn Pat Terry-speak in test and exam questions. When he says "what changes would you have to make" to some code he means "make them quite clear by giving the new or extended code" like you can see in the solution below. Don't just say "I would add some new opcodes and numbers". Show me. Much easier to understand your answer, and usually the answer is shorter and more succinct (and if you don't know the word "succinct" look it up!).

       case PVM.isdig:          // isDigit               case PVM.isdig:          // isDigit
         tos = Pop();                                      mem[cpu.sp] = Char.IsDigit((char) mem[cpu.sp]) ? 1 : 0;
         Push(Char.IsDigit((char) tos) ? 1 : 0);           break;
         break;
       case PVM.digv:           // digitValue            case PVM.digv:           // digitValue
         tos = Pop();                                      if (Char.IsDigit((char) mem[cpu.sp]))
         if (Char.IsDigit((char) tos) Push(tos - '0');       mem[cpu.sp] -= '0';
         else ps = badVal;                                 else ps = badVal;
         break;                                            break;

Incidentally, several students had the "right idea" and wrote answers like the one below. But the mem array/stack contains integers, and so you can't treat its elements as Booleans or characters without having some conversion process explicit in your code - at least in well behaved languages like Java, C#, Pascal and Modula-2. Note, too, in the full solution, how we check first that the character represents a digit before we convert the character '6' to the integer value six (6,) or object if we are given a character like 'A'.

       case PVM.isdig:              // isDigit - incorrect, though the idea is the right one
         tos = Pop();               // okay, but tos is an integer, remember
         Push(Char.IsDigit(tos);    // Char.isDigit returns boolean and this can't be pushed directly
         break;

(b) Would you have to make any other changes to the PVM file and PVMAsm file to support these extensions? Explain your thinking! [2]

We simply add DIGV and ISDIG to the list of opcode/string pairs at the end of the PVM.cs file, and add lines giving the numeric equivalent to the list near the beginning. Since these are one-word opcodes, no other changes are needed (in contrast to the addition of codes like STL N and LDL N as you should have discovered in the prac).

(c) Translate the algorithm above into PVM code, making use of your new opcodes, and also of the STL and LDL opcodes discussed in question 2. You should also demonstrate a "short-circuit" approach to the expression controlling the while loop. (Hint: this can be done in under 28 lines.) [8]

Since most students have an aversion to comments in code, I have left them out... (Put them back!). Several submissions had not incorporated the "short-circuit" requirement (achieved by the two BZE instructions).

            0  DSP     2                                 0  DSP     2
            2  LDC_0                                     2  LDC_0
            3  STL_0                                     3  STL_0
            4  LDA_1                                     4  LDA_1
            5  INPC                                      5  INPC
            6  LDL_0                                     6  LDL_0
            7  LDC     214748364                         7  LDC     214748364
            9  CLT                                       9  CLT
           10  BZE     28                               10  BZE     26
           12  LDL_1                                    12  LDL_1
           13  ISDIG                                    13  ISDIG
           14  BZE     28                               14  BZE     26
           16  LDC     10                               16  LDC     10
           18  LDL_0                                    18  LDL_0
           19  MUL                                      19  MUL
           20  LDL_1                                    20  LDL_1
           21  DIG                                      21  DIG
           22  ADD                                      22  ADD
           23  STL_0                                    23  STL_0
           24  LDA_1                                    24  BRN     4
           25  INPC                                     26  LDL_0
           26  BRN     6                                27  PRNI
           28  LDL_0                                    28  HALT
           29  PRNI
           30  HALT

4. (Array handling) Translate the following program into PVM code, making use of the INC, LDL and STL opcodes studied in the prac. [10] (Hint: this can be done in under 24 lines.)

     void main () {
       int i;
       bool[] odd = new bool[1000];
       for (i = 0; i < 1000; i++) odd[i] = i % 2 != 0;
     }

Two solutions are given. The one on the left is a literal translation, that on the right is slightly optimized, taking advantage of the fact that a "mod 2" operation yields a 1 for odd numbers, which happens also to be the representation of true. Some submissions showed that their authors are uncomfortable with Boolean assignments. And some of you were unaware of the REM opcode. Oh well, hopefully you are after reading this

            0  DSP   2      int i; bool[] odd            0  DSP   2      int i; bool[] odd
            2  LDC   1000                                2  LDC   1000
            4  ANEW                                      4  ANEW
            5  STL   1      odd = new bool[1000]         5  STL   1      odd = new bool[1000]
            7  LDC   0                                   7  LDC   0
            9  STL   0      i = 0;                       9  STL   0      i = 0;
           11  LDL   0                                  11  LDL   0
           13  LDC   1000                               13  LDC   1000
           15  CLT                                      15  CLT
           16  BZE   37     while (i < 1000) {          16  BZE   34     while (i < 1000) {
           18  LDL   1                                  18  LDL   1
           20  LDL   0                                  20  LDL   0
           22  LDXA                                     22  LDXA
           23  LDL   0                                  23  LDL   0
           25  LDC   2                                  25  LDC   2
           27  REM                                      27  REM
           28  LDC   0                                  28  STO             odd[i] = i % 2 != 0;
           30  CNE                                      29  LDA   0
           31  STO             odd[i] = i % 2 != 0;     31  INC             i++;
           32  LDA   0                                  32  BRN   11     }
           34  INC             i++;                     34  HALT
           35  BRN   11     }
           37  HALT

5. The PVM currently supports the INPI, INPB and INPC operations that expect to find a destination address on the top of the stack, as you should know, favouring the translation of code like

          read(x);
          read(sentence[i]);

Suppose instead that the system were to be changed to support a style of programming like

          x = readInt() + readInt();  // read and add two integers
          sentence[i] = readChar();   // read and store a single character

What implications would this have for the PVM, and what code sequences would be needed to correspond to these fragments? [5]

One would need to modify the INPx instructions so that they did not store the input at an address lying in wait on the stack, but simply push the value read onto the stack:

       case PVM.inpi:          // integer input         case PVM.inpi:          // integer input
         Push(data.ReadInt());                            mem[--cpu.sp] = data.ReadInt();
         if (data.Error()) ps = badData;                  if (data.Error()) ps = badData;
         break;                                           break;
       case PVM.inpb:          // boolean input         case PVM.inpb:          // boolean input
         Push(data.ReadBool() ? 1 : 0);                   mem[--cpu.sp] = data.ReadBool() ? 1 : 0;
         if (data.Error()) ps = badData;                  if (data.Error()) ps = badData;
         break;                                           break;
       case PVM.inpc:          // character input       case PVM.inpc:          // character input
         Push(data.ReadChar());                           mem[--cpu.sp] = data.ReadChar();
         if (data.Error()) ps = badData;                  if (data.Error()) ps = badData;
         break;                                           break;

The statements above might then be translated simply to

               LDA  x       or       INPI                                LDL  sentence
               INPI                  INPI                                LDL  i
               INPI                  ADD                                 LDXA
               ADD                   STL   x                             INPC
               STO                                                       STOC

Task 10 of Prac 2 - Execution overheads - part two

In the prac kit you were supplied with a second translation SIEVE2.PVM of a cut down version of the same prime-counting program SIEVE.PAV as was used in Task 4, but this time using the extended opcode set developed in the last task. The kit also included the code that could be executed if the PVM were extended still further on the lines of the suggestions on page 44 of the textbook.

Running SIEVE1.PVM through both of the original and modified assemblers, and SIEVE2.PVM and SIEVE3.PVM through both of the modified assemblers gave the following timings for the same limit (4000) and number of iterations (100) on my machines, one a laptop running Windows XP and one a desktop running Windows 7-32.

   ┌────────────────────────────────┬────────────────────────┬────────────────────────┬────────────────────────┐
   │ Desktop Machine (Win 7-32)     │  Sieve1.pvm   (1.00)   │  Sieve2.pvm   (0.78)   │  Sieve3.pvm  (0.75)    │
   │                                │                        │                        │                        │
   │ ASM1  (Push/Pop)               │      0.73              │      0.57              │      0.55              │
   │ ASM2  (Inline)                 │      0.30     (0.41)   │      0.20     (0.36)   │      0.13    (0.24)    │
   │                                │                        │                        │                        │
   └────────────────────────────────┴────────────────────────┴────────────────────────┴────────────────────────┘

   ┌────────────────────────────────┬────────────────────────┬────────────────────────┬────────────────────────┐
   │ Laptop machine (XP-32)         │  Sieve1.pvm   (1.00)   │  Sieve2.pvm   (0.75)   │  Sieve3.pvm  (0.67)    │
   │                                │                        │                        │                        │
   │ ASM1  (Push/Pop)               │      1.14              │      0.85              │      0.76              │
   │ ASM2  (Inline)                 │      0.52     (0.46)   │      0.30     (0.35)   │      0.27    (0.35)    │
   │                                │                        │                        │                        │
   └────────────────────────────────┴────────────────────────┴────────────────────────┴────────────────────────┘

The Desktop times were about 65% of those on the slower Laptop. The Inline times were about 40% of the Push/Pop system with the original limited opcode set. The Inline times were about 30% of the Push/Pop system with the extended opcode set,

The reasons are not hard to find. The InLine emulator makes very few function calls within the fetch-execute cycle, whereas the Push/Pop one makes a very large number, each carrying an extra overhead. Similarly, the introduction of the LDL and STL codes allowed for fewer opcodes to be interpreted to achieve the desired result.

If one wishes to improve the performance of the interpreter further it might make sense to get some idea of which opcodes are executed most often. Clearly this will depend on the application, and so a mix of applications might need to be analysed. It is not difficult to add a profiling facility to the interpreter, and this has been done in yet another interpreter that you can find in the solution kit. Running this on the Sieve files yielded some interesting results. For a start, there were enormous numbers of steps executed - probably more than you might have thought.

        Original opcodes           Extended opcode set          Extended opcode set
                                   LDL and STL used             LDL, STL, LDL_x STL_x

      39 494 323 operations.     27 070 118 operations. (68%)    (same op count)

        LDA     10824405           LDL      8186502             LDL_2    3821200
        LDV      9386302           LDC      4148705             LDL_1    2582600
        LDC      4948605           BZE      2182801             BZE      2182801
        STO      3165703           CLE      1782901             LDC_0    1910701
        BZE      2182801           BRN      1727700             LDC      1782902
        CLE      1782901           LDXA     1727600             CLE      1782901
        ADD      1782701           STO      1327700             BRN      1727700
        BRN      1727700           STL      1038103             LDL_0    1727600
        LDXA     1727600           ADD       982801             LDXA     1727600
        CGT       982800           AND       982800             STO      1327700
        AND       982800           CGT       982800             ADD       982801
        HALT           1           LDA       799900             STL_2     982800
        ANEW           1           INC       799900             CGT       982800
        PRNS           1           LDV       399900             AND       982800
        PRNI           1           DSP            1             INC       799900
        DSP            1           PRNI           1             LDA_1     799800
                                   PRNS           1             LDC_1     454902
                                   ANEW           1             LDV       399900
                                   HALT           1             STL        55101
                                                                LDL        55001
                                                                LDC_2        200
                                                                STL_1        200
                                                                LDL_3        101
                                                                LDA_3        100
                                                                STL_3          1
                                                                STL_0          1
                                                                HALT           1
                                                                ANEW           1
                                                                PRNS           1
                                                                PRNI           1
                                                                DSP            1