Test for week 2

Computer Science 3 - 2015 - Test on Week 2 - Solutions

1. I have a memory like my friend Ερατοσθενes and his σιεϕ and cannot remember - remind me of your name (surname) and student number [1]

Pat (Terry) 63T0884 Surname was supposed to be parenthesized

2. In the prac you were invited to note that sequences like

         LDA   X          and   LDA   Y
         LDV                    <code>
                                STO

occurred so frequently that it might be worth introducing special opcodes LDL X and STL Y instead. By now you will recognise something of the Terry perversity - maybe the machine is defined to have LDA, LDV and STO opcodes rather than simply having only LDL and STL opcodes for no other reason than that it makes for lots of fun late at night to add these opcodes to the machine. But maybe not? If you have LDL and STL, could you completely remove LDA, LDV and STO from the opcode list, or would you still need to have them available? Justify your answer. [2]

If every LDV was preceded by an LDA one could get away with using LDL instead of LDA + LDV, and if every other use of LDA was followed by a STO one could get away without LDA. But this is not the case. Consider the following code, which illustrates simple input and array handling:

           read(i); list[i] = list[j];

This, even if we use every LDL and STL we can, translates to

LDA i INPI ; read(i) LDL list LDL i INXA ; address of list[i] LDL list LDL j INXA ; address of list[j] LDV ; value of list[j] STO ; list[i = list[j]

3. (a) What do you understand by "short circuit evaluation of Boolean expressions" [2]

Essentially we can sum this up in English by saying that, if for a Boolean infix expression of the general form

                        Expression  =  Term1  Operator  Term2

we can deduce the value of the complete Expression by evaluating Term1 only, then there would be no need to evaluate Term2; in code terms we should be able to use a conditional branch operation to bypass evaluating Term2 if and when this situation is encountered. This is very useful in some situations - consider, for example

if (b != 0) and ( a / b > 4) then .... // protection against divide by zero error if (n > limit) or (list[n] == key) then .. // protection agains index range error

In the case of the AND and OR infix operations we can be more explicit - if this behaviour is desired then we can write the following equivalences when we need to evaluate expressions

C = Term1 AND Term2 has a Boolean value determined by C = if !Term1 then false, else Term2 D = Term1 OR Term2 has a Boolean value determined by D = if Term1 then true, else Term2

In terms of translation into code this allows us to use tests for true/false and conditional branch instructions and, indeed, forbids us to use AND and OR - which require that both terms must be evaluated.
In the PVM a conditional branch instruction examines and discards the value on the top of the stack. So, for example, while code like

bool a, b; if (a and b) write("both")

is easily seen to lead to code like this (using symbolic names rather than 0, 1, 2 etc):

LDL a ; evaluate first term BZE skip ; branch if false LDL b ; bad luck, must evaluate the second term BZE skip ; branch if false PRNS "both" skip ....

some other examples requite a little more thought. For example, it would be tempting to translate

if (a or b) write("both") LDL a ; evaluate first term BNZ write ; branch if true LDL b ; bad luck, must evaluate the second term BZE skip ; branch if false write PRNS "both" skip ....

it would appear that we cannot do this without extending the PVM's opcode set (which, as you have now seen) is easy enough to do). Not so - how about

LDL a ; evaluate first term NOT ; tos replaced by !tos BZE write ; branch if !a is false (ie a was true) LDL b ; bad luck, must evaluate the second term BZE skip ; branch if false write PRNS "both" skip ....

or, as they say, everything is possible; the impossible just takes a little longer. Ah, but how about something like

bool c = a and b

where we have to store the value of the expression? In terms of our definition and our stack machine this has to be massaged a little - we have to throw away TOS when we do the conditional branch, but we have to push a value onto the stack before we can do the assignment. Here is a neat possibility (some submissions were very clumsy)

if !a then push false, else push b LDL a ; evaluate first term BZE push0 ; if a is false push 0, don't bother to look at b LDL b ; evaluate second term - its value is the one to store BRN store ; skip past push0 push0 LDC 0 ; push 0 store STL c ; hope that is clear?

And you might like to consider how we could code

bool c = a or b

and look at the solution to part (b) for a more complicated example.

(b) Translate the following program into PVM code, making use of the opcodes you have now encountered, and using short circuit evaluation, but not adding any other opcodes. [10] (Hint: this can be done in under 24 lines.)

     void Main () {
       int n;
       bool a, b, c, d;
       bool[] list = new bool[n];
       if (a && b) write("both");
       list[n] = a == b || c != d;
     }

Three solutions are given. The one on the left is the most compact, making full use of the opcode set. The central one uses STL and LDL in their simpler form. The one on the right uses the original opcodes and is more verbose. Some submissions showed that their authors are uncomfortable with Boolean assignments.

    0  DSP      6                                  0  DSP      6             0  DSP      6
    2  LDL_0                                       2  LDL      0             2  LDA      5
    3  ANEW                                        4  ANEW                   4  LDA      0
    4  STL      5       list = new bool[n]         5  STL      5             6  LDV
    6  LDL_1                                       7  LDL      1             7  ANEW
    7  BZE      14      if a false, jump           9  BZE      17            8  STO
    9  LDL_2                                      11  LDL      2             9  LDA      1
   10  BZE      14      if b false, jump          13  BZE      17           11  LDV
   12  PRNS     "both"                            15  PRNS     "both"       12  BZE      21
   14  LDL      5       push adr list[n]          17  LDL      5            14  LDA      2
   16  LDL_0                                      19  LDL      0            16  LDV
   17  LDXA                                       21  LDXA                  17  BZE      21
   18  LDL_1                                      22  LDL      1            19  PRNS     "both"
   19  LDL_2                                      24  LDL      2            21  LDA      5
   20  CNE              a != b? (invert)          26  CNE                   23  LDV
   21  BZE      29      if a == b, jump           27  BZE      35           24  LDA      0
   23  LDL_3                                      29  LDL      3            26  LDV
   24  LDL      4                                 31  LDL      4            27  LDXA
   26  CNE              c != d                    33  CNE                   28  LDA      1
   27  BRN      30                                34  BRN      38           30  LDV
   29  LDC_1            push true                 35  LDC      1            31  LDA      2
   30  STO              store on list[n]          38  STO                   33  LDV
   31  HALT                                       39  HALT                  34  CNE
                                                                            35  BZE      46
                                                                            37  LDA      3
                                                                            39  LDV
                                                                            40  LDA      4
                                                                            42  LDV
                                                                            43  CNE
                                                                            44  BRN      48
                                                                            46  LDC      1
                                                                            48  STO
                                                                            49  HALT

4. The following represents the C# translation of the encoding method that you (should have) translated to PVM code in this practical.

     public static void Main(string[] args) {
     // rot13 encryption of a text terminated with a period
     // P.D. Terry, Rhodes University, 2015

       char ch;
       do {
         ch = IO.ReadChar();
         ch = Char.ToLower(ch);
         if (Char.IsLetter(ch)) ch = (char) ('a' + (ch - 'a' + 13) % 26);
         IO.Write(ch);
       }
       while (ch != '.');
     } // Main

(a) How would you decode a text that had been supplied to you after being encrypted in this way? [2]

You could simply use the program "as is", since the algorithm is cyclic. However, you would lose any distinction there might have been between upper and lower case in the original. If this were really important, code like the following would be better for the encoding/decoding program:

     public static void Main(string[] args) {
     // rot13 encryption of a text terminated with a period
     // P.D. Terry, Rhodes University, 2015

       char ch;
       do {                         // Don't use a regular WHILE loop (why not?)
         ch = IO.ReadChar();
         if      (Char.IsUpper(ch)) ch = (char) ('A' + (ch - 'A' + 13) % 26);
         else if (Char.IsLower(ch)) ch = (char) ('a' + (ch - 'a' + 13) % 26);
         IO.Write(ch);
       }
       while (ch != '.');
     } // main

(b) You will note the presence of the (char) cast in this source code. Given that characters are simply represented by integer values at the machine level, and that C# permits you to add character literals to integers in any case (as you can also see exemplified here), what purpose (if any) does the (char) casting operation serve? Could it simply have been omitted? Discuss. [3]

The rules of C# demand that when "arithmetic" is done on "characters" the 16-bit character values are promoted to 32-bit integer values, and the resulting expression is thus normally of int type. The cast above serves two purposes - to demote the value of the expression at run-time from 32 to 16 bits (which may result in a loss of precision, though for examples like the one shown here this should not happen) and, at compile-time, to flag the expression as being of char type, and thus assignment compatible to a char designator like ch. This second consideration makes the cast obligatory in this example. At run time it may even demote the 32 bit rsult to 16 bits by replacing the result by result % 65536 (usually this won't really "do" much).

There were a few misconceptions. An assignment - even if it were acceptable - like

         char ch = 'a' + (ch - 'a' + 13) % 26;

could not change the type of the variable ch to int, even though the type of the expression

'a' + (ch - 'a' + 13) % 26

would be int. An expression as a whole has a "type" as we shall explore in more detail. In languages like C# simple variables have a declared type which stays with them! So the statement

IO.Write(ch);

does not need another cast. The IO.Write method is, indeed, overloaded, but the compiler system will "pick" the version that outputs a character if (as here) the argument (ch) is declared of the char type.
Curiously, Java will (very sloppily) allow one to code

char ch = 'a' + 3; // both terms are constant

but it will not allow

char ch1 = ch + 'a'; // one term is a variable

One rather wishes that the rules for "mixing" types were a bit more self-consistent. C# will not allow either of those statements - and Pascal and Modula-2 would never have let you "get away" with them either! Modula-2 was very fussy about how and when one could mix types.

(c) When you translated the algorithm to PVM code you may (or may not) have added an opcode to perform the casting operation, which seems, in a sense, to be superfluous. After reflection, would such an opcode be a good idea, and if so, can you suggest how the interpreter "case arm" would deal with it? [3]

Since the PVM treats all data values as 32 bit integers, it would appear that no opcode is really needed. However, since Parva uses an ASCII character subset, for which only values 0 .. 255 have any meaning, it would be safer to introduce a i2c opcode that could have an interpretation on the lines of

     case PVM.i2c:
       if (mem[cpu.sp] < 0 || mem[cpu.sp > maxChar) ps = badVal;  // else don't "do" anything!
       break;

or even

     case PVM.i2c:
       mem[cpu.sp] = Maths.Abs(mem[cpu.sp]) % 256; // extract lowest 8 bits - play safe
       break;

5. Here is another algorithm that should be familiar:

     void Main() {
     // Read a sequence of digit characters and convert
     // this to the corresponding integer
       int n = 0;
       char ch;
       read(ch);
       while (isDigit(ch)) {
         n = 10 * n + digitValue(ch);
         read(ch);
       }
       write(n);
     } // Main

In the practical you (should have) introduced opcodes to achieve the effect of character handling functions LowerCase() and IsLetter().

(a) What changes would you have to make to the switch statement in the PVM emulator in the Assembler system you have been using, to support additional opcodes that achieve the effect of character handling functions isDigit() and digitValue()? [3]

Questions like this are best answered by giving concise code, not verbose English - hint from the examiner! The exact spelling of library methods won't worry him in tests either.

A very common misconception was that the ASCII representation of the digit characters '0' ... '9' are the simple values 0 ... 9. They are not. '0' ... '9' are represented internally by integer values 48 ... 57.

          case PVM.dig:            // digitValue
            tos = Pop();
            if (Char.IsDigit((char) tos) Push(tos - '0');
            else ps = badVal;
            break;
          case PVM.isdig:          // isDigit
            tos = Pop();
            push(Char.IsDigit((char) tos) ? 1 : 0);
            break;

(b) Would you have to make any other changes to the PVM.cs file and PVMAsm file to support these extensions? Explain your thinking! [2]

We simply add DIG and ISDIG to the list of opcode/string pairs at the end of the PVM.cs file, and add lines giving the numeric equivalent to the list near the beginning. Since these are one-word opcodes, no other changes are needed (in contrast to the addition of codes like STL N and LDL N).

(c) Translate the algorithm above into PVM code, making use of your new opcodes, and also of the STL and LDL opcodes discussed in question 2. (Hint: this can be done in under 20 lines.) [6]

Since most students have an aversion to comments in code, I have left them out...

            0  DSP      2                        0  DSP      2
            2  LDC_0                             2  LDC_0
            3  STL_0                             3  STL_0
            4  LDA_1                             4  LDA_1
            5  INPC                              5  INPC
            6  LDL_1                             6  LDL_1
            7  ISDIG                             7  ISDIG
            8  BZE     22                        8  BZE     20
           10  LDC     10                       10  LDC     10
           12  LDL_0                            12  LDL_0
           13  MUL                              13  MUL
           14  LDL_1                            14  LDL_1
           15  DIG                              15  DIG
           16  ADD                              16  ADD
           17  STL_0                            17  STL_0
           18  LDA_1                            18  BRN      4
           19  INPC                             20  LDL_0
           20  BRN      6                       21  PRNI
           22  LDL_0                            22  HLT
           23  PRNI
           24  HLT