1. I have a memory like my friend Ερατοσθενes and his σιεϕ and cannot remember - remind me of your name (surname) and student number [1]
Pat (Terry) 63T0884 Surname was supposed to be parenthesized
2. In the prac you were invited to note that sequences like
LDA X and LDA Y LDV <code> STO
occurred so frequently that it might be worth introducing special opcodes LDL X and STL Y instead. By now you will recognise something of the Terry perversity - maybe the machine is defined to have LDA, LDV and STO opcodes rather than simply having only LDL and STL opcodes for no other reason than that it makes for lots of fun late at night to add these opcodes to the machine. But maybe not? If you have LDL and STL, could you completely remove LDA, LDV and STO from the opcode list, or would you still need to have them available? Justify your answer. [2]
If every LDV was preceded by an LDA one could get away with using LDL instead of LDA + LDV, and if every other use of LDA was followed by a STO one could get away without LDA. But this is not the case. Consider the following code, which illustrates simple input and array handling:
read(i); list[i] = list[j];
This, even if we use every LDL and STL we can, translates to
LDA i INPI ; read(i) LDL list LDL i INXA ; address of list[i] LDL list LDL j INXA ; address of list[j] LDV ; value of list[j] STO ; list[i = list[j]
3. (a) What do you understand by "short circuit evaluation of Boolean expressions" [2]
Essentially we can sum this up in English by saying that, if for a Boolean infix expression of the general form
Expression = Term1 Operator Term2
we can deduce the value of the complete Expression by evaluating Term1 only, then there would be no need to evaluate Term2; in code terms we should be able to use a conditional branch operation to bypass evaluating Term2 if and when this situation is encountered. This is very useful in some situations - consider, for example
if (b != 0) and ( a / b > 4) then .... // protection against divide by zero error if (n > limit) or (list[n] == key) then .. // protection agains index range error
In the case of the AND and OR infix operations we can be more explicit - if this behaviour is desired then we can write the following equivalences when we need to evaluate expressions
C = Term1 AND Term2 has a Boolean value determined by C = if !Term1 then false, else Term2 D = Term1 OR Term2 has a Boolean value determined by D = if Term1 then true, else Term2
In terms of translation into code this allows us to use tests for true/false and conditional branch instructions and, indeed, forbids us to use AND and OR - which require that both terms must be evaluated.
In the PVM a conditional branch instruction examines and discards the value on the top of the stack. So, for example, while code like
bool a, b; if (a and b) write("both")
is easily seen to lead to code like this (using symbolic names rather than 0, 1, 2 etc):
LDL a ; evaluate first term BZE skip ; branch if false LDL b ; bad luck, must evaluate the second term BZE skip ; branch if false PRNS "both" skip ....
some other examples requite a little more thought. For example, it would be tempting to translate
if (a or b) write("both") LDL a ; evaluate first term BNZ write ; branch if true LDL b ; bad luck, must evaluate the second term BZE skip ; branch if false write PRNS "both" skip ....
it would appear that we cannot do this without extending the PVM's opcode set (which, as you have now seen) is easy enough to do). Not so - how about
LDL a ; evaluate first term NOT ; tos replaced by !tos BZE write ; branch if !a is false (ie a was true) LDL b ; bad luck, must evaluate the second term BZE skip ; branch if false write PRNS "both" skip ....
or, as they say, everything is possible; the impossible just takes a little longer. Ah, but how about something like
bool c = a and b
where we have to store the value of the expression? In terms of our definition and our stack machine this has to be massaged a little - we have to throw away TOS when we do the conditional branch, but we have to push a value onto the stack before we can do the assignment. Here is a neat possibility (some submissions were very clumsy)
if !a then push false, else push b LDL a ; evaluate first term BZE push0 ; if a is false push 0, don't bother to look at b LDL b ; evaluate second term - its value is the one to store BRN store ; skip past push0 push0 LDC 0 ; push 0 store STL c ; hope that is clear?
And you might like to consider how we could code
bool c = a or b
and look at the solution to part (b) for a more complicated example.
(b) Translate the following program into PVM code, making use of the opcodes you have now encountered, and using short circuit evaluation, but not adding any other opcodes. [10] (Hint: this can be done in under 24 lines.)
void Main () { int n; bool a, b, c, d; bool[] list = new bool[n]; if (a && b) write("both"); list[n] = a == b || c != d; }
Three solutions are given. The one on the left is the most compact, making full use of the opcode set. The central one uses STL and LDL in their simpler form. The one on the right uses the original opcodes and is more verbose. Some submissions showed that their authors are uncomfortable with Boolean assignments.
0 DSP 6 0 DSP 6 0 DSP 6 2 LDL_0 2 LDL 0 2 LDA 5 3 ANEW 4 ANEW 4 LDA 0 4 STL 5 list = new bool[n] 5 STL 5 6 LDV 6 LDL_1 7 LDL 1 7 ANEW 7 BZE 14 if a false, jump 9 BZE 17 8 STO 9 LDL_2 11 LDL 2 9 LDA 1 10 BZE 14 if b false, jump 13 BZE 17 11 LDV 12 PRNS "both" 15 PRNS "both" 12 BZE 21 14 LDL 5 push adr list[n] 17 LDL 5 14 LDA 2 16 LDL_0 19 LDL 0 16 LDV 17 LDXA 21 LDXA 17 BZE 21 18 LDL_1 22 LDL 1 19 PRNS "both" 19 LDL_2 24 LDL 2 21 LDA 5 20 CNE a != b? (invert) 26 CNE 23 LDV 21 BZE 29 if a == b, jump 27 BZE 35 24 LDA 0 23 LDL_3 29 LDL 3 26 LDV 24 LDL 4 31 LDL 4 27 LDXA 26 CNE c != d 33 CNE 28 LDA 1 27 BRN 30 34 BRN 38 30 LDV 29 LDC_1 push true 35 LDC 1 31 LDA 2 30 STO store on list[n] 38 STO 33 LDV 31 HALT 39 HALT 34 CNE 35 BZE 46 37 LDA 3 39 LDV 40 LDA 4 42 LDV 43 CNE 44 BRN 48 46 LDC 1 48 STO 49 HALT
4. The following represents the C# translation of the encoding method that you (should have) translated to PVM code in this practical.
public static void Main(string[] args) { // rot13 encryption of a text terminated with a period // P.D. Terry, Rhodes University, 2015 char ch; do { ch = IO.ReadChar(); ch = Char.ToLower(ch); if (Char.IsLetter(ch)) ch = (char) ('a' + (ch - 'a' + 13) % 26); IO.Write(ch); } while (ch != '.'); } // Main
(a) How would you decode a text that had been supplied to you after being encrypted in this way? [2]
You could simply use the program "as is", since the algorithm is cyclic. However, you would lose any distinction there might have been between upper and lower case in the original. If this were really important, code like the following would be better for the encoding/decoding program:
public static void Main(string[] args) { // rot13 encryption of a text terminated with a period // P.D. Terry, Rhodes University, 2015 char ch; do { // Don't use a regular WHILE loop (why not?) ch = IO.ReadChar(); if (Char.IsUpper(ch)) ch = (char) ('A' + (ch - 'A' + 13) % 26); else if (Char.IsLower(ch)) ch = (char) ('a' + (ch - 'a' + 13) % 26); IO.Write(ch); } while (ch != '.'); } // main
(b) You will note the presence of the (char) cast in this source code. Given that characters are simply represented by integer values at the machine level, and that C# permits you to add character literals to integers in any case (as you can also see exemplified here), what purpose (if any) does the (char) casting operation serve? Could it simply have been omitted? Discuss. [3]
The rules of C# demand that when "arithmetic" is done on "characters" the 16-bit character values are promoted to 32-bit integer values, and the resulting expression is thus normally of int type. The cast above serves two purposes - to demote the value of the expression at run-time from 32 to 16 bits (which may result in a loss of precision, though for examples like the one shown here this should not happen) and, at compile-time, to flag the expression as being of char type, and thus assignment compatible to a char designator like ch. This second consideration makes the cast obligatory in this example. At run time it may even demote the 32 bit rsult to 16 bits by replacing the result by result % 65536 (usually this won't really "do" much).
There were a few misconceptions. An assignment - even if it were acceptable - like
char ch = 'a' + (ch - 'a' + 13) % 26;
could not change the type of the variable ch to int, even though the type of the expression
'a' + (ch - 'a' + 13) % 26
would be int. An expression as a whole has a "type" as we shall explore in more detail. In languages like C# simple variables have a declared type which stays with them! So the statement
IO.Write(ch);
does not need another cast. The IO.Write method is, indeed, overloaded, but the compiler system will "pick" the version that outputs a character if (as here) the argument (ch) is declared of the char type.
Curiously, Java will (very sloppily) allow one to code
char ch = 'a' + 3; // both terms are constant
but it will not allow
char ch1 = ch + 'a'; // one term is a variable
One rather wishes that the rules for "mixing" types were a bit more self-consistent. C# will not allow either of those statements - and Pascal and Modula-2 would never have let you "get away" with them either! Modula-2 was very fussy about how and when one could mix types.
(c) When you translated the algorithm to PVM code you may (or may not) have added an opcode to perform the casting operation, which seems, in a sense, to be superfluous. After reflection, would such an opcode be a good idea, and if so, can you suggest how the interpreter "case arm" would deal with it? [3]
Since the PVM treats all data values as 32 bit integers, it would appear that no opcode is really needed. However, since Parva uses an ASCII character subset, for which only values 0 .. 255 have any meaning, it would be safer to introduce a i2c opcode that could have an interpretation on the lines of
case PVM.i2c: if (mem[cpu.sp] < 0 || mem[cpu.sp > maxChar) ps = badVal; // else don't "do" anything! break;
or even
case PVM.i2c: mem[cpu.sp] = Maths.Abs(mem[cpu.sp]) % 256; // extract lowest 8 bits - play safe break;
5. Here is another algorithm that should be familiar:
void Main() { // Read a sequence of digit characters and convert // this to the corresponding integer int n = 0; char ch; read(ch); while (isDigit(ch)) { n = 10 * n + digitValue(ch); read(ch); } write(n); } // Main
In the practical you (should have) introduced opcodes to achieve the effect of character handling functions LowerCase() and IsLetter().
(a) What changes would you have to make to the switch statement in the PVM emulator in the Assembler system you have been using, to support additional opcodes that achieve the effect of character handling functions isDigit() and digitValue()? [3]
Questions like this are best answered by giving concise code, not verbose English - hint from the examiner! The exact spelling of library methods won't worry him in tests either.
A very common misconception was that the ASCII representation of the digit characters '0' ... '9' are the simple values 0 ... 9. They are not. '0' ... '9' are represented internally by integer values 48 ... 57.
case PVM.dig: // digitValue tos = Pop(); if (Char.IsDigit((char) tos) Push(tos - '0'); else ps = badVal; break; case PVM.isdig: // isDigit tos = Pop(); push(Char.IsDigit((char) tos) ? 1 : 0); break;
(b) Would you have to make any other changes to the PVM.cs file and PVMAsm file to support these extensions? Explain your thinking! [2]
We simply add DIG and ISDIG to the list of opcode/string pairs at the end of the PVM.cs file, and add lines giving the numeric equivalent to the list near the beginning. Since these are one-word opcodes, no other changes are needed (in contrast to the addition of codes like STL N and LDL N).
(c) Translate the algorithm above into PVM code, making use of your new opcodes, and also of the STL and LDL opcodes discussed in question 2. (Hint: this can be done in under 20 lines.) [6]
Since most students have an aversion to comments in code, I have left them out...
0 DSP 2 0 DSP 2 2 LDC_0 2 LDC_0 3 STL_0 3 STL_0 4 LDA_1 4 LDA_1 5 INPC 5 INPC 6 LDL_1 6 LDL_1 7 ISDIG 7 ISDIG 8 BZE 22 8 BZE 20 10 LDC 10 10 LDC 10 12 LDL_0 12 LDL_0 13 MUL 13 MUL 14 LDL_1 14 LDL_1 15 DIG 15 DIG 16 ADD 16 ADD 17 STL_0 17 STL_0 18 LDA_1 18 BRN 4 19 INPC 20 LDL_0 20 BRN 6 21 PRNI 22 LDL_0 22 HLT 23 PRNI 24 HLT