There were some very good solutions submitted, and some very energetic ones too - clearly a lot of students had put in many hours developing their code. This is very encouraging. Do learn to put your names into the introductory comments of programs that you write.
Full source for the solutions summarized here can be found in the ZIP file on the servers - PRAC20A.ZIP. These sources also include the extensions needed to handle character I/O and storage.
At the heart of this lay a search loop:
read("Search for (99 stops)? ", item); // first search item while (item != 99) { i = 1; while (list[i] != item) i = i + 1;
It's an awful program because the array subscript goes out of bounds if the item cannot be located.
In the better version of the program in the file SEARCH1.PAV was code that read
list[0] = item; // post as a sentinel i = 10; while (list[i] != item) i = i - 1; // must terminate!
This is much better because it uses a sentinel technique for the linear search. However, it can still be broken if you give it data that is out of range of the int type, give it non-numeric data, or supply 99 as one of the initial list of 10 numbers.
Task 5 was to hand-compile the mystery program into PVM code. Most people got a long way towards this.
The original Parva code was pretty awful, and I suspect many people initially had trouble understanding it. So, just to make the point that code with comments and nicely chosen variable names is worth writing, here is a better version.
void main () { // Prac 20 task 5 // Read a zero terminated sequence of numbers and report on the longest // subsequence of repeated numbers // P.D. Terry, Rhodes University, 2007 int nextInt, longestInt; read(nextInt); int longestCount = 0; while (nextInt != 0) { int currentInt = nextInt; int currentCount = 1; read(nextInt); while (nextInt == currentInt) { currentCount = currentCount + 1; read(nextInt); } if (currentCount > longestCount) { longestInt = currentInt; longestCount = currentCount; } } write("In the longest sequence ", longestInt, " appeared ", longestCount, " times"); }
Have a look at how I have commented this, using "high level" code, rather than detailed line by line commentary of the form "load address of X". Most of the submissions had "commentary" that was, frankly, almost useless. Try the following test for assembler code: Cover over the real code with a piece of paper and read only the comments. Does what you read make sense on its own? I maintain that it should. The easiest way to do this is by using a high level algorithmic notation.
0 DSP 5 ; v0 is nextInt, v1 is longestInt 48 ADD 2 LDA 0 ; v2 is longestCount 49 STO ; currentCount = currentCount + 1 4 INPI ; read(nextInt); 50 LDA 0 5 LDA 2 52 INPI ; read(nextInt); 7 LDC 0 53 BRN 32 ; } 9 STO ; longestCount = 0; 55 LDA 4 10 LDA 0 57 LDV 12 LDV 58 LDA 2 13 LDC 0 60 LDV 15 CNE 61 CGT 16 BZE 78 ; while (nextInt != 0) { 62 BZE 76 ; if (currentCount > longestCount) { 18 LDA 3 ; // v3 is currentInt 64 LDA 1 20 LDA 0 66 LDA 3 22 LDV 68 LDV 23 STO ; currentInt = nextInt; 69 STO ; longestInt = currentInt; 24 LDA 4 ; // v4 is currentCount 70 LDA 2 26 LDC 1 72 LDA 4 28 STO ; currentCount = 1; 74 LDV 29 LDA 0 75 STO ; longestCount = currentCount 31 INPI ; read(nextInt); 76 BRN 10 }} 32 LDA 0 78 PRNS "In the longest sequence " 34 LDV 80 LDA 1 35 LDA 3 82 LDV 37 LDV 83 PRNI ; write(longestInt); 38 CEQ 84 PRNS " appeared " 39 BZE 55 ; while (nextInt = currentInt) { 86 LDA 2 41 LDA 4 88 LDV 43 LDA 4 89 PRNI ; write(longestCount); 45 LDV 90 PRNS " times" 46 LDC 1 92 HALT ; System.Exit(0);
Checking for overflow in multiplication and division was not well done. You cannot multiply and then try to check overflow (it is too late by then) - you have to detect it in a more subtle way. Here is one way of doing it - note the check to prevent a division by zero. This does not use any precision greater than that of the simulated machine itself. Note that it is necessary to check for "division by zero" in the rem code as well!
case PVM.mul: // integer multiplication tos = pop(); sos = pop(); if (tos != 0 && Math.abs(sos) > maxInt / Math.abs(tos)) ps = badVal; else push(sos * tos); break; case PVM.div: // integer division (quotient) tos = pop(); if (tos == 0) ps = divZero; else push(pop() / tos); break; case PVM.rem: // integer division (remainder) tos = pop(); if (tos == 0) ps = divZero; else push(pop() % tos); break;
It is possible to use an intermediate long variable (but don't forget the casting operations or the abs function):
case PVM.mul: // integer multiplication tos = pop(); sos = pop(); long temp = (long) sos * (long) tos; if (Math.abs(temp) > maxInt) ps = badVal; else push(sos * tos); break;
This is straightforward, if a little tedious, and it is easy to leave some of the changes out and get a corrupted solution. The PVMAsm class requires modification in the switch statement that recognizes two-word opcodes:
case PVM.brn: // all require numeric address field ... case PVM.ldc: case PVM.ldl: // +++++++++++++++++ addition case PVM.stl: // +++++++++++++++++ addition codeLen = (codeLen + 1) % PVM.memSize; if (ch == '\n') // no field could be found error("Missing address", codeLen); else { // unpack it and store PVM.mem[codeLen] = src.readInt(); if (src.error()) error("Bad address", codeLen); } break;
The PVM class requires several additions. We must add to the enumeration of the machine opcodes:
public static final int // Machine opcodes ... ldl = 63, // +++++++++++++++++ additions stl = 64, lda_0 = 65, ...
We must add to the switch statement in the trace method (several submissions missed this):
static void trace(OutFile results, int pcNow) { switch (cpu.ir) { ... case PVM.ldl: // +++++++++++++++++ addition case PVM.stl: // +++++++++++++++++ addition } results.writeLine(); }
and we must provide case arms for all the new opcodes. A selection of these follows; the rest can be seen in the solution kit. Notice that for consistency all the "inBounds" checks should be performed on the new opcodes too (several submissions missed this).
case PVM.ldc_m1: // push constant -1 push(-1); break; case PVM.ldc_0: // push constant 0 push(0); break; case PVM.ldc_1: // push constant 1 push(1); break; ... case PVM.lda_0: // push local address 0 adr = cpu.fp - 1; if (inBounds(adr)) push(adr); break; case PVM.lda_1: // push local address 1 adr = cpu.fp - 2; if (inBounds(adr)) push(adr); break; ... case PVM.ldl: // push local value adr = cpu.fp - 1 - next(); if (inBounds(adr)) push(mem[adr]); break; case PVM.ldl_0: // push value of local variable 0 adr = cpu.fp - 1; if (inBounds(adr)) push(mem[adr]); break; case PVM.ldl_1: // push value of local variable 1 adr = cpu.fp - 2; if (inBounds(adr)) push(mem[adr]); break; ... case PVM.stl: // store local value adr = cpu.fp - 1 - next(); if (inBounds(adr)) mem[adr] = pop(); break; case PVM.stl_0: // pop to local variable 0 adr = cpu.fp - 1; if (inBounds(adr)) mem[adr] = pop(); break; case PVM.stl_1: // pop to local variable 1 adr = cpu.fp - 2; if (inBounds(adr)) mem[adr] = pop(); break;
We must add to the method that lists out the code (several submissions missed this). :
public static void listCode(String fileName, int codeLen) { ... case PVM.brn: case PVM.ldc: case PVM.ldl: // +++++++++++++++++ addition case PVM.stl: // +++++++++++++++++ addition i = (i + 1) % memSize; codeFile.write(mem[i]); break;
Finally we must add to the section that initializes the mnemonic lookup table:
public static void init() { ... mnemonics[PVM.ldl] = "LDL"; // +++++++++++++++++ additions mnemonics[PVM.stl] = "STL"; mnemonics[PVM.lda_0] = "LDA_0"; ...
To be able to deal with input and output of character data we need to add two new opcodes, modelled on the INPI and PRNI codes whose interpretation would be as below. These would require additions to the lists of opcodes in the assembler and interpreter similar to those just mentioned (full details can be found in the solution kit)
case PVM.inpc: // character input adr = pop(); if (inBounds(adr)) { mem[adr] = data.readChar(); if (data.error()) ps = badData; } break; case PVM.prnc: // character output if (tracing) results.write(padding); results.write((char) (Math.abs(pop()) % (maxChar + 1)), 1); if (tracing) results.writeLine(); break;
Note that the PRNC opcode outputs the character in a field width of 1, not 0 as many people tried. This has the effect that we can output characters without intervening spaces. Note also the way in which the value is forced "modulo 256" to become a valid ASCII value.
To build a really safe system there are further refinements we should make. It can be argued that we should not try to store a value outside ot the range 0 .. 255 into a character variable. This suggests that we should have a range of STO type instructions that check the value on the top of stack before assigning it. One of these - STOC to act as a variation on STO - would be interpreted as follows; we would need others to handle STLC, STLC_0 and so on.
case PVM.stoc: // character checked store tos = pop(); adr = pop(); if (inBounds(adr)) if (tos >= 0 && tos <= maxChar) mem[adr] = tos; else ps = badVal; break;
As an example of using the new input/output opcodes, here is the mystery program recoded for character data in considerably fewer operations. Many submissions only used some of the new opcodes, ignoring the STL ones, for example. Notice that we have had to hard-code 46 as the integer equivalent of character '.', of course.
0 DSP 5 ; v0 is nextChar, v1 is longestChar, 28 STL 4 ; currentCount = currentCount + 1; 2 LDA_0 ; v2 is longestCount 30 LDA_0 3 INPC ; read(nextChar) 31 INPC ; read(nextChar); 4 LDC_0 32 BRN 19 ; } 5 STL_2 ; longestCount = 0; 34 LDL 4 6 LDL_0 37 LDL_2 7 LDC 46 37 CGT 9 CNE 38 BZE 45 ; if (currentCount > longestCount) { 10 BZE 47 ; while (nextChar != '.') { 40 LDL_3 12 LDL_0 ; // v3 is currentChar 41 STL_1 longestChar = currentChar; 13 STL_3 ; currentInt = nextInt; 42 LDL 4 14 LDC_1 ; // v4 is currentCount 44 STL_2 ; longestCount = currentCount; 15 STL 4 ; currentCount = 1; 45 BRN 6 ; }} 17 LDA_0 47 PRNS "In the longest sequence " 18 INPC ; read(nextChar); 49 LDL_1 19 LDL_0 50 PRNC ; write(longestChar); 20 LDL_3 51 PRNS " appeared " 21 CEQ 53 LDL_2 22 BZE 34 ; while (nextChar == currentChar) { 54 PRNI ; write(longestCount); 24 LDL 4 55 PRNS " times" 26 LDC_1 57 HALT ; System.Exit(0); 27 ADD
Surprisingly, no. In the prac worksheet the suggestion was made that you study the original source to see that the original opcodes had been mapped onto the numbers 30 .. 62. This meant that you could map the new opcodes onto a set of numbers below 30, or above 62. In the prac solution kit you will find four versions of the interpreter in which this has been done.
The following table shows various timings obtained on the four systems for two encodings of the infamous Sieve of Eratosthenes, differing only in that one used the compact opcodes where possible. The behaviour is quite remarkable. The optimized opcode set resulted in the execution of about 30% fewer instructions over counts running into millions, and when the optimized opcodes were mapped onto "high" internal numbers the overall execution speed improved to about 87%. However, when mapped onto low numbers the code using the unoptimized opcode set took far longer to run, while that using the optimized opcode set slightly less time to run. Since the only difference in the source code of the interpreter was to be found in this numerical mapping, one is forced to conclude that the underlying implementation of the large switch statement plays a key role in the performance one can expect. Several submissions suggested that the differences could be explained away by the longer list of opcodes and the (relatively) slow lookup process that forms the basis of the opCode method in the PVM.java file (at least, that is what I think the authors were trying to say; some explanations were very badly expressed!). But this has nothing to do with it - that method is used by the assembly process when the source code is read in, and not at all by the interpretation/execution process when the program is "run".
In a really serious implementation of an interpreter it would be worth carrying out further experiments to determine the optimal mapping, based, for example, on benchmarks carried out on a variety of programs. (These timings were done fairly roughly on a stopwatch; one should really have run the simulations many times over and for higher numbers of iterations, but the effects show up readily enough.)
Few teams came up with any suggestions for how the interpreter could be improved still further. This can be done in various ways, for example by "inlining" the code that is currently executed by calls to the next, push and pop routines, and it was disappointing that nobody bothered to try this. Of course it means quite a lot of changes have to be made. The solution kits show this in detail.
1000 iterations, 1000 upper limit, times in seconds (Win XP, 3GHz machine) S1.PVM S2.PVM Opcode set Original Optimized High numbers 3.47 3.04 (88%) High numbers, checks removed 1.98 2.18 (110%) Low numbers 4.71 2.99 (63%) Low numbers, checks removed 3.32 2.00 (60%) Operations 92,442,039 64,889,031 (70%) 200 iterations, 4000 upper limit, times in seconds (Win XP, 3GHz machine) S1.PVM S2.PVM Opcode set Original Optimized High numbers 2.97 2.54 (86%) High numbers, checks removed 1.69 1.84 (109%) Low numbers 3.88 2.47 (64%) Low numbers, checks removed 2.78 1.67 (60%) Operations 77,121,239 54,092,231 (70%)
The "checks removed" figures were obtained using variations of the interpreter source in which all the checks that CPU.SP remained in bounds had been suppressed, as well as the calls to next, push and pop (their effect was achieved by "inlining" the equivalent code. One can see that an insistence on safety results in a considerable loss of run-time efficiency.
I ran the simulations again using C# implementations of the system - the source code is to all intents and purposes identical:
1000 iterations, 1000 upper limit, times in seconds (Win XP, 3GHz machine) S1.PVM S2.PVM Opcode set Original Optimized High numbers 2.85 2.06 (72%) High numbers, checks removed 2.03 0.95 (49%) Low numbers 2.80 1.97 (71%) Low numbers, checks removed 1.74 0.83 (48%) Operations 92,442,039 64,889,031 (70%) 200 iterations, 4000 upper limit, times in seconds (Win XP, 3GHz machine) S1.PVM S2.PVM Opcode set Original Optimized High numbers 2.35 1.79 (76%) High numbers, checks removed 1.70 0.81 (48%) Low numbers 2.37 1.67 (70%) Low numbers, checks removed 1.50 0.68 (45%) Operations 77,121,239 54,092,231 (70%)
Interestingly, the C# system is "faster" than the Java one, and there is less variation in timing between the "high" and "low" number mappings of the opcodes.
This example aimed to demonstrate the use of the Boolean opcodes. Here is a solution, also making use of the new opcodes (a solution using the original opcodes would have been acceptable, of course)
0 DSP 2 ; v0 is x, v1 is y 25 OR ; (X or Y) 2 PRNS " X Y (X.Y)\' X\'+Y\' (X+Y)\' X\'.Y\'\n\n" 26 NOT 4 LDC_0 27 PRNB ; write (!(X || Y)); 5 STL_0 ; X = false 28 LDL_0 6 LDC_0 ; repeat { 29 NOT ; (not X) 7 STL_1 ; Y = false 30 LDL_1 8 LDL_0 repeat { 31 NOT ; (not Y) 9 PRNB ; write (X); 32 AND 10 LDL_1 33 PRNB ; write (!X && !Y); 11 PRNB ; write (Y); 34 PRNS "\n" ; write("\n"); 12 LDL_0 36 LDL_1 13 LDL_1 37 NOT 14 AND ; (X and Y) 38 STL_1 ; Y = !Y; 15 NOT 39 LDL_1 16 PRNB ; write (! (X && Y)); 40 NOT 17 LDL_0 41 BZE 8 ; } until (!Y); 18 NOT ; (not X) 43 LDL_0 19 LDL_1 44 NOT 20 NOT ; (not Y) 45 STL_0 ; X = !X; 21 OR 46 LDL_0 22 PRNB ; write (!X || !Y); 47 NOT 23 LDL_0 48 BZE 6 ; } until (!X); 24 LDL_1 50 HALT ; System.exit(0);
A translation of SEARCH1.PAV using the new opcodes follows. One using the original opcodes appears in the solution kit.
0 DSP 3 ; v0 is item, v1 is i, v2 is list 42 LDL_2 2 LDC 11 43 LDL_1 4 ANEW 44 LDXA 5 STL_2 ; int[] list = new int[11] 45 LDV 6 LDC_1 46 LDL_0 7 STL_1 ; i = 1; 47 CNE ; while (list[i] != item) { 8 LDL_1 48 BZE 56 9 LDC 10 50 LDL_1 11 CLE ; while (i <= 10) { 51 LDC_1 12 BZE 24 52 SUB 14 LDL_2 53 STL_1 ; i = i = 1; 15 LDL_1 54 BRN 42 ; } 16 LDXA 56 LDL_1 17 INPI ; read(list[i]); 57 LDC_0 18 LDL_1 58 CEQ ; if (i == 0) 19 LDC_1 59 BZE 63 20 ADD 61 PRNS "Not found\n" 21 STL_1 ; i = i + 1; 63 LDL_1 22 BRN 8 ; } 64 LDC_0 24 PRNS "Search for (99 stops)? " 65 CNE ; if (i != 0) { 26 LDA_0 66 BZE 74 27 INPI ; read(item) 68 PRNS "Found in position" 28 LDL_0 70 LDL_1 29 LDC 99 71 PRNI ; write(i); 31 CNE ; while (item != 99) { 72 PRNS "\n" ; write("\n"); 32 BZE 78 74 LDA_0 ; } 34 LDL_2 75 INPI ; read(item); 35 LDC_0 76 BRN 28 ; } 36 LDXA 78 HALT 37 LDL_0 38 STO ; list[0] = item; 39 LDC 10 41 STL_1 ; i = 10;
As always, a shorter solution still could have been found. For example, the sequence LDC_0 LDXA at 35-36 could have been omitted (can you see why?).