Computer Science 3 - 2001 - Test on Prac 20

THURSDAY TEST

In the last week you have (or should have) developed a pretty-printer for
CS301 using Coco and either C++ or Pascal.  Describe the process of developing
and then using this best-selling piece of software by drawing me some T
diagrams.

This was very badly done, which was a great disappointment, since we have
presented numerous examples of T diagrams.  Many people don't seem to realise
that using Coco/R is a two stage process - CR produces C++ not EXE files.  The
C++ code that it produces still has to be compiled by the C++ compiler.  Come
on - you can't really tell me you haven't noticed!

Using Coco/R to develop the sources:

      ----------------------------          ----------------------------
      |        PRETTY.ATG        |          |        PRETTY.CPP        |
      |  CLN                CLN  |          |  CLN                 CLN |
      |                          |          |                          |
      ---------          ----------------------------          ---------
              |          |           CR.EXE         |          |
              |    ATG   |   ATG  -------->     C++ |     C++  |
              |          |                          |          |
              --------------------          --------------------
                                 |          |
                                 |   8086   |
                                 |          |
                                 ------------
                                 |          |
                                 |   8086   |
                                 |          |
                                 ------------

Using the C++ compiler to compile those sources:

      ----------------------------          ----------------------------
      |        Pretty.CPP        |          |         PRETTY.EXE       |
      |  CLN                CLN  |          |  CLN                 CLN |
      |                          |          |                          |
      ---------          ----------------------------          ---------
              |          |          BCC.EXE         |          |
              |    C++   |   C++  -------->    EXE  |    EXE   |
              |          |    (used as compiler)    |          |
              --------------------          --------------------
                                 |          |
                                 |   8086   |
                                 |          |
                                 ------------
                                 |          |
                                 |   8086   |
                                 |          |
                                 ------------

Using the executable to beautify programs:

      ----------------------------          ----------------------------
      |        Messy.CLN         |          |         Tidy.CLN         |
      |                          |          |                          |
      |                          |          |                          |
      ---------          ----------------------------          ---------
              |          |        PRETTY.EXE        |          |
              |    CLN   |   CLN  -------->    CLN  |    CLN   |
              |          |                          |          |
              --------------------          --------------------
                                 |          |
                                 |   8086   |
                                 |          |
                                 ------------
                                 |          |
                                 |   8086   |
                                 |          |
                                 ------------



Noise, noise, noise, how I hate noise! In Pascal and CS301 you have to write statements with keywords THEN and DO, as in if Expression then Statement; while Expression do Statement; but in C++ you don't need THEN and DO, but you do need parentheses; as in if (Expression) statement; while (Expression) statement; Couldn't we just scrap THEN and DO from CS301, and the parentheses from C++, and allow people to leave off the parentheses in C++, so that both then looked the same if Expression Statement; while Expression Statement; In other words, is that noise really necessary? In either language? In one but not the other? Strictly it is not necessary in either language. Expression is not "nullable", so arguments like "FIRST(Expression) and FIRST(Statement) have a non-empty intersection" (which some students put forward) are rather specious. However, the noise really does help to prevent really silly typographical errors from getting through unobserved, for that reason alone it would be useful. But (as they say in the Verimark advertisements) that's not all! For example, consider the following True C++: if (a) +b; In C++ statements can be expressions (terminated with a semicolon). The "statement" +b; does not have any real semantic meaning, but it is quite valid (it might do something like push the value of +b onto the working stack and then pop it off again, or store it in a machine register and then ignore it). If one omitted the parentheses one would get Nue C++: if a +b; which would be syntactically correct, but would imply that the "statement" was an empty statement (which in C++ is represented by a semicolon alone) - the same effect as if we had written Nue C++: if (a+b) /*empty*/ ; I cannot think of a Pascal/CS301/Modula example that would be pseudo-ambiguous if one omitted the parentheses, as in those languages (which distinguish clearly between expressions and statements) one cannot start a statement with an operator. Life is too short to waste it fighting those sort of "gotchas". It's bad enough that in C++ or Pascal one can write empty loops already: while (a > b) do /*empty*/ ;
My generosity knows no bounds - I have attached a copy of a grammar for the unextended CS301 language, which should be familiar. In spite of my efforts, I have the impression that you think CS301 is greatly inferior to C++. Someone complained, because in C++ you can declare and initialize variables at any point where it is convenient to do so, rather than putting all the declarations up front. No great problem. Suppose you wanted to be able to write a program in the form on the right rather than the one on the left: PROGRAM TestsAreSuchFun; PROGRAM AndTheyGetBetter; CONST BEGIN Tax = 10; CONST VAR Tax = 10; Item, Target, Total; VAR Item; BEGIN Read(Item); Read(Item); VAR Target := Item; Target := Item, Total; Total := Target + Tax; Total := Target + Tax; END. END. How would you alter the CS301 grammar to be able to do this? Does your grammar allow both versions of the program to be acceptable? (There is no need to write out all the grammar; simply write out those (few) productions that need alteration.) Most people got the basic idea - one essentially has to allow the declarations to move to the status of being statements. So I was looking for something like this: CS301 = "PROGRAM" identifier ";" StatementSequence "." . StatementSequence = Statement { ";" Statement } . Statement = [ CompoundStatement | Assignment | IfStatement | WhileStatement | ReadStatement | WriteStatement | ConstDeclarations | VarDeclarations ] with an extended OneVar to allow for initialization: OneVar = identifier [ UpperBound ] [ ":=" Expression ] . or, perhaps more realistically: OneVar = identifier [ UpperBound | ":=" Expression ] . since the first syntax would not really handle the semantics of a statement like VAR x[12] = 34; very well (does it declare an array and initialize all values to 34? Only initialize the 12th element to 34?). There is a subtlety that is easily missed - one has to change the syntax of the declarations slightly to get the semicolons into the correct places: CS301 = "PROGRAM" identifier ";" StatementSequence "." . StatementSequence = Statement { ";" Statement } . Statement = [ CompoundStatement | Assignment | IfStatement | WhileStatement | ReadStatement | WriteStatement | ConstDeclarations | VarDeclarations ] ConstDeclarations = "CONST" OneConst { ";" OneConst } . OneConst = identifier "=" number . VarDeclarations = "VAR" OneVar { "," OneVar } . OneVar = identifier [ UpperBound | ":=" Expression ] .
FRIDAY TEST In the last week you have (or should have) developed a pretty-printer for CS301-2 starting from a Cocol grammar for a pretty-printer for CS301-1. Suppose we wanted to develop a system that would convert CS301-1 programs to their equivalent CS301-2 versions. This would involve such conversions as are shown below CS301-1 version CS301-2 version PROGRAM Silly; PROGRAM Silly; BEGIN BEGIN IF A > B THEN C := D; IF A > B THEN WHILE D > 0 DO C := D; BEGIN END; D := D - 1; WHILE D > 0 DO Write(D) D := D - 1; END; Write(D) END. END; END. in particular, ensuring that IF and WHILE statements terminated with their own ENDs, and removing redundant BEGINs. Part of the pretty printer grammar is shown below. Ignore the problems associated with making the output "pretty" (in the sense of getting the indentation correct), but show what other changes would have to be made to get the conversion system to behave correctly. Block = { ( ConstDeclarations | VarDeclarations ) } CompoundStatement . CompoundStatement = "BEGIN" (. Append("BEGIN"); IndentNewLine(); .) Statement { WEAK ";" (. Append(";"); NewLine(); .) Statement } "END" (. ExdentNewLine(); Append("END"); .) . IfStatement = "IF" (. Append("IF "); .) Condition "THEN" (. IndentNewLine(); Append("THEN "); .) Statement (. Exdent(); .) . WhileStatement = "WHILE" (. Append("WHILE "); .) Condition "DO" (. Append(" DO "); .) Statement . Many people missed an important point - the system has to be able to recognise CS301-1 code but create CS301-2 code. This means that we cannot alter the fundamental grammar - what we do is alter the actions to change the places where BEGIN and END are output. It is really quite easy: For Block we insert the essential initial and final BEGIN and END brackets: Block = { ( ConstDeclarations | VarDeclarations ) } (. Append("BEGIN"); .) CompoundStatement (. Append("END""); .) . We downgrade Compound Statement: CompoundStatement = "BEGIN" /* do not append BEGIN */ Statement { WEAK ";" Statement } "END" /* do not append END */ and we modify Ifstatement and WhileStatement to force the trailing END in each case: IfStatement = "IF" (. Append("IF "); .) Condition "THEN" (. Append("THEN "); .) Statement (. Append("END"); .) . WhileStatement = "WHILE" (. Append("WHILE "); .) Condition "DO" (. Append(" DO "); .) Statement (. Append("END"); .) .
My generosity knows no bounds - I have attached a copy of a grammar for the unextended CS301 language, which should be familiar. As you should know, if you write a program like the following in CS301 it should be rejected as invalid PROGRAM Reject; VAR List1[10], List2[9], Item; BEGIN List1 := List2; Item := List1[12]; List2 := Item; END. How and where, if at all, do the three assignments above conflict with the grammar you have for CS301? If they do not conflict, how and where does the system detect the errors? The question related to the assignment statements List1 := List2; Item := List1[12]; List2 := Item; These are ALL syntactically correct; they do not conflict with the grammar. The first and last are "statically semantically" incorrect. This could be detected by the compiler, which would have to look up the properties of the identifiers in the symbol table, and perform tests on these which would fail. The middle assignment is "dynamically semantically" incorrect. The subscript value is out of range. A sophisticated compiler might pick this up at compile time, otherwise a cautious run time system might detect the error. Of course, if one were using a C++ approach, one would probably never learn till it was too late ...
A student, set the problem of extending the CS301 system to allow a program like the following PROGRAM TestsAreSuchFun; CONST Max = 10; Min = -10; VAR Item; BEGIN Read(Item); IF Item >= Min THEN IF Item <= Max THEN Write('in range'); END. hit on what he thought was the clever idea of changing part of the Cocol specification to read CHARACTERS digit = "+-0123456789" . TOKENS number = digit { digit } . PRODUCTIONS ConstDeclarations = "CONST" OneConst { OneConst } . OneConst = identifier "=" number ";" . His prac partner recoiled in horror. Why did she do this, and what did she propose as a better solution? To my delight, almost everyone got this correct. The "clever idea" leads to the not-so-clever ability to recognize garbage like "34+7--9" as a valid number. The fix is, of course: CHARACTERS digit = "0123456789" . TOKENS number = [ "+" | "-" ] digit { digit } . or, perhaps CHARACTERS digit = "0123456789" . TOKENS number = digit { digit } . PRODUCTIONS ConstDeclarations = "CONST" OneConst { OneConst } . OneConst = identifier "=" [ "+" | "-" ] number ";" .
COMPILER CS301 $XCN /* generate compiler, C++ */ IGNORE CASE IGNORE CHR(9) .. CHR(13) COMMENTS FROM "{" TO "}" CHARACTERS cr = CHR(13) . lf = CHR(10) . letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" . digit = "0123456789" . instring = ANY - "'" - cr - lf - CHR(0) . TOKENS identifier = letter { letter | digit } . number = digit { digit } . string = "'" (instring | "''") { instring | "''" } "'" . PRODUCTIONS CS301 = "PROGRAM" identifier ";" Block "." . Block = { ConstDeclarations | VarDeclarations } CompoundStatement . ConstDeclarations = "CONST" OneConst { OneConst } . OneConst = identifier "=" number ";" . VarDeclarations = ( "INT" | "BOOL" ) OneVar { "," OneVar } ";" . OneVar = identifier [ UpperBound ] . UpperBound = "[" number "]" . CompoundStatement = "BEGIN" Statement { ";" Statement } "END" . Statement = [ CompoundStatement | Assignment | ReturnStatement | IfStatement | WhileStatement | ReadStatement | WriteStatement ] . Assignment = Variable ":=" Expression . Variable = Designator . Designator = identifier [ "[" Expression "]" ] . ReturnStatement = "RETURN" . IfStatement = "IF" Condition "THEN" Statement . WhileStatement = "WHILE" Condition "DO" Statement . Condition = Expression . ReadStatement = "READ" "(" Variable { "," Variable } ")" . WriteStatement = "WRITE" [ "(" WriteElement { "," WriteElement } ")" ] . WriteElement = string | Expression . Expression = AndExp { "OR" AndExp } . AndExp = RelExp { "AND" RelExp } . RelExp = AddExp [ RelOp AddExp ] . AddExp = MultExp { AddOp MultExp } . MultExp = UnaryExp { MulOp UnaryExp } . UnaryExp = Factor | UnaryOp UnaryExp . Factor = Designator | number | "TRUE" | "FALSE" | "(" Expression ")" . UnaryOp = "+" | "-" | "NOT" . AddOp = "+" | "-" . MulOp = "*" | "/" . RelOp = "=" | "%lt;>" | "%lt;" | "%lt;=" | ">" | ">=" . END CS301.