Computer Science 3 - 2001 - Test on Prac 19

(You may not use your textbooks this time.  There are questions on both sides
of the page.  Answer on the question sheet.)

THURSDAY TEST

Are these two Cocol specifications equivalent?  If not, give an example of
input which the one would accept that the other would reject, and explain your
reasoning.

  COMPILER Sample   (* one *)
    CHARACTERS
      letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
    TOKENS
      ident = letter { letter } .
    PRODUCTIONS
      Sample = "BEGIN" ident ":=" ident "END" .
  END Sample.

  COMPILER Sample   (* two *)
    CHARACTERS
      letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
    TOKENS
      Letter = letter .
    PRODUCTIONS
      Sample = "BEGIN" ident ":=" ident "END" .
      ident  = Letter { Letter } .
  END Sample.

They are not equivalent.  The second one would allow input like

    BEGIN G A P := ANameWith Spaces END

but the first would only allow input like

    BEGIN GAP := ANameWithNoSpaces END



The following Cocol description is of a set of letters in envelopes ready to take to the post office. COMPILER Mail $XCN /* Describe simple set of mailing envelopes */ IGNORE CHR(1) .. CHR(13) CHARACTERS eol = CHR(13) . digit = "0123456789" . inaddress = ANY - CHR(0) - '$:' - eol . sp = CHR(32) . TOKENS number = "postcode:" sp digit { digit } . info = inaddress { inaddress } . PRODUCTIONS Mail = { Envelope } EOF . Envelope = Stamp { Stamp } Person Address . Stamp = "$1" | "$2" | "$3" /* values of stamps implied */ . Address = Street Town [ PostCode ] . Person = info . Street = info . Town = info . PostCode = number . END Mail. What would you have to add to this grammar so that the parser system could tell you (a) the total value of the stamps on all the envelopes (b) the names of all people whose addresses did not contain postcodes? This application is so simple that we can get away with having a few global variables accessible to all routines in the parser object. COMPILER Mail $XCN /* Describe simple set of mailing envelopes */ /* This does it all with "global" variables for simplicity */ #include <stdio.h> int cost = 0; char name[1000]; IGNORE CHR(1) .. CHR(13) CHARACTERS eol = CHR(13) . digit = "0123456789" . inaddress = ANY - CHR(0) - '$:' - eol . sp = CHR(32). TOKENS number = "postcode:" sp digit { digit } . info = inaddress { inaddress } . PRODUCTIONS Mail = { Envelope } EOF (. printf("Total cost %d\n", cost); .) . Envelope = Stamp { Stamp } Person Address . Stamp = "$1" (. cost += 1; .) | "$2" (. cost += 2; .) | "$3" (. cost += 3; .) . Address = Street Town ( PostCode | (. printf("%s has no postcode\n", name); .) ) . Person = info (. LexString(name, 100); .) . Street = info . Town = info . PostCode = number . END Mail. Here is a solution that shows how we would pass parameters. In general it is better for parsing routines in (possibly highly recursive) parsers to communicate by parameter, rather than relying on global variables which might be abused by other routines at inconvenient times! COMPILER Mail $XCN /* Describe simple set of mailing envelopes */ /* This shows a solution that uses parameters */ #include <stdio.h> IGNORE CHR(1) .. CHR(13) CHARACTERS eol = CHR(13) . digit = "0123456789" . inaddress = ANY - CHR(0) - '$:' - eol . sp = CHR(32). TOKENS number = "postcode:" sp digit { digit } . info = inaddress { inaddress } . PRODUCTIONS Mail = (. int cost = 0; .) { Envelope<cost> } EOF (. printf("Total cost %d\n", cost); .) . Envelope<int &cost> = (. char name[100]; bool hasCode; int value; .) Stamp<value> (. cost += value; .) { Stamp<value> (. cost += value; .) } Person<name, sizeof(name)> Address<hasCode> (. if (!hasCode) printf("%s has no postcode\n", name); .) . Stamp<int &faceValue> = "$1" (. faceValue = 1; .) | "$2" (. faceValue = 2; .) | "$3" (. faceValue = 3; .) . Address<bool &hasCode> = Street Town (. hasCode = false; .) [ PostCode (. hasCode = true; .) ] . Person<char * name, int length> = info (. LexString(name, length - 1); .) . Street = info . Town = info . PostCode = number . END Mail.
FRIDAY TEST Would the following attempt to write a Cocol specification be acceptable to Coco/R? If not, why not? If so, give an example of the sort of input that would be acceptable to the program that Coco/R would generate. COMPILER Sample CHARACTERS letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" . digit = "0123456789" . TOKENS ident = letter { letter | digit } digit . integer = digit { digit } . float = digit { digit } [ "." digit { digit } ] . PRODUCTIONS Sample = { ident | integer | digit } EOF. END Sample. It is unacceptable for at least two reasons: - the rules for integer and for float allow tokens that cannot be distinguished - a situation that would not arise if one insisted on the decimal point in a float, as for example float = digit { digit } "." digit { digit }. - Sample has attempted either to use the name of a character set (digit) as a token, or has intended it to be a new nonterminal (of the same name, allowable, but rather misleading) that has not been defined. There is another twist to this which one person saw, to my delight. The rule for ident is non-LL(1), and in fact ambiguous (digit can start and follow the deletable structure { letter | digit }. When setting and trying the question I had thought that Coco would object, but in fact it does not, and builds a perfectly good scanner. Being cruel, I left the rule as above just to see if anyone would spot the potential problem! The Scanner generator part of Coco/R uses a system that is different from the Parser generator part. However, one should not try to produce kinky TOKENS definitions like this - stick to the idea of using LL(1) conformant rules and you will find life is much easier.
You thought you had heard the last of those bagpipes, didn't you? Wrong again! The grammar below describes a medley event at a Highland Gathering: COMPILER Event $XCN /* Describe simple pipe band competition event */ IGNORE CASE IGNORE CHR(1) .. CHR(31) CHARACTERS letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_". TOKENS NameOfBand = letter { letter } . PRODUCTIONS Event = Band { Band } . Band = NameOfBand Medley . Medley = OneTune { OneTune } "break" . OneTune = "March" | "SlowMarch" | "Jig" | "HornPipe" | "Reel" | "Strathspey" { "Strathspey" } "Reel" . END Event. This application is so simple that we can get away with having a few global variables accessible to all routines in the parser object. COMPILER Event $XCN /* Describe simple pipe band competition event */ /* This demonstrates the use of global variables only */ #include <stdio.h> #include <string.h> int maxTunes = 0, tunes = 0, bands = 0; char maxName[1000], bandName[1000]; IGNORE CASE IGNORE CHR(1) .. CHR(31) CHARACTERS letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_". TOKENS NameOfBand = letter { letter } . PRODUCTIONS Event = Band { Band (. if (tunes > maxTunes) { maxTunes = tunes; strcpy(maxName, bandName); } .) } (. printf("%d bands played. " "Longest selection played by %s\n", bands, maxName); .) . Band = (. bands++; tunes = 0; .) NameOfBand (. LexString(bandName, 100); .) Medley . Medley = OneTune { OneTune } "break" . OneTune = ( "March" | "SlowMarch" | "Jig" | "Hornpipe" | "Reel" ) (. tunes++; .) | "Strathspey" { "Strathspey" (. tunes++; .) } "Reel" (. tunes += 2; .) . END Event. Here is a solution that shows how we would pass parameters. In general it is better for parsing routines in (possibly highly recursive) parsers to communicate by parameter, rather than relying on global variables which might be abused by other routines at inconvenient times! COMPILER Event $XCN /* Describe simple pipe band competition event */ /* This demonstrates the use of parameter passing */ #include <stdio.h> #include <string.h> IGNORE CASE IGNORE CHR(1) .. CHR(31) CHARACTERS letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_". TOKENS NameOfBand = letter { letter } . PRODUCTIONS Event = (. int maxTunes = 0, tunes, bands; char maxName[1000], bandName[1000]; .) Band<bandName, tunes> (. bands = 1; .) { Band<bandName, tunes> (. bands++; if (tunes > maxTunes) { maxTunes = tunes; strcpy(maxName, bandName); } .) } (. printf("%d bands played. " "Longest selection played by %s\n", bands, maxName); .) . Band<char *bandName, int &tunes> = NameOfBand (. LexString(bandName, 1000); .) Medley<tunes> . Medley<int &tunes> = (. int count; tunes = 0; .) OneTune<count> (. tunes += count; .) { OneTune<count> (. tunes += count; .) } "break" . OneTune<int &count> = ( "Hornpipe" | "SlowMarch" | "March" | "Jig" | "Reel" ) (. count = 1; .) | "Strathspey" (. count = 1; .) { "Strathspey" (. count++; .) } "Reel" (. count++; .) . END Event.