(1) The grammar below attempts to describe your hard-working lifestyle over the last few weeks.
COMPILER Prac $CN /* Describe progress of a compiler practical */ CHARACTERS letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" . TOKENS name = letter { letter } . IGNORE CHR(0) .. CHR(31) PRODUCTIONS Prac = Handouts { Task } "Submit" . Handouts = "PracSheet" { "HintSheet" } . Task = "Attempt" { ConsultTutor "Attempt" } . ConsultTutor = name . END Prac.
As you have complained, you sometimes have trouble finding tutors. As you may not know, we are going to have problem paying them, because they have recently told me that they want to be paid a fee for each occasion they are consulted. So what I need now is a list of tutors with a count of such occasions for each one. What must I add to the grammar to produce me such a list?
Typical output might be as follows (it does not have to be in alphabetic order):
Kevin 12 Ingrid 5 Nyasha 8 Pat 45
The first solution simply makes use of a static list structure, and avoids parameter passing:
import java.util.*; import Library.*; COMPILER Prac $CN /* Determine how often each tutor has been consulted during a practical using a static list P.D. Terry, Rhodes University, 2005 */ static class Tutor { public String name; public int count; public Tutor(String name, int count) { this.name = name; this.count = count; } public void writeLine() { IO.write(name, -14); IO.writeLine(count); } } // Tutor static ArrayList list = new ArrayList(); // global for simplicity here! static int position(String name) { // Finds position of entry with search key name in list, or -1 if not found int i = list.size() - 1; // index of last entry while (i >= 0 && // short-circuit protection !name.equals(((Tutor)list.get(i)).name)) // must cast before extracting field i--; // will reach -1 if no match return i; } CHARACTERS letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" . TOKENS name = letter { letter } . IGNORE CHR(0) .. CHR(31) PRODUCTIONS Prac = Handouts { Task } "Submit" (. for (int i = 0; i < list.size(); i++) ((Tutor)list.get(i)).writeLine(); .) . Handouts = "PracSheet" { "HintSheet" } . Task = "Attempt" { ConsultTutor "Attempt" } . ConsultTutor = name (. int pos = position(token.val); if (pos >= 0) ((Tutor)(list.get(pos))).count++; else list.add(new Tutor(token.val, 1)); .) . END Prac.
The second solution declares the list structure within the method for the goal symbol, necessitating that this list is then passed to all sub parsers. This is really over kill for a problem of this sort. Since an "object" is being passed we really pass (by value) a reference to the list, which means that the sub parsers have access to the elements of the list and can manipulate them (without altering the reference itself):
COMPILER Prac1 $CN /* Determine how often each tutor has been consulted during a practical using paramterized methods P.D. Terry, Rhodes University, 2005 */ static class Tutor { ... } // Tutor static int position(ArrayList list, String name) { // Finds position of entry with search key name in list, or -1 if not found int i = list.size() - 1; // index of last entry while (i >= 0 && // short-circuit protection !name.equals(((Tutor)list.get(i)).name)) // must cast before extracting field i--; // will reach -1 if no match return i; } CHARACTERS letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" . TOKENS name = letter { letter } . IGNORE CHR(0) .. CHR(31) PRODUCTIONS Prac1 (. ArrayList list = new ArrayList(); .) = Handouts { Task<list> } "Submit" (. for (int i = 0; i < list.size(); i++) ((Tutor)list.get(i)).writeLine(); .) . Handouts = "PracSheet" { "HintSheet" } . Task<ArrayList list> = "Attempt" { ConsultTutor<list> "Attempt" } . ConsultTutor<ArrayList list> = name (. int pos = position(list, token.val); if (pos >= 0) ((Tutor)(list.get(pos))).count++; else list.add(new Tutor(token.val, 1)); .) . END Prac1.
(2) The Cocol grammar below describes a sequence of Regular Expressions (written one to a line) using the conventions discussed during the course.
COMPILER RE $CN /* Regular expression grammar */ CHARACTERS control = CHR(0) .. CHR(31) . noquote1 = ANY - control - "'" . noquote2 = ANY - control - '"' . meta = "()*|[]-?+" . lf = CHR(10) . simple = ANY - control - "'" - '"' - meta . TOKENS atomic = simple . escaped = "'" noquote1 "'" | '"' noquote2 '"' . EOL = lf . IGNORE control - lf PRODUCTIONS RE = { Expression EOL } EOF . Expression = Term { "|" Term } . Term = Factor { Factor } . Factor = Element [ "*" | "?" | "+" ] . Element = Atom | Range | "(" Expression ")" . Range = "[" OneRange { OneRange } "]" . OneRange = Atom [ "-" Atom ] . Atom = atomic | escaped . END RE.
How would you add appropriate actions and attributes so that you could generate a program that would parse a sequence of regular expressions and report on the alphabets used in each one. Input like
a | b c d | ( x y z )* [a-g A-G] [x - z]? a? "'" z+
the output should be something like
Alphabet = a b c d x y z Alphabet = A B C D E F G a b c d e f g x y z Alphabet = ' a z
The first solution simply makes use of a static array of boolean flags, and avoids parameter passing:
import Library.*; COMPILER RE $CN /* Regular expression grammar - using a static array to determine alphabet P.D. Terry, Rhodes University, 2005 */ static boolean[] alphabet = new boolean[256]; CHARACTERS control = CHR(0) .. CHR(31) . noquote1 = ANY - control - "'" . noquote2 = ANY - control - '"' . meta = "()*|[]-?+" . lf = CHR(10) . simple = ANY - control - "'" - '"' - meta . TOKENS atomic = simple . escaped = "'" noquote1 "'" | '"' noquote2 '"' . EOL = lf . IGNORE control - lf PRODUCTIONS RE (. int i; .) = { (. for (i = 1; i < 256; i++) alphabet[i] = false; .) Expression EOL (. IO.write("Alphabet = "); for (i = 1; i < 256; i++) if (alphabet[i]) IO.write( (char) i + " "); IO.writeLine(); .) } EOF . Expression = Term { "|" Term } . Term = Factor { Factor } . Factor = Element [ "*" | "?" | "+" ] . Element (. char ch; .) = Atom<out ch> (. alphabet[ch] = true; .) | Range | "(" Expression ")" . Range = "[" OneRange { OneRange } "]" . OneRange (. char ch, ch1, ch2; .) = Atom<out ch1> (. alphabet[ch1] = true; .) [ "-" Atom<out ch2> (. if (ch2 < ch1) SemError("invalid range"); for (ch = ch1; ch <= ch2; ch++) alphabet[ch] = true; .) ] . Atom<out char ch> (. ch = 0; .) = ( atomic (. ch = token.val.charAt(0); .) | escaped (. ch = token.val.charAt(1); .) ) . END RE.
The second solution declares the data structure within the method for the goal symbol, necessitating that this list is then passed to all sub parsers. This is really over kill for a problem of this sort. Since an "object" is being passed we really pass (by value) a reference to the array, which means that the sub parsers have access to the elements of the list and can manipulate them (without altering the reference itself):
import Library.*; COMPILER RE1 $CN /* Regular expression grammar - using parameters to determine alphabet P.D. Terry, Rhodes University, 2005 */ ... PRODUCTIONS RE1 (. boolean[] alphabet = new boolean[256]; int i; .) = { (. for (i = 1; i < 256; i++) alphabet[i] = false; .) Expression<alphabet> EOL (. IO.write("Alphabet = "); for (i = 1; i < 256; i++) if (alphabet[i]) IO.write( (char) i + " "); IO.writeLine(); .) } EOF . Expression<boolean[] alphabet> = Term<alphabet> { "|" Term<alphabet> } . Term<boolean[] alphabet> = Factor<alphabet> { Factor<alphabet> } . Factor<boolean[] alphabet> = Element<alphabet> [ "*" | "?" | "+" ] . Element<boolean[] alphabet> (. char ch; .) = Atom<out ch> (. alphabet[ch] = true; .) | Range<alphabet> | "(" Expression<alphabet> ")" . Range<boolean[] alphabet> = "[" OneRange<alphabet> { OneRange<alphabet> } "]" . OneRange<boolean[] alphabet> (. char ch, ch1, ch2; .) = Atom<out ch1> (. alphabet[ch1] = true; .) [ "-" Atom<out ch2> (. if (ch2 < ch1) SemError("invalid range"); for (ch = ch1; ch <= ch2; ch++) alphabet[ch] = true; .) ] . Atom<out char ch> (. ch = 0; .) = ( atomic (. ch = token.val.charAt(0); .) | escaped (. ch = token.val.charAt(1); .) ) . END RE1.