Checking EBNF productions CSC 301 2013/10/10
The problem is to analyse a set of EBNF grammar to see that all non-terminals are defined exactly
one, and that all non-terminls are "reachable".
The grammar describing a set of EBNF productions should be familiar:
COMPILER EBNF $NC
/* Simple checks on EBNF productions */
CHARACTERS
letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" +
"abcdefghijklmnopqrstuvwxyz" .
lowline = "_" .
digit = "0123456789" .
control = CHR(0) .. CHR(31) .
noquote1 = ANY - control - "'" .
noquote2 = ANY - control - '"' .
IGNORE CHR(9) .. CHR(13)
COMMENTS FROM "(*" TO "*)" NESTED
TOKENS
nonterminal = letter { letter | lowline | digit } .
terminal = "'" noquote1 { noquote1 } "'"
| '"' noquote2 { noquote2 } '"' .
PRODUCTIONS
EBNF
= { Production } EOF .
Production
= SYNC nonterminal
WEAK "=" Expression SYNC "." .
Expression = Term { WEAK "|" Term } .
Term = [ Factor { Factor } ] .
Factor
= nonterminal
| terminal
| "[" Expression "]"
| "(" Expression ")"
| "{" Expression "}" .
END EBNF.
We can solve the problem by building a symbol table as we "pass" over the productions, and then
making a "pass" over the table once all the productions have been read to check the semantic
constraints. The table must record the names of the non-terminals, and counts of how many times
each name has appeared in the left side of a production (should be exactly once> and how many time is
has appeared on the right side of any production (should be non-zero).
For example for a set of silly productions like
Goal = One Two Three "stop" .
One = "Go" Two Two "relax" .
Two = "drink" { "drink" } "pay" .
Thre = "payTheRent" One . // error
the table would look like this:
name left right
Goal 1 0
One 1 2
Two 1 3
Three 1 1
Thre 0 0
The attributed grammar would look like this
COMPILER EBNF $NC
/* Simple checks on EBNF productions */
CHARACTERS
letter = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" +
"abcdefghijklmnopqrstuvwxyz" .
lowline = "_" .
digit = "0123456789" .
control = CHR(0) .. CHR(31) .
noquote1 = ANY - control - "'" .
noquote2 = ANY - control - '"' .
IGNORE CHR(9) .. CHR(13)
COMMENTS FROM "(*" TO "*)" NESTED
TOKENS
nonterminal = letter { letter | lowline | digit } .
terminal = "'" noquote1 { noquote1 } "'"
| '"' noquote2 { noquote2 } '"' .
PRODUCTIONS
EBNF
= { Production } EOF (. if (Successful())
Table.TestProductions(); .) .
Production
= SYNC nonterminal (. Table.Update(token.val, Table.LHS); .)
WEAK "=" Expression SYNC "." .
Expression = Term { WEAK "|" Term } .
Term = [ Factor { Factor } ] .
Factor
= nonterminal (. Table.Update(token.val, Table.RHS); .)
| terminal
| "[" Expression "]"
| "(" Expression ")"
| "{" Expression "}" .
END EBNF.
We need to construct a symbol table of entries, each of which looks like this
class Entry {
public string name;
public int left, right;
public Entry(string name) {
this.name = name; left = 0; right = 0;
}
} // Entry
The Table handler (and semantic checker) might look like this in a simple implementation
class Table {
public const int
LHS = 0, RHS = 1;
static List list = new List();
public static void Update(string name, int where) {
// Updates a non-terminal entry in the table, according as to
// where in a production it was found
int i = 0;
while (i < list.Count && !name.Equals(list[i].name))
i++;
if (i == list.Count) list.Add(new Entry(name)); // new entry
if (where == LHS)
list[i].left++;
else list[i].right++;
} // Table.Update
public static void TestProductions() {
// Checks that all non-terminals have appeared once on the left
// side of each production, and at least once on the right side
// of each production
bool OK = true; // optimistic
int i;
for (i = 0; i < list.Count; i++) { // loop runs from 0
if (list[i].left == 0) {
OK = false;
IO.WriteLine(list[i].name + " never defined");
}
else if (list[i].left > 1) {
OK = false;
IO.WriteLine(list[i]).name + " redefined");
}
}
for (i = 1; i < list.Count; i++) { // loop runs from 1
if (list[i].right == 0) {
OK = false;
IO.WriteLine(list[i].name + " cannot be reached");
}
}
if (OK)
IO.WriteLine("Seems to be reduced grammar");
else
IO.WriteLine("Cannot be reduced grammar");
}
} // Table.TestProductions
Why does one loop run from zero and the second one only run from 1?