Computer Science 301 - Translators


Tutorial 9 - Practice in attribute grammars - Solutions

1. Consider the by now rather hackneyed grammar for expressions which has been written out again below. How would you add attributes so that a parser, after recognising an Expression, would be able to determine whether that Expression was a "constant expression" (meaning that in principle it could be evaluated at compile time) or was a "variable expression" (meaning that it could not be evaluated until run time)? Assume that the Designator production can be attributed so that it returns the result of searching the symbol table to determine whether the associated identifier denotes a variable or a constant.

  Expression<out bool isConst>           (. bool alsoConst; .)
  = (   "+" Term<out isConst>
      | "-" Term<out isConst>
      | Term<out isConst>
    )
    { AddOp Term<out alsoConst>          (. isConst = isConst && alsoConst; .)
    } .

  Term<out bool isConst>                 (. bool alsoConst; .)
  = Factor<out isConst>
    { MulOp Factor<out alsoConst>        (. isConst = isConst && alsoConst; .)
    } .

  Factor<out bool isConst>
  =   Designator<out isConst>
    | number                             (. isConst = true; .)
    | "(" Expression<out isConst> ")"
    .

3. Consider the familiar grammar below for describing a set of EBNF productions. How could you add attributes to this grammar so that it could check a set of productions to determine whether any non-terminals had not been defined, and whether any non-terminals could not possibly be reachable?

Here is a C# version, including the code for a very simple table handler

  COMPILER EBNFCheck $CN
  /* Check a set of EBNF productions to make sure all apparent non-terminals are both defined
     (appear on the left of a production rule) and referenced (appear on the right of a rule)
     P.D. Terry, Rhodes University, 2016 */

    const bool defined = true;        // Name them, name them!
    const bool referenced = false;

  CHARACTERS
    letter   = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
    digit    = "0123456789" .
    noquote1 = ANY - "'" .
    noquote2 = ANY - '"' .

  TOKENS
    nonterminal = letter {letter | digit | "_" } .
    terminal    = "'" noquote1 {noquote1} "'" | '"' noquote2 {noquote2} '"' .

  IGNORE  CHR(9) .. CHR(13)

  PRODUCTIONS
     EBNFCheck
     =                              (. Table.Clear(); .)
       { Production }
       EOF                          (. Table.CheckNonTerminals(); .)
     .

     Production
     =
       SYNC nonterminal             (. Table.Add(token.val, defined); .)
       WEAK "="
       Expression
       SYNC "." .

     Expression
     =
     Term
     { WEAK "|" Term
     } .



     Term
     =
     [ Factor
       { Factor
       }
     ] .

     Factor
     =   nonterminal                (. Table.Add(token.val, referenced); .)
       | terminal
       | "[" Expression "]"
       | "(" Expression ")"
       | "{" Expression "}" .

  END EBNFCheck.

  =======================  Table Handler stored in .\EBNFCheck\Table.cs

  // Table handler for EBNF checker
  // P.D. Terry, Rhodes University, 2016 (C# version)

  using Library;
  using System;
  using System.Collections.Generic;
  using System.Text;

  namespace EBNFCheck {

    class Entry {
      public bool defined, referenced;
      public string name;
      public Entry(string name) {
        this.defined = false;
        this.referenced = false;
        this.name   = name;
      }
    } // Entry

    class Table {

      static List<Entry> list = new List<Entry>();

      public static void Clear() {
      // Clear all entries in table
        list = new List<Entry>();
      } // Clear

      public static int Add(string name, bool defining) {
      // Search for, and possibly add a nonterminal "name" to the table, according as
      // to "where" in a production it was found, and return the position as the
      // index where the name was either added previously or added by this call
      // debug: IO.WriteLine("add " + name + " " + where);
        int position = 0;
        while (position < list.Count && !name.Equals(list[position].name)) position++;
        if (position >= list.Count) list.Add(new Entry(name));
        if (defining)
          list[position].defined = true;
        else
          list[position].referenced = true;
        return position;
      } // Add

      public static void CheckNonTerminals() {
      // Check to see whether any non-terminals are not defined or are not referenced
      // This could be made rather more sophisticated!
        IO.WriteLine("The following non-terminals were never defined");
        for (int i = 0; i < list.Count; i++)
          if (! list[i].defined) IO.WriteLine(list[i].name);
        IO.WriteLine("The following non-terminals were never referenced");
        for (int i = 1; i < list.Count; i++)  // no need to check goal symbol
          if (! list[i].referenced) IO.WriteLine(list[i].name);
      } // CheckNonTerminals

    } // Table

  } //namespace

3. You will remember meeting the staff in a previous practical (you can find this in Staff1.txt):

       R. J. Foss, BSc, MSc, PhD.
       Professor Philip Machanick, PhD.
       Professor P. D. Terry, MSc, PhD.
       ...

and, perhaps, coming up with something like the following grammar for parsing such a list (Staff.atg):

  COMPILER Staff $CN
  /* Describe a list of academic staff in a department
     P.D. Terry, Rhodes University, 2016 */

  CHARACTERS
    uLetter   = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" .
    lLetter   = "abcdefghijklmnopqrstuvwxyz" .
    letter    = uLetter + lLetter .

  TOKENS
    name      = uLetter { letter | "'" uLetter | "-" uLetter } .
    initial   = uLetter "." .

  IGNORE CHR(0) .. CHR(31)

  PRODUCTIONS
    Staff        = { Person } EOF .
    Person       = [ Title ] FullName { "," Degree } SYNC "."  .
    FullName     = NameLast { NameLast } .
    NameLast     = { initial } name .
    Title        = "Professor" | "Prof" | "Dr" | "Mr" | "Mrs" | "Ms" | "Mx" | "Miss" .
    Degree       =   "BA" | "BSc" | "BCom" | "BMus" | "BEd" | "BSocSci"
                   | "BSc(Hons)" | "BCom(Hons)" | "MA" | "MSc" | "PhD" .
  END Staff.

The HR Division have asked for an extract list which simply list all the initials and surnames for each of the staff who has a PhD, namely be something like

       Dr R.J. Foss
       Dr P. Machanick
       Dr P.D. Terry

Oh, while you are about it, sometimes the degree qualifications mention where that degree was obtained.

       R. J. Foss, BSc (Natal), MSc(UNISA), PhD(Rhodes).
       Professor Philip Machanick, PhD.
       Professor P. D. Terry, MSc(Rhodes), PhD (Cantab).

How could you handle this sort of list without adding to the already long list of possible qualifications?

  using Library;

  COMPILER Staff $CN
  /* Describe a list of academic staff in a department, and extract those with PhD degrees
     P.D. Terry, Rhodes University, 2016 */

    static StringBuilder sb;
    static string lastName;
    static bool hasPhD;

  CHARACTERS
    uLetter   = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" .
    lLetter   = "abcdefghijklmnopqrstuvwxyz" .
    letter    = uLetter + lLetter .
    control   = CHR(0) .. CHR(31) .
    varsity   = ANY - control - ")" .

  TOKENS
    name      = uLetter { letter | "'" uLetter | "-" uLetter } .
    initial   = uLetter "." .
    university= '(' { varsity } ')' .

  IGNORE control

  PRODUCTIONS
    Staff     = { Person } EOF .

    Person                                        (. sb = new StringBuilder();
                                                     hasPhd = false; .)
    = [ Title ] FullName { "," Degree } SYNC "."  (. string initials = sb.ToString();
                                                     initials = initials.Substring(0, initials.Length - 2);
                                                     if (hasPhD)
                                                       IO.WriteLine("Dr "  + initials + " " + lastName); .) .
    FullName  = NameLast { NameLast } .

    NameLast  = { initial                         (. sb.Append(token.val); .)
                } name                            (. sb.Append(token.val[0]);
                                                     sb.Append('.');
                                                     lastName = token.val; .) .

    Title
    =   "Professor" | "Prof" | "Dr"               (. hasPhD = true; .)
      | "Mr" | "Mx" | "Mrs" | "Ms" | "Miss" .

    Degree
    = (  "BA" | "BSc" | "BCom" | "BMus" | "BEd"
        | "BSocSci"   | "BSc(Hons)"
        | "BCom(Hons)" | "MA" | "MSc"
        | "PhD"                                   (. hasPhD = true; .)
      )
      [ university ]

  END Staff.

The solution above appends an initial corresponding to the surname (the very last name) to the string builder for the initials, and then removes it again before the final name is printed. This might strike you as a bit inelegant. Indeed it should! Terry Theorem 2: There is always a better way:

    Person                                        (. sb = new StringBuilder();
                                                     hasPhD = false; .)
    = [ Title ] FullName { "," Degree } SYNC "."  (. string initials = sb.ToString();
                                                     if (hasPhD)
                                                       IO.WriteLine("Dr "  + initials + " " + lastName); .) .

    FullName  = NameLast {                        (. sb.Append(lastName[0]);
                                                     sb.Append('.'); .)
                NameLast } .

    NameLast  = { initial                         (. sb.Append(token.val); .)
                } name                            (. lastName = token.val; .)  .

    Title
    =   "Professor" | "Prof" | "Dr"               (. hasPhD = true; .)
      | "Mr" | "Mx" | "Mrs" | "Ms" | "Miss" .

    Degree
    = (  "BA" | "BSc" | "BCom" | "BMus" | "BEd"
        | "BSocSci"   | "BSc(Hons)"
        | "BCom(Hons)" | "MA" | "MSc"
        | "PhD"                                   (. hasPhD = true; .)
      ) [ university ] .

Note the ability to introduce actions before a non-terminal is parsed, as shown in the FullName production. Cute? And of course one can do still better (Theorem 2 again):

    Person                                        (. sb = new StringBuilder();
                                                     hasPhD = false; .)
    = [ Title ] FullName { "," Degree } SYNC "."  (. if (hasPhD)
                                                       IO.WriteLine("Dr " + sb.ToString()); .) .

    FullName  = NameLast {                        (. sb.Append(lastName[0]);
                                                     sb.Append('.'); .)
                NameLast }                        (. sb.Append(" " + lastName); .) .

    NameLast  = { initial                         (. sb.Append(token.val); .)
                } name                            (. lastName = token.val; .) .

    Title
    =   "Professor" | "Prof" | "Dr"               (. hasPhD = true; .)
      | "Mr" | "Mx" | "Mrs" | "Ms" | "Miss" .

    Degree
    = (  "BA" | "BSc" | "BCom" | "BMus" | "BEd"
        | "BSocSci"   | "BSc(Hons)"
        | "BCom(Hons)" | "MA" | "MSc"
        | "PhD"                                   (. hasPhd = true; .)
      ) [ university ] .

There might be a tendency to write this sort of code:

  COMPILER Staff $CN
  /* When will some of you learn to put your names and a brief description at the start of your files? */

**  static StringBuilder sb = new StringBuilder();                                   // initialize

      ....

  PRODUCTIONS
      ...

    Person                                        (. hasPhD = false; .)
    = [ Title ] FullName { "," Degree } SYNC "."  (. if (hasPhD) {
                                                       IO.WriteLine("Dr " + sb.ToString()); .) .
**                                                     sb = new StringBuilder();  // re-initialize
                                                     }  .) .

which strikes me as odd - have you really been taught to end while loop bodies with "initialization" code?


Home  © P.D. Terry