Computer Science 3 - 2000


Test 1 on Translators - 26 September 2000 - Solutions

Please answer all questions 0 ... 7, in the spaces provided (four sides). Text books may not be used.

0. What are your name (surname!) and student number? [ 1 mark ]

1. Write a sentence or two to explain your understanding of the differences between the following pairs of concepts (keep the answers short and to the point; we do not need "essays"!) [ 3 * 6 = 18 marks ]

(a) A token and a character (see page 50)
(b) Nonterminals and terminals (see page 53)
(c) The front end and the back end of a translator (see page 8)
(d) A sentential form and a sentence (see pages 49, 54)
(e) Scanning and parsing (see pages 8, 10)
(f) Static semantics and dynamic semantics (see page 4)

2. Suppose we define an alphabet A = { a , b }. Write down a regular expression for each of the following languages: [ 2 * 3 = 6 marks ]

(a) L = { all strings with exactly one occurrence of a } b* a b*
(b) L = { all strings with at most one occurrence of a } b* (a | e) b* or b* (a | b) b*
(c) L = { all strings with at least one occurrence of a } (a | b)* a (a | b)*

3. Suppose a grammar is defined by G = { N , T , S , P } where

N = { A , B }
T = { x , y , z }
S = A
P = {
        A → BABB
        A → z
        B → x
        B → y }

What if anything is the difference between the language L1 = L(G) and the language L2 defined by the regular expression given below? [ 3 marks ]

( x | y )* z ( x | y )*

L1 and L2 both describe strings consisting of a sequence of x and y followed by a single z followed by a sequence of x and y. But L1 demands that there are always half as many x or y's before the z as there are after it. L2 does not have this constraint. For example, xyxxzyxyyyxyxyx is in either language, but xxxzyy is only in L2. More generally, a context-free grammar is capable of expressing requirements such as "brackets must match" which regular expressions often cannot do.

4. The emulator/interpreters you are using in the practical course incorporate the following code (or its Pascal equivalent):

      cpu.pc = initpc;   // initialize program counter
      do {
        if (unsigned(mem[cpu.pc]) > int(STKMC_nul)) ps = badop;
        else {
            cpu.ir = STKMC_opcodes(mem[cpu.pc]);            // fetch
            cpu.pc++;                                       // anticipate
            switch (cpu.ir) {                               // execute
              // various case options
            }
          }
      } while (ps == running);

In this code (and in certain of the "various case options") tests were made to ensure that the interpreter did not misbehave. However, if you study the code above carefully you may come to the conclusion that this testing is not sufficiently thorough. How could the emulator be made safer? (No, I don't need details of what might happen in the "case options", I need an improvement to the code that appears above.) [ 3 marks]

There is no checking that cpu.pc remains in bounds. We should need something on the lines of the following

      cpu.pc = initpc;   // initialize program counter
      do {
        if (incode(cpu.pc) {  // ----------------------------------
          if (unsigned(mem[cpu.pc]) > int(STKMC_nul)) ps = badop;
          else {
              cpu.ir = STKMC_opcodes(mem[cpu.pc]);            // fetch
              cpu.pc++;                                       // anticipate
              if (incode(cpu.pc) {  // ----------------------------------
                switch (cpu.ir) {                             // execute
                  // various case options
                }
            }
        }
      } while (ps == running);

where incode is a function that checks that 0 <= cpu.pc < CodeTop and sets ps == badmem if the test fails. This function could also be called at various places in the case options where cpu.pc is used as an index.

5. Draw T diagrams that represent the process that underlies the stack assembler/interpreter system that you are using in the current practical. Show how the various parts of the system have been developed, and how the assembly and execution of a simple program is achieved. [ 8 marks ]

In principle we need something like this:

  .--------------------------.          .--------------------------.
  |       StackAsm.CPP       |          |        StackAsm.EXE      |
  | STK    ---------->   P   |          | STK   ----------->   P   |
  |                          |          |                          |
  `-------.          .--------------------------.          .-------'
          |          |           CC.EXE         |          |
          |    C++   | C++    -------->  80x86  |  80x86   |
          |          |                          |          |
          `------------------.          .------------------'
                             |          |
         We write the        |   80x86  |      and compile it to
         source of the       |          |      the equivalent EXE
        assembler in         `----------'      file that we can
        C++ for which we     .----------.      then run on a PC
        have a compiler CC   |   80x86  |
                             `----------'

  .--------------------------.          .--------------------------.
  | Data    STKMC.CPP        |          | Data   STKMC.EXE         |
  |  P     ---------->  Res  |          |  P    ----------->  Res  |
  |                          |          |                          |
  `-------.          .--------------------------.          .-------'
          |          |           CC.EXE         |          |
          |    C++   | C++    -------->  80x86  |  80x86   |
          |          |                          |          |
          `------------------.          .------------------'
                             |          |
         We write the        |   80x86  |      and compile it to
         source of the       |          |      the equivalent EXE
        interpreter in       `----------'      file that we can
        C++ for which we     .----------.      then run on a PC
        have a compiler CC   |   80x86  |
                             `----------'

  .--------------------------.          .--------------------------.
  |           EG.STK         |          |          EG.P            |
  | Input  ----------> Output|          | Input -----------> Output|
  |                          |          |                          |
  `-------.          .--------------------------.          .--------------------------.
          |          |        StackAsm.EXE      |          | Input    StkMC.EXE       |
          |   STK    |  STK   -------->    P    |    P     |   P    -------->   Res   |
          |          |                          |          |                          |
          `------------------.          .--------------------------.          .-------'
                             |          |                          |          |
       We can now submit the |   80x86  |      and produce  the    |   80x86  |  to produce the
       code of an example    |          |    equivalent P code     |          |  results after
       written in STK        `----------'    version ready for the `----------'  interpretation
       to the asssembler     .----------.    interpreter           .----------.
                             |   80x86  |                          |   80x86  |
                             `----------'                          `----------'

In practice the assembler and interpreter programs have actually been combined into a single "seamless' program.

6. The following is one possible Cocol specification of a set of productions expressed in EBNF notation.

      COMPILER EBNF

      CHARACTERS
        letter   = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" .
        lowline  = "_" .
        digit    = "0123456789" .
        noquote1 = ANY - "'" .
        noquote2 = ANY - '"' .

      IGNORE  CHR(9) .. CHR(13)

      TOKENS
        nonterminal   = letter { letter | lowline | digit } .
        terminal      = "'" noquote1 { noquote1 } "'" | '"' noquote2 { noquote2 } '"' .

      PRODUCTIONS
        EBNF          = ProductionSet EOF .
        ProductionSet = Production | ProductionSet Production .
        Production    = nonterminal "=" Expression "." .
        Expression    = Term | Expression "|" Term  .
        Term          = Factor | Term Factor .
        Factor        =   nonterminal
                        | terminal
                        | "[" Expression "]"
                        | "(" Expression ")"
                        | "{" Expression "}" .
      END EBNF.

(a) What do the letters EBNF stand for? [ 2 marks ]

Extended Backus Naur Form

(b) Do the productions in the PRODUCTIONS section display left, right, or no recursion? [ 2 marks ]

Left recursion

(c) Taking as a goal symbol the notion of a Production, draw the parse tree that derives the fourth production rule in the PRODUCTIONS section, namely [ 6 marks ]

Expression = Term | Expression "|" Term .

This is complicated slightly by the fact that the words used for non-terminals have to appear as terminals as well! Here we have tried to draw the distinction by using a different typeface (italics).

                 Production
                      |
          .-----------+---------------------.-----------------------.
          |           |                     |                       |
     nonterminal     "="               Expression                  "."
          |                                 |
          |                      .----------+-----.
          |                      |          |     |
      Expression             Expression    "|"   Term
                                 |                |
                                 |                |--------------------.
                                 |                |                    |
                                Term             Term                Factor
                                 |                |                    |
                                 |                |----------.         |
                              Factor              |          |         |
                                 |               Term     Factor  nonterminal
                                 |                |          |         |
                            nonterminal           |          |         |
                                 |              Factor    terminal   Term
                                 |                |          |
                               Term               |          |
                                             nonterminal    "|"
                                                  |
                                                  |
                                              Expression

7. (Bonus) In what applications do you find programs written in the language Canntaireachd? [ 2 marks ]

Canntaireachd (pronounced cann-ta-rick) is the language of so-called "vocables" (suggestive syllables) used by pipers to teach each other the classical music of the Highland bagpipe (a music form known as "Ceol Mor" or "big music", also called Piobaireachd).