Test 1

Computer Science 301 - 2001

Test 1 on Translators - 16 August 2001

Please answer all questions 0 ... 12, in the spaces provided (four sides). Text books may not be used.

Keep your answers very short and to the point. None of the answers requires more than a sentence or two.

0 What are your name (surname!) and student number? [ 1 mark ]

1 Under what circumstances is interpretation to be preferred over compilation? What disadvantages are there in interpretation as opposed to compilation? [ 4 marks ]

See text book, section 2.5. Essentially the important points are that interpreters are very useful when one wants to do very simple "on the fly" computations such as in spreadsheets or database queries, but interpreted systems are much slower than systems compiled to native code.

2 Draw T diagrams that represent the processes that underlie the stack assembler/interpreter system that you used in your most recent practical. [ 8 marks ]

      .--------------------------.          .--------------------------.
      |       StackAsm.CPP       |          |        StackAsm.EXE      |
      | STK    ---------->   P   |          | STK   ----------->   P   |
      |                          |          |                          |
      `-------.          .--------------------------.          .-------'
              |          |           CC.EXE         |          |
              |    C++   | C++    -------->  80x86  |  80x86   |
              |          |                          |          |
              `------------------.          .------------------'
                                 |          |
             We write the        |   80x86  |      and compile it to
             source of the       |          |      the equivalent EXE
            assembler in         `----------'      file that we can
            C++ for which we     .----------.      then run on a PC
            have a compiler CC   |   80x86  |
                                 `----------'

      .--------------------------.          .--------------------------.
      | Data    STKMC.CPP        |          | Data   STKMC.EXE         |
      |  P     ---------->  Res  |          |  P    ----------->  Res  |
      |                          |          |                          |
      `-------.          .--------------------------.          .-------'
              |          |           CC.EXE         |          |
              |    C++   | C++    -------->  80x86  |  80x86   |
              |          |                          |          |
              `------------------.          .------------------'
                                 |          |
             We write the        |   80x86  |      and compile it to
             source of the       |          |      the equivalent EXE
            interpreter in       `----------'      file that we can
            C++ for which we     .----------.      then run on a PC
            have a compiler CC   |   80x86  |
                                 `----------'

      .--------------------------.          .--------------------------.
      |           EG.STK         |          |          EG.P            |
      | Input  ----------> Output|          | Input -----------> Output|
      |                          |          |                          |
      `-------.          .--------------------------.          .--------------------------.
              |          |        StackAsm.EXE      |          | Input    StkMC.EXE       |
              |   STK    |  STK   -------->    P    |    P     |   P    -------->   Res   |
              |          |                          |          |                          |
              `------------------.          .--------------------------.          .-------'
                                 |          |                          |          |
           We can now submit the |   80x86  |      and produce  the    |   80x86  |  to produce the
           code of an example    |          |    equivalent P code     |          |  results after
           written in STK        `----------'    version ready for the `----------'  interpretation
           to the asssembler     .----------.    interpreter           .----------.
                                 |   80x86  |                          |   80x86  |
                                 `----------'                          `----------'

In practice the assembler and interpreter programs have actually been combined into a single "seamless" program.

3 You used the X2C system in the first practical of the course. What is the technical term used to describe this class of translator? [ 2 marks ]

High Level Translator

4 Consider the silly (equivalent) Pascal and Modula programs below along with their translations into C performed using the translators p2c and X2C respectively. (Even if you are not expert in Pascal or Modula the intent of these snippets of code should be pretty obvious.)

Why do you suppose the p2c translation has translated the FOR loop using an extra variable FORLIM when this surely could have been written more directly [ 2 marks ]

          for (I = 5; I <= J; I++)
            List[I-5] = I + 10;

The reason is that within the body of a for loop in C one might want to alter the "terminating condition", but this is not allowed in Pascal, so an extra intermediate variable is created and initialised to the original "final" value of the control variable. I was delighted to see how many students puzzled this out correctly!

Why do you suppose the X2C translation makes such a meal of the assignment statement List[I] := I + 10 where the p2c translation has used the more obvious List[I-5] = I + 10; ? [ 2 marks ]

The reason is that it is doing "range checking" to ensure that the value assigned to List[I-5] is correctly within the range 5 to 15. Those of you who have grown up in the unprotected world of C++ have never learnt how useful this sort of check can be when debugging programs, unfortunately.

Why do you suppose the X2C translation has introduced such long-winded variable names and the p2c one has not deemed it necessary to do so? [ 2 marks ]

X2C has done this (as explained in a previous prac solution, I think) so as to try to ensure that "name clashes" do not arise in the C++ code that is produced. Each name has been synthesized from the "simple" name and the "module name" - this may make for long identifiers but helps programmers and compilers identify whence they have come (quickly - if I gave you a piece of C++ code with a function call to a function called offsetof, which library does it come from?). p2c has not bothered - either because the designers hope the problem won't arise, or perhaps because it is traditional for Pascal programs to be contained completely in one source file, not assembled (linked) from a variety of modules.

   PROGRAM Silly;                                  MODULE Silly;
     TYPE                                            TYPE
       SUBSCRIPTS = 5 .. 15;                           SUBSCRIPTS = INTEGER [5 .. 15];
     VAR                                             VAR
       I, J : SUBSCRIPTS;                              I, J : SUBSCRIPTS;
       List : ARRAY [SUBSCRIPTS] OF SUBSCRIPTS;        List : ARRAY SUBSCRIPTS OF SUBSCRIPTS;
     BEGIN                                           BEGIN
       FOR I := 5 TO J DO List[I] := I + 10            FOR I := 5 TO J DO List[I] := I + 10 END
     END.                                            END Silly.


   typedef char SUBSCRIPTS;

   main(int argc, char *argv[])  /* translation by p2c */
   { SUBSCRIPTS I, J;
     SUBSCRIPTS List[11];
     SUBSCRIPTS FORLIM;

     PASCAL_MAIN(argc, argv);
     FORLIM = J;
     for (I = 5; I <= FORLIM; I++)
       List[I-5] = I + 10;
     exit(EXIT_SUCCESS);
   }

   #include "X2C.h"
   typedef X2C_INT16 Silly_SUBSCRIPTS;
   static Silly_SUBSCRIPTS Silly_I;
   static Silly_SUBSCRIPTS Silly_J;
   typedef Silly_SUBSCRIPTS Silly_ANONYM1[11];
   static Silly_ANONYM1 Silly_List;

   int main(int argc, char **argv) /* translation by X2C */
   { X2C_BEGIN(argc, argv);
     static Silly_SUBSCRIPTS Silly_V1;
     Silly_I = 5;
     Silly_V1 = Silly_J;
     if (Silly_I <= Silly_V1) {
       for (;;) {
         Silly_List[(X2C_INT16) Silly_I - 5] = (X2C_INT16) X2C_CHK((Silly_I + 10), 5, 15);
         if (Silly_I == Silly_V1) break;
         ++Silly_I;
       }
     }
     return 0;
   }

5 What does the term cross-compiler mean to you, and what is the technical term used to describe a compiler that is, in a sense, the "opposite" of a cross-compiler? [ 2 marks ]

A cross compiler generates code for a machine other than the one on which the cross compiler itself executes. The "opposite" is usually called a self-resident compiler.

6 Identify the four chief phases that make up the front end of a compiler, and indicate briefly the role that each of these plays in the translation process. [ 8 marks ]

This is all book work, see Section 2.3 in the text, on pages 8 - 10. The four phases in question are the character handler, lexical analyser (scanner), syntax analyser (parser) and static semantic analyser.

7 Write down two regular expressions; one C that defines a single consonant; the other one W that describes words like "facetious" and "abstemious" that have each of the vowels (a, e, i, o, u) appearing exactly once, and in the correct order. [ 4 marks ]

C = b | c | d | f | g | h | j | k | l | m | n | p | q | r | s | t | v | w | x | y | z
W = C^* a C^* e C^* i C^* o C^* u C^*

8 What differences, if any, are there between static semantics and dynamic semantics? [ 2 marks ]

Static semantics refers to those features of a language that enable the constraint analyser of a compiler to check programs before they are executed for such things as correctly scoped variables, correctly set up function calls and so on. Dynamic semantics are those features of a program that only assume meaning when the program is actually executed, such as the values actually assigned to variables as data is read and so on. (See sections 2.1 and 2.3)

9 A grammar G is described by a 4-tuple. Name and briefly describe the components of this 4-tuple. [ 4 marks ]

G = { N , T , S , P )

where

N = set of non-terminals or "syntactic classes" of the language -symbols used to help define the syntactic structure of a sentence without actually appearing in one.

T = set of terminals - symbols that actually appear in the sentences of the language.

S = starts or goal symbol; a member of N from which all sentences may ultimately be derived.

P = set of productions - rules that determine how one string may be transformed into another.

(See section 5.4, page 53)

10 In terms of your notation, define precisely (mathematically) what is meant by a language L defined by the grammar G. (No, don't write a mini-essay; write a line of Mathematics!) [3 marks]

L(G) = { w | w e T ^* ; S Þ^* w }

11 What is the difference between a sentential form and a sentence [ 2 marks ]

A sentential form for a language L(G) is the goal symbol, or any string that can be derived from it; this string can contain both terminals and non-terminals. A sentence is a string that contains only terminals. (See section 5.4)

12 What is canntaireachd and what is it used for? [ 1 mark ]

Canntaireachd (pronounced cann-ta-rick) is the language of so-called "vocables" (suggestive syllables) used by pipers to teach each other the classical music of the Highland bagpipe (a music form known as "Ceol Mor" or "big music", also called Piobaireachd).