Computer Science 301 - Translators


Tutorial 9 - Constraint analysis (switch statement in Parva) - Solutions

1. One version of the SwitchStatement has syntax that can be described by

         SwitchStatement = "switch" "(" Expression ")" "{"
                             { CaseLabelList Statement { Statement } }
                             [ "default" ":" { Statement } ]
                           "}" .
         CaseLabelList   = CaseLabel { CaseLabel } .
         CaseLabel       = "case" [ "+" | "-" ] Constant ":" .
         Constant        = IntConst | CharConst | "true" | "false" | "null" .

but, as usual, this has to be supplemented by semantic constraints. Among these are, fairly obviously, that no Constant may appear more than once as a CaseLabel within a single SwitchStatement and that the types of all the constants and the type of the selector Expression must be compatible. Although not illegal, a SwitchStatement without any internal clauses is of little use, and might elicit a warning to the user if it is detected.

How would the productions be attributed to allow such constraint checking to proceed?

  SwitchStatement<StackFrame frame>         (. int expType;
                                               ArrayList<Integer> labelList = new ArrayList<Integer>();
                                               boolean hasClauses = false; .)
  =  "switch" "("
       Expression<out expType>              (. if (isRef(expType) || expType == Entry.noType)
                                                 SemError("invalid selector type"); .)
     ")" "{"
        { CaseLabelList<expType, labelList>
          Statement<frame>                  (. hasClauses = true; .)
          { Statement<frame> }
        }
        [   "default" ":"
            { Statement<frame>              (. hasClauses = true; .)
            }
        ]
     "}"                                    (. if (!hasClauses) Warning("Empty switch statement"); .) .

  CaseLabelList<. int expType, ArrayList<Integer> labelList .>
  =  CaseLabel<expType, labelList>
     { CaseLabel<expType, labelList>
     } .

  CaseLabel<. int expType, ArrayList<Integer> labelList .>
                                            (. ConstRec con;
                                               int factor = 1;
                                               boolean signed = false; .)
  =  "case"
     [ ( "+" | "-"                          (. factor = -1; .)
       )                                    (. signed = true; .)
     ] Constant<out con> ":"                (. if (!compatible(con.type, expType)
                                                   || signed && con.type != Entry.intType)
                                                 SemError("invalid label type");
                                               int lab = factor * con.value;
                                               if (labelList.contains(lab))
                                                 SemError("duplicated case label");
                                               else labelList.add(lab); .)

2. In C, C++ and Java the statements that follow each CaseLabelList are not restricted in any way, although it is common to find that the last of them is a BreakStatement or ReturnStatement; if absent the semantics prescribe that control then falls through to the next option. This is expressly forbidden in C#, which requires that if the Statement sequence is not empty it must terminate with an unguarded explicit BreakStatement or ReturnStatement.

Can you develop the productions given above in such a way that this C# constraint is enforced syntactically, or find some other constraint analysis technique that will do so?

A context-free grammar will not be able to handle this constraint and remain LL(1). It might be tempting to try

         SwitchStatement = "switch" "(" Expression ")" "{"
                             { CaseLabelList Statement { Statement } "break" ";"
                             [ "default" ":" { Statement } "break" ";" ]
                           "}" .

but you should easily recognise that this would break "Rule 2" as the FIRST and FOLLOW sets for the deletable section { Statement } both include "break".

We might try to attribute the Statement parser to allow it to return a value signifying the kind of statement that has been just been parsed. Full details will not be given here, but you should be able to get the sense of this from the fragment below.

  SwitchStatement<out int statKind>         (. int expType, statementKind = emptyStat;
                                               ArrayList<Integer> labelList = new ArrayList<Integer>();
                                               boolean hasClauses = false; .)
  =  "switch" "("
       Expression<out expType>              (. if (isRef(expType) || expType == Entry.noType)
                                                 SemError("invalid selector type"); .)
     ")" "{"
        { CaseLabelList<expType, labelList>
          Statement<out statementKind, frame>
                                            (. hasClauses = true; .)
          { Statement<out statementKind> }  (. if (statementKind != breakStat &&
                                                   statementKind != returnStat)
                                                  SemError("break or return needed here"); .)


        }
        [   "default" ":"                   (. statementKind = emptyStat; .)
            { Statement<out StatementKind>  (. hasClauses = true; .)
            }                               (. if (statementKind != breakStat &&
                                                   statementKind != returnStat)
                                                  SemError("break or return needed here"); .)

        ]
     "}"                                    (. if (!hasClauses) Warning("Empty switch statement");
                                               statKind = switchStat; .) .

In practice in Java and C# compilers rather more sophisticated ways are found of doing these sorts of checks. But it is possible to fool them too. Here's another "language lawyer" example. The following C# code will produce the effects shown in the comments if you attempt to compile it:

  using Library;

  public class TryToBreakCompiler {

    public static void Main(string[] args) {
      int i = IO.ReadInt();
      switch (i) {
        case 5 : i = 9;
        // The C# compiler correctly flags an error - "cannot fall through"

        case 6 : if (i == 6) break; else break;
        // Although the clause does not end with an unguarded break/return the C# compiler
        // will accept the clause

        case 7 : if (i == 7) break; else if (i != 7) break;
        // Aha! we can fool the C# compiler with that one

        case 8 : if (i == 8) break; else break; i = 89;
        // Although the clause does not end with an unguarded break/return the C# compiler
        // is clever enough to realise that you cannot reach the final assignment

        case 9: if (i == 9) break; else if (i != 9) break; i = 89; break;
        // Aha! but we can fool the C# compiler with that one too

        default: i = 12;
        // The C# compiler correctly flags an error - "cannot fall through"
      } // switch

    } // Main

  }  // TryToBreakCompiler


Home  © P.D. Terry