Hand in this prac sheet on your next practical day, correctly packaged in a transparent folder and with your solutions. This prac sheet forms the "cover sheet". Since the practical will have been done on a group basis, please hand in one copy of the prac sheet for each member of the group. These will be returned to you in due course, signed by the marker, and you will be asked to sign to acknowledge that you have received your own copy.
Your surname: (PRINT CLEARLY) Your prac day: Your student number: (PRINT CLEARLY) Your signature: Your group partners: Other people with whom you collaborated (you need not give tutors' names): Mark awarded: Tutor's signature
Last year it became clear to me that many students lack confidence in C++ input, output and string handling, so this practical aims to provide a number of short exercises to acquire the confidence that we shall need later. One of the exercises is directly "compiler related", however. An appendix to this prac sheet summarizes those parts of the ctype, stdio and string libraries that you will find yourselves using.
I would greatly prefer you to do these exercises in C++ using the stdio library, and not in Pascal. As from next week you will have a true choice of language for your pracs. As before, joint submissions are required, but please resist the temptation simply to divide up the various tasks. Discuss them thoroughly with your prac partners and with the demonstrators.
You will need this prac sheet, the handout defining the Clang language, and the notes on string handling in C/C++, with which you may not be familiar.
Copies of this prac sheet and of the Clang report are also available on the web site for the course.
This week you are required to hand in, besides this cover sheet:
Keep the prac sheet and your solutions until the end of the semester. Check carefully that your mark has been entered into the Departmental Records.
You are referred to the rules for practical submission which are clearly stated on page 10 of our Departmental Handbook. However, for this course pracs must be posted in the "hand-in" box in the secretary's office for collection by Pat Terry.
A rule not stated there, but which should be obvious, is that you are not allowed to hand in another student's work as your own. Attempts to do this will result in (at best) a mark of zero and (at worst) severe disciplinary action and the loss of your DP. You are allowed - even encouraged - to work and study with other students, but if you do this you are asked to acknowledge that you have done so.
The source files misc.h and set.h are again included in the prac kit PRAC20.ZIP (which you can copy and unpack in the usual way; you will not actually need set.h). misc.h is defined as given below. The idea is that you simply #include misc.h in your programs and the system will then automagically include the headers for the rest of the C libraries that you need.
// Various common items #ifndef MISC_H #define MISC_H #include <stdio.h> #include <stdlib.h> #include <string.h> #include <stdarg.h> #include <ctype.h> #include <limits.h> #define boolean int #define bool int #define true 1 #define false 0 #define TRUE 1 #define FALSE 0 #define maxint INT_MAX #if __MSDOS__ || MSDOS || WIN32 || __WIN32__ # define pathsep '\\' #else # define pathsep '/' #endif static void appendextension (char *oldstr, char *ext, char *newstr) // Changes filename in oldstr from PRIMARY.xxx to PRIMARY.ext in newstr { int i; char old[256]; strcpy(old, oldstr); i = strlen(old); while ((i > 0) && (old[i-1] != '.') && (old[i-1] != pathsep)) i--; if ((i > 0) && (old[i-1] == '.')) old[i-1] = 0; if (ext[0] == '.') sprintf(newstr,"%s%s", old, ext); else sprintf(newstr, "%s.%s", old, ext); } #endif /* MISC_H */
Task 1 (source code listing must be submitted)
Write a very short program TASK1.CPP that simply reads a text file character by character from stdin and copies it exactly to stdout. As sample data are provided three text files in the prac kit (which you should NOT try to edit)
POE.TXT 13331 bytes TWAIN.TXT 13680 bytes TABS.TXT 251 bytes
You should be able to run your program from a DOS prompt with a command like
TASK1 <POE.TXT >POE.NEW
and after the program has finished executing you should check that the copy is exactly the same as the original (do this for all three data files). You can do this by using the FC (file compare) command
FC POE.TXT POE.NEW
and follow this by looking at the directory entries
DIR POE.*
Task 2 (source code listing must be submitted)
You should recall that if you declare your main function in C++ to be
void main (int argc, char *argv[])
then the operating system will arrange for the number of command line arguments to be retrievable from argc, and the text of the arguments themselves from the array argv (the first of these is the name of the program executable).
Write a program TASK2.CPP that adds the numbers provided as command line arguments to give a very simple calculator. For example
TASK2 1 4 6 7 9
should display the result 27.
Task 3 (source code listing must be submitted)
Rewrite the program in Task 1 so that it can still work as before, but if the name of the input file appears as a parameter on the command line, input will be taken from that file, and output written to a file with the same primary name, but the extension NEW. For example
TASK3 POE.TXT
should read POE.TXT and copy it to POE.NEW. (Alternatively, of course, the command
TASK3 <POE.TXT >POE.NEW
should achieve the same effect.)
Hints:
As before, check carefully that the copy of the file is identical to the original.
Task 4 (source code listing must be submitted)
Modify the program in Task 3 so that it also counts the number of lines in the file as it copies them, and displays this count on the stderr output file (recollect that stderr is opened automagically to the screen).
Task 5 (source code listing must be submitted)
At last we get onto a more challenging problem that is rather more compiler-oriented.
Firstly, some background. By now you will have seen the use of QEdit as a "development environment". QEdit can work in conjucntion with various other programs so that when a hot key is pressed (actually CTRL-F9 or CTRL-F10 in our set up) then the combination of programs:
This file is assumed to have lines in it of the form
NameOfFileBeingEdited ( lineNo, colNo ) Some message
This makes for very easy development of user-friendly new compilers. The Clang compiler, for example, has been written to be the executable CLANG.EXE. A command of the form
CLANG bad.cln errors.lst
will invoke the compiler, and produce a file ERRORS.LST that might read
bad.cln ( 1 , 1) 'PROGRAM' expected bad.cln ( 7 , 5) ';' expected bad.cln ( 10 , 16) number expected
CLANG.EXE is automagically invoked by pressing <CTRL+F9> from within QEdit if the source text being edited resides in a file with a .CLN extension. The toy assembler you used last year is an executable ASM.EXE and is automagically invoked by QEdit if the source text being invoked resides in a file with a .ASM extension, and so on.
Now we are not going to write a complete compiler this week (wait for it, we'll get there soon enough). But let's have a look at some key features of the things compilers do, by writing a program that
Get this program to work in conjunction with QEdit so that you can edit a Clang program and then after "compilation" step through the procedures and functions, and through the variable declarations. Here is a silly Clang program, to show the output we would wish to produce:
PROGRAM Silly; VAR first, Second, third; PROCEDURE Sum; BEGIN Write(first + second + third); END; FUNCTION Average (X, Y); BEGIN RETURN (X + Y) / 2; END; BEGIN READ (first, second); Sum; WRITE(Average(first, second)) END. FILE.CLN ( 6, 3 ) PROCEDURE FILE.CLN ( 11, 3 ) FUNCTION FILE.CLN ( 1, 11 ) Silly FILE.CLN ( 3, 5 ) First FILE.CLN ( 3, 12 ) Second FILE.CLN ( 4, 5 ) Third FILE.CLN ( 6, 13 ) Sum FILE.CLN ( 11, 12 ) Average FILE.CLN ( 11, 20 ) X FILE.CLN ( 11, 23 ) Y
This problem probably looks impossibly difficult at first, second, third ... glance. But it isn't really. There are several ways in which it might be solved. Here is one possibility:
Regard a Clang source program simply as a sequence of concatenated strings - alternate words and non-words. The first string in the above program is the word "PROGRAM", the second is the string " " that separates "PROGRAM" from "Silly", the third is the word "Silly", the fourth is the string starting with ";" and ending with the space just before "VAR" and so on.
Given this insight, we can solve the problem by first developing a function that will obtain the "next" of these strings from the Clang source file each time it is called. The function might be defined to have a prototype like
int fgetnextstr (char *str, FILE *stream, int &line, int &column); /* Reads next string from stream into str and returns the line and column in which it appeared in the stream. Strings are of two kinds; the return value of the function distinguishes them from one another: Returns 1 if the string is a valid identifier or keyword (consists of an initial letter followed by other letters and digits only) Returns 2 if the string does not start with a letter, and contains no letters Returns 0 if the stream is exhausted (no further string could be extracted) */
The complete program can then be developed around the idea of a loop which simply calls this scanner function repeatedly. Each time the "scanner" discovers an identifier or keyword, the corresponding string is compared (in a case insensitive way) with the entries in a table of words already known (this table is initialized to contain the Clang keywords in UPPERCASE). If a match with PROCEDURE or FUNCTION is found, the position of the string is recorded on the output file; otherwise if it is a word that has not been declared before the word is added to the table and its position in the source text is recorded. Once all the text has been read, the table can be scanned and all the identifiers and their first positions recorded on the output file.
This program has two features in common with a "real" compiler - it needs to be able to set up and interrogate a "symbol table", and it needs to be able to unpack ("scan") characters into "tokens". Keep the table handling simple - an array of strings will suffice, along with a simple linear search algorithm.
Clang programs are "case insensitive", so that it does not really matter if the various words and identifiers appear in the original source in upper or lower case, or in a mixture of the two. Of course, the various spellings must all be regarded as equivalent by the tablehandler.
Comments tend to mess this sort of thing around. of course, because words in comments might get confused with variable and procedure names. For the purposes of this exercise simply assume that comments will never form part of any Clang program you are asked to analyse. (Good Heavens! Who ever expected to find a comment in a student program anyway!)
One last (first?) word of warning. In my experience student programs that attempt to handle strings in C++ are notoriously bug ridden, because students don't really understand how memory is allocated to strings. See if you can restore my confidence!
CTYPE.H isalnum (c) True if c is a letter or digit isalpha (c) True if c is a letter isdigit (c) True if c is a digit iscntrl (c) True if c is a delete character or ordinary control character isascii (c) True if c is a valid ASCII character isprint (c) True if c is a printable character isgraph (c) Like isprint except that the space character is excluded islower (c) True if c is a lowercase letter isupper (c) True if c is an uppercase letter ispunct (c) True if c is a punctuation character isspace (c) True if c is a space, tab, carriage return, newline, vertical tab, or form-feed isxdigit (c) True if c is a hexadecimal digit toupper (c) Converts c in the range [a-z] to characters [A-Z] tolower (c) Converts c in the range [A-Z] to characters [a-z] toascii (c) Converts c greater than 127 to the range 0-127 by clearing all but the lower 7 bits STDIO.H Input/output library for text and binary files (not all functions shown here, only the most common ones). int fclose (FILE *stream); /* Closes stream. If successful, returns 0. If unsuccessful, returns EOF. */ int feof (FILE *stream); /* Returns nonzero if end-of-file has been reached on stream. */ int ferror (FILE *stream); /* Returns nonzero if an error has occurred on stream. */ int fgetc (FILE *stream); /* Reads character (or EOF) from stream. If successful, returns character. If unsuccessful, returns EOF. */ int fgetchar (void); /* Reads a character (or EOF) from stdin. If successful, returns character. If unsuccessful, returns EOF. */ char *fgets (char *str, int n, FILE *stream); /* Reads a string str of at most n characters from stream. Collects input from stream until a newline character (\n) is found or at most n-1 characters are read. (If read, \n is placed in the string.) If successful, returns a pointer to the nul-terminated string str. If unsuccessful, returns NULL. */ FILE *fopen (const char *filename, const char *mode); /* Opens stream in required mode to external filename. If successful, returns pointer to the newly opened stream. If unsuccessful, returns NULL. */ int fprintf (FILE *stream, const char *format { , argument } ); /* Sends formatted output to stream. Uses the same format specifiers as printf, but fprintf sends output to the specified stream. If successful, returns the number of bytes output. If unsuccessful, returns EOF. */ int fputc (int c, FILE *stream); /* Writes character c to stream. If successful, returns c. If unsuccessful, returns EOF. */ int fputchar (int c); /* Writes character c to stdout. If successful, returns c. If unsuccessful, returns EOF. */ int fputs (const char *str, FILE *stream); /* Writes string str to stream. If successful, returns last character written. If unsuccessful, returns EOF. */ int fscanf (FILE *stream, const char *format { , address } ); /* Performs formatted input from stream. Returns the number of input fields successfully scanned, converted, and stored; return value doesn't include unstored scanned fields. Processes input according to the format and places the results in the memory locations pointed to by the arguments. */ int getc (FILE *stream); /* Reads character (or EOF) from stream. If successful, returns character. If unsuccessful, returns EOF. */ int getchar (void); /* Reads character (or EOF) from stdin. If successful, returns the character read, after converting it to an int without sign extension. If unsuccessful, returns EOF. */ char *gets (char *str); /* Reads string str from stdin. Collects input from stdin until a newline character (\n) is found. \n is not placed in the string. If successful, returns a pointer to the nul-terminated string str. If unsuccessful, returns NULL. */ int printf (const char *format { , argument } ); /* Formatted output to stdout. Processes a variable number of arguments according to the format, sending the output to stdout. If successful, returns the number of bytes output. If unsuccessful, returns EOF. */ int putc (int c, FILE *stream); /* Writes character c to stream. If successful, returns the character c. If unsuccessful, returns EOF. */ int putchar (int c); /* Writes character c on stdout. If successful, returns the character c. If unsuccessful, returns EOF. */ int puts (const char *str); /* Writes string str to stdout (and appends a newline character). If successful, returns the last character written. If unsuccessful, returns EOF. */ void rewind (FILE *stream); /* Repositions file pointer to stream's beginning. */ int scanf (const char *format { , argument } ); /* Performs formatted input from stdin. Returns the number of input fields successfully scanned, converted, and stored; return value does not include unstored scanned fields. Processes input according to the format and places the results in the memory locations pointed to by the arguments. */ int sprintf (char *buffer, const char *format { , argument } ); /* Performs formatted output to a string buffer. If successful, returns the number of bytes output. If unsuccessful, returns EOF. */ int sscanf (const char *buffer, const char *format { , address } ); /* Performs formatted input from a string buffer. Returns the number of input fields successfully scanned, converted, and stored; return value does not include unstored scanned fields. Processes input according to the format and places the results in the memory locations pointed to by the arguments. If sscanf attempts to read past end of buffer, the return value is EOF. */ int ungetc (int c, FILE *stream); /* Pushes the character c back into input stream, so that the next call to getc (or to other stream input functions) for stream will return c again. If successful, returns c. If unsuccessful, returns EOF. */ Predefined streams automatically opened when the program is started. stdin Standard input device. stdout Standard output device. stderr Standard error output device. Other predefined quantities FILE File control structure for streams. NULL Null pointer value. EOF value of character returned when end-of-file is encountered. STRING.H /* size_t is int or long, depending on the implementation. */ char *strcpy (char *dest, const char *src); /* Copies string src to dest. Returns dest. */ char *strncpy (char *dest, const char *src, size_t n); /* Copies at most n chars from src to dest. If n characters are copied, no null character is appended; the contents of the dest area is not a null-terminated string. */ size_t strlen (const char *str); /* Returns length of str. */ int strcmp (const char *s1, const char *s2); /* Compares one string to another, case significant Returns 0 if s1 = s2, < 0 if s1 < s2, > 0 if s1 > s2. */ int stricmp (const char *s1, const char *s2); /* might be called strcasecmp */ /* Compares one string to another, ignoring case Returns 0 if s1 = s2, < 0 if s1 < s2, > 0 if s1 > s2. */ char *strstr (const char *str, const char *substr); /* Returns pointer to first location of substr within str or NULL. */ char *strcat (char *s1, const char *s2); /* Appends s2 to s1. */ char *strncat (char *s1, const char *s2, size_t n); /* Appends not more than n chars from s2 to s1. */ STDLIB.H int abs (int x); /* Returns the absolute value of integer x. */ int atexit (atexit_t func); /* Registers termination function. Returns 0 on success and nonzero on failure. */ double atof (const char *str); /* Converts string str to a floating point number. Returns the converted value of str, or 0 if str cannot be converted. */ int atoi (const char *str); /* Converts string str to integer. Returns the converted value of str. Returns the converted value of str, or 0 if str cannot be converted. */ void exit (int status); /* Terminates program. Defined values for status are EXIT_SUCCESS Normal program termination EXIT_FAILURE Abnormal program termination Before terminating, buffered output is flushed, files are closed, and exit functions are called. */ void free (void *block); /* Frees block previously allocated by a call to malloc. */ char *getenv (const char *name); /* Gets string from environment. Returns a pointer to the value associated with name, or NULL if name is not defined in the environment. */ char *itoa (int value, char *str, int radix); /* Converts an integer value to a string. Returns a pointer to the target string. For a decimal representation, use radix=10. For hexadecimal, use radix=16. */ char *ltoa (long value, char *str, int radix); /* Converts a long value to a string. Returns a pointer to the target string. For a decimal representation, use radix=10. For hexadecimal, use radix=16. */ void *malloc (size_t size); /* Allocates memory. size is in bytes. Returns a pointer to the newly allocated block, or NULL if insufficient space exists for the new block. If size == 0, it returns NULL. */ int rand (void); /* Returns random number between 0 and RAND_MAX. */ int random (int num); /* Returns a random integer between 0 and (num-1). */ void randomize (void); /* Initializes the random number generator with a random value. It uses the time function, so you should include time.h when using this routine. */