Trained in cognitive psychology, Dr. Colvin first learned to program in 1972, in BASIC on a PDP-8. He later had the distinction of being the first Cornell University graduate student to purchase an Apple II with student loan money, and has been happily hacking microcomputers ever since. He has been programming professionally in C since 1983. He welcomes comments and queries at 680 Hartford, Boulder, CO 80303 (303) 499-7254.
In an ideal world, computer programmers would never make mistakes. But in the real world, programs can and do go wrong, and programmers must anticipate the exceptions to the normal flow of operation. When exceptions occur, programmers must handle them, either by correcting the cause of the exception, trying a different strategy to complete the program, or terminating the program gracefully.
Many computer languages, including PL/I, CLU, Ada, and Eiffel, provide syntactic support for exception handling. C does not. Instead, it provides several unrelated library facilities which can, with effort, be used for exception handling. I used the ANSI C specifications for these facilities to create a small collection of macros that integrate <errno.h>, <assert.h>, <signal.h>, and <setjmp.h> into a reasonably well-disciplined exception-handling strategy.
Five Basic Strategies
I have distinguished five basic strategies for handling exceptions in C programs: denial, perfection, paranoia, truth, and communication.
Denial. You can pretend you live in an ideal world and ignore the possibility of exceptions. If you are lucky, your program will work just fine. If you are somewhat lucky, the operating system will terminate your program before it goes too far astray. If you are unlucky, your users will have to terminate your program with a reset. If you are very unlucky, your program will destroy system or user data, and will not be run again by any moderately cautious user.
Perfection. In principle, you can prove that a logically correct program will never encounter an exception assuming of course that your proof, compiler, and operating system are flawless, and that your hardware is immune to cosmic rays, power glitches, and head crashes. Careful reasoning about program correctness is essential to engineering reliable software, but nothing can guarantee perfection.
Paranoia. A programmer following this strategy takes nothing for granted and practices defensive programming to a fault. Every function checks all its arguments and makes sure it has adequate system resources before proceeding. Every function returns a value or sets a variable to indicate failures. Every invocation of every function is followed by a check for failure. Every computation is preceded by a test for valid operands and followed by a test for overflow.
Paranoia can be an effective strategy, and is preferred over denial. After all, just because you're paranoid doesn't mean they aren't out to get you. Defensive programming integrates well with the ANSI C standard library, in which most functions overload their return value with an error code (such as HUGE_VAL, EOF, or NULL), and set the global variable errno (defined in <errno.h>) to a value indicating the cause of the error.
The disadvantage to defensive programming is that applying it exclusively can easily double or triple the size, complexity, and execution time of your programs. Returning error codes can be particularly onerous in deeply nested loops and function calls, where it becomes tempting to branch unconditionally when failures are detected. Even more dangerous is the temptation to skip the error checks when you believe nothing could go wrong, which is usually a form of denial.
Truth. Rather than pass errors up to its caller, a function can insist that certain conditions be true for it to proceed correctly. The <assert.h> header defines a single macro, assert(expression). If the expression is false and NDEBUG is undefined, the program will terminate, typically with a message containing the text of the expression. A program cannot proceed in the face of a violated assertion, but assertions can be turned off by defining NDEBUG. Thus assertions need not cause any runtime overhead in production programs, but testing and inspection are essential to show that a program is unlikely to violate any assertions.
Communication. Rather than return an error or force termination, an exception can communicate its occurrence to the program, which can then choose to ignore the exception, correct it, or terminate the program.
The ANSI C <signal.h> facility communicates exceptions. A few conditions, such as division by zero, memory access violations, and termination requests, give rise to signals, which are caught by a signal handler. The default handling for most signals is to terminate the program. The available signals are implementation defined and cannot be extended by a program, but the default handling can be extended.
The function signal(int sig, void (*handler)(int)) installs a replacement signal handler, which is a function to call when the specified signal sig occurs. This function is restricted. A portable signal handler can do only five things:
- assign values to volatile integers of type sig_atomic_t
- make successful calls to signal
- return, in which case the program proceeds as if no signal was raised
- terminate execution with exit or abort
- transfer control out of the signal handler with the longjmp function.
To use longjmp, you must first declare a jmp_buf structure and save an execution context in it with setjmp(jmp_buf). setjmp() returns zero when called directly. A subsequent call to longjmp(jmp_buf,int) causes the saved execution context to be restored, so that setjmp returns the non-zero int passed to longjmp. Thus a call to longjmp behaves much like a non-local goto statement, with all the same dangers.
Disciplined Exceptions In Eiffel
Although each of these strategies has its place, not one of them is wholly adequate. I found a more integrated approach in the disciplined exception mechanism of Eiffel. A typical Eiffel routine is composed of four basic clauses, written as
Routine(argument:type) IS REQUIRE ... DO ... ENSURE ... RESCUE ... ENDThe REQUIRE clause is a Boolean expression asserting the preconditions for a routine those facts that must hold true before the routine can succeed. If a routine computes a mathematical function, its precondition describes the domain of its arguments.
The DO clause is the computational body of a routine. It can also contain explicit CHECK assertions.
The ENSURE clause is a Boolean expression asserting the post-conditions for a routine those facts which must hold true for the routine to have succeeded. If a routine computes a mathematical function, its postcondition describes the range of its results.
The RESCUE clause provides exception handling. If an exception occurs during the execution of a DO clause, then control transfers to the nearest RESCUE clause. The RESCUE clause may either RETRY its DO clause (after taking corrective action), or proceed to the end, in which case the next higher RESCUE clause is invoked. Exceptions in Eiffel include:
- calling a routine whose REQUIRE clause or ENSURE clause is false
- explicit assertions which are false
- arithmetic overflow.
The Eiffel exception mechanism is disciplined in the sense that any routine can only succeed or fail. This may seem obvious, but in Ada an exception handler can cause the failing routine to return as if it had succeeded. An Eiffel routine which has experienced an exception cannot succeed except via a successful RETRY, since it does not permit a normal return from the RESCUE clause. A terminated RESCUE clause causes an exception in its calling routine, which may attempt a RESCUE in turn, until the exception is either successfully handled or the program terminates. In contrast, the C assert facility makes no provisions for retry, and signal can be used with longjmp to allow retry, but with no restrictions on flow of control.
Disciplined Exceptions In ANSI C
The macros in EXCEPT.H (see Listing 1) provide a disciplined approach to exceptions in ANSI C. They allow for blocks of computation to be written as two clauses, comparable to the DO clause and RESCUE clause of Eiffel. For example, the function in Listing 2 provides a fail-safe wrapper for malloc on the Macintosh that handles the possibility of exhausting memory, and raises an exception if malloc fails.
In Listing 2, the leading and trailing assertions serve as REQUIRE and ENSURE clauses: they enforce the conditions that xalloc takes a non-zero argument and returns a non-zero pointer. The BEGIN_TRY macro begins the DO clause. The standard library function malloc attempts to allocate memory. It signals failure by returning a null pointer. On most implementations, including MPW C, it also sets errno to ENOMEM. Because the ANSI standard does not guarantee this, I use an assertion to enforce this assumption. The FAIL macro sets the X_Error global and transfers control to the FAIL_TRY block, which serves as the RESCUE clause. It first makes sure that the exception it is handling was caused by lack of memory, then it attempts to make more memory available with the Macintosh Toolbox CompactMem function. If CompactMem succeeds, then RETRY will transfer control back to BEGIN_TRY.
This example shows that functions using xalloc need never check for a null pointer. If all your functions use FAIL to report exceptions then you can eliminate most of the error checking code from your application, and centralize your exception handling in just a few places. In event-driven applications, I place most of my exception handling in the event loop, where the user can be informed that something went wrong and advised on what to do.
Implementation Details
The X_TRAP structure, which contains a pointer to another X_TRAP and a jmp_buf, manages the flow of control for exception handling. The pointer maintains a stack of X_TRAPs. The top-most trap on the stack contains the execution context for the current exception handler. Each BEGIN_TRY and END_TRY pushes and pops this stack, so it remains synchronized with the run-time stack. You should never use return, goto, or longjmp to leave a BEGIN_TRY block, as this will desynchronize the stacks. (You cannot enforce this restriction with the preprocessor.)
Five macros provide most of what is needed to write disciplined exception handlers, as seen in Listing 1.
- BEGIN_TRY declares an X_TRAP and pushes it on the stack. It then labels and invokes setjmp to save the execution context and clear the global variable X_Error.
- FAIL(XCEPTION) sets the X_Error global and calls X_TrapError (shown in Listing 4) , which uses longjmp to transfer control to the context saved in the topmost X_TRAP, which in turn will cause the setjmp in BEGIN_TRY to return X_Error. If the X_TRAP stack is empty, the program terminates with a message.
- FAIL_TRY begins an optional block to be executed if setjmp returns non-zero (when caused to return by longjmp), or when X_ReturnSignal returns.
- RETRY is a statement macro used by FAIL_TRY blocks to attempt to recover from an exception. This macro makes sure it is really in the FAIL_TRY block by testing X_Error, then it branches back to BEGIN_TRY.
- END_TRY pops the X_TRAP stack, then FAILs again if X_Error is not clear. Consequently, unrecoverable errors in a FAIL_TRY block will cause the next higher FAIL_TRY block on the X_TRAP stack to be executed. Successful execution falls through END_TRY.
- X_ERRNO is for standard library calls that set errno.
- X_SIGNAL is for signals caught by a signal handler.
- X_ASSERT is for assertion failures.
- X_SYSTEM is for errors in operating system calls.
- X_USER is for all other exceptions.
To trap errors in standard library functions that set errno, you can FAIL(X_ERRNO) when they fail. To handle false assertions as exceptions you can define the XDEBUG flag to replace the standard assert macro with a FAIL(X_ASSERT).
The X_HandleSignal and X_ReturnSignal functions handle signals as exceptions. You can install one of these functions as a handler for a given signal. When that signal occurs, the function sets the X_Signal global. The X_Error global will be set to X_Signal. Synchronous, internal signals can use X_HandleSignal to longjmp to the topmost exception handler. Other signals can return with X_ReturnSignal, which will cause the current FAIL_TRY block to execute when BEGIN_TRY is done.
To trap errors in operating system calls, you can call FAIL(X_SYSTEM). To trap any other exceptions you can call FAIL(X_USER).
Exception Handler Syntax
The macros in EXCEPT.H (Listing 1) provide an extension to the C syntax. New statement forms are added with the slightly simplified grammar shown in Listing 3.
A try-clause should not contain a jump statement or longjmp out of the clause, and a RETRY should only appear in a fail-clause. These restrictions could have been shown in Listing 3, but cannot be enforced by the preprocessor.
Experience With Exception Handling
The Workstation Group here at IHS is using a predecessor to implement the FAIL mechanism for exception handling in our memory management and user interface libraries, and in two large applications based on these libraries. The only difficulty has been interfacing with our database library, which uses overloaded return values and a global variable for error reporting. Consequently, we must test the results of every database call, and raise an exception if necessary. Exception handling was especially helpful during beta test, in which most program errors, even subtle pointer bugs, led to reproducible exceptions instead of the usual system crashes.
References
Object-Oriented Software Construction, Betrand Meyer, Prentice Hall, 1988.
The C Programming Language, Brian W. Kernigan and Dennis M. Ritchie, Prentice Hall, 1988.
Standard C, P. J. Plauger and Jim Brodie, Microsoft Press, 1989.