Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Visual C++ Exception-Handling Instrumentation


December 2002/Visual C++ Exception-Handling Instrumentation


Today, exceptions and exception handling find their way into almost every C++ program. Even if you personally do not use them, you can hardly ignore the multitude of third-party libraries (and of course the C++ Standard Library itself) that can propagate exceptions into your application. However, when exceptions meet real-life production code, many subtle problems arise.

One such problem is an absence of a convenient way to log and monitor exceptions activity. If your application crashed in front of your customer because of an unhandled exception, you would surely like to know which portion of your code is responsible for throwing it. Moreover, even if your top-level handler had caught an exception and dealt with it somehow, it will benefit your profiling and other troubleshooting efforts to know where this exception did come from. Ideally, it would be great to be able to log the precise line of source code that throws any exception in your application. Standard C++ does not give you any means for doing that, apart from including source location information (usually in the form of __FILE__ and __LINE__ parameters) into the exception object itself. Such an approach is hardly sufficient. First, if there are parts of code you do not control (such as third-party libraries or the standard library), you cannot modify the exceptions they throw. Second, if the exception is never caught by a handler, the information stored in the thrown object will be lost. Because of such limitations, any solution to the problem must be a platform-specific one.

Another exception-handling problem is how to manage many different exception types that are used by various libraries. Suppose your application has its own nice exception-handling scheme but a third-party library you use "spoils" it by throwing its own exceptions. The usual and portable solution in such a case is to wrap any call to this library into try {} catch() {} blocks that will intercept the library's exceptions and throw the exceptions of your application. However, as you can imagine, such code repetition is error prone and needlessly complicates the code. Here again, a platform-specific solution that enables exception translation in one place could lead to much cleaner and easily maintainable code.

The last exception-handling problem I frequently encounter is a VC-specific one. As you probably know, the catch(...) clause in VC can potentially catch ordinary processor exceptions (such as access violations, divisions by zero, and so on) together with C++ exceptions. In some situations, such behavior is desirable, but in others, it could lead to very bad results. If your application experiences an unexpected processor exception on your client's machine, the result is typically the famous GPF dialog. The Dr. Watson debugger will usually pop up, and you will at least have its log specifying what happened. In many situations, this will enable you to diagnose and solve the problem. However, if by any chance the processor exception is trapped by the catch(...) clause somewhere up the calling stack, all the information about what happened will be lost. Even worse, the catch(...) handler is usually written with only C++ exceptions in mind, so the most likely scenario will be that your application will happily report some "unknown error," continue to run, and completely crash in some unrelated place. One possible solution to this problem could be to use the _set_se_translator() function to enable the translation of processor exceptions to C++ (see the MSDN library help for details). However, this approach has some deficiencies. For example, it would require your application to forego a "synchronous" exception-handling model in favor of an "asynchronous" one, increasing the executable size and hurting performance. This article will present a different solution that will work regardless of the exception-handling model in use.

In what follows, I will show you one possible approach to solving these problems in Visual C++ 6.0 and .NET. There are very few differences between the two versions regarding their exception-handling implementation. Therefore, unless specifically mentioned, the following text will apply to both versions and I will use the term VC to mean both of them.

If one wishes to know where the exception came from, the obvious thing to do would be to try to intercept the throw keyword itself. The following is typical code produced by the VC compiler for a throw 1; statement:

mov    dword ptr [ebp-4],1
push   offset __TI1H
lea    eax,[ebp-4]
push   eax
call   __CxxThrowException@8

If you are not familiar with the x86 assembler, here is what is going on. First, value 1 is placed into a temporary on the stack. Second, the function __CxxThrowException@8 is called with two parameters passed on thestack. The first parameter is a pointer to the aforementioned temporary and the second is a pointer to some undocumented structure, __TI1H. The name __CxxThrowException@8 reveals that this function is actually called _CxxThrowException() — it uses an __stdcall calling convention and C name-mangling rules (extern "C"). This function is clearly the one that actually throws the exception, and it is provided as part of the C++ run-time library. This means that if we were to reimplement this function in our application, it could be possible to intercept any exception ever thrown. The following naïve attempt to do so:

extern "C" void _CxxThrowException(void *,void *)
{
}

is greeted by this compiler error:

error C2733: second C linkage
    of overloaded function
    '_CxxThrowException' not allowed

The VC compiler apparently knows about the signature of _CxxThrowException() without any special header files. Our simple signature (void *, void *) is clearly incorrect and there is no header to give us the correct one. When I initially encountered this problem, I had only VC 6 at my disposal and, after a long time trying to guess the correct signature, I admitted defeat. The only way I could find to insert this function into my application was to create an assembler source file and implement _CxxThrowException() there (forwarding the call to the actual C++ implementation with another name). I will spare you the gory details because VC.NET came around, and life is much simpler now. The VC.NET improved debugger actually gives the exact signature of _CxxThrowException() as well as the precise declarations of the undocumented second parameter it uses. Using this information and the results of _CxxThrowException() disassembly, it is easy to create boilerplate code that simply does what the original VC code did (Listing 1). Before we modify this code to suit our needs, let's take a look at what it does.

As you probably know, the VC implementation of exception handling is based on Structured Exception Handling (SEH), which is a part of the Windows architecture and Win32 API. (If you are not familiar with SEH, see the References for an introduction.) Therefore, it is not surprising that _CxxThrowException() ends up calling the RaiseException() API. What is interesting are the parameters that are given to this API. The exception code is 0xE06D7363. This code identifies this SEH exception as the Microsoft C++ exception. (The last 3 bytes form the string "msc" in ASCII.) You cannot change this value since the code generated by VC for catching C++ exceptions checks it. The flags parameter for this exception is EXCEPTION_NONCONTINUABLE because C++ exceptions do not support continuation. The information about the object being thrown is passed in the exception arguments array. This array holds exactly three entries. The first one is the magic number 0x19930520. Whoever created VC exception-handling code probably did it on May 20, 1993. You cannot change this value either, since it is also checked by catching code. The second array member holds the pointer to the object being thrown and the third is the pointer to _s__ThrowInfo structure that describes this object. You do not need to (and cannot) declare this structure since the compiler knows about it (just as it knows the declaration of _CxxThrowException()). Listing 2 contains its probable declaration together with other built-in structures it uses, wrapped in C++ comments to assist you in working with it. All these declarations were obtained by using the VC.NET debugger. For example, to see what _s__ThrowInfo looks like, just type (_s__ThrowInfo*)0 in the debugger's watch window and expand the result. In the comments, you will also find my attempts to reconstruct the meaning of the different structure members. Most of them are not relevant to this discussion and are of interest only if you wish to reimplement the catching of exceptions, too. The ones that interest us here are briefly described in Table 1.

Armed with knowledge of these structures, we can achieve our first goal — to log the exception being thrown. The code that does this is given in Listing 3. To make the code as generic as possible, I introduced the hook function PreviewException(), which is called from _CxxThrowException() and is given all the interesting information. The trivial implementation of PreviewException() given here simply prints this information to standard output. You can reuse this code by writing your own version that logs to your project-specific repository. The _CxxThrow- Exception() function begins with determining the address it was called from. This is processor-specific stuff, so if you write your code for Itanium you will need to modify this part. On x86 architecture, you can reliably determine the return address of a function only if its frame pointer was not optimized. To ensure this, #pragma optimize turns off the frame pointer optimization for _CxxThrowException(). The return address we obtain points to the next assembler instruction after:

call        __CxxThrowException@8

Since our goal is to determine the address that would correspond to the throw statement itself, we need to subtract the size of the call instruction from the return address. This size is always 5 bytes, so by subtracting 5 from the return address we can get the address of the call instruction. Using your favorite form of converting addresses to source-code lines (map files, John Robbins' excellent CrashFinder tool [see References], direct manipulation of pdb files, and so on), you can always obtain the source-code line that corresponds to this address. If everything is done correctly, this would be the line that contains the throw instruction in question.

The next step is to determine the calling function's stack frame. Though my simple example does not make any use of this parameter, it could be used to recreate the entire "calling stack" that led to the function that invoked throw. For sample code, look at the DebugHandler sample in Microsoft Platform SDK [3] and the StackWalk() API. (If you installed the Platform SDK samples, this one would probably be at %MSSdk%\Samples\WinBase\Debug\DebugHandler.) After that, I extract the type_info pointer for the first type of the thrown object. The first type in the type array will always be the exact type thrown, and this is what we are interested in. Note the very important check for NULL pointers. As you may know, there is one case in the C++ language when the exception object is not specified in the throw statement, namely when you rethrow an exception from a catch block. In such cases, both parameters to _CxxThrowException() will be NULL so we need to take this possibility into account. Finally, PreviewException() is called and with four parameters: the pointer to the thrown object, the address of the throw instruction, the pointer to the caller's frame, and the type_info pointer for the object type.

As you can see, with just a few lines of code we have achieved a lot. No matter what third-party libraries you use and what C++ exception they throw, all the exceptions will be routed through our function and reported. However, this just scratches the surface of what we can do now. Let us attempt to "translate exceptions;" that is, replace one thrown exception with another. The clean and elegant way to do so is shown in Listing 4. Just for fun, this example replaces any std::exception with an int. Since the body of our boilerplate _CxxThrowException() from Listing 1 is the actual throw operation, we just wrap it inside a try{} catch() {} and perform the exception translation inside the catch block.

Another possibility is to replace the function used to raise the SEH exception inside _CxxThrowException(). If you are writing a nonWin32 application (such as an NT native application or a device driver; see References), you cannot call the RaiseException() API. In such a case, you could replace it with the exception-raising function that exists in your environment. For the case of native NT applications, this would be RtlRaiseException() and for the drivers, probably ExRaiseException() functions (both of these functions are undocumented). This means that for the first time it would be possible to use C++ exceptions (and thus, libraries that require them in these "hostile" environments).

What you should never do with a custom _CxxThrowException() function is to try to return to the caller. The VC compiler (not to mention the calling code itself) does not expect this function to return and the results of such an attempt will probably result in an application crash.

Another thing to be careful with is the DLL version of the C run-time library. If you use it, you must define _CxxThrowException() in each executable (EXE and DLL) module of your application. For example, if you have an executable and third-party DLL that share the same C run-time library (CRT), you will not be able to make the third-party DLL use your _CxxThrowException(). It will continue to use the version it was linked against, that is, the one provided in CRT DLL. There are multiple ways to overcome this problem. One solution could be a run-time patch of the CRT DLL code. The source code that accompanies this article contains a function DllPatchThrow() that attempts to do this.

Finally, let us turn to solving the catch(...) problem. You will notice that if we had the means of discovering what kind of exceptions were actually caught from inside the catch(...) handler, we wouldn't have a problem at all. C++ exceptions could be handled as usual and unexpected processor exceptions could be given special treatment (for example, logging and application termination). So the question is: How do we determine what we have caught inside the catch(...) handler? After examination of the code that performs actual exception handling, I have found that the information about the exception currently being handled is actually stored and accessible to your handler code. The reason for storing this information is simple. If you rethrow an exception from a catch block, the compiler generally does not know what the exception object is. Therefore, as I have mentioned before, it just calls _CxxThrowException() with NULL parameters. When such a "NULL exception" is processed by catching code, it realizes that it is a rethrow and fetches the information about the actual exception object from the place it was stored before.

When you are using the single-threaded version of CRT, a pointer to the EXCEPTION_RECORD structure describing the current exception is stored in a global variable called _pCurrentException. The declaration of this variable is:

extern struct EHExceptionRecord * _pCurrentException;

The EHExceptionRecord is just another fancy built-in structure that is basically a wrapper around Win32's EXCEPTION_ RECORD. (A full description of the EXCEPTION_RECORD structure can be found in the MSDN library help.) For all practical purposes, you can safely cast this pointer to EXCEPTION_RECORD *.

When you use the multithreaded CRT, the story is a little more complicated. Since exceptions are per-thread entities, the CRT has to hold information about them in a thread-specific data structure. CRT has a special structure allocated for every thread (including the main one) that holds all the per-thread information it needs. The structure is called _tiddata and it is actually documented in run-time library sources provided with VC. (Look in mtdll.h header. You need to install the C run-time library source code to have access to this file. It is not installed by default, so if you cannot find the file, run the VC installation again, and make sure this feature is selected.) The run-time library holds pointers to this structure in thread local storage (TLS). The _tiddata for a main thread is allocated at program startup. When you create additional threads via _beginthread() or _beginthreadex(), they receive their own copy, too. If a thread doesn't have _tiddata structure (for example, when it is created by a call to the CreateThread() API), it will be allocated the first time it is needed. However, the structure will never be freed. This is why all the good books state that calling CreateThread() in C/C++ applications will usually result in a small memory leak. The _tiddata structure had been slightly changed between VC 6 and .NET. However, in both versions, the member called _curexception holds a pointer to EXEPTION_ RECORD structure that describes the currently handled exception. To access the _tiddata structure you must use the _getptd() function defined by CRT. This function's declaration is not provided in VC public headers so you will need to provide your own. Unfortunately, this function is not exported from the DLL version of CRT. This limits the following discussion to the nonDLL multithreaded version of CRT. Listing 5 provides all the necessary declarations and definitions. I merged VC 6 and .NET versions of _tiddata but left Microsoft's original comments for its fields.

To hide the implementation details, I defined the function GetCurrentExceptionRecord(), which can be transparently used to obtain the EXCEPTION_RECORD pointer on both single and multithreaded CRTs. Armed with this function, we can easily determine what kind of exception we are processing inside the catch(...) handler. It is enough to look at the ExceptionCode field of the returned EXCEPTION_RECORD structure. If it is equal to 0xE06D7363, then this is a C++ exception. All other values mean that this is an ordinary SEH exception. See Listing 6 for sample code that distinguishes between the two. As an added bonus, we can use the EXCEPTION_ RECORD structure to see what kind of C++ exception was thrown. The ExceptionInformation array will hold the same array we have seen _CxxThrowException() pass to the RaiseException() API. When handling an SEH exception, you can use the EXCEPTION_RECORD structure to obtain more information about where the exception came from (ExceptionAddress field).

As you can see, with just a few lines of code you can add some very useful things to your arsenal of exception-handling techniques. However, to do so you must sacrifice portability of your code. If you solely use Visual C++, I hope the material in this article is useful to you.

References

Pietrek, M. "A Crash Course on the Depths of Win32 Structured Exception Handling," Microsoft Systems Journal, January 1997. http://www.microsoft.com/msj/0197/exception/exception.htm.

Robbins, J. Debugging Applications, Microsoft Press, 2000, ISBN 0-7356-0886-5.

Russinovich, M. "Inside the Native API," http://www.sysinternals.com/ntw2k/info/ntdll.shtml.


Eugene Gershnik has been programming for the past 12 years, and specializes in C, C++, Windows development, networking, and security. He holds a B.S. in physics from the Technion, Israel Institute of Technology. Currently, he is working in the eTrust (network security) division of Computer Associates Inc. and can be reached at [email protected].


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.