Noam is chief technology officer at Vsoft. He can be contacted at [email protected].
During the development process, it is useful to know the context in which certain errors occur. For example, when error messages are displayed, it helps to know the tracks leading to the errors. Likewise, when applications crash, the postmortem data (available using Dr. Watson, for instance) in the stack can save the day.
But collecting call stack data only when the application crashes has a limitationsometimes the call stack is corrupted and only the top of the stack is visible. This is how Dr. Watson works. Luckily, there is an alternative: You can collect the call stack while it's being created. This is what profilers do. As you might expect, this method also has a deficiencycontinuously collecting and storing call stack data consumes resources.
When call stack information is available during the actual advancement of the program and not as a snapshot, the data can be used to identify patterns of calling functions. Normally, you would not want to see the very long trail of function calls leading to the current callexcept when this provides a solution to the bug.
During the development of a C++ application, we needed more data than standard debugging tools provided, including:
- Content of the call stack when error messages were logged.
- Content of the call stack when fatal exceptions were thrown.
- Function call trees showing the history of function callsup to the current time.
All these data were needed on a per-thread basis.
Of course, the first two requirements can be satisfied in part with the Win32 API StackWalk() function. However, we decided not to use this API call because it is too complex, not available on all platforms, and the results aren't detailed enough. (For more information on StackWalk(), see John Robbins's "Bugslayer" column Microsoft Systems Journal, February 1999; http://www.microsoft.com/msj/0299/bugslayer/bugslayer0299.htm).
Consequently, we ended up building a tool that collects the information in real time, then displays the results. To collect the data, we needed to keep every function entry/exit. To do this, we used the constructor/distractor of auto variable to do all the hooking work; see Listing One. This approach is easy to implement, platform independent, lets you easily decide which functions not to collect, and works correctly even when exceptions are thrown. On the downside, it requires that you manually insert code into the application under test.
An alternative function hooking relies on the compiler to create a hook function for the entry/exit of each function. Microsoft's Visual C++ 6 enables the call to _penter() for every function call (using the /GH command line). In Visual C++ 7, the /GH flag also generates the _pexit() function call. This approach requires no changes to the application source code and works for C sources as well. However, it does rely on compiler-specific implementations and necessitates the use of a symbol engine.
Every function entry/exit is reported to a singleton object. I implemented it as a COM (INPROC) server, though this isn't mandatory. During the first function entry report (IStackTrace::Add()), storage is allocated and a pointer to it is saved in the Thread Local Storage (TLS). Listing Two is the pseudocode for Add(). The main body of code is in StackTrace.dll and contains the COM object that registers the function calls. Use of this COM object is done with the aid of classes and macros defined in stackTrace.h (available electronically; see "Resource Center," page 5).
To use the stack trace, you need to:
1. Register the stackTrace.dll.
2. Set the registry flag to configure the stack tracer.
3. Add #include "stackTrace.h" to the .cpp file(s).
4. Add STACK_TRACE("function name") as the first line in every function to be monitored (Listing One). I wrote a macro for VisualStudio 6 that eliminates manual typing; Listing Three is sample code that uses the macro.
At any time during the run (that is, when an error is logged), the content of a stack trace can be dumped (for each thread, or for all threads). If dumping function call tree to file is enabled, a file for each thread is created. The file (which is in XML) shows the history of function calls. As Figure 1 shows, I used XML to enable XML tree viewers. This lets me easily collapse/expand large parts of the full call history. I then browsed the XML file using Frank Le's XmlTreeView (http://www.codeproject.com/soap/xmltreeview.asp). Figure 2 is a sample logger output with StackTrace data (note the parameter value changing in the hex dump attached for each function call).
Since C++ guarantees that destructors of automatic variables are called even if an exception is thrown, the registration of "exiting function" will not be skipped. Actually, it is so clean that no information on the exception is visible in the trace (unless you know to expect a certain function call order).
Performance Issues
Although this tool was developed for debuggingnot retail buildsI still kept a close eye on performance. Writing the function data to the TLS requires copying about 30 bytes and an STL::map::find() operation. Since each thread has its own data structure, no synchronization objects are used, thereby avoiding nasty congestions.
Writing function call history data to file is much slower (though most of the WriteFile() calls are done to the filesystem cache). To avoid slowing the application, a special worker thread in idle priority writes the function call trace to files. For each thread, a separate file is created, thus, the traces are not mixed. When IStackTrace::Add() is called, it checks if function trace is enabled. If so, it posts a message to the writer thread and returns immediately. The same is done in IStackTrace::Remove(). When the application terminates, the thread receives a TIME_TO_DIE message, closes the files, and terminates cleanly.
Hooking Entry and Exit
This version uses an explicit code to report the function name (as a string). An alternative, used by code profilers (such as Numega TrueTime), is to parse the binary code, and inject hooks to it. The best alternative is to use compiler supplied hooks. In this hook_penter() and _pexit()the function address is written to the registrar. A symbol engine can then convert the address to a well-formed function name. This requires compilation with debug symbols. The technique is thoroughly described by John Robbins in his article on creating the Smooth Working Set tool (MSDN Magazine, October and December 2000; http://msdn.microsoft.com/msdnmag/ issues/1000/bugslayer/bugslayer1000.asp). This technique seems to be the ideal solution, but since I use VC 6, the compiler supplies only the _penter(). I tried injecting my own _pexit() from it but failed. A full explanation is beyond the scope of this article.
Conclusion
Admittedly, I do not use stack traces every day. For those sneaky bugs, however, I need the stack trace to be reliable and supply as much data as possible. The function call history was added at a later stage. At first we were skeptical of its usabilityafter all, no one wants to browse through hundreds of thousands of lines. We learned quickly that it is quite easy to find the interesting portion and get to the usual "Hey, why is that call done here in the sequence?!" ritual.
Acknowledgment
Thanks to John Robbins for his help with the function hooks.
DDJ
Listing One
#define DECLARE_STACKTRACE StackTrace g_StackTracer; // helper class for making life easier for the users of the StackTrace // When entering a function, simply construct an object of this class class AutoStackTrackEntry { public: AutoStackTrackEntry(IN const char* message ,const char* _ebp) { g_StackTracer.Add(message,_ebp); } ~AutoStackTrackEntry() { g_StackTracer.Remove(); } }; // macro to make life even easier using AutoStackTrackEntry. // You should have #define DECLARE_STACKTRACE // before any use of the STACK_TRACE() macro #define STACK_TRACE(msg) \ unsigned int _ebp; \ {__asm mov _ebp,ebp } \ AutoStackTrackEntry _stackEntry(msg, reinterpret_cast<char*>(_ebp + 4))
Listing Two
Add( function_name, args_buffer) { tls_index = GetTlsForThread(GetCurrentThreadId()) ; if(NOT_FOUND == tls_index) { allocate structure for the call stack, put it in the TLS; insert the mapping 'tid -> tls' to the map. } StackTraceStruct *pStruct = data_block_from(tls_index); Update pstruct with function_name, args_buffer If(should write function trace) Notify worker thread with the current parameters. }
Listing Three
int Tester::g(int k) { STACK_TRACE("Tester::g"); ... }