Saving the Stack to the Log File
The task of writing all return addresses from the stack to the log file is done by the function LogStackFrames
(Listing Four). Each stack frame's return address is prefixed with the tag "$ADDRESS:"
. MapAddr.exe just looks for these entries and interprets them as program addresses.
Listing Four: LogStackFrames.
void LogStackFrames(PVOID FaultAddress, char *eNextBP) { char *p, *pBP; unsigned i, x, BpPassed; static int CurrentlyInTheStackDump = 0; ... BpPassed = (eNextBP != NULL); if(! eNextBP) { _asm mov eNextBP, eBp } else fprintf(sgLogFile, "\n Fault Occurred At $ADDRESS:%08LX\n", (int)FaultAddress); // prevent infinite loops for(i = 0; eNextBP && i < 100; i++) { pBP = eNextBP; // keep current BasePointer eNextBP = *(char **)pBP; // dereference next BP p = pBP + 8; // Write 20 Bytes of potential arguments fprintf(sgLogFile, " with "); for(x = 0; p < eNextBP && x < 20; p++, x++) fprintf(sgLogFile, "%02X ", *(unsigned char *)p); fprintf(sgLogFile, "\n\n"); if(i == 1 && ! BpPassed) fprintf(sgLogFile, "*************************************\n" " Fault Occurred Here:\n"); // Write the backjump address fprintf(sgLogFile, "*** %2d called from $ADDRESS:%08LX\n", i, *(char **)(pBP + 4)); if(*(char **)(pBP + 4) == NULL) break; } ... }
LogStackFrames
also prints the first 20 bytes of that area of the stack where arguments to the functions are stored. The determination of whether these are really arguments or local variables would need further inspection by the assembler codeor a look at the source code. According to the language specification, C and C++ functions put their arguments on the stack from right to left, whereby the address following the return address always holds the first argument of the function, if any. Interpreting these values requires a little bit of thinking, but usually this "manual" approach is sufficient.
Although Windows offers a function StackWalk(...)
to walk the stack, for simplicity, this is done using simple pointer dereferencing. If LogStackFrames
is called from an assert
(where BP
is unknown), one line of inline x86 assembler is used to get the value of the BP
register.
MapAddr.exe
The program MapAddr.exe is called from the command line with two argumentsa log file path and a map file path. It first reads the map file and builds a sorted list of all addresses in it. It then creates a copy of the log file (with the suffix ".out") where, for each stack address found by the tag "$ADDRESS:"
, it inserts a line with the appropriate function name. It also prints the offset of the return address from the function start (extracting the actual line within the function from the address would be a more difficult taskalthough not impossible). The result is the log file including function names, as in Figure 1. The function names are mangled with their arguments, as all sources for this article are C++. Therefore, the function
int FunctionC(int x, int y)
becomes
?FunctionC@@YAHHH@Z
in the map file of Visual C++ 6.0. If you use the Visual C++ source browser, you can use the goto
definition
command to jump to the function body while the cursor is positioned on this mangled name.
Analyzing Client Program Faults
By adding one function call to an application, you can now analyze a problem postmortem that occurred on any machine anywhere else. You now know in which function the bug occurred and from where it was called. This certainly is an improvement, but it may only be one piece of an overall strategy for analyzing program faults on client machines. There are a number of considerations:
- Asserts. To find hidden bugs early, use as many asserts as possible and leave them in the code in the shipping release versions. In the release version, an assert should enable postmortem debugging by calling the function
LogStackFrames
with both parametersNULL
(which will determineBP
by an assembler call). Clearly, an assert should always give a message as loudly as it can. "Loud" in this context may mean a modal dialogit can also mean creating an e-mail or any other kind of signal in the case of a server application. - Sometimes asserts that fail may not be errors at all, leading to bug reports that are created solely by an assert. However, that problem is outweighed by the advantage of finding serious bugs quickly and early. Some people argue that there are performance reasons for leaving asserts out in release versions. In most cases, this argument is moot. Performance problems almost always have only two solutionsa profiler and/or better design.
- Log files. Finding all the ways to the problem source often requires further information. Therefore, an application should always create a log file on startup and fill it with information about a program's basic conditions: the program and exact OS version, amount of memory, loaded documents, and so on. During execution, all commands should be logged together with their parameters. When a fault occurs, the current state (free memory, global application states, and the like) is of interest together with the stack frames. All this will lead to more success in locating the problem source.
- Although writing log files is good in general, it can sometimes create problems. I saw a server application writing gigabytes of log files over the course of a few days until the disk was full. Log file writing requires control mechanisms (such as ring logging) that prevent such problems. Besides that, it's always helpful if the number of log statements written can be modified by users.
- Automation. Using MapAddr.exe in a professional context, of course, implies that all map files of all versions must be archived and accessible for postmortem problem analysis. Also, in a professional context, you may write batch scripts (or extend MapAddr.exe), so that the program version is extracted from the log file and the appropriate map file is selected automatically.
Conclusion
No application is error free. The postmortem analysis approach I present here helps you identify and resolve hidden and unreproduceable errors with little impact on the program itself. The technique also reduces the problem of sometimes unclear and guess-based user reports. Using postmortem debugging together with logging and asserts, you can clearly see what happened.
Stefan is a software developer in Berlin specializing in cross-platform development, user interfaces, and reengineering. He can be contacted at [email protected].