Designing a Watchdog
Figures 1-5 show a way you could design a simple virtual memory analyzer/watchdog component that helps your program stay alive, at least long enough to get some diagnostic information out, if not longer. The idea is to track virtual memory as the various components in your program reserve or commit it. For this to work, you need code that intercepts the system calls that are responsible for reserving, committing, and freeing virtual memory. On some operating systems, such as Linux, the choice of which calls to intercept is straightforward: On Linux, virtual memory regions are created via calls to mmap() and released via calls to munmap(). On other operating systems, such as Windows, there's no documented API function that's responsible for creating all of the virtual memory regions for use by your process. However, there is an API function, VirtualAlloc(), which you can use to create some regions. If you debug into VirtualAlloc(), you'll reach an exported function that's called for most, if not all, of the regions your program creates. On current versions of Windows, including Vista and XP, this function is called NtAllocateVirtualMemory(). This function is paired with NtFreeVirtualMemory(), which is invoked to release regions.
There are numerous ways to arrange function interception. A simple approach is to replace the first few bytes of the function to be intercepted with an instruction, such as a jump, that passes control to a routine that you'd like to invoke whenever that function is called. Your routine can then restore the first few bytes of the intercepted function, call it with its original parameters, intercept it again, and then do any processing that you have in mind. This simple approach can meet the interception needs of the virtual memory analyzer/watchdog concept in Figures 1-5. Listing One (available at www.ddj.com/code/ does all of this on x86 architecture systems running Windows. Note that the code reached via the jump instructions can be improved for multithreaded programs if you add some form of serialization mechanism so that the target system calls will always be intercepted when a new thread comes along. Some suggestions regarding the placement of synchronization calls are provided in Listing One, which sets up a handler for out-of-memory exceptions. The routines called by this handler will make use of the information tracked via the intercepted functions that create and release virtual memory regions.
You can arrange the interception of system calls to take place automatically when your virtual memory analyzer/watchdog module loads. This is accomplished (on Windows) in Listing One by doing the interception within the module's DllMain() call. That way, only one line of code is needed to load the module and kick off its virtual memory tracking mechanism on the fly. On Windows, the relevant line of code is a LoadLibrary() call; see Listing Two (also available at www.ddj.com/code/). After this call, the virtual memory allocation/deallocation calls in Listing Two are intercepted. Alternatively, you can dispense with the LoadLibrary() call altogether and link your virtual memory analyzer/watchdog module statically. If you statically link your watchdog to a component that loads at the beginning of each run, that causes virtual memory tracking to start early in the run, giving your watchdog more regions to choose from if virtual memory runs low.
Figures 1-3 describe virtual memory tracking routines that can be called from the intercepted functions. The effectiveness of these routines depends on what percentage of the actual unused virtual memory regions or pages have been tracked, by the time excessive memory pressure brings your program to a halt. For that reason, the interception should be done at the lowest possible level to catch the most possible regions. It should also be set up as early as possible during the run. The regions can be tracked in a list that's ordered by the regions' base addresses. In Figures 1-3, some data items are associated with each tracked region. These data items include the call chain leading to the creation of the region, and a timestamp. The timestamp is used when the program runs out of memory, to pick a region that's been unused for a long time as a target for reuse. The call chain can serve to identify the component responsible for creating the region. Listing Three (available at www.ddj.com/code/) provides a simple routine for collecting a call chain on the Intel platform.