Code Reviews
Many software processes suggest frequent code reviews as a means of improving software quality. The complexity of parallel programming makes this task challenging. While not a replacement for using well established parallel programming design patterns, code reviews may, in many cases, help catch bugs in the early stages of development.
One technique for these types of code reviews is to have individual reviewers examine the code from the perspective of one of the threads in the system. During the review, each reviewer steps through the sequence of events as the actual thread would. Have objects that represent the shared resources of the system available and have the individual reviewers (threads) take and release these resources. This technique will help you visualize the interaction between different threads in your system and hopefully help you find bugs before they manifest themselves in code.
As a developer, when you get the urge to immediately jump into coding and disregard any preplanning or preparation, you should consider the following scenarios and ask yourself which situation you'd rather be in. Would you rather spend a few weeks of work up front to validate and verify the design and architecture of your application, or would you rather deal with having to redesign your product when you find it doesn't scale? Would you rather hold code reviews during development or deal with the stress of trying to solve mysterious, unpredictable showstopper bugs a week before your scheduled ship date? Good software engineering practices are the key to writing reliable software applications. Nothing is new, mysterious, or magical about writing multithreaded applications. The complexity of this class of applications means that developers must be conscious of these fundamental software engineering principles and be diligent in following them.
Extending your Application: Using Trace Buffers
Two categories of bugs are often found in multithreaded applications: synchronization bugs and performance bugs. Synchronization bugs include race conditions and deadlocks that cause unexpected and incorrect behavior. Performance bugs arise from unnecessary thread overhead due to thread creation or context switch overhead, and memory access patterns that are suboptimal for a given processor's memory hierarchy. The application returns the correct results, but often takes too long to be usable. This chapter focuses on debugging synchronization bugs that cause applications to fail.
In order to find the cause of these types of bugs, two pieces of information are needed:
- Which threads are accessing the shared resource at the time of the failure.
- When the access to the shared resource took place.
In many cases, finding and fixing synchronization bugs involves code inspection. A log or trace of the different threads in the application and the pattern in which they accessed the shared resources of the code helps narrow down the problematic code sections. One simple data structure that collects this information is the trace buffer.
A trace buffer is simply a mechanism for logging events that the developer is interested in monitoring. It uses an atomic counter that keeps track of the current empty slot in the array of event records. The type of information that each event can store is largely up to the developer. A sample implementation of a trace buffer, using the Win32 threading APIs, is shown in Listing One ( In the interest of making the code more readable, Listing One uses the time()
system call to record system time. Due to the coarse granularity of this timer, most applications should use a high performance counter instead to keep track of the time in which events occurred.)
Listing One: Sample Implementation of a Trace Buffer.
// Circular 1K Trace buffer #define TRACE_BUFFER_SIZE 1024 typedef struct traceBufferElement { DWORD threadId; time_t timestamp; const char *msg; } traceBufferElement; static LONG m_TraceBufferIdx = -1; static traceBufferElement traceBuffer[TRACE_BUFFER_SIZE]; void InitializeTraceBuffer() { m_TraceBufferIdx = -1; /* initialize all entries to {0, 0, NULL} */ memset(traceBuffer, 0, TRACE_BUFFER_SIZE*sizeof(traceBufferElement)); } void AddEntryToTraceBuffer(const char *msg) { LONG idx = 0; // Get the index into the trace buffer that this // thread should use idx = InterlockedIncrement(&m_TraceBufferIdx) % TRACE_BUFFER_SIZE; // Enter the data into the Trace Buffer traceBuffer[idx].threadId = GetCurrentThreadId(); traceBuffer[idx].timestamp = time(NULL); traceBuffer[idx].msg = msg; } void PrintTraceBuffer() { int i; printf("Thread ID Timestamp Msg\n"); printf("----------|----------|----------------------" "-----------------\n"); // sort by timestamp before printing SortTraceBufferByTimestamp(); for (i = 0; i < TRACE_BUFFER_SIZE; i++) { if (traceBuffer[i].timestamp == 0) { break; } printf("0x%8.8x|0x%8.8x| %s\n", traceBuffer[i].threadId, traceBuffer[i].timestamp, traceBuffer[i].msg); } }
Listing One, creates a trace buffer that can store 1,024 events. It stores these events in a circular buffer. As you'll see shortly, once the circular buffer is full, your atomic index will wrap around and replace the oldest event. This simplifies your implementation as it doesn't require dynamically resizing the trace buffer or storing the data to disk. In some instances, these operations may be desirable, but in general, a circular buffer should suffice.
Lines 1-13 define the data structures used in this implementation. The event descriptor traceBufferElement
is defined in lines 4-9. It contains three fields: a field to store the thread ID, a timestamp value that indicates when the event occurred, and a generic message string that is associated with the event. This structure could include a number of additional parameters, including the name of the thread.
The trace buffer in Listing One defines three operations. The first method, InitializeTraceBuffer()
, initializes the resources used by the trace buffer. The initialization of the atomic counter occurs on line 16. The atomic counter is initialized to -1. The initial value of this counter is -1 because adding a new entry in the trace buffer requires us to first increment (line 29) the atomic counter. The first entry should be stored in slot 0. Once the trace buffer is initialized, threads may call AddEntryToTraceBuffer()
to update the trace buffers with events as they occur. PrintTraceBuffer()
dumps a listing of all the events that the trace buffer has logged to the screen. This function is very useful when combined with a debugger that allows users to execute code at a breakpoint. Both Microsoft Visual Studio and GDB support this capability. With a single command, the developer can see a log of all the recent events being monitored, without having to parse a data structure using the command line or a watch window.
Note that the implementation of the trace buffer in Listing One logs events as they are passed into the buffer. This doesn't necessarily guarantee that the trace buffer will log events exactly as they occur in time. To illustrate this point, consider the two threads shown in Listing Two.
Listing Two: Two Threads Logging Events to a Trace Buffer.
unsigned __stdcall Thread1(void *) { // ... thread initialization // write global data m_global = do_work(); AddEntryToTraceBuffer(msg); // ... finish thread } unsigned __stdcall Thread2(void *) { // ... thread initialization // read global data Thread_local_data = m_global; AddEntryToTraceBuffer(msg); // ... finish thread }