By now it should be clear what the problem is. A race condition exists between the two threads and the access to the trace buffer. Thread1
may write to the global data value and then start logging that write event in the trace buffer. Meanwhile, Thread2
may read that same global value after the write, but log this read event before the write event. Thus, the data in the buffer may not be an accurate reflection of the actual sequence of events as they occurred in the system.
One potential solution to this problem is to protect the operation that you want to log and the subsequent trace buffer access with a synchronization object. A thread, when logging the event, could request exclusive access to the trace buffer. Once the thread has completed logging the event, it would then unlock the trace buffer, allowing other threads to access the buffer. This is shown in Listing Three.
Listing Three: Incorrectly Synchronizing Access to the Trace Buffer.
// This is NOT RECOMMENDED unsigned __stdcall Thread1(void *) { // ... thread initialization // write global data LockTraceBuffer(); m_global = do_work(); AddEntryToTraceBuffer(msg); UnlockTraceBuffer(); // ... finish thread } unsigned __stdcall Thread2(void *) { // ... thread initialization // read global data LockTraceBuffer(); Thread_local_data = m_global; AddEntryToTraceBuffer(msg); UnlockTraceBuffer(); // ... finish thread }
There are a number of drawbacks to this technique. Using a synchronization primitive to protect access to a trace buffer may actually mask bugs in the code, defeating the purpose of using the trace buffer for debug. Assume that the bug the developer is tracking down is related to a missing lock around the read or write access in the thread. By locking access to the trace buffer, the developer is protecting a critical section of code that may be incorrectly unprotected. Generally speaking, when tracking down a race condition, the programmer should avoid synchronizing access to the trace buffer. If you synchronize access and your application works, it's a clue that there may be a problem in the synchronization mechanism between those threads.
The preferred method to overcoming this limitation is to log a message before and after the event occurs. This is demonstrated in Listing Four.
Listing Four: Preferred Method of Logging Messages with a Trace Buffer.
unsigned __stdcall Thread1(void *) { // ... thread initialization // write global data AddEntryToTraceBuffer(before_msg); m_global = do_work(); AddEntryToTraceBuffer(after_msg); // ... finish thread } unsigned __stdcall Thread2(void *) { // ... thread initialization // read global data AddEntryToTraceBuffer(before_msg2); Thread_local_data = m_global; AddEntryToTraceBuffer(after_msg2); // ... finish thread }
By logging a before and after message, a programmer can determine whether or not the events occurred as expected. If the before and after messages between the two threads occur in sequence, then the developer can safely assume that the event was ordered. If the before and after messages are interleaved, then the order of events is indeterminate; the events may have happened in either order.
A trace buffer can be used to gather useful data about the sequence of operations occurring in a multithreaded application. For other more difficult problems, more advanced threading debug tools may be required.