Automate Acquire/Release
The whole system has to play by the rulesmeaning the rules you wrote into your program. In particular, acquire and release fencing rules have to apply at every point beyond the source code, because when your program does strange things, it doesn't matter whether it was your compiler that reordered your statements or your processor that reordered the instructions emitted by the compiler. At the processor level, the only way to avoid instruction reordering is to use acquire- and release-style fences that most processors support as explicit standalone instructions. But you don't want to be in the business of writing fences by hand in your source code. So what can be done to control reordering?
Make the compiler both obey the rules and emit the right processor fences for you in one fell swoop: Use abstractions that express critical sections; namely, locks and atomic objects. Consider our first example of lock-based code, and how a compiler might translate it to specific "load-acquire" and "store-release" instructions when compiling for an IA64 processor:
mut.lock(); // "acquire" mut => ld.acq mut.var ... read/write x ... mut.unlock(); // "release" mut => st.rel mut.var
But what about lock-free code? Similarly, for our other opening example of lock-free styles, we get:
while( !myTurn ) { } // "acquire" => ld.acq myTurn ... read/write x ... myTurn = false; // "release" => st.rel myTurn
Both styles end up generating similar instructions because they express the same concept of a critical section. So avoid writing fences by hand; use these abstractions, and let the compiler write them for you.
Summary
Protect all mutable shared objects from races by putting code that accesses them into critical sections. The critical section is a fundamental concept that applies equally to all kinds of synchronization: Entering a critical sectiontaking a lock or reading from an atomic variableis an acquire operation. Exiting a critical sectionreleasing a lock or writing to an atomic variableis a release operation.
Next month, I look at practical examples of how, and how not, to use critical sections, including combining different styles.
Notes
[1] J. Manson, W. Pugh, and S. Adve. "JSR-133: Java Memory Model and Thread Specification" (Java Community Process, 2004).
[2] Note that I'm not specifying whether two consecutive critical sections could overlap, by allowing the "release" end of the first critical section to pass the "acquire" start of the second critical section. Some memory models allow this, whereas others have an additional requirement that acquire and release operations can't pass each other. This design choice doesn't affect the examples in this article.
[3] People regularly propose schemes that are more finegrained and try to let the programmer specify which objects are actually significant and must respect the fence. These have not become very popular, at least not yet, primarily because doing this adds great complexity to the programming model in return for insufficient actual performance benefit in most use cases.