Maybe a Lock-Based Solution Plus Extreme And Perfect Discipline?
But let's keep pushing on the lock-based approaches for a moment. Let's assume that every piece of code in our universe is under our control and all of it always strictly plays fair and never gets the locking order wrong, not even by mistake. Quite an assumption, but let's assume.
Is that enough to be thread-safe? Not in general, for the simple reason that this approach only works if the only locks you ever acquire are acquired not only in the same order but at the same time, so that some code somewhere knows what all the locks in use at a given time will be and can acquire them at once. Put another way, locking operations can't safely nest. That's a problem, because it basically rules out safe locking in layered applications and libraries, which covers most of the world's significant software.
To make this more concrete, consider: What does SharedAdd
have to do to calculate the new values? For example, in the "read a
and b
and set result
to proper values" code, can it call (possibly virtual) member functions of a
or b
? Can it call helper functions? And does whatever it calls, and all the functions those things call, and so on, ever try to acquire any locks?
If even one part of the code that gets executed inside the a
and b
locks caused us to call a function somewhere that could try to acquire a lock, we're open to deadlock again, even if every function everywhere in the code globally respected the discipline to acquire locks in order. Consider: What if during "read a
and b
and set result
to proper values" we end up calling a function that tries to acquire a lock on an object c whose address is between a
's and b
's? That, too, is a recipe for deadlock because if we're doing that while some other thread already holds the lock for c, and then that thread tries to lock a
or b
(say, by calling SharedAdd(a,b)
), we're toast. Kaput.
This demonstrates the fundamental issue that lock-based synchronization isn't composable, meaning that you can't take two concurrency-correct lock-based pieces of code and glue them together into a bigger concurrency-correct piece of code. Building a large system using lock-based programming is like building a house using bricks that are made out of different kinds of volatile materials that are stable in isolation but not in all combinations, so that they often work well and stack nicely but occasionally they cause violent explosions when the wrong two bricks happen to touch.
That's why calling into any unknown code while holding a lock is a recipe for deadlock.
Let me share one data point that illustrates both that lock-based programming is hard and that very few people are actually doing much of it: Many shipping frameworks, even very popular ones by very savvy companies, do dangerous stuff that they're just getting away with because developers aren't writing much lock-based code. Here's an example pointed out to me by Chris Brumme [5]: Look in just about any framework, even ones that are designed for concurrency including the Java libraries and the .NET Framework, and you will find code that calls a virtual function while holding a lock which is a recipe for deadlock. (Think about it: Deadlock can happen anytime you hold a lock and try to acquire another one, unless you can rule out that another thread might try to acquire the locks in the other order. So if you're holding a lock, and then call a virtual function that could or should be overridden by code outside your framework, which by definition means you're calling out into unknown and unknowable black-box code... One shudders.) And the folks who write those libraries are smart people. Locking is the status quo, but it's not good enough for widespread use.
There are other lock-based solutions we haven't talked about. One is to use monitors, or objects that lock themselves so as to guarantee that all member functions are synchronized with respect to each other, so that no two threads can be executing member functions on the same object at the same time. Examples in the field include Java's Vector
and the .NET Framework's SyncHashTable
. Fundamentally, just the fact that monitors use locks shows that they're open to the same issues as all uses of locks, including deadlock if their implementations ever call into unknown code while holding the object's lock. And such internally locked objects have their own additional pitfalls, notably that they are limited and don't solve the specific problem we've been considering, which requires atomic use of both a
and b
at the same time (and locking each individual member function of a
and b
isn't enough to do that).
Conclusion
In this article, I've focused on only the one narrow question of how to write a lock-based program correctly, meaning that it works (avoids data corruption) and doesn't hang (avoids deadlock and livelock). That's the minimum requirement to write a program that runs at all. Along the way, I've noted reasons why many concurrent programs in the field today aren't actually correct, but are just getting away with lock-based programming flaws because concurrency isn't yet in routine widespread use.
I have virtually ignored the equally difficult question of how to write a lock-based program efficiently, one that actually scales decently when multiple processors are available and avoids all the common pitfalls, from priority inversion to serialization to convoying to aggressive locking, that can easily render it no better, and even worse, than a single-threaded program.
I've also ignored lock-free programming, which avoids nearly all the problems of lock-based programming but creates even more; experts readily admit that lock-free programming is much harder still than writing lock-based code.
Personally, I think that the most important sentence in [1] and [2] was the fourth and last conclusion: "4. Finally, programming languages and systems will increasingly be forced to deal well with concurrency." Before concurrency becomes accessible for routine use by mainstream developers, we will need significantly better, still-to-be-designed, higher level language abstractions for concurrency than are available in any language on any platform. Lock-based programming is our status quo and it isn't enough; it is known to be not composable, hard to use, and full of latent races (where you forgot to lock) and deadlocks (where you lock the wrong way). Comparing the concurrency world to the programming language world, basics such as semaphores have been around almost as long as assembler and are at the same level as assembler; locks are like structured programming with goto
; and we need an "OO" level of abstraction for concurrency.
The good news is that many people are working on this problem. As the next-generation programming models become available, they will be a key factor in opening the gates to the broad and strong adoption of concurrent programming that will let us make our programs simpler (by separating logical control flows); more responsive (by accomplishing more work during idle time such as waiting for disk and memory); and fully able to exploit the continuing exponential gains in processor throughput.
References
[1] Sutter, H. "The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software," Dr. Dobb's Journal, March 2005. Available online at http://www.gotw.ca/publications/concurrency-ddj.htm.
[2] Sutter, H. "The Concurrency Revolution," C/C++ Users Journal, February 2005. This is an abbreviated version of [1].
[3] Alexandrescu, A. "Lock-Free Data Structures," C/C++ Users Journal, October 2004.
[4] Alexandrescu, A. and M. Michael. "Lock-Free Data Structures with Hazard Pointers," C/C++ Users Journal, December 2004.
[5] http://blogs.msdn.com/cbrumme.
Herb Sutter (http://www.gotw.ca/) chairs the ISO C++ Standards committee and is an architect in Microsoft's Developer Division. His most recent books are Exceptional C++ Style and C++ Coding Standards (Addison-Wesley).