Concurrent programming is hard. The key concern for application developers is data sharing. Design, analysis, code, verification, test, debug, performance, and portability are all major issues, but most significant of all are concerns for program correctness.
Basic approaches to sharing data across concurrently running tasks have traditionally been based either on locking of data elements, to provide single-task access to them, or else on transactions that incorporate such locks. (Here I use the term "task" to encompass different forms of asynchronous processing, such as threads, processes, or even executions on remote processors.)
An alternative to the use of locks is lock-free data sharing, otherwise known as "wait-free data sharing." Here, applications provide a set of potential updates to change state data from one externally consistent view to another consistent view. In other words, there is no inconsistent data that is visible to allow race conditions. When the updates are completed, an attempt is made to install either all or none of them. If this fails because other parallel task have changed input assumptions, the application processing must be retried. Of course, nothing is for free here, and there is a trade-off between never having to wait, and having to later repeat the update processing.
Lock-free processing has many advantages but is fundamentally more difficult to implement. In this article, I outline these advantages and propose application-level approaches that avoid these difficulties. In fact, these higher level approaches appear to be even easier to use than lock-based ones, both because they provide explicit specification of requirements for reasoning about correctness and because they avoid complex interactions among competing tasks.
The discussion here addresses three levels of support:
- A description of the basic support environment necessary for lock-free algorithms.
- A description of a set of application-level abstractions for data sharing. These abstractions can also be supported with lock-based algorithms, where the environment support is insufficient, but they are far more powerful when implemented with lock-free algorithms.
- An outline of the library support needed to provide a bridge that implements the application-level abstractions using the basic support level.
In this article, I introduce three application-level abstractions for shared data:
- Avoidance is provided by a type of "smart" pointer abstraction that assures only single task access to shared data,
- Synchronization is provided primarily by a extension of the compare-and-swap (CAS) idiom.
- Moving the problem is provided for by a work queuing abstraction, which in turn is based on simple send/receive abstractions.
My goals here, in order of importance, are:
- Correctness. The key to correctness is the ability to analyze the points of contention and provide assurance that the functioning code is correct. In addition to the usual requirements, "correct" here implies that only valid or consistent state data is operated on, that there are no race conditions or deadlocks, and that all possible time orderings of data updates provide valid results. While it is impossible to insure correctness of code, it is possible to provide abstractions that greatly facilitate reasoning about its validity.
- Usability. The usability intent here is to provide a set of abstractions that can be implemented in libraries by experts, then readily used by experienced programmers with some confidence of correctness.
- Performance. The primary performance intent is to provide abstractions that can be implemented with "lock-free" programming techniques. A secondary performance concern, only partially dealt with here, is the effective use of shared memory and cache resources.
- Completeness. The completeness intent I present here can simply and adequately satisfy basically all common scenarios for application use.