Shared and Exclusive Use Data
The first application-level abstraction here is based on avoiding sharing conflicts by limiting application access to only data that is guaranteed not to have update conflicts (sometimes known as "thread local" data). In particular this avoids complexities in implementation of "thread safe" objects.
This approach is based on the use of a type of "smart" or encapsulated pointers of two kinds:
- Shared Use Pointers have a constant reference to read-only data instances, and hence there can be no update conflicts. An updater might install a new instance of the object, but the readers can continue with their previous but consistent version of the data.
- Exclusive Use Pointers reference data instances that are guaranteed to be accessible by only a single task at a time.
A synchronization point lets the updated instance be installed as a new current version, and made available for shared use, if the referenced base has not been changed during the update processing. Otherwise, the application must retry its update of the referenced data.
Shared and Exclusive Use Pointers are proxy objects based on the usual extended pointer semantics.
The use of these pointers is an extension of an approach presented in Lock-Free Data Structures, by Andrei Alexandrescu, and Lock-Free Data Structures with Hazard Pointers, by Andrei Alexandrescu and Maged Michael.
The development of this concept here is in three stages:
- A single access to a single object.
- Single access to an object from a collection.
- Multiple accesses to different objects.
Single Access to a Single Object
Single access support is useful when consistent state data for an application is all maintained in a single object. An example might be an object that reflects the status of an external entity, which is subject to updates, but which always provides a consistent point-in-time view of the object.
A Shared Resource Manager and an application Object Accessor provide the basic support here.
The implementation here is supplied by a Shared Resource Manager, which owns all the shared data, which is responsible for maintaining a Current Version pointer for each object, and which tracks multiple instances of objects returned to clients.
- For Shared Use Pointers, the Manager returns a reference to an instance of the then Current Version. An initial or default constructed instance is provided when no Current Version exists. The Manager tracks either the number of, or a list of references to, each shared instance and only allows release of the storage when there are no longer any references to it. Maintaining a lock-free list is more complex than just using a counter, but it can allow recovery, when the owner of reference fails and does not release it.
- For Exclusive Use Pointers, the Manager creates a copy of the Current Version which can then be updated. To make the new version available to other tasks, the application attempts to Install the updated instance. If the Current Version has not changed, this succeeds; otherwise, the operation is flagged and the application is given a new instance of the then Current Version and must repeat the update processing.
In that the Manager is responsible for the constructing, copying and freeing of an object, it essentially requires an object lifetime controller similar to the STL notion of an Allocator which allocates and de-allocates space, and which also constructs and destroys the allocated objects. This "Allocator" can be supplied as a policy to the Manager.
The Manager tracks and identifies referenced data instances as:
- Update candidates, until installed,
- Current, when successfully installed,
- In use, when there are current Shared Use references,
- Retired, when no longer referenced.
All tasks using the same shared data need to reference the data through the same Manager. However, sharing of different data can be controlled by different Managers, with possibly different policies. At the application level, code and data declarations for use by a particular Manager need to be identified in design documents and typically packaged together in the code.
Typically an application would provide an Object Accessor which would be responsible for:
- Initially defining and accessing an appropriate Shared Resource Manager.
- Returning Shared or Exclusive Use Pointers supplied by the Manager.
- Possibly initializing the referenced object on first use.
- Supplying a destructor for the Pointers to release use of the referenced objects.
- Providing an Install interface for clients of an Exclusive Use Pointer to register an updated object.
Typical use of the Object Accessor by an updater might look somewhat like:
// Create an Accessor for a Shared Resource Manager Library::XUsePtr<Data> dataPtr = accessor->getXUsePtr( ) ; // initialize pointer while ( ! dataPtr.isInstalled( ) ) { dataPtr->updateProcessing ( ) ; // Process data dataPtr.install ( ) ; // Attempt to install updates } ;
Single Access to a Collection of Objects
The previous section described shared access to a single object, which could be an entire Collection. Providing Shared and Exclusive Use Pointers to just members of a Collection requires further refinement.
Shared Collections are based on a Map or Index with unique primary keys or index values that reference a Data Object through a Control Element. For each primary key, the Control Element tracks the Current Version, any shared instances, and any update candidates.
Here the Shared Resource Manager is extended to manage either all the primary keys or just those that are being shared.
To manage all the entire collection, the Manager can internally maintain a Map for the Data Objects. However, in many applications with large Collections, only a few members are ever shared at any one time. Here the applications can maintain local copies of the actual Collection, with the Shared Resource Manager maintaining only instances currently being accessed. This is similar to a cache concept for current or recently accessed objects.
The local application collections can take many forms with the principle requirement being that the shared elements are identified by a unique primary key.
The Object Accessor becomes a Collection Adapter. This Adapter can access local collections, and it can create and initialize an internal Map. When accessed, the objects are tracked by the Shared Resource Manager.
Shared Use Pointers essentially serve as iterators for the Collection Adapters. These iterators can be forward, bidirectional or random, as determined by their underlying Collection capabilities.
A Shared Use Pointer can be used to iterate through the local Collection to find interesting Objects. It can then be converted by the application to an Exclusive Use Pointer in order to gain write access to the object.
To add objects to the Collection, an Exclusive Use Pointer is first obtained for the new primary key.
Federated Collections are useful here to provide more than one access path to the primary key of the object to be referenced. Federated Collections report additions, deletions and modifications to other Collections. Here the other collections could also be updated through lock-free techniques as described in the following sections.
Multiple Accesses
So far our discussion has related only to a task updating a single object at a time. Updating multiple objects at the same time requires the notion of a transaction manager that has the property that, when transactions are committed, either all the updates take effect, or none do.
Here, the client starts a transaction, synchronizes a set of Shared and Exclusive Use Pointers for the transaction, and uses the Exclusive Use Pointers to create and modify shared objects. At the end of the transaction, commit processing attempts to install the set of updates.
Synchronization processing occurs at two points:
- At the start of a transaction to obtain pointers to a set of consistent data objects.
- At the end of a transaction to install a new consistent set of updated objects.
Synchronization processing to obtain a set of consistent data objects occurs in two phases:
- A list is made of Current Version pointers for all the data objects requested.
- The list is then scanned to see that the all the entries still reference their Current Version. If not, the process is repeated until a consistent set is found.
Synchronization processing for the commit processing to install a set of consistent data objects occurs in four phases.
- A list is made with previous and proposed pointers for all data objects that have changes pending for the transaction. The list is ordered so that all transactions process common elements in the same order.
- An attempt is made to claim ownership of each element in the list. If there is an element conflict with another transaction, one of the transactions is flagged as failed. If a transaction can claim all elements in the list and is not flagged, then it succeeds.
- When the transaction has succeeded, proposed pointers are installed, either by the transaction that is successful, or by any other task that attempts to use the referenced objects.
- When the transaction completes, it returns a flag (or throws an exception) to the synchronization request.
Choosing a transaction to fail can be based on criteria such as priority. Alternatively it may be useful to always advance the oldest transaction. This latter approach improves worst case responsiveness. More importantly though, it insures that the system always makes progress and does not start thrashing with a cycle of transactions that alternately contend for common resources and, thereby, keep failing each other.