Complex multi-threaded programs often end up a tangle of interconnected objects or threads. If some parts need to be stopped (such as when a user disconnects), things become complicated. Even the orderly stopping of the entire application becomes difficult: Objects keep referencing other objects that don't exist anymore. That's why you sometimes see programs that crash on an attempt to exit.
Java programmers are bound to chime in at this point and say: "Java programs don't crash!" Right. They just throw the unhandled exceptions.
The issue usually lies in the circular connections: Two objects, each with its thread, are referencing each other. If you try to destroy one, another one might reference it and blow up. It's not always that symmetrical, but that's the gist of the problem.
But Java programmers will likely chime in again and say: "The garbage collection takes care of it!" For one thing, the garbage collection means that there are no proper destructors. The finalize() methods may get delayed for an unknown length of time, until the garbage collector decides to do its sweep. Which is not good for the expensive resources like file descriptors. They still have to be freed manually. For another thing, one object's finalize method would find another object already finalized, and might not like it. So even though things are simpler, careful consideration is still needed, and the approach I present in this article might come in useful.
Java has another useful concept -- weak references. A weak reference refers to an object but doesn't prevent this object from being garbage collected if all the normal strong references to it are gone and the system starts running short of memory. If that has happened, the weak reference would contain null. Copying a weak reference to a strong one works as an atomic act of getting a hold on the object. Deleting the strong reference releases the hold.
Consider a small practical example of a system. Consisting of the following objects, each with its own thread:
- The worker object. It performs some operations on some data. The nature of operations is not important here. There may be multiple objects of this kind. For simplicity, let's consider only one but bear in mind there may be more. At some point this thread exits, and the object is destroyed.
- The RPC listener. This object and its thread listen for incoming TCP/IP connections and creates a new thread for each of them.
- The RPC client objects. Each keeps the context for an RPC client that may issue calls to the worker object. These calls happen in the context of the RPC client thread (i.e., they aren't somehow queued for the worker object thread to process). This thread simply calls the methods of the worker object, which somehow synchronize themselves to keep their object consistent. The calls may be used to get the state of the worker or to change its behavior. The RPC client threads don't exit when the worker thread exits. The client may still have some response data buffered to send to the user. Or they may still be useful for doing calls on the other worker objects. Or they may be able to create the new worker objects. There might be a myriad of reasons.
The point here is that the worker object might be "pulled from under" the rest of the objects. And the weak references seem to be a convenient way to implement it.
When the RPC listener gets created, it would be given a weak reference to the worker object. As it creates the client threads, they will be given copies of that weak reference (also weak). Whenever they get a call to process, they would convert the weak reference to a strong one, thus getting a hold on the worker, check if the reference is still valid, and proceed with the call. If the reference is not valid any more, they would return an error code right away. After the call they would release the strong reference, thus releasing the hold. If the worker object gets deleted, it would first go and invalidate all the weak references. If some references are being held, it would have to wait for them to be released first. After that the worker object becomes unaccessible and may be safely destroyed.
This works only as long as the calls are reasonably short. If some call is unbounded in time (such as "wait until the worker enters a certain state"), the RPC client must hold the worker object in some other way and release the reference back to weak before waiting. Otherwise a deadlock would occur.