Tom is lead Java developer at Kizoom, which provides travel information to mobile devices. He can be contacted at [email protected].
Version 1.2 of the Java 2 SDK, Standard Edition introduced a class called "ThreadLocal" for concurrent programming. In this article, I examine how you can use thread-local variables to improve the performance of frequently used utility classes with instances that must be shared between multiple threads. Scenarios such as these are common on application servers that have many long-running execution threads.
There are two common approaches to making nonthreadsafe code safe for concurrent usesynchronization and segregated instances. Using synchronization means a single instance of the unsafe class may be used, but sacrifices "liveness" because only one thread can use the object at any one time. Segregated instances work by creating a new object for each client, which wastes more resources but does not sacrifice liveness. You choose which approach is better depending on the requirements of your applicationdo you want to sacrifice resources or liveness for thread safety? The thread-local approach offers a third way by creating an object per thread, so different threads cannot interfere with each other's objects. This approach maintains liveness, but not at the expense of resources because only so many instances of threads are created.
A Threadsafe Date Parser
To illustrate, consider a DateParser interface that parses a String representation of a date into a java.util.Date object, as in Listing One. The contract of this interface requires that multiple threads can safely share a single instance of the DateParser. Granted, a real date parser would support many formats but, for simplicity, our example understands only one.
A naive implementation of the DateParser interface uses a single instance of java.text.SimpleDateFormat to do its parsing; see UnsynchronizedDateParser in Listing Two. Although it is not mentioned in the Javadoc for SimpleDateFormat, multiple threads may not safely call SimpleDateFormat's parse method. Therefore, UnsynchronizedDateParser does not satisfy the DateParser interface contract because, under some circumstances, SimpleDateFormat (and therefore, UnsynchronizedDateParser) will throw an ParseException or an unchecked numberFormatException when two threads call the parse method at the same time.
The most common way to fix the broken UnsynchronizedDateParser is via synchronization. If you make the parse method synchronized (see SynchronizedDateParser in Listing Three), then the implementation satisfies the contract because only one thread at a time calls the unsafe parse method on the underlying SimpleDateFormat. Another simple fix is to create a fresh instance of SimpleDateFormat each time the parse method of DateParser is called (as in NewInstanceDateParser in Listing Four).
Figure 1, a performance comparison of the two simple approaches to thread safety generated by an application called "PerformanceTester" (available electronically; see "Resource Center," page 5), reveals that using synchronization is considerably better. PerformanceTester exercises each DateParser implementation with several runs, each differing in the number of threads that access the DateParser instance simultaneously. For each run, the number of dates parsed was the same, so you can compare the effect of contention between threads. ControlDateParser is a no-op implementation of DateParser with a parse method that returns null. It is included to gauge the precision of the other results.
Figure 1 also includes a DateParser implementation that uses ThreadLocal. It performs better than the two simple approaches, particularly as the number of threads increases.
Implementing a Date Parser with ThreadLocal
The ThreadLocal class was designed to make it simple for you to use thread-local variables in your programs. The work of creating a new variable for each new thread is taken care of, as are other issues that are surprisingly tricky to get right, such as the task of ensuring that thread-local variables are eligible for garbage collection when the thread terminates. All you have to do is initialize (and possibly update) the values of your thread-local variables. To do this, you subclass ThreadLocal and provide an implementation of the initialValue method, which is used to set the initial value of the thread-local variable for each thread. The value is accessed and updated using the get and set methods of ThreadLocal. Multiple threads access a single instance of the specialized ThreadLocal class, which manages the values for each thread and ensures that different threads see different values.
For the date-parser example, each thread should hold a separate instance of SimpleDateFormat. This is the approach taken by ThreadLocalDateParser in Listing Five. When the parse method of ThreadLocalDateParser is called, it calls the get method of the specialized ThreadLocal class (DATE_PARSER_THREAD_LOCAL). If the current thread has not called parse before, then calling get on the ThreadLocal object causes it to initialize itself by calling the overridden initialValue method. In this case, a new SimpleDateFormat is createdand immediately returned by the get method. Subsequent calls to parse in the same thread will see the same SimpleDateFormat instance. However, a different thread will see a different SimpleDateFormat instance. The ThreadLocal mechanism ensures each thread has its own independent copy of a SimpleDateFormat variable; therefore, the implementation of ThreadLocalDateParser is safe and satisfies the interface contract.
One way to understand ThreadLocal is to imagine that it contains a hash table mapping threads to values. When ThreadLocal's get method is called, it uses the current thread (retrieved using Thread.currentThread()) as the key to look up the value to return. If there is no such key in the hash table, then initialValue is called to populate the value. Not all implementations necessarily use this strategy for the ThreadLocal class; however, it is a useful model for conceptualizing ThreadLocal. Figure 2 is the class diagram of this conceptual model.
Although there are as many thread-local variables as there are threads in an application, there should typically only be a single instance of the ThreadLocal subclass. This is because the ThreadLocal instance is a container for thread-local variables. In ThreadLocalDateParser (Listing Five), the anonymous ThreadLocal subclass is a private static field, which ensures there is only a single instance for the lifetime of the application.
On the other hand, if multiple instances of ThreadLocal are created for the same thread-local variables, then the benefits of using ThreadLocal are diluted as more variables are created than are needed. Worse, there is a memory leak in older implementations of ThreadLocal whereby the ThreadLocal instance does not become eligible for garbage collection until all threads using it terminateeven if the program has no references to the ThreadLocal instance. Fortunately, there is no chance of experiencing memory leaks if you stick to creating a single ThreadLocal instance for each collection of thread-local variables. Put another way, ThreadLocal objects should be private static fields in the class that uses them.
ThreadLocal Performance
Even though the general contract of the class did not change, Sun's implementations of java.lang.ThreadLocal changed radically between the release of 1.2 and 1.3, and provided a substantial increase in performance. There were further efficiency gains in the 1.4 release. To measure these improvements, I benchmarked the performance of SynchronizedDateParser, NewInstanceDateParser, and ThreadLocalDateParser for Versions 1.4.1_02, 1.3.1_07, and 1.2.2_014 (Figures 1, 3, and 4, respectively) of Sun's JVMs. The timings shown are mean values averaged over five test runs.
The first conclusion to draw from all the results is that creating a new object each time the parse method is called, as NewInstanceDateParser does, is very expensive. While the cost of creating new object instances has come down substantially for each major JVM release (for the example being discussed, 1.4 is almost twice as fast as 1.2), it is still significantly less efficient than using synchronization or ThreadLocal. There is also an extra cost, not measured here, in terms of memory usage and garbage collection for the many objects created.
ThreadLocalDateParser and SynchronizedDateParser show similar levels of performance across all JVMs and numbers of threads. However, two features stand out.
- Synchronization is marginally more efficient than the ThreadLocal implementation on the 1.2 JVM.
- ThreadLocalDateParser scales to larger numbers of threads more effectively than SynchronizedDateParser for 1.3 and 1.4 JVMsthe curve is much flatter.
In fact, over the range of the test (from 1 to 64 threads), the performance of both NewInstanceDateParser and SynchronizedDateParser for 1.3 and 1.4 JVMs falls by a factor of about two; whereas for ThreadLocalDateParser, the factor is about 1.25. So, while synchronization is good, the ThreadLocal approach is even better, especially for large numbers of threads.
Conclusion
It can be difficult to write efficient code that is safe for multithreaded access. Java's ThreadLocal class provides a powerful, easy-to-use solution, while avoiding the drawbacks of other approaches. Plus, ThreadLocal implementations are more efficient, particularly in later JVMs. If you are trying to improve the performance of frequently used classes that use nonthreadsafe resources that are expensive to create (such as XML parsers or connections to a database), try a ThreadLocal implementation.
DDJ
Listing One
import java.text.ParseException; import java.util.Date; public interface DateParser { public Date parse(String text) throws ParseException; }
Listing Two
import java.text.DateFormat; import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.Date; public class UnsynchronizedDateParser implements DateParser { private final DateFormat dateFormat = new SimpleDateFormat("dd/MM/yyyy"); public Date parse(String text) throws ParseException { return dateFormat.parse(text); } }
Listing Three
import java.text.DateFormat; import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.Date; public class SynchronizedDateParser implements DateParser { private final DateFormat dateFormat = new SimpleDateFormat("dd/MM/yyyy"); public synchronized Date parse(String text) throws ParseException { return dateFormat.parse(text); } }
Listing Four
import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.Date; public class NewInstanceDateParser implements DateParser { public Date parse(String text) throws ParseException { return new SimpleDateFormat("dd/MM/yyyy").parse(text); } }
Listing Five
import java.text.DateFormat; import java.text.ParseException; import java.text.SimpleDateFormat; import java.util.Date; public class ThreadLocalDateParser implements DateParser { private static final ThreadLocal DATE_PARSER_THREAD_LOCAL = new ThreadLocal() { protected Object initialValue() { return new SimpleDateFormat("dd/MM/yyyy"); } }; public Date parse(String text) throws ParseException { return ((DateFormat) DATE_PARSER_THREAD_LOCAL.get()).parse(text); } }