You can't go far in the software industry these days without hearing about the inevitability of multicore processors. Herb Sutter may have put it best in his Dr. Dobb's Journal article when he said, "The free lunch is over" (www.ddj.com/dept/architect/184405990). It is telling that this warning was quoted by Intel's Tim Mattson and James Reinders in a web-based presentation that explained why multicore is a reality, and is here to stay. Attempting to scale processor speed using traditional means such as increased clock frequency and deeper pipelines is running into physical limitations, putting an end to the "free lunch" in performance that developersand usershave become accustomed to. Luckily, the answer is straightforward: Instead of attempting to make more complex single cores, processor makers are putting multiple simpler cores on the same die.
Multicore designs have been used in other processors for some time. Less restricted by traditional architectural designs, graphics processing units (GPUs), and the Cell Broadband Engine (Cell BE) processor (www.ddj.com/dept/64bit/197801624) by Sony, Toshiba, and IBM, have demonstrated tremendous performance improvements employing massively parallel approaches to processor architecture. These processors provide opportunities for high-performance applications. But the multicore revolution isn't limited to these processors. With multicore designs being adopted by CPU vendors such as AMD and Intel, parallel programming is a necessity for all developers.
The Problem with Multicore
Although multicore clearly provides potential for high performance, software needs to explicitly take advantage of the multiple cores to fulfill this potential. Sadly, no compiler turns your serial algorithms expressed using C++ into perfectly parallelized programs that scale to an arbitrary number of cores. Traditional approaches such as multithreading force you to spend time worrying about thread management, instead of designing scalable parallel algorithms. With the number of threads growing as the number of cores does, dealing with bugs caused by deadlocks and race conditions hampers developer productivity significantly. At four or more cores, the complexity of threading becomes a serious problem.
So are there high-level approaches to parallel computing that map well to multicore architectures? After all, parallel computing is nothing new. Architectures such as the Thinking Machines Connection Machine CM-2 sported as many as 65,000 processors in the late '80s. However, past high-level approaches to parallel computing have often been focused on new languages or nonstandard extensions to existing languages. This approach allows a great deal of flexibility in terms of syntax, but means that developers are forced down a heavy migration path by dropping established tools and languages. The lack of good approaches to parallel programming that embrace the existing toolchain was one of the factors that led to the sluggish adoption of parallel computing in the past.