Parallel

Itanium 2 Developer Days Diary

By Robin Drummond, April 27, 2006

The Intel Itanium microarchitecture is fundamentally different under the hood than other processors. It "thinks" differently.

Making Sense of Microarchitecture

The main difference between standard processors and the Intel Itanium 2 microprocessors is Explicitly Parallel Instruction Computing (EPIC), which shifts the responsibility for maximizing parallelism from the processor to the compiler. Unlike microprocessors employing Reduced Instruction Set Computing (RISC) or Complex Instruction Set Computing (CISC) models, in the EPIC model the compiler, aware that there are multiple execution units, groups parallel-ready instructions in bundles. The processor executes the bundles in parallel without runtime analysis.

The Leap to EPIC: Architecture Highlights

The compiler orchestrates predication, allowing instructions to be executed conditionally and reducing the performance hits caused by branch mispredicts in RISC-based systems.
The Intel Itanium 2 compiler recognizes that there are multiple execution units; the compiler groups instructions that can be performed in parallel, making them ready for execution without runtime analysis.
The processor's scheduler is in the compiler, allowing the compiler to handle scheduling and produce code that takes full advantage of on-chip resources.
Intel Itanium 2 microarchitecture has 128 general-purpose and floating-point registers, versus the 32 general-purpose and floating registers found in most RISC-based systems.
Intel Itanium 2 processors use only the registers they need rather than the 8 registers that RISC-based systems take whether they need them or not.
Intel Itanium 2 microarchitecture has more units that execute instructions.
Two-way pipelines pre-load data ahead of possible over-writes, resulting in fewer flushes, fewer problems, and increased reliability and performance.
Software pipelining. Combining speculation, explicit parallelism, predicated execution and rotating registers with looping branch instruction allows:
- Efficiently pipelined loops
- Smaller code
- Reduced latency
- Elimination of copied code for prologue or epilogue
- Increased parallelism
- More Level 1 cache memory
- Shorter wait times
- Greater I/O bandwidth

—R.D.

The Intel Itanium 2 microprocessor has other speed-enhancing features in addition to the EPIC paradigm; in most cases, the compiler exploits them automatically. For example, non-EPIC processors use branch prediction to speed up processing times. Encountering a code branch, x86 chips don't wait around to find out which way to go. They "guess." Branch prediction algorithms are almost always right, but in the highly branched code relevant to data- and calculation-intensive computing, even a tiny percentage of wrong guesses can add up to big performance hits because a wrong guess sends the process back to the beginning.

The Intel Itanium 2 processor does use prediction, but adds predication to avoid misprediction performance hits by running each possible variation of a branch in parallel and tossing the incorrect result. The microprocessor actually contains extra bits which can be set to "true" or "false" for a given predicated instruction. The compiler chooses which branches are suitable for predication and sets the bit. All developers have to do is re-compile for the Intel Itanium 2 processor to make use of predication

Optimal use of the Intel Itanium 2 microarchitecture's extensive onboard memory caches is also critical to maximizing performance. "The idea is to arrange program execution so that needed instructions and data are in L1 cache as much as possible," said HP/s Dick Nicholson at the Developer Days conference sponsored by the Itanium Solutions Alliance. "In a best case/worst case comparison, a program whose data is always in Level 1 cache when needed will run much faster as a program whose data always has to be fetched from main memory."

[Click image to view at full size]

Figure 1: Intel Itanium 2 microarchitecture.

Previous 1 2 3 4 5 6 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Parallel

Itanium 2 Developer Days Diary

Making Sense of Microarchitecture

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Parallel Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Parallel

Itanium 2 Developer Days Diary

Making Sense of Microarchitecture

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Parallel Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content