Backends and Runtime Compilation
When this last line runs, the platform actually does a little more than just execute the program object. First, it decides which "backend" should be responsible for the program executions. Backends form the connection between the RapidMind platform and a particular piece of target hardware. For example, RapidMind currently ships with backends for the Cell BE, OpenGL-based GPUs, and a fallback backend that generates and compiles C code on the fly. A multicore x86 backend will be released later this year, and other backends are in the works.
Once a suitable backend has been chosen (a process that is generally instantaneous), it is asked to execute the program under the given conditions. The first time this is done generally causes the program object to be compiled for that particular backend, similar to the way a JIT environment behaves. Once a program has been compiled, it is not recompiled. Compiles can also be forced to happen using the compile() function.
This runtime compilation mechanism is powerful, as the generated code is optimized for the exact conditions it is being run under. This is one reason why RapidMind-generated code outperforms plain C or even hand-tuned assembly code in many cases.
A More Interesting Example
The previous program was a trivial example. Program objects typically contain more statements, and include function calls, random-access array reads, control flow, and the like. The following program, which computes a Julia set fractal, is more complicated, but still fairly basic:
Program julia2d = RM_BEGIN { In<Value2i> index; Out<Value1f> shade; std::complex<Value1f> z(index[0]/(Value1f)width - 0.5f, index[1]/(Value1f)height - 0.5f); z *= scale; Value1ui i; RM_FOR (i = 0, i < max_iterations && std::norm(z) < 4.0f, i++) { z = z * z + c; } RM_ENDFOR; shade = i/(Value1f)max_iterations; } RM_END;
This program (available at www.ddj.com/code/) takes as its input the location of a pixel in an image, then computes a shade for that particular pixel by iterating a complex arithmetic expression and checking whether it diverges; see Figure 2. To perform the complex arithmetic, we use the standard C++ class std::complex with a RapidMind value type. Because RapidMind value types act like numeric types in C++, this works out of the box. It is often possible to convert a piece of existing C++ code to RapidMind by simply replacing basic C++ types with RapidMind types.