Computation
The example computation I present here (complete source code available in the source code area) is based on the High-Performance Embedded Computing (HPEC) FIRBANK benchmark (http://www.ll.mit.edu/HPECChallenge/). The various HPEC benchmarks are representative computations performed in signal-processing applications.
The computation performed by the benchmark is to filter signals recorded by each of M sensors. Each sensor samples a digital data stream every T seconds. After N samples have been collected, a filtering operation is performed to remove unwanted frequencies from the input. For example, if the sensor in question is part of a sonar system, then perhaps only the frequencies generated by submarine propellers might be of interest. Other frequencies, like those generated by marine life, would be filtered out of the sensors. The filters to apply to each sensor may, in general, be different.
Typically, the M sensors are arranged in a straight line. Therefore, signals arriving from a distant point reach one would arrive at one of the sensor arrays first, then gradually show up at other points in the array. By comparing the observations of each sensor, you can determine the bearing of the signal. This subsequent processing would be performed after the filter operation I describe here.
Computationally, then, the input is an M×N matrix. The entry at position (i,t) in the matrix is the sample value recorded by sensor i at time t. The output is also an M×N matrix containing data in the same form as the original inputbut with the uninteresting frequencies removed.
The filter is also an M×N matrix. The entry at position (i,j) in the matrix is the amount by which to amplify (or deamplify) the frequency j/NT hertz in the ith sensor's input data.
(The computation described is also used in situations other than the filtering scenario just described. The same computation is called "pulse compression," if each of the rows is a separate set of samples arriving at a single sensor. It's also called "cross-range processing" when used as part of a synthetic-aperture radar computation.)