Algorithm Two (Light Pipe)
This communication model also involves two processes: Reader and Writer. Writer is writing data into shared memory and the Reader process is reading consistent data from shared memory. The shared memory is aligned by processor word and initially is filled with zeros. Data is written by the Writer process to shared memory word by word, starting from word number 0. Data must contain only non-zero words (later I'll show how to lessen this requirement). Before writing a non-zero word to some address, the Writer process reads a word from this address and checks that its value is zero. If the read word is zero-valued then data can be written, otherwise the Writer process must wait. Suppose there are N words in the shared memory. After writing word number N-1, the first process starts writing word number 0. Here is some pseudocode for Writer process:
int i = 0; while( true ) { while( SharedMemory[i] != 0 ) { DoSomethingUseful(); // Or just wait or spin } SharedMemory[i] = GetDataToPassToAnotherProcess(); i = (i+1) mod N; }
The second process (Reader) is reading words from shared memory starting from word number 0. When this process reads a zero-value word from shared memory, it waits, and then reads data again and again until it reads a non-zero word. This is a signal that new data has arrived. After reading a non-zero word, the Reader process writes a zero-valued word in its place. Then it processes the most recently read non-zero word and moves to the next word. After reading word number N-1, the process starts reading word number 0. Here is some pseudo code for Reader process:
int i = 0; while( true ) { word theWord; while( (theWord = SharedMemory[i]) == 0 ) { DoSomethingUseful(); // Or just wait or spin } SharedMemory[i] = 0; ProcessDataReceivedFromTheFirstProcess(theWord); i = (i+1) mod N; }
In other words, processes are sending messages to each other that are one processor word long. The first process (Writer) sends non-zero information messages; the second process (Reader) reads them and writes zero-valued messages in their place as confirmation that non-zero messages are read. Since processor words are passed between different processor cache levels and main memory atomically, we can be sure that communication will be consistent.
Figures 1, 2 and 3 explain the algorithm and data migration between different processor caches.
If the processes (or threads) are run on different processors, then the messages are delivered by cache synchronization hardware asynchronously and without any intervention. This can lead to higher performance.