A Practical Example: The Fire Web Server
When I first sat down to write this article, I recalled that my own learning experience was made somewhat difficult by the relative dearth of real-world examples. The same was true of my experience learning about some of the more esoteric Windows Sockets APIs. As these are both important Windows innovations, I decided to develop a simple, multithreaded web server that demonstrates the use of both. It is named in honor of my friend and colleague Ray Schraff, who often tells our customers that mankind has adopted the World Wide Web faster than any technology since the invention of fire. The Fire web server (available at www.ddj.com/code/) exploits I/O completion ports and the best features of Windows Sockets to deliver respectable performance in about 500 lines of C++ code.
The main() Event
All important initialization occurs in the main function, including the initialization of the Windows socket library, registration of an event handler to capture the user's request to stop the server via CTRL-C, and creation of the listener socket.
Next, a single I/O completion port is created, followed by the creation of a small pool of worker threads. Finally, a fixed number of Connection objects (each of which manages one socket) are created.
The Connection Class
The real meat of the program lies within the implementation of the Connection class. Its constructor creates a socket, associates it with the I/O completion port previously created in the main function, and finally, issues an asynchronous request to accept a client connection.
People familiar with the standard accept API may be confused by the fact that a client socket is created prior to the call to AcceptEx, so let me explain. AcceptEx requires that the client socket be created up-front, but this minor annoyance has a payoff in the end: It lets a socket descriptor be reused for a new connection via a special call to TransmitFile. This means that a server that deals with many short-lived connections can utilize a pool of allocated sockets without incurring the cost of creating new descriptors all the time.
The rest of the Connection class is a simple state machine; any given connection may be in any of four states:
- WAIT_ACCEPT. Waiting for AcceptEx to complete.
- WAIT_REQUEST. Waiting for the client request to be complete.
- WAIT_TRANSMIT. Waiting for the response to be sent.
- WAIT_RESET. Waiting for the client socket to be reset.
Here's how things get rolling: When the Connection objects are allocated in the main function, they all issue asynchronous accept calls on their sockets. This means that shortly after startup, all connection objects are in the WAIT_ACCEPT state, until a client actually connects and the operating system wakes one of the worker threads.
The handling worker thread takes advantage of the fact that the Connection class is derived from OVERLAPPED, casts the OVERLAPPED pointer into a connection object, and assuming the pointer checks out, calls the Connection's OnIoComplete function.
OnIoComplete implements the Connection class's state machineessentially transitioning from one waiting state to another by calling the appropriate CompleteXxx function. For example, when a new client connects, the CompleteAccept method is called to perform the necessary steps to prepare the socket for actual use.
Likewise, each CompleteXxx function's last move is to issue another asynchronous I/O request, whether to read more data from the client, transmit a response, or ask that the socket be reset and ready to accept a new client.
At this point, several items merit mention when designing around I/O completion ports and asynchronous I/O. First, as has already been mentioned, because asynchronous I/O functions typically return immediately, you must ensure that any buffers passed to the calls remain valid at least until the completion event is handled. This implies heap allocation, since buffers allocated on the stack in a function get junked on exit.
Second, in a server application such as Fire, any thread could handle any connection at any time. As soon as an asynchronous file operation is issued on a descriptor, it is up to the operating system to pick a thread to run the completion routine. Put differently, there is no guaranteed affinity between the thread issuing an asynchronous I/O call and the thread receiving the completion notification. For this reason, you must design your data structures carefully in order to ensure that threads don't tromp all over each other when trying to handle a request.
Last, you must be extremely careful to design your application to avoid races and the other classes of problems that arise when writing multithreaded programs. For example, when designing Fire, I spent a considerable portion of my development time convincing myself which states were necessary to consider. I was also careful to make sure that the various asynchronous I/O requests (read data, write data, reset the socket, and so on) were always the last operations performed in any of the completion handlers.
The reason is subtle, but clear: Were I to perform any other types of work in a handler function after issuing an I/O request, I would have a race conditionit would be possible for more than one thread to be operating on a connection object concurrently.
The benefit of all this careful planning, however, is that Fire does not require any mutual exclusion mechanism in its implementation.
The result should be that Fire scales well on multicore machines, since individual worker threads are never competing for resources.
Conclusion
I/O completion ports provide an elegant solution to the problem of writing scalable server applications that use multithreading and asynchronous I/O. While it is important to design such applications carefully to avoid certain types of problems such as race conditions or excessive resource contention, the benefits of doing so far outweigh the costs, especially considering the world of multicore, multiprocessor servers in which we now reside.
Acknowledgments
Thanks to Dave Cutler, Len Holgate, Paul Lloyd, and especially Dad for his detailed review.