Mark Bartosik is a software engineer who specializes in debugging. Mark can be contacted at mark.bartosikbugbrowser.com.
There are a myriad of ways to connect C++ code over process or machine boundaries, none of which are truly native to C++ and most requiring some form of intermediate language. But often, IPC mechanisms such as SOAP, COM, or DCE RPC are overkill for the task at hand. Sure, they provide machine- and language-neutral interface definitions; some even provide a transport. However, this is also their weakness. If all your code is in C++, what use is language neutrality?
Worse still, in an effort to be infinitely flexible and self-describing, SOAP can result in significant performance penalties. I've seen web servers spend more time decoding SOAP messages than doing real work.
When an intermediate language such as IDL, WDL, or XML is used, it is typically compiled into a proxy stub. Your code interfaces with the proxy stub and the proxy stub is linked into your program. The proxy stub sits between your code and the code you want to call. The interface that the intermediate language forces you to use is typically unnatural to C++. For instance, have you ever wanted to pass an std::string to a COM object? COM does not understand what an std::string is; you must convert to/from a BSTR or _bstr_t. The complexity added to C++ code to be compatible with the IPC provider can be huge. Think about the work required to make an out-of-process call to this function:
vector<string> get_titles(const string & author)
Now think about the various potential IDL prototypes for this functionnone of which are especially appealing. In this article, I'll show how C++ templates (and macros) can be used to automatically generate code that enables get_titles and other C++ functions to be called out-of-process. For example, calling get_titles can be as simple as:
vector<string> titles = proxy->get_titles(author);
while the implementation of get_titles is as simple as:
vector<string> get_titles(const string & author) { // your code }
The steps required to call a function out-of-process are roughly:
- Serialize [in] arguments from the caller into a buffer.
- Transport the buffer to the callee's process.
- Create temporary storage for the arguments in the callee's process.
- Reconstruct the [in] arguments in the callee's process.
- Dispatch the call to the correct target function passing both [in] and [out] arguments to it.
- Serialize the [out] arguments and any return value to a buffer.
- Transport the buffer back to the caller's process.
- Deserialize the buffer in the caller's process and update the caller's arguments.
The marshaling library I present here (available at http://www .cuj.com/code/) implements these steps.
The Implementation Challenges
The key implementation challenges that have to be overcome are:
- How to differentiate [in] from [out] parameters?
- How to dispatch to the correct function without requiring users to implement a large switch statement?
- How to serialize?
- How to deserialize?
- How to transport?
Function prototypes should express the intent of the programmer. This means clearly expressing which parameters are [in] and which are [out]. For example, Table 1 illustrates this for an object of type T. This is important where IPC is involved because there's a real cost to marshaling data between processes. This is especially true when the processes are on different machines linked via a WAN. Consider the function:
void get_name(string & given, string & family);
Both the parameters are intended to be [out] parameters. However, with C++, only the return value can be truly [out]-only. Because C++ passes references on the stack, the cost difference between an [out] and an [in/out] parameter is zero. Thus, C++ does not need a true [out], only a parameter type. But the cost of needlessly copying data to remote processes could be high. Clearly, you need to give the compiler some way to distinguish [in/out] from [out]-only parameters:
void get_name(marshal::out<string &> given, marshal::out<string &> family);
For consistency, I provide three adaptersmarshal::in, marshal::out, and marshal::in_ out. Most of the decision logic is in a class called marshal::traits. As Table 2 shows, traits use template specialization to make decisions. The marshal::traits class is used to make several compile-time decisions about how to treat parameters and return values (my marshaling library requires Boost library headers, http://www.boost.org/). The most important decision is in deciding which direction to serialize each argument. Both processes must agree on this at compile time because no metainformation is sent over the wire (this is early binding). Other important decisions that marshal::traits is used to make include whether the data can be marshaled at all, whether the return type can be marshaled, and what is a suitable intermediate value type. For example, const char * is a reference type with no storage, so data is deserialized into an alternative value type with the same wire representation, which is selected by marshal::traits<const char*>::value_type. Listing 1 includes some examples of marshal::traits classes. The only permitted return types are value or handle types plus void:
T, void *, const void *, void
Returning references or pointers to objects in another process is potentially unsafe. The objects certainly cannot be accessed directly. However, returning opaque handles (via void *) and value types is permitted. If you attempt to marshal this function, you get a compile-time error:
char *getenv(const char *varname );
However, this is allowed:
string getenv(const char *varname );
The first version of getenv is clearly a C function, whereas the second is a C++ function. This leads me to the next restrictionpointer and size pairs are often used by C functions for [out] parameters. For example:
char * fgets(char * str, int n, FILE * stream);
The first parameter is an [out] parameter, the second parameter specifies how much data can be written. The problem is that the marshaling information is split across two parameters. There is no support for this and I make no apologies for it because I'm only interested in marshaling C++ function calls and this is a C function. Similar C++ versions of this function that can be marshaled might look like:
bool fgets(file_t file_handle, string & str);
or
bool fgets(file_t file_handle , marshal::out<string> &> str);
I decided to treat pointers like references that might be equal to NULL. So how can you marshal arrays? Table 3 lists my recommendations. I could have allowed fixed-length arrays to be used as parameters [1]; for example:
void send_array(int const (&my_array)[10] );
However, such functions are rare, and it would complicate the marshaling implementation. So std::vector remains the preferred way to transfer arrays, although the implementation can easily be extended to support other collections (as long as the size of the collection is available and it can be default initialized).
Serialization and Encoding
Deciding how to go about serializing the arguments was easy. Streams are the typical C++ idiom for serialization [2]. The standard formatted C++ stream operators are not always symmetrical; for example, the standard insertion operator for std::string writes the whole string, but the extraction operator stops at the first whitespace. Similarly, there are insertion operators for const char * and enum, but there are no corresponding extraction operators. The main requirement is that an object inserted into a stream using the operator<< can be fully reconstructed using the operator>>. If this is not possible, as with const char*, then an object of type T can be inserted using the operator<<, but an alternative type marshal::traits<T>::value_type with the same wire representation can later be extracted and then converted back to type T using the function marshal::traits<T>::to_arg. The format into which the data is serialized is not importantmany different formats could be used. The amount of state information that these streams need to maintain is minimal. I provide the following streams:
marshal::raw_ostream, marshal::raw_istream, marshal::raw_iostream, marshal::lex_ostream, marshal::lex_istream, marshal::lex_iostream, marshal::null_ostream, marshal::null_istream
The real work is done in the stream operators; the actual stream class does little more than provide a type to select the appropriate insertion operators and a placeholder for the stream buffer. The lex_stream operators transfer data in text form, providing a simple, platform-neutral encoding scheme, although no specific character set is assumed. The raw_stream operators are built for speed and transfer the build-in types as a raw byte pattern (like memcpy does), but types such as const char * are run-length encoded. This is efficient but requires that the processes are running on machines and compiled with compilers that use the same bit layout for the basic types. For example, the sizeof(int) must be the same on both platforms and must have the same representation (Big or Little Endian). If only strictly defined types such as __int32 are used in the interface functions, then only the bit layout needs to be the same. Alignment and calling conventions can be different without causing any problems. Some IPC mechanisms dictate a calling convention. In this case, the C++ compiler generates all the proxy stub code, so this restriction does not exist.
The null_ostream does nothing. This is the default type for the debug stream (more on that later). A do-thing class is used rather than a NULL pointer so that the compiler can completely eliminate unused code. This is a recurring theme in the code. Most of the conditions are template arguments that select whether to instantiate code; there are relatively few runtime tests. Other potential streams include asn1_ [3], little_endian, and big_endian.
Transport
My only real concern about a transport is that the implementation should be easy to swap. Again, C++ streams provide the answer. Streams typically come in two parts, the stream class and a buffer class. For example, an std::ofstream class uses an std::filebuf class to do the actual writing to the file. Typical marshaling scenarios include cross-process (same machine), across LAN, and across the Internet. I provide buffer classes that are able to use pipes (or HTTP/HTTPS) as transports, which covers each of the typical scenarios. If you write or derive your own buffer classes, you might like to add compression or encryption. If you write your own stream classes, you must remember to provide all the appropriate insertion and extraction operators.
Typically, the same transport is used for both sending and receiving data. For example, a full-duplex pipe (or two half-duplex pipes) and HTTP requests result in HTTP responses. Data flowing in one direction can be encoded differently; for example, requests might be sent "in-the-clear," whereas return results might be compressed, encrypted, or even transported over an alternative physical network. For maximum flexibility, you can mix-and-match different types of input and output streams and different types of transport buffers.
Dispatch
If you have done much network programming, you have probably used a message specification. A message specification typically defines a list of messages and the fields contained within those messages. One of those fields is usually the message ID. The receiver of the message first decodes the message ID, then performs a switch on the message ID:
switch (message_id) { case MSG_ID_X: do_x(pmessage); break; case MSG_ID_Y: do_y(pmessage); break;
This is a primitive form of dispatch. Instead of a message ID, COM can use v-table indexes, function names, or dispids (dispatch IDs). Unless you are implementing COM's dispatch mechanism IDispatch::Invoke yourself, you will not have to write a switch statement; COM uses metainformation in type library files. I needed some form of dispatch ID that could be generated by the compiler and used to automatically lookup a function pointer.
The C preprocessor is able to help out herea simple interface definition; see Listing 2. The macro expansion results in a class declaration; Listing 3 is an abbreviated version. In this case, strlen has a dispatch ID of 1, and get_titles a dispatch ID of 2. The dispatch ID of each function is calculated by taking __LINE__ and subtracting the value of line_begin. A function's dispatch ID is the number of lines including blank lines between the MARSHAL macro for that function and the BEGIN macro. When an instance of proxy_t is created, it populates an array that is indexed using the dispatch IDs and contains pointers to invoke functions. There are other techniques for generating the dispatch IDs, including using the initialization order of member variables, and using offsetof for each member variable. Each has its drawbacks. Using the preprocessor in this way means that the addition of vertical whitespace between the BEGIN and END macros alters the dispatch IDs, requiring recompilation of the caller and calleebe warned. Using initialization order has a runtime overhead. offsetof is intended only for use with plain-old data types and can generate compiler warnings (for GCC 3.4, use -Wno-invalid-offsetof). The default is to use offsetof, but special versions of the BEGIN macro let the method be selected. Listing 4 shows a relatively portable way to calculate the dispatch IDs using offsetof; the marshaling code uses this principle.
There are two arguments to the MARSHAL_ARGn_EX macro for when functions are overloaded. The second argument specifies the function type. The preferred macro MARSHAL_ARGn uses the typeof operator; this is not yet part of Standard C++ [5], and can only be used if your compiler supports it. Before the target function can be called, the values of the parameters must be deserialized. This is easy for value types such as int; for std::string, an overloaded operator>> can be used.
This does not work for reference types such as const char * because there is no suitable operator>>. An alternative type, selected by marshal::traits<const char *>::value_type, is converted to the parameter type using the function marshal::traits<const char *>::to_arg. Enumerated types also pose a problem. Enumerated types are convertible to int so they are easy to serialize using operator<<, but there is no corresponding operator>>. To deserialize an enumerated type, an integer must first be deserialized, then cast to the enumeration. Listing 5 shows a simplified version of the dispatch function and how it only reads the [in] parameters. The selection of which parameters to read from the input stream is done by read_arg. It uses the marshal::traits class to decide whether the parameter is [in] or [out], and only reads the [in] parameters from the stream.
After dispatching the deserialized arguments to the correct function, the code waits for either an exception to occur or for the function to return, so that the [out] arguments can be transported back to the caller. The marshal::traits class is used to decide which arguments to transport back to the caller. If an exception occurs, an exception of type std::runtime_error is thrown in the caller's context. If the function has a void return type and no [out] arguments, then MARSHAL_ARG_ASYNC can be used to declare that the calling code does not need to wait for the callee to return.
Debug Support
Since I write debugging tools, debugging support is important to me [4]. When debugging networked systems, I want to see what the inputs and outputs are to a particular process. Systems that have diagnostic logging on their network interfaces are much easier to diagnose than systems that do not. The code has support for three streamsoutput, input, and debug. If a debug stream is provided, then all traffic is logged to that stream. Configuring the debug stream to use std::clog provides a convenient way to demonstrate the code.
Putting It All Together
Using the automatically generated proxy is simple. Listing 2 gives an example of interface definition for the functions strlen and get_titles. Listings 6 and 7 illustrate how these functions are implemented and called.
In Listing 7, the call to proxy.bind configures the optional debug stream to std::clog. The full code for these listings is available at http://www.cuj.com/code/. You should run the program pipe_callee.exe from a console window, followed by pipe_caller.exe from another console window. The output of the calling program is:
func: strlen [in](hello) [retval]5 func: get_titles [in](Koenig) [retval]{Accelerated C++,C Traps and Pitfalls,Ruminations on C++,} func: get_website [retval]www.bugbrowser.com
The code available online includes Windows and POSIX examples that bind to a named pipe. Additionally, there is code that binds to an HTTP (or HTTPS) transport buffer using Microsoft-specific APIs (WinHttp.dll and WinINet.dll) for the caller and ISAPI for the callee. There is also reference documentation.
Limitations and Future Directions
The key limitation with this code is that it only supports free-standing functions (not member functions) in the global namespace. With objects, additional consideration needs to be given to operators and copy constructors. To mitigate this, out-of-process interfaces are much less complex than in-process interfaces, and thus the need for objects is reduced. Pointer and size pairs are often used by C functions to return [out] parameters; this idiom is not supported and I recommend using std::string or std::vector instead. All marshaled types must be default constructable; alternatively, a suitable traits<T>::value_type must be provided.
The code has been built and tested with GCC 3.2.3, GCC 3.3.1 on Linux and Windows, and requires at least Visual C++ 7.1. To take advantage of the typeof operator requires GCC 3.4 (under development at the time of writing). Note that support for typeof is optional and can be used to simplify interface specifications. Mileage with other compilers may vary. The code is still experimental, so I am interested in feedback. I would like to add configurable and chainable filters that operate on each function call; for example, uuencode, zip, encrypt, and so on. I hope to post any updates or additional transport buffers at http://www.cuj.com/ or http://www.bugbrowser.com/.
Acknowledgments
Thanks to Andrew Koenig and Jon Jagger for reviewing this.
References and Notes
- [1] Vandevoode, David and Nicolai M. Josuttis. C++ Templates. Addison-Wesley, 2002.
- [2] Langer, Angelika and Klaus Kreft. Standard C++: IOStreams and Locales, Addison-Wesley, 2000.
- [3] For an explanation of ASN.1 see http://www.asn1.org/ or http://asn1 .elibel.tm.fr.
- [4] See http://www.bugbrowser.com/.
- [5] Jaakko JS
rvi et al. Decltype and auto, doc no N1418 03-0061, http://std.dkuug.dk/jtc1/sc22/wg21/docs/papers/2003/n1478.pdf.