You know the one about the syntactic sugar? It causes cancer of the semicolon [1]? Well, enough jokes for today. We have a lot of ground to cover today, so let's hit the road early.
This installment continues the implementation of discriminated unions in C++. Today we'll finish the discussion on alignment and proceed full-speed to writing some code for the actual implementation of Variant
.
Before that, let's refresh our memory on the progress we made in the previous installment of "Generic<Programming>" [2].
Highlights from the Previous Episode
After putting together a wish list for implementing discriminated unions, we discussed what would be the best storage model. After stumbling upon alignment issues, we figured out that a storage model that is at the same time properly aligned, efficient, and exception-friendly is:
union { unsigned char buffer_[neededSize]; Align dummy_; };
where neededSize
is the maximum size of the types in the union, and Align
is a POD [3] type that ensures proper alignment.
Depending on how well chosen Align
is, the discriminated union's storage can vary from optimal to unnecessarily large to defective.
Even with the best Align
, the implementation above is still not 100-percent portable for all types. In theory, someone could implement a compiler that respects the Standard but still does not work properly with discriminated unions. This is because the Standard does not guarantee that all user-defined types ultimately have the alignment of some POD type. Such a compiler, however, would be more of a figment of a wicked language lawyer's imagination, rather than a realistic language implementation.
Implementing an Alignment Calculator
That said, several effective alignment computation algorithms can be devised. You can see a couple implemented in the Boost libraries [4]. One that works well for computing the alignment of a type T
is:
- Start with a collection of all primitive types.
- Eliminate from that collection the types larger than
T
. - The resulting
Align
is a Cunion
containing each of the types in the collection computed in step 2.
The basic idea underlying this algorithm is that any user-defined type T
will ultimately contain only primitive types. Consequently, T
's alignment requirements will be the same as one of those types. Larger types have larger alignment requirements; hence it's reasonable to infer that an upper bound of T
's alignment is the alignment of its largest member.
This algorithm might "overalign" the result. For example, if T
is char[5]
and sizeof(int)
is four, T
might have an alignment of one, but Align
will have the alignment of an int
(most likely, four).
Erring on the large side of alignment is not that bad in most cases. (Erring on the small side is disastrous.) On the other hand, erring on the large side of size wastes space. Our algorithm's step 2 ensures that only types of size less than or equal to T
's size remain selected.
To implement the algorithm for computing Align
, recall from the previous installment of "Generic<Programming>" that we have two appropriate tools at our disposal: typelists and ConfigurableUnion
. Typelists allow us to manipulate collections of types, allowing us to perform steps 1 and 2. ConfigurableUnion
generates a C-style union
from a typelist, thus taking care of step 3.
It is worth noting that, in addition to all primitive types, the initial collection of types should include some simple struct
s as well. Each of those struct
s contains only one member of a primitive type. We generate these stub structures from a simple template:
template <typename U> struct Structify { U dummy_; };
The reason for introducing these little structures is simple. Some compilers put tougher alignment on a structure containing only one int
than on the int
alone. This way, the compiler writer ensures that all user-defined types have the same alignment requirements. In turn, this architectural decision makes things easy for various parts of the compiler.
We thus start with a typelist containing all primitive types and all "structified" types:
class Unknown; typedef cons< char, short int, int, long int, float, double, long double, char*, short int*, int*, long int*, float*, double*, long double*, void*, Unknown (*)(Unknown), Unknown* Unknown::*, Unknown (Unknown::*)(Unknown), Structify<char>, Structify<short int>, Structify<int>, Structify<long int>, Structify<float>, Structify<double>, Structify<long double>, Structify<char*>, Structify<short int*>, Structify<int*>, Structify<long int*>, Structify<float*>, Structify<double*>, Structify<long double*>, Structify<void*>, Structify<Unknown (*)(Unknown)>, Structify<Unknown* Unknown::*>, Structify<Unknown (Unknown::*)(Unknown)> >::type TypesOfAllAlignments;
Okay, so the primitive types are there; the structified types are there, too very predictable. But some lines of the typelist above look as if the copyeditor's cat did a little Teutonic dance on the keyboard. The incriminating lines are:
Unknown (*)(Unknown) Unknown* Unknown::* Unknown (Unknown::*)(Unknown)
Also, their structified versions appear at the bottom of the list.
Who are these guys, and where did they come from? To make them look more familiar, let's give them names:
Unknown (*T1)(Unknown); Unknown* Unknown::* T2; Unknown (Unknown::* T3)(Unknown);
Aha! If you precede each of the lines above with typedef
, they reveal themselves in all the tired glory of the C declaration syntax. T1
is a pointer to a function taking Unknown
and returning Unknown
; T2
is a pointer to member of an Unknown
, member of type Unknown*
; and T3
is a pointer to member function of Unknown
, taking an Unknown
as an argument and returning an Unknown
as a result.
The trick here is that the cunningly named Unknown
is indeed not defined. Therefore, the compiler will assume that Unknown
is defined elsewhere and that it has the worst alignment requirements possible. (Otherwise, the compiler can optimize Unknown
's layout. Optimization here plays against generality.)
Okay, here comes the fun part. Let's now eliminate from TypesOfAllAlignments
all types that are larger than a given size, by writing the following template:
template <class TList, size_t size> struct ComputeAlignBound;
First, we treat the limited case, as a nice recursive algorithm ought to:
template <size_t size> struct ComputeAlignBound<null_typelist, size> { typedef null_typelist Result; };
Then, we treat the general case:
template <class Head, class Tail, size_t size> struct ComputeAlignBound<typelist<Head, Tail>, size> { typedef typename ComputeAlignBound<Tail, size>::result TailResult; typedef typename select< // details below sizeof(Head) <= size, typelist<Head, TailResult>, TailResult>::result Result; };
First, ComputeAlignBound
yields the result of its recursive application to the tail of the typelist in TailResult
. Now, if the head's size is less than or equal to size
, the result is a typelist formed by Head
and TailResult
. Otherwise, the result is only TailResult
Head
is no longer included.
This type selection is performed by a little template entitled select
. To save column space, please allow me to refer you to [5]. It's a very useful little template, and if you are interested in generic programming using C++, you owe it to yourself to find out about it.
To put it all together, we need to tie up some loose ends. Let's recap what we have and what we need. We have the MaxSize
template, which calculates the maximum size of all types in a typelist. We have ConfigurableUnion
, which builds a union
out of a typelist. Finally, we have ComputeAlignBound
, which calculates a type that has the same or more stringent alignment requirements as any type in a typelist. We need a POD type that provides a properly aligned storage for any type in a typelist. Here's how to exploit what we have to obtain what we need:
template <typename TList> class AlignedPOD { enum { maxSize = MaxSize<TList>::result }; typedef ComputeAlignBound<TypesOfAllAlignments, maxSize>::Result AlignTypes; public: typedef ConfigurableUnion<AlignTypes> Result; };
You can play encapsulation games by putting ComputeAlignBound
, Unknown
, Structify
, and ConfigurableUnion
in some private namespace so that they are properly hidden.
Discriminated Unions Implementation: The Fake vtable Idiom
Let's get back to discriminated unions. At this point, half of this article's real estate has been spent on computing alignment. I hope that it was a well-invested effort; alignment is important in a host of low-level applications, as well as in the efficient implementation of higher-level abstractions.