In conclusion, it would be best if Variant
used in-situ allocation.
The search continues with the obvious idea of in-situ allocation: we could use an untyped buffer of characters, right?
template <class TList> class Variant { // compute the needed buffer size enum { neededSize = ... }; unsigned char buffer_[neededSize]; ... };
We actually can compute neededSize
quite easily. Just think of implementing a recursive algorithm that gets the largest element in a typelist: you compare the head with the largest element in the tail (which you compute recursively), and you choose the larger of the two.
template <class TList> struct MaxSize; template <> struct MaxSize<null_type> { enum { result = 0 }; }; template <class Head, class Tail> struct MaxSize< typelist<Head, Tail> > { private: enum { tailResult = size_t(MaxSize<Tail>::result) }; public: enum { result = sizeof(Head) > tailResult ? sizeof(Head) : size_t(tailResult) }; };
As you can see, MaxSize
recursively applies itself to compute the internal variable tailResult
, and then it chooses the larger number. The limit case (null typelist) just returns the smallest possible positive number — zero.
Given MaxSize
, our Variant
definition becomes:
template <class TList> class Variant { enum { neededSize = MaxSize<TList>::result }; unsigned char buffer_[neededSize]; ... };
If you heard an awful sound, that's our current storage implementation hitting the wall: it doesn't work. Worse, it might work sometimes — with some types, on some machines, on some compilers, and during certain phases of the moon. Get ready for tough times — we're having alignment issues.
Indeed, the problem with our approach is that the compiler doesn't know that we plan on storing "true" data inside buffer_
. For all it knows, the compiler must care about a buffer of characters.
It turns out that, in most current computer architectures, some data types must be stored at specific offsets in memory. For example, typically four-byte integers can be allocated only at addresses that are a multiple of four. That doesn't mean that all eight-byte data must be allocated at multiple-of-eight addresses — their alignment could be four as well. (No data, however, can require alignment greater than its own size, because that would break the array storage rules.)
The rationale for the notion of alignment is the physical arrangement of memory banks, the bus, and the processor. For example, certain memory banks are physically linked to the lowest-order eight bits of the memory bus, so by construction their content cannot directly reach the higher-order bits.
If you try to access data that's not properly aligned, one of several things might happen. You can get the data you wanted, only much slower, because essentially the processor simulates in software, by shuffling data around, what should be a hardware operation. (The x86 processors do this.) Many other processors, however, terminate your program immediately on grounds of an attempt to execute an illegal memory access.
The good news is that most of the time, you don't need to worry about alignment: the processor nicely takes care of it by allocating all data at properly aligned addresses. The bad news is that you do need to care exactly now, because buffer_[neededSize]
really means you're taking your life in your hands.
How can we guarantee that buffer_
is properly aligned? Well, that's exactly what union
does — it guarantees enough size for its largest member and alignment for its most restrictive member. This is what we need, so our storage should look something like this:
union { unsigned char buffer_[neededSize]; Align dummy_; };
We never use dummy_
; it's there just for alignment purposes. We now only need to define the Align
type properly. We'll investigate exactly how to do that in the next installment. Until then, happy coding and ideas galore!
Bibliography
[1] A. Alexandrescu. "Generic<Programming>: Typelists and Applications," C/C++ Users Journal C++ Experts Forum, February 2002, <www.cuj.com/experts/2002/alexandr.htm>.
[2] You can download Loki from <www.moderncppdesign.com>.
[3] A. Alexandrescu. Modern C++ Design (Addison-Wesley Longman, 2001).
[4] A. Alexandrescu. "An Implementation of Discriminated Unions in C++," The 2001 C++ Template Programming Workshop, held in conjunction with OOPSLA 2001, <www.oonumerics.org/tmpw01/alexandrescu.pdf>.
[5] Rumor has it that the next revised C++ Standard, fondly called C++0x by those who keep score at home, will support types with constructors as union members. Anyway, types with destructors can't be allowed, and that makes sense. Given that the union doesn't store a discriminator, how would the compiler know which destructor to call when going out of scope?
Andrei Alexandrescu is a Ph.D. student at University of Washington in Seattle, and author of the acclaimed book Modern C++ Design. He may be contacted at www.moderncppdesign.com. Andrei is also one of the featured instructors of The C++ Seminar