Now that we have storage for the objects held by the union, we need the discriminator the tag that Variant
stores and that tells what type the object inside really is. The discriminator must make it possible to perform certain type-specific operations on the contents of the Variant
, such as type identification and data manipulation in a type-safe manner.
We have many design options. The simplest is to store an integral discriminator:
template <class TList, class Align = AlignedPOD<TList>::Result> class Variant { union { char buffer_[size]; Align dummy_; }; int discriminator_; public: ... };
The disadvantage of this solution is that integers are not very "smart." To do different things depending on the discriminator, the switch
solution rears its head, and switch
means coupling. Perhaps we could use the int
as an index into some table, but then why not directly use pointers into that table (a solution that we'll investigate later).
The second solution is to use a proxy polymorphic object.
template <class TList, class Align = AlignedPOD<TList>::Result> class Variant { union { char buffer_[size]; Align dummy_; }; struct ImplBase { virtual type_info& TypeId() = 0; virtual Variant Clone(const Variant&) = 0; virtual void Destroy(Variant&) = 0; ... }; ImplBase* pImpl_; public: ... };
The idea here is to perform various operations on the same data by using a pointer to a polymorphic object. Then, depending on what actually lies in buffer_
, the pointer points to a different concrete ImplBase
, and the polymorphic object performs specialized operations on it.
This approach is indeed very clean, except for one thing: efficiency. When invoking a function such as Destroy
, the access is:
Variant
object.- Dereference
pImpl_
. - Dereference
pImpl_
's virtual function table, the so-called "vtable
" [6]. - Index into the appropriate function in the
vtable
.
Indexing with a constant into the vtable
and accessing a field in an object (which boils down to indexing as well) are not problematic. The problem is dereferencing pointers there are two levels of indirection between issuing the call and getting to the function. However, one level of indirection should be enough to dispatch a call to the correct type handler. What to do?
(Allow me a parenthetical remark. If you wonder whether I forgot the two laws of optimizations: (1) don't do it and (2) don't do it yet well, I still remember them. Why, then, is Variant
so keen on optimization? The reason is simple. This is not application code; this is a library artifact, and more so, it is one that emulates and rivals a language feature. The more streamlined Variant
is, the more you can rely on it for heavyweight duties; the more sloppily it is implemented, the more it becomes a toy that you wouldn't care to use in a real application.)
A good approach would be to simulate what the compiler does: fake a vtable
and ensure only one level of indirection. This leads us to the following code:
template <class TList, class Align = AlignedPOD<TList>::Result> class Variant { union { char buffer_[size]; Align dummy_; }; struct VTable { const std::type_info& (*typeId_)(); void (*destroy_)(const Variant&); void (*clone_)(const Variant&, Variant&); }; VTable* vptr_; public: ... };
The VTable
structure contains pointers to various functions, functions that access the Variant
object. (Be wary of the syntax for defining destroy_
and clone_
; they are pointers to functions stored as regular data members inside VTable
. The C declaration syntax strikes again.)
The fake vtable
idiom offers, at the cost of some implementation effort, only one level of indirection and great flexibility in manipulating the Variant
object. Let's now see how we can initialize and use the fake vtable
.
Initializing a Variant Object
Upon construction, Variant
needs to initialize its vptr_
member. In addition, it should properly initialize each pointer inside the VTable
. To this end, let's define a template class VTableImpl<T>
. That template defines static functions that match the types of the pointers to functions in VTable
.
... inside Variant ... template <class T> struct VTableImpl { static const std::type_info& TypeId() { return typeid(T); } static void Destroy(const Variant& var) { const T& data = *reinterpret_cast<const T*>(&var.buffer_[0]); data.~T(); } static void Clone(const Variant& src, Variant& dest) { new(&dest.buffer_[0]) T( *reinterpret_cast<const T*>(&src.buffer_[0])); dest.vptr_ = src.vptr_; } };
Notice a couple of interesting aspects in VTableImpl
's implementation:
- All functions are static and take the
Variant
object to manipulate by reference as their first argument. - Whenever it needs access to the actual object of type
T
,VTableImpl
obtains it by casting &buffer_[0]
toT*
. In other words, all functions inVTableImpl
assume that there's aT
seated insidebuffer_
.
It is easy to put the function pointers and the type stored in buffer_
in sync this is the job of the constructor:
... inside Variant ... template <class T> Variant(const T& val) { new(&buffer_[0]) T(val); static VTable vtbl = { &VTableImpl<T>::TypeId, &VTableImpl<T>::Destroy, &VTableImpl<T>::Clone, }; vptr_ = &vtbl; }
Yes, it's that easy. We create a copy of the incoming val
and comfortably seat it (by using the placement new
operator) inside buffer_
. (We worked hard to ensure that buffer_
is properly aligned to hold a T
.) Then, we create a static VTable
right on the spot and initialize it with pointers to the three corresponding static functions in VTableImpl
. All that is left to do is to initialize vptr_
with the address of the newly created vtable
, and that's all there is to it.
Stop that's not really all. Consider this:
typedef Variant<cons<int, double>::type> Number; string s("Hello, World!"); Number weird(s);
The code above compiles fine because it successfully instantiates Variant
's constructor with string
. However, the intent of Number
is obviously to accept only int
or double
upon construction. So we need to ensure that Variant
's templated constructor cannot be instantiated with any other type than those listed in Variant
's typelist.
Here two helper tools kick in: typelist manipulation, which makes it possible to compute at compile time the index of a type in a typelist; any typelist package contains such a facility. The other tool is compile time assertions a feature that allows you to render code non-compilable if a logical condition is not met. I won't bloat the article by repeating the whole story. Both Boost [7] and Loki [8] have typelist manipulation and compile-time assertions that vary only a little in syntax.