C/C++

Discriminated Unions (II)

By Andrei Alexandrescu, June 01, 2002

We continue working on our Variant design and implementation. This time we focus on storage - how data is stored and accessed. This article presents techniques for computing alignment of types with reasonable accuracy and portability. Then, once the storage proper is in place, we need polymorphic means to manipulate that storage - for which Variant uses the fake vtable

Now that we have storage for the objects held by the union, we need the discriminator — the tag that Variant stores and that tells what type the object inside really is. The discriminator must make it possible to perform certain type-specific operations on the contents of the Variant, such as type identification and data manipulation in a type-safe manner.

We have many design options. The simplest is to store an integral discriminator:

template <class TList, class Align = AlignedPOD<TList>::Result>
class Variant
{
  union
  {
    char buffer_[size];
    Align dummy_;
  };
  int discriminator_;
public:
  ...
};

The disadvantage of this solution is that integers are not very "smart." To do different things depending on the discriminator, the switch solution rears its head, and switch means coupling. Perhaps we could use the int as an index into some table, but then why not directly use pointers into that table (a solution that we'll investigate later).

The second solution is to use a proxy polymorphic object.

template <class TList, class Align = AlignedPOD<TList>::Result>
class Variant
{
  union
  {
    char buffer_[size];
  Align dummy_;
  };
  struct ImplBase
  {
    virtual type_info& TypeId() = 0;
    virtual Variant Clone(const Variant&) = 0;
    virtual void Destroy(Variant&) = 0;
    ...
  };
  ImplBase* pImpl_;
public:
  ...
};

The idea here is to perform various operations on the same data by using a pointer to a polymorphic object. Then, depending on what actually lies in buffer_, the pointer points to a different concrete ImplBase, and the polymorphic object performs specialized operations on it.

This approach is indeed very clean, except for one thing: efficiency. When invoking a function such as Destroy, the access is:

Variant object.
Dereference pImpl_.
Dereference pImpl_'s virtual function table, the so-called "vtable" [6].
Index into the appropriate function in the vtable.

Indexing with a constant into the vtable and accessing a field in an object (which boils down to indexing as well) are not problematic. The problem is dereferencing pointers — there are two levels of indirection between issuing the call and getting to the function. However, one level of indirection should be enough to dispatch a call to the correct type handler. What to do?

(Allow me a parenthetical remark. If you wonder whether I forgot the two laws of optimizations: (1) don't do it and (2) don't do it yet — well, I still remember them. Why, then, is Variant so keen on optimization? The reason is simple. This is not application code; this is a library artifact, and more so, it is one that emulates and rivals a language feature. The more streamlined Variant is, the more you can rely on it for heavyweight duties; the more sloppily it is implemented, the more it becomes a toy that you wouldn't care to use in a real application.)

A good approach would be to simulate what the compiler does: fake a vtable and ensure only one level of indirection. This leads us to the following code:

template <class TList, class Align = AlignedPOD<TList>::Result>
class Variant
{
  union
  {
    char buffer_[size];
    Align dummy_;
  };
  struct VTable
  {
    const std::type_info& (*typeId_)();
    void (*destroy_)(const Variant&);
    void (*clone_)(const Variant&, Variant&);
  };
  VTable* vptr_;
public:
  ...
};

The VTable structure contains pointers to various functions, functions that access the Variant object. (Be wary of the syntax for defining destroy_ and clone_; they are pointers to functions stored as regular data members inside VTable. The C declaration syntax strikes again.)

The fake vtable idiom offers, at the cost of some implementation effort, only one level of indirection and great flexibility in manipulating the Variant object. Let's now see how we can initialize and use the fake vtable.

Initializing a Variant Object

Upon construction, Variant needs to initialize its vptr_ member. In addition, it should properly initialize each pointer inside the VTable. To this end, let's define a template class VTableImpl<T>. That template defines static functions that match the types of the pointers to functions in VTable.

... inside Variant ...
template <class T>
struct VTableImpl
{
  static const std::type_info& TypeId()
  {
    return typeid(T);
  }

  static void Destroy(const Variant& var)
  {
  const T& data =
        *reinterpret_cast<const T*>(&var.buffer_[0]);
     data.~T();
  }

  static void Clone(const Variant& src, Variant& dest)
  {
    new(&dest.buffer_[0]) T(
      *reinterpret_cast<const T*>(&src.buffer_[0]));
    dest.vptr_ = src.vptr_;
  }
};

Notice a couple of interesting aspects in VTableImpl's implementation:

All functions are static and take the Variant object to manipulate by reference as their first argument.
Whenever it needs access to the actual object of type T, VTableImpl obtains it by casting &buffer_[0] to T*. In other words, all functions in VTableImpl assume that there's a T seated inside buffer_.

It is easy to put the function pointers and the type stored in buffer_ in sync — this is the job of the constructor:

... inside Variant ...
template <class T>
Variant(const T& val)
{
  new(&buffer_[0]) T(val);
  static VTable vtbl =
  {
    &VTableImpl<T>::TypeId,
    &VTableImpl<T>::Destroy,
    &VTableImpl<T>::Clone,
  };
  vptr_ = &vtbl;
}

Yes, it's that easy. We create a copy of the incoming val and comfortably seat it (by using the placement new operator) inside buffer_. (We worked hard to ensure that buffer_ is properly aligned to hold a T.) Then, we create a static VTable right on the spot and initialize it with pointers to the three corresponding static functions in VTableImpl. All that is left to do is to initialize vptr_ with the address of the newly created vtable, and that's all there is to it.

Stop — that's not really all. Consider this:

typedef Variant<cons<int, double>::type> Number;
string s("Hello, World!");
Number weird(s);

The code above compiles fine because it successfully instantiates Variant's constructor with string. However, the intent of Number is obviously to accept only int or double upon construction. So we need to ensure that Variant's templated constructor cannot be instantiated with any other type than those listed in Variant's typelist.

Here two helper tools kick in: typelist manipulation, which makes it possible to compute at compile time the index of a type in a typelist; any typelist package contains such a facility. The other tool is compile time assertions — a feature that allows you to render code non-compilable if a logical condition is not met. I won't bloat the article by repeating the whole story. Both Boost [7] and Loki [8] have typelist manipulation and compile-time assertions that vary only a little in syntax.

Previous 1 2 3 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

C/C++

Discriminated Unions (II)

Initializing a Variant Object

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

C/C++ Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

C/C++

Discriminated Unions (II)

Initializing a Variant Object

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

C/C++ Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content