An Improved Variant Type Based on Member Templates

C has generic pointers and varying length argument lists for flexibility. C++ has templates for even more flexibility, and better type safety in the bargain.


October 01, 2000
URL:http://drdobbs.com/an-improved-variant-type-based-on-member/184401293

October 2000/An Improved Variant Type Based on Member Templates


This article presents class variant_t, which encapsulates a mechanism to hold values of arbitrary types. If the types used to initialize variant_t variables have full copy semantics, this variant_t can be stored in an array, returned from a function, and contained in any standard collection. As an example use of the variant_t, I present a class ArgumentList. It is a container adaptor to a vector<variant_t> that provides an interface akin to a list of arguments to a function.

Developing a variant_t

A preliminary definition for variant_t could be the following struct:

struct variant_t
{  
   variant_t (const void * value)
   : data ( value ) {}
   const void* data;
};

It would be used as follows:

variant_t array [2];
int n0 = 3;
double f0 = 3.14;
array[0] = variant_t((const void*)&n);
array[1] = variant_t((const void*)&f);
int n1 = *(const int*)(array[0].data);
double f1 = 
   *(const double*)(array[1].data);

One of the problems with this design is that it forces the user to explicitly cast to and from the variant_t object. A recent addition to the language, member templates, permits a more flexible design.

Version 0

Listing 1 shows the definition of variant0_t. The constructor for variant0_t is a member template function:

template<typename T>
variant0_t(const T& v){...}

A member template function is a member function that is instantiated (generated by the compiler) for a particular type when a call using this type is first seen by the translator. In the case of variant0_t, the compiler will generate a constructor according to the particular type of the argument. For example, the expression variant0_t(3) will generate a constructor that takes a const int & as an argument and produce a call to that constructor. Similarly, variant0_t has a generic conversion operator:

template<typename T>
operator T () const {...}

The compiler will try to generate code to convert from variant0_t to any type by instantiating this conversion operator for the particular return type. For example, the expression int n = variant0_t(3) will call an operator int (instantiated by the compiler) to provide an int that can be used to initialize n.

The following snippet shows how variant0_t can be used.

variant0_t array [2];
int n0 = 3;
double f0 = 3.14;
array[0] = n0;
array[1] = f0;
int n1 = array[0];
double f1 = array[1];

The conversion code uses reinterpret_cast<> to cast back the void pointer. I will add type checking to this cast later.

A problem with variant0_t is that it holds a pointer to the original value. If the original value goes out of scope, then variant0_t points to an invalid object. The solution is to hold a copy of the original value.

Version 1

Listing 2 shows variant1_t. It maintains a copy of the value it is initialized with in a separate memory block. Using variant1_t, the following operations are valid:

int* a0 = new int(3);
variant1_t v ( *a0 );
delete a0;
int a1 = v;  // a1 = 3

A problem with this new version is that it assumes that T can be safely copied as a bitwise copy operation.

Version 2

Listing 3 shows variant2_t. It is a considerable improvement over the last variant, so I'll explain it step by step.

Consider the following template:

template<typename T>
struct Impl
{
   Impl (T v) : data (v) {}
   T data;
};

This template can be used to safely obtain and store a copy of a given variable of an arbitrary type. Since it uses the type's copy constructor, the copy is guaranteed to be appropriate, as long as the copy constructor is properly defined. The next snippet extends Impl<> in a way that makes it particularly useful for our purposes:

struct ImplBase
{
   virtual ~ImplBase() {}
};
   
template<typename T>
struct Impl : ImplBase
{
   Impl (T v) : data (v) {}
   T data;
};

The big difference here is that Impl<T> is a class derived from ImplBase, which has a virtual destructor. Any instance of Impl<T> is a polymorphic type; that is, a pointer to an object of this type can be converted to a pointer to the base type, and vice versa. For instance, given a pointer to ImplBase, it is possible to safely cast it to Impl<T> using dynamic_cast<>. It is possible to cast a pointer to Impl<T> back to ImplBase using an implicit cast, since this is an upcast. This inheritance relationship enables arbitrary types to be stored in arrays and containers (via pointers), in a way that complements the variant_t.

Consider the following example:

ImplBase* array[2];
array[0] = new Impl<int>(3);
array[1] = new Impl<double>(3.14);
int n1 = (dynamic_cast<Impl<int>*>(array[0]))->data;
double f1 = (dynamic_cast<Impl<double>*>(array[1]))->data;
delete array[0];
delete array[1];

The above code implements an array of polymorphic types that holds values of types int and double. The expression

new Impl<int>(n0)

generates the type Impl<int> with the following important properties:

1) It contains a data member of type int.

2) This data member has been safely initialized as a copy of the int variable n0. Both the data member and the original variable are exactly of the same type so this initialization is type safe.

3) It is polymorphic, which means that it can safely be converted to and from BaseImpl *.

4) BaseImpl has a virtual destructor, so an Impl<int> * can be safely deleted from a BaseImpl *.

5) This type is uniquely bound to the type int. Different instances of Impl<> instantiated for different types will have similar properties, which will guarantee type-safe and copy-safe operations while maintaining polymorphic behavior.

The expression:

int n1 = (dynamic_cast<Impl<int>*>(array[0]))->data

is type-safe because the dynamic cast will fail if array[0] is not an instance of Impl<int> *. Therefore, if the dynamic cast succeeds, ->data is guaranteed to be of type int. Furthermore, because Impl<> is a polymorphic type, this dynamic cast is guaranteed to succeed when the result type matches the type of the n1.

variant2_t (Listing 3) is the combination of variant1_t and Impl<>. An additional method in variant_t encapsulates the casting operation shown above; CastFromBase enforces type checking by throwing an exception if the dynamic cast fails.

Dealing with Shared Representation

Since variant2_t holds a pointer to a value in a separate object, it creates a new problem.

Consider the following code:

variant_t foo() {return 3;}

The return statement is functionally equivalent to:

variant_t result ;
int _unnamed_int(3);
variant_t _unnamed_var_t(_unnamed_int); // temporary
result = _unamed_var_t;

As you can see, a temporary variant_t object is constructed and copied into the result object (which is constructed on the caller side and passed as a hidden parameter so the return statement can assign to it). This temporary holds a pointer to an instance of Impl<int> through its ImplBase pointer. This same instance is stored in result. But, when the temporary goes out of scope, this instance is deleted and result holds a pointer to an invalid object.

There are three basic solutions to this problem: "copy on copy," unique ownership, and reference counting.

By "copy on copy," I mean that every time the variant_t is initialized or copied, it makes a copy of the underlying value. I will not discuss this solution here, since it is relatively inefficient, but I provide an implementation on the CUJ website (www.cuj.com/code).

The unique ownership idiom mandates that only one instance of a (smart) pointer can point to a given object. std::auto_ptr<> is an example of this kind of smart pointer. Whenever a given instance of auto_ptr<> is copied, it passes a pointer to the owned object to the copy. In this manner, auto_ptr<> hands off ownership to the copy and relinquishes ownership for itself.

As an alternative, reference counting is the idiom for collaborative shared ownership. The reference counting idiom allows any number of smart pointers to own the same object without the risk of containing "dangling" pointers (i.e., pointers to deleted objects). Each smart pointer is guaranteed to contain a valid pointer because the jointly owned object can be destroyed only when all smart pointers to it have relinquished ownership.

There are two major drawbacks to reference counting:

1) The most straightforward implementations of the idiom require that additional data and operations be placed in the contained object. In this case, any class that will be reference counted must be slightly modified. (However, see [1] for an implementation that does not impose such requirements.)

2) Circular references might cause unpredictable behavior.

Unique ownership is a flexible and safe approach to providing smart pointer behavior in general; that's why auto_ptr<> uses this scheme. However, this scheme is unsuitable for general collections (including arrays and the standard containers), because general collections make extensive use of copying, possibly leading to situations where unowned objects are referenced. For this reason, I've chosen to implement variant_t as a reference counted object.

The Final Version

Listing 4 shows the complete code for the final variant_t type. It is functionally equivalent to variant2_t, except for the reference counting over Impl<>.

This final implementation of variant_t also includes a member template is_type(), with two overloaded versions. You can use is_type to test if variant_t holds a value of of some type T. The first version takes no argument and must be used as in:

v.is_type<MyClass>();

The second version takes a single argument and must be used as in:

v.is_type(MyObject);

where MyObject is of the type being tested for. I thought about adding a method of the form:

string type_name() const {return typeid(*data).name();}

but this method would return strings such as "variant_t::Impl<int>", revealing implementation details. Extracting "int" from the string above cannot be done portably because the exact string returned by typeid::name is compiler specific. So I just decided to leave this feature out.

Listing 5 shows a test program that demonstrates various uses of variant_t.

An Argument List Example

This example shows how to use a variant_t to create variable argument list. I think of an argument list as a collection with the following properties:

I consider these properties to closely resemble the functioning of a formal parameter list to function.

With a properly defined variant_t class, designing the ArgumentList class is just a matter of determining its interface and the container it will use. Based on the above properties, I present class ArgumentList, shown in Listing 6. It is implemented in terms of a vector of variant_t.

Elements can be pushed on the back of the list. Just as you can with a variant_t, you can push elements directly to the list without explicit casting. You can use operator[] or the at method to access individual elements in the list, but you need to know the positions (indexes) of the specific arguments you are accessing. You cannot query the list about which type is at a given position.

Listing 7 shows a test program that demonstrates various uses of the ArgumentList class.

A Possible Extension

You will notice that this design of variant_t allows you to retrieve only a copy (or const reference) of the value. If you want to permit the user to modify the values held in a variant_t, you will need to add a copy-on-write mechanism. Beware, however: you must not add a method or operator that provides access to the value via reference or pointer to non-const. That will violate the basic assumption behind any copy-on-write mechanism, which is that only the class itself is allowed to modify the data.

Conclusion

The type variant_t is a special class with the ability to hold values of arbitrary types. variant_t itself is just a reference counted envelope for objects derived from variant_t::ImplBase. ImplBase forms a template-based polymorphic hierarchy, whose leaves are classes generated by the compiler according to the specific type of the value held by variant_t. These leaves are responsible for holding the copies of the values in a copy-safe and type-safe manner.

Reference

[1] Vladimir Batov. "Safe and Economical Reference Counting in C++," C/C++ Users Journal, June 2000.

Fernando Cacciola has been programming since 1984 and programming in C++ since 1990. He studied Biochemistry at John. F. Kennedy University. For the past five years, he has been developing computational geometry algorithms.

October 2000/An Improved Variant Type Based on Member Templates/Listing 1

Listing 1: Definition of class variant0_t

struct variant0_t
{
  variant0_t():data(NULL){}

  template<typename T> variant0_t(const T& v)
   : data ( &v ) {}

  template<typename T> operator T () const
  { return * reinterpret_cast<const T*>(data); }

  const void* data ;
} ;

October 2000/An Improved Variant Type Based on Member Templates/Listing 2

Listing 2: Definition of class variant1_t

// This variant copies the original
// value into its own data to
// preserve the value when the
// original variable goes out of scope.
struct variant1_t
{
   variant1_t():data(NULL){}
  ~variant1_t(){ free(data); }

  template<typename T> variant1_t ( T v )
    :data(malloc(sizeof(T)))
    { memcpy ( data , &v , sizeof(T)); }

  template<typename T> operator T () const
    { return * reinterpret_cast<T*>(data); }

  void* data ;
} ;

// usage:
variant1_t _int ( 2 ) ;
variant1_t _dbl ( 3.14 ) ;
cout << (int)   _int << endl ;
cout << (double)_dbl << endl ;

October 2000/An Improved Variant Type Based on Member Templates/Listing 3

Listing 3: Definition of class variant2_t

// This variant holds the copy of the original value in
// an object of a class specially designed to ensure proper copy
// and conversion.
class variant2_t
{
  public :

   variant2_t() : data ( NULL ) {}
  ~variant2_t() { delete data; }

  template<typename T> variant2_t ( T v )
    :data(new Impl<T>(v))
    {}

  template<typename T> operator T () const
    { return * CastFromBase<T>(data); }

  private :

    struct ImplBase
    {
      virtual ~ImplBase() {}
    } ;
    template<typename T>
    struct Impl : ImplBase
    {
      Impl ( T v ) : data ( v ) {}
      T data ;
    } ;

    template<typename T> Impl<T>* CastFromBase(ImplBase* v)
    {
      Impl<T>* p = dynamic_cast<Impl<T>*>(v);
      if ( p == NULL )
       throw invalid_argument ( typeid(T).name() +
                                string(" is not a valid type")
                              ) ;
      return p ;
    }

    ImplBase* data ;
} ;

// usage:
variant2_t _int( 2 ) ;
variant2_t _dbl( 3.14 ) ;
variant2_t _str( string( "Hellow"));
cout << (int)   _int << endl ;
cout << (double)_dbl << endl ;
cout << (string&)_str << endl ;

October 2000/An Improved Variant Type Based on Member Templates/Listing 4

Listing 4: The final variant_t definition

#ifndef VARIANT_H
#define VARIANT_H

class variant_t
{
  public :

   variant_t() : data ( NULL ) {}
   variant_t( const variant_t & rhs )
     { if ( rhs.data != NULL )
         rhs.data->AddRef() ;
       data = rhs.data ;
     }
  ~variant_t()
     { if ( data != NULL )
         data->Release() ;
     }
   // NOTE: This code takes care of self-asignment.
   // DO NOT CHANGE THE ORDER of the statements.
   variant_t& operator = ( const variant_t& rhs )
     {
       if ( rhs.data != NULL )
         rhs.data->AddRef();
       if ( data != NULL )
         data->Release();
       data = rhs.data ;
       return * this ;
     }

  // This member template constructor allows you to
  // instance a variant_t object with a value of any type.
  template<typename T> variant_t ( T v )
    : data ( new Impl<T>(v) )
    { data->AddRef() ; }

  // This generic conversion operator let you retrieve
  // the value held. To avoid template specialization conflicts,
  // it returns an instance of type T, which will be a COPY
  // of the value contained.
  template<typename T> operator T () const
    { return CastFromBase<T>( data )->data ; }

  // This forms returns a REFERENCE and not a COPY, which
  // will be significant in some cases.
  template<typename T> const T & get() const
    { return CastFromBase<T>( data )->data ; }

  template<typename T> bool is_type() const
    { return typeid(*data)==typeid(Impl<T>); }

  template<typename T> bool is_type(T v) const
    { return typeid(*data)==typeid(v); }

  private :

    struct ImplBase
    {
      ImplBase() : refs ( 0 ) {}
      virtual ~ImplBase() {}
      void AddRef () { refs ++ ; }
      void Release() { refs -- ;
                       if ( refs == 0 )
                         delete this ;
                     }
      size_t refs ;
    } ;

    template<typename T>
    struct Impl : ImplBase
    {
       Impl ( T v ) : data ( v ) {}
      ~Impl () {}
      T data ;
    } ;

    // The following method is static because it doesn't
    // operate on variant_t instances.
    template<typename T>
    static Impl<T>* CastFromBase ( ImplBase* v )
    {
      // This upcast will fail if T is other than the T used
      // with the constructor of variant_t.
      Impl<T>* p = dynamic_cast<Impl<T>*> ( v ) ;
      if ( p == NULL )
        throw invalid_argument
         ( typeid(T).name()+string(" is not a valid type"));
      return p ;
    }

    ImplBase* data ;
} ;

#endif
October 2000/An Improved Variant Type Based on Member Templates/Listing 5

Listing 5: A test program that demonstrates various uses of variant_t

#include<iostream>
#include<sstream>
#include<vector>
#pragma hdrstop
#include <condefs.h>

#include "variant.h"

using namespace std ;

// test0() shows the basic construction and use of variant_t with
// various types.
void test0()
{
  variant_t _int ( 2 ) ;
  variant_t _dbl ( 3.14 ) ;
  variant_t _str ( string ( "This is a string" ) ) ;
  // IMPORTANT NOTE: The above statement COULD NOT have been
  //  variant_t _str ( "This is a string" ) ;
  // The expression "This is a string" is of type
  //  const char *, which is just a pointer.
  // The copy of a pointer is just another pointer of the
  // same value, not a new membory block with a copy of
  // the contents of the original block.
  // The value copied and stored in _str would be the
  // pointer value and not the character array contents.

  cout << "BEGIN test0" << endl ;
  cout << (int)   _int  << endl ;
  cout << (double)_dbl  << endl ;
  cout << (string)_str  << endl ;
  cout << "END test0"   << endl << endl ;
}

// test1() show how variant_t can be used as efficient
// return values. The internal data is not copied but shared
// among the various variant_t objects involved.
variant_t test1_aux()
{
  return variant_t ( string ( "This is a string" ) ) ;
}
void test1()
{
  cout << "BEGIN test1"       << endl ;
  cout << (string)test1_aux() << endl ;
  cout << "END test1"         << endl << endl ;
}

// test2() shows an invalid convertion throwing
// invalid_argument.
void test2()
{
  cout << "BEGIN test2" << endl ;
  try
  {
    variant_t _dbl(3.14);
    char* n = _dbl ; (n);
  }
  catch ( invalid_argument& x )
  {
    cout << "exception invalid_argument: " << x.what() << endl ;
  }
  cout << "END test2" << endl << endl ;
}

// test3() shows an array of variant_t
void test3()
{
  cout << "BEGIN test3" << endl ;
  variant_t Array [3];
  Array[0]=2;  // int
  Array[1]=3.14; // double
  Array[2]=string("This is a string");
  cout << (int)   Array[0] << endl ;
  cout << (double)Array[1] << endl ;
  cout << (string)Array[2] << endl ;
  cout << "END test3" << endl << endl ;
}

// test4() shows a vector<variant_t>
string print ( variant_t const& aVar )
{
  ostringstream ss ;
  if ( aVar.is_type<int>() )
  {
    ss << "int: " << (int)aVar ;
  }
  else if ( aVar.is_type<double>() )
  {
    ss << "double: " << (double)aVar ;
  }
  else if ( aVar.is_type<string>() )
  {
    ss << "string: " << (string)aVar ;
  }
  return ss.str();
}

void test4()
{
  cout << "BEGIN test4" << endl ;
  vector<variant_t> Vector ;
  Vector.push_back ( 2 ) ;
  Vector.push_back ( 3.14 ) ;
  Vector.push_back ( string("This is a string") ) ;
  std::transform ( Vector.begin() ,
                   Vector.end  () ,
                   ostream_iterator<string> ( cout , "\n" ) ,
                   print
                 ) ;
  cout << "END test4" << endl << endl ;
}

int main()
{
  test0() ;
  test1() ;
  test2() ;
  test3() ;
  test4() ;

  return 0 ;
}

/*
OUTPUT:

BEGIN test0
2
3.14
This is a string
END test0

BEGIN test1
This is a string
END test1

BEGIN test2
exception invalid_argument: char * is not a valid type
END test2

BEGIN test3
2
3.14
This is a string
END test3

BEGIN test4
int: 2
double: 3.14
string: This is a string
END test4
*/

October 2000/An Improved Variant Type Based on Member Templates/Listing 6

Listing 6: The class ArgumentList

#ifndef ARGLIST_H
#define ARGLIST_H

#include<vector>

#include"variant.h"

class ArgumentList
{
  public:

    // Ctor, Dtor, CopyCtor, and Assign ommited
    // because the defaults are apropriate.

    template<typename T> void push_back ( T Value )
      { _container.push_back ( variant_t ( Value ) ) ; }

    // NOTE: This returns a COPY of the value at pos idx.
    template<typename T> T operator [] ( size_t idx ) const
      { return _container [ idx ] ; }

    // This form returns a REFERENCE and not a COPY.
    template<typename T> const T & at ( size_t idx ) const
      { return _container [ idx ].get<T>() ; }

    bool   empty() const { return _container.empty(); }
    size_t size () const { return _container.size (); }

  private:
    std::vector<variant_t> _container ;
} ;

#endif


October 2000/An Improved Variant Type Based on Member Templates/Listing 7

Listing 7: A test program that demonstrates various uses of the ArgumentList class

#include<iostream>
#pragma hdrstop
#include <condefs.h>

#include "arglist.h"

using namespace std ;

void show ( const ArgumentList & args )
{
  cout << "List contents:" << endl ;

  // If your compiler can determine the template 
  // specialization parameter from the left side of the 
  // assigments, you can write: args[0], etc...
  int    i = args.at<int>   (0);
  double d = args.at<double>(1);
  string s = args.at<string>(2);

  cout << "int   : " << i << endl ;
  cout << "double: " << d << endl ;
  cout << "string: " << s << endl ;

  // Used when show() is called from test_recursion().
  if ( args.size() > 3 )
  {
    // The whole list was pushed as the last element.
    const ArgumentList & list = args.at<ArgumentList>(3);
    show ( list ) ;
  }
}
void wrong ( const ArgumentList & args )
{
  cout << "Trying to retrieve the wrong argument." << endl;
  cout << "This will throw invalid_argument" << endl ;
  string s = args.at<string>(0);
}

void test_recursion( ArgumentList & args )
{
  cout << endl << "Testing type-recursion" << endl ;

  cout << "Pushing the list itself as the fourth element." 
       << endl ;
  args.push_back ( args ) ;

  show ( args ) ;
}

void test()
{
  // These variables are allocated in the heap to show
  // how ArgumentList copies the values.
  int    * i = new int    ( 33 ) ;
  double * d = new double ( 3.14 ) ;
  string * s = new string ( "This is a string" ) ;

  // Creates a list with 3 elements.
  ArgumentList args ;
  args.push_back ( *i ) ;
  args.push_back ( *d ) ;
  args.push_back ( *s ) ;

  cout << "Testing ArgumentList." << endl
       << endl
       << "Creating a list with 3 elements:" << endl
       << "int   : " << *i << endl
       << "double: " << *d << endl
       << "string: " << *s << endl ;

  delete i ;
  delete d ;
  delete s ;

  show ( args ) ;

  // wrong() will try to peek up the string at the wrong position.
  try
  {
    wrong ( args ) ;
  }
  catch ( invalid_argument& x )
  {
    cout << "exception invalid_argument: " << x.what() << endl ;
  }

  // This show that ArgumentList can be safely copied.
  cout << endl << "Copying the entire argument list... " << endl ;
  ArgumentList copy = args ;
  show ( copy ) ;

  // This shows that ArgumentList can even contain elements of 
  // type ArgumentList.
  test_recursion(copy) ;
}
void main()
{
  try
  {
    test() ;
  }
  catch ( invalid_argument& x )
  {
    cout << "unexpected exception invalid_argument: " 
         << x.what() << endl ;
  }
  catch ( ... )
  {
    cout << "Unexpected exception" << endl ;
  }
}

/*
OUTPUT:

Testing ArgumentList.

Creating a list with 3 elements:
int   : 33
double: 3.14
string: This is a string
List contents:
int   : 33
double: 3.14
string: This is a string
Trying to retrieve the wrong argument. 
This will throw invalid_argument
exception invalid_argument: string is not a valid type

Copying the entire argument list... 
List contents:
int   : 33
double: 3.14
string: This is a string

Testing type-recursion
Pushing the list itself as the fourth element.
List contents:
int   : 33
double: 3.14
string: This is a string
List contents:
int   : 33
double: 3.14
string: This is a string
*/

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.