The Problem
Sometimes it is useful to declare a variable with a type that is not known directly, but is known to be "the type of this expression." For example:
void f(LibClass *p) { typeof(p->value) value = p->value; /* ... */ }
where typeof(p->value) yields a type that is used to declare the local value. This is particularly useful in templates, where many of the types in use are not known ahead of time. (For example, suppose LibClass above had been a template type parameter.)
Some compilers implement a typeof keyword as an extension, but code that uses such an extension is not portable. This article describes a portable way to write a typeof operator.
Almost Enough
C++ has almost enough functionality to make typeof easy. Function templates can be used to extract a type from an expression and declare a typedef of that type:
template<class T> void f(T) { typedef T TheType; } void g() { f(123); } // TheType is "int"
Here the typedef TheType in the instantiated function template f<int> has the same type as the expression 123. But there is no way to export the type from the function, so by itself this is not useful for implementing typeof.
Similarly, overloading can be used to select a function that contains a typedef that matches the type:
void f(char) { typedef int TheType; } void f(short)( typedef short TheType; } void g() { f('x'); } // TheType is "char"
but again there is no way to extract that type.
Class templates can be used to map a value or type to another value or type, as in:
// no definition needed template<class T> struct Unsigned; template<> struct Unsigned<char> { typedef unsigned char type; }; template<> struct Unsigned<short> { typedef unsigned short type; }; void f() { Unsigned<char>::type a; // unsigned char Unsigned<short>::type b; // unsigned short }
But you cannot use a class template to extract a type from an expression, as you can with function templates or overloading. (If the expression is a name with external linkage it is possible to implement typeof with class templates by using a template non-type parameter, but this is not very useful.)
Function templates and overloaded functions can also map an expression of one type into an expression of another type, as in:
unsigned char Unsigned(char x) { return x; } unsigned short Unsigned(short x) { return x; } void f() { short s = 0; s; // the expression "s" // has type "short" Unsigned(s); // the expression "Unsigned(s)" // has type "unsigned short" }
This is also not sufficient, although it turns out to be part of the solution.
We need some way to map the type of an expression to something that can be used as a template argument. The only construct in the language that can do this is sizeof. It happens that sizeof is sufficient as the basis of a solution.
The Solution
The trick is to map each type to a unique integer value using overloaded functions and sizeof. Then template specializations can map the integer value to the right type:
- The expression is passed to an overloaded function.
- The return type of the particular function selected is "pointer to array of char," with the array size being unique to the type.
- The sizeof operator extracts the array size as a constant. The function is never actually called and so does not need a definition.
- The constant is used to select a class template among a set of specializations; the selected class template contains a typedef with the right type.
Simple Example
The following example shows a simple typeof that handles expressions of type short, int, and long.
// No definition, only specializations template<int N> struct select_type; template<> struct select_type <1> { typedef short Type; }; template<> struct select_type <2> { typedef int Type; }; template<> struct select_type <3> { typedef long Type; }; typedef char CharArrayOf1[1]; typedef char CharArrayOf2[2]; typedef char CharArrayOf3[3]; typedef CharArrayOf1 *PtrCharArrayOf1; typedef CharArrayOf2 *PtrCharArrayOf2; typedef CharArrayOf3 *PtrCharArrayOf3; PtrCharArrayOf1 select_array(short); // No definitions needed PtrCharArrayOf2 select_array(int); PtrCharArrayOf3 select_array(long); #define typeof(x) \ select_type <sizeof(*select_array(x))>::Type
Consider what happens to typeof(123L):
- The macro expands to
select_type <sizeof(*select_array(123L))>::Type.
- The call select_array(123L) selects the following declaration from the set of overloaded function declarations:
PtrCharArrayOf3 select_array(long);
- The type of select_array(123L) is PtrCharArrayOf3, or "pointer to array of char length 3."
- The type of *select_array(123L) is "array of char length 3."
- The expression sizeof(*select_array(123L)) is an integral constant expression with the value 3. Since the operand of sizeof is not evaluated there is no need to have a definition for select_array.
- The template-id select_type<sizeof(*select_array(123L))> is effectively select_type<3>, which selects the specialization
template<> struct select_type<3> { typedef long Type; };
- The qualified-id select_type<sizeof(*select_array(123L))>::Type is the above typedef, which has type long.
- So typeof(123L) refers to a typedef of type long.
Note that all of these operations happen at compile time. The generated code will be the same as if the type were written as "long" instead of "typeof(123L)."
Improvements
The typedefs in the above example are not really necessary and they pollute the global namespace. We can eliminate them at the risk of confusing human readers (and maybe a few compilers):
char (*select_array(short))[1]; // function returning pointer // to array of char length 1 char (*select_array(int ))[2]; // length 2 char (*select_array(long ))[3]; // length 3
If we declare the parameters as references to const instead of using simple pass by value, it becomes possible to use expressions that cannot be passed by value for example, objects of classes with private copy constructors. So in the general case, given type T, we want to write the select_array declaration for T as:
char (*select_array(const T &))[nnn];
Declaring the required specializations and functions can be handled much more easily with a macro:
#define REGISTER_TYPEOF(N,T) \ template<> struct select_type<N> { typedef T Type; }; \ char (*select_array(const T &))[N];
However, we run into trouble with some compound types. For example, if we expand:
REGISTER_TYPEOF( 3, void (*)() )
we get:
template<> struct select_type<3> { typedef void (*)() Type; }; char (*select_array(const void (*)() &))[3];
Both of these declarations are ill formed because combining C++ types is not as simple as substituting a type for a type name. We can make the macro work again by introducing another class template:
template<class T> struct WrapType { typedef T WT; };
This template takes advantage of the fact that when you instantiate a template with a compound type, e.g.
WrapType<void (*)()>
you create a simple name (the template parameter, in this case T) for the type. We can't get to T directly from outside the template but we can get to the typedef WT which has the same type. In this example,
WrapType<void (*)()> :: WT
has type void (*)(). So if we replace instances of T in the macro with WrapType<T>::WT we have the original type but in a form that we can use to build other types.
Our macro now looks like this:
#define REGISTER_TYPEOF(N,T) \ template<> struct select_type<N> { \ typedef WrapType<T>::WT Type; }; \ char (*select_array(const WrapType<T>::WT &))[N];
Note that the WrapType template could have been written as a member template of select_type to avoid putting the name in the global namespace.
See Listing 1 for a more complete example using the above macro and several test types.
Limitations
This approach does require that each type that might be used in typeof be explicitly registered with the REGISTER_TYPEOF macro. This limits, or at least places additional burden, on the use of this technique in general-purpose template libraries.
The need to assign a distinct ordinal to each type could be a nuisance. If all of the REGISTER_TYPEOF uses are in a single include file the macro could be simplified by using __LINE__ instead of N:
#define REGISTER_TYPEOF_ONE_FILE(T)\ REGISTER_TYPEOF(T,__LINE__)
It does not matter that the ordinals may be large and/or discontiguous; the only restriction is that they be positive and distinct.
Another limitation: if the type of the expression given to typeof has not been registered, the resulting error message may be somewhat cryptic; but it should at least refer to the name select_array which should give a clue to the problem.
Related Techniques
The map-through-sizeof approach may be used in other places. For example, function overloading can be used to distinguish whether one class is derived from another, and the result can be used to select a specialization of a class template using the above approach.
For example, using Andrei Alexandrescu's SUPERSUBCLASS mechanism [1] you can get a compile-time Boolean constant which says whether one class is derived from another. That value may be used to select one of two template specializations. For example:
struct A { }; struct B : A { }; // B is derived // from A struct C { }; // C is not derived from A template<bool derived> struct D; template<> struct D<true> { template<class X, class Y> struct SelectedTemplate { int which() { return 1; } }; }; template<> struct D<false> { template<class X, class Y> struct SelectedTemplate { int which() { return 2; } }; }; template<class T, class U> int f(T t, U u) { D<SUPERSUBCLASS(T,U)>::SelectedTemplate<T,U> d; // Just to show which one was selected return d.which(); } void g(A a, B b, C c) { f(a,b); // returns 1 f(a,c); // returns 2 }
If U is derived from T then the first SelectedTemplate is used, else the second one is used. The first one can rely on Y being derived from X; the second one cannot.
In some cases template specialization isn't needed at all. For example:
// Note: it is platform-dependent whether char is signed or not char (*sign_fct(char ))[2]; char (*sign_fct(signed char ))[2]; char (*sign_fct(unsigned char ))[1]; char (*sign_fct(short ))[2]; char (*sign_fct(unsigned short ))[1]; char (*sign_fct(int ))[2]; char (*sign_fct(unsigned int ))[1]; char (*sign_fct(long ))[2]; char (*sign_fct(unsigned long ))[1]; char (*sign_fct(long long ))[2]; char (*sign_fct(unsigned long long))[1]; #define isSigned(x) (sizeof(*sign_fct(x)) == 2)
The value of isSigned(expression) will be true if and only if the type of expression is signed. Of course this could also be done as numeric_limits<typeof(expression)>::is_signed using the standard library class numeric_limits.
Earlier Work
Brian Parker posted an article in 1997 on comp.std.c++ [1] which uses the same basic technique. Andrei Alexandrescu has also explored this technique in a posting on comp.std.c++ and in an article on the CUJ website [2].
Conclusion
For applications that need a typeof operator the technique described above provides most of the power of a compiler extension while keeping the code portable. The general approach of using function overloading and sizeof to extract type information about an expression may be used in other places as well.
The implementation details of this technique can be somewhat obscure but the actual uses (such as reference to typeof) can be kept simple. Since all of these operations occur at compile time there is no run-time cost for using them.
References
[1] Brian Parker. "Poor Man's typeof() Implementation," Comp.std.c++, 10 November 1997.
[2] Andrei Alexandrescu. "On Mappings Between Types and Values," C++ Experts Forum, www.cuj.com/experts/1810/alexandr.html.
Bill Gibbons has been active in the C++ community since working on Apple's port of CFront 2.0 in 1988 and joining the C++ standards committee in 1990. He has done compiler, embedded, and numerics work for numerous companies including HP, Apple, Taligent, Palm, and Asymetrix. He is currently working at Pixo on Internet technologies for cellular phones.