In recent issues of CUJ, Thomas Becker explained some principles of template metaprogramming [1, 2]. In this article, well take up the metaprogramming thread and give you some insights into a powerful metaprogramming tool known as the expression template.
As Becker stated, you can use templates to calculate values at compile time, such as the factorial of n. Becker also described the principle of type traits and showed how to strip the reference property off of a type. Well get a little more ambitious in this article: well show you how to evaluate complex expressions at compile time, thereby reducing the run-time effort for expression evaluation to its bare minimum. Our goal is to demonstrate how to prepare an expression at compile time for repeated, highly efficient evaluation at run time.
Consider the integral:
You can approximate this integral by evaluating the expression x/(1+x) for n equidistant points in the interval [1.0,5.0]. A function that calculates integrals for arbitrary arithmetic expressions could look like the following, provided you manage to implement expression templates that can be evaluated repeatedly for different values:
template <class ExprT> double integrate (ExprT e, double from,double to,size_t n) { double sum=0, step=(to-from)/n; for (double i=from+step/2; i<to; i+=step) sum+=e.eval(i); return step*sum; }
ExprT in this example would somehow represent an expression such as x/(1+x).
Expression Templates for Arithmetic Expressions
An expression template for the solution described in the preceding section requires the compile-time evaluation of arithmetic expressions consisting of unary and binary arithmetic operators and variable and constant operands. Youll find a solution for this type of problem in the classic GOF patterns book [3]. The GOF book has a pattern for exactly this case the Interpreter pattern.
The Interpreter pattern provides a way to represent a language in the form of an abstract syntax tree and an interpreter that uses the syntax tree to interpret language constructs. Actually, the Interpreter pattern is a special case of the Composite pattern. The part-whole relationship of the Composite pattern corresponds to the relationship of an expression and its sub-expressions in the Interpreter pattern:
- The leaf is a terminal expression.
- The composite is a non-terminal expression.
- Evaluation of the components is interpretation of the syntax tree and its expressions.
In this case, the goal is to use the Interpreter patterns syntax tree to represent arithmetic expressions such as (a+1)* c or log(abs(x-N)). The syntax tree has two types of terminal nodes: numeric literals and numeric variables. The non-terminal nodes are unary or binary expressions consisting of one or two sub-expressions. The expressions have different semantics such as +, -, *, /, ++, --, exp, log, and sqrt.
Consider the concrete example of an expression, say (x+2)*3. The composite structure, that is, the syntax tree for this expression, is shown in Figure 1.
The classic object-oriented technique for implementing the Interpreter pattern, as it is suggested in the GOF book, involves the classes shown in Figure 2.
Listing 1 shows samples of the corresponding source code for an implementation of this technique. The base class for UnaryExpr is implemented in analogy to class BinaryExpr, and all the concrete unary and binary expressions follow the example of class Sum.
Listing 2 shows how to use the Interpreter pattern for evaluating the expression (x+2)*3. First an expression object expr is created that represents the expression (x+2)*3, and then the expression object is told to evaluate itself. Naturally, this is an utterly inefficient way to calculate the result of a primitive expression such as (x+2)*3. But hold on; we will now turn the object-oriented approach into a template-based solution.
A Compile-Time Version of Arithmetic Expression Evaluation
In the object-oriented implementation, the classes representing the terminal and non-terminal expressions have a common base class that defines the operation they have in common. This technique is well known: commonalities are expressed by means of a common base class. In template programming, commonalities are expressed in terms of naming conventions and name commonalities. A virtual function in the object-oriented approach becomes a plain, non-virtual function with a certain name. Classes no longer derive from a common base class. Instead, the classes are stand-alone classes that happen to have a member function with the same name and a compatible signature (in the preceding example, an evaluation function named eval()). The expression template solution eliminates all base classes.
Next, we express all the non-terminal expressions, such as Sum or Product, as classes generated from class templates UnaryExpr and BinaryExpr, both of which are parameterized on structural information. These class templates take the types of their sub-expression(s) as type template arguments. In addition, we parameterize the expression class templates on the type of the operation that they represent. That is, the actual operation (+, -, *, /, ++, --, abs, exp, log) will be provided as a function object, and its type is one of the template arguments of the expression class template.
We implement the terminal expressions as regular (non-template) classes, pretty much like they are implemented in the object-oriented approach.
Instead of using run-time recursion, we use compile-time recursion for the expression template: we replace the recursive invocation of the virtual evaluation function with recursive-template instantiation of the expression class templates. Remember the calculation of the factorial of n as demonstrated in [1]. This calculation is based on the fact that instantiation of templates is a recursive process. The compiler may find that for the instantiation of one template it may need an instantiation of another template. The compiler then instantiates that other template, which may require the instantiation of yet another template, and so forth. Many of the template metaprogramming techniques take advantage of the recursive nature of the template instantiation process and use it to perform recursive calculations. We will use this effect to replace the recursive calls of the evaluation function eval() with recursive instantiation of expression class templates.
Figure 3 shows the classes in the template-based solution. Listing 3 shows the source code for this implementation.
The class template for UnaryExpr is analogous to class BinaryExpr. For operations, we can use the pre-defined STL function object types (plus, minus, multiplies, divides, etc.), or we can define our own function object types as needed. A binary expression representing a sum, for instance, would be of type BinaryExpr< ExprT1, ExprT2, plus<double> >. Since this type name is rather unwieldy, we add a creator function for a more convenient solution. Listing 4 shows two examples of the creator functions. (For more information, see sidebar, Creator Functions).
Listing 5 shows how to use the template-based implementation of an interpreter for evaluating the expression (x+2)*3. An expression object expr represents the expression (x+2)*3, and then the expression object is told to evaluate itself. The type of the expression object is not visible in Listing 5, thanks to the creator function, but the type would be:
BinaryExpr< BinaryExpr < Variable,Literal, plus<double> >, Literal, multiplies<double> >
Evaluation
What have we gained by re-implementing the Interpreter pattern using templates instead of inheritance? Well, under the assumption that the compiler inlines all the creator functions and constructors and eval() functions (which is likely since all of them are trivial) the expression:
makeProd(makeSum(Variable(x),Literal(2)),Literal(3)).eval()
evaluates to nothing more than (x+2)*3.
Compare this to the effect of:
Product expr(new Sum(new Variable(x), new Literal(2)), new Literal(3)).eval()
(See Listing 2). The object-oriented approach in Listing 2 results in a number of allocations from the heap and subsequent constructions plus a number of invocations of the virtual eval() function. Most likely, none of the calls to eval() will be inlined because compilers typically do not inline functions that are invoked through a pointer.
In essence, the template-based solution is much faster at run time than the object-oriented implementation.
Further Elaboration of the Expression Template Solution
We will now tweak the expression templates a bit to turn them into something really useful. Our first improvement addresses readability. We want to make an expression such as:
makeProd(makeSum(Variable(x),Literal(2)),Literal(3)).eval()
more readable by making it look more like the expression that it represents, namely (x+2)*3. We can achieve this goal by means of operator overloading. We will make the expression look like eval((v+2)*3.0) with just a few minor modifications.
The first change is to rename the creator functions so that they are overloaded operators; that is, we rename makeSum into operator+, makeProd into operator*, and so forth. The effect is that the term:
makeProd(makeSum(Variable(x),Literal(2)),Literal(3))
turns into:
((Variable(x) + Literal(2)) * Literal(3))
Thats good, but not good enough. We would like to write the expression as ((x+2)*3). Hence, our goal is to eliminate the creation of a Variable and a Literal, which still clutter the expression.
We will start by asking what an expression such as x+2 means, now that we have renamed the creator function makeSum into operator+. (Listing 6 shows the implementation of operator+.)
x+2 is the same as operator+(x,2), which formerly was makeSum(x,2). For this reason x+2 is the result of creating a binary expression object that represents a sum and that was created with the double variable x and the int literal 2 as constructor arguments. More precisely, x+2 is an unnamed object created as BinaryExpr<double,int,plus<double>>(x,2). Note that the type of the object is not quite what we want. We need an object of type BinaryExpr<Variable,Literal,plus<double>>, but the automatic template argument deduction does not know that x is a Variable and 2 is a Literal in our context. The compiler deduces type double from argument x and type int from argument 2 because the compiler examines the types of the arguments passed to the function.
It turns out that we must nudge the compiler a little bit. If we pass an object of type Variable instead of the original variable x, then the automatic argument deduction would at least yield type BinaryExpr<Variable,int,plus<double>>, which is a little closer to the goal. (We will address the remaining int-to-Literal conversion in a minute.) For this reason, a minimal degree of cooperation from the users side is inevitable: users must wrap their variables into objects of type Variable to make the expression work, as shown in Listing 7.
By using a Variable object v instead of a plain numeric variable, we ensure that an expression such as v+2 evaluates to an unnamed object equivalent to BinaryExpr
<Variable,int,plus<double>>(v,2). Such a BinaryExpr object has two data members, which are of type Variable and int respectively. The evaluation function BinaryExpr<Variable,int,plus
<double>>::eval() would return the sum of the result of the evaluation
of its two data members. The crux is that the int data member does not
know how to evaluate itself; we need to turn the literal 2 into an object
of type Literal, which does know how to evaluate itself. How can we automatically
convert constants of any numerical type to objects of type Literal? In
order to solve the problem, well define expression traits. For more information
on traits, see the sidebar, Traits.
The traits technique solves the problem of converting numeric literals to objects of type Literal. For each expression type, we define expression traits that provide information about how sub-expressions should be stored inside the expression objects of which they are operands. All entities of numeric types shall be stored as objects of type Literal; all objects of type Variable shall be stored as they are, namely as Variables; and all non-terminal expression objects shall also be stored as they are. Listing 8 shows the definition of the expression traits.
The expression traits class defines a nested type expr_type, which represents the expression objects expression type. A general expression traits template defines the expression type for all expressions that are class types, such as BinaryExpr, UnaryExpr, or Variable. In addition, there are specializations of the general class template for all those types that are built-in numeric types such as short, int, long, float, double, etc. For all non-class expressions, the expression type is defined as type Literal.
Inside the definition of the BinaryExpr and UnaryExpr classes, we will use the expression traits to determine the data member type that will hold the sub-expression.
Thanks to the expression traits, an expression object of type BinaryExpr<Variable,int,plus<double>> will contain its two operands as objects of type Variable and Literal, as desired.
Now we have ensured that an expression such as ((v + 2) * 3).eval(), where v is a Variable that wraps a double variable x, will evaluate to (x+2)*3. Let us do some minor tweaking to make the expression even more readable. Most people find it weird to invoke a member function of something that looks like an expression. If we define a helper function, we can turn the expression ((v + 2) * 3).eval() into something like eval((v + 2) * 3), which looks more natural to most people but is otherwise equivalent. Listing 9 shows the helper function.
Figure 4 illustrates how the expression ((v + 2) * 3).eval(), where v is a Variable that wraps a double variable x, gradually unfolds at compile time to the expression (x+2)*3.
Repeated Evaluation of an Expression Object
What is the benefit of expression objects? Each expression object represents the syntactical decomposition of an arithmetic expression; its a syntax tree that knows how to interpret itself to produce a numeric value. Basically, weve set up machinery for evaluation of expressions something that is built into the language anyway. So, what is the point? A final small adjustment to the solution will reveal the point.
So far, the syntax tree interpretation is rather static. Each syntax tree is created and interpreted only once. A more dynamic usage model is possible, where a given syntax tree is evaluated repeatedly for different input values. Remember, we ultimately want to calculate integrals such as:
using an integration function such as the one below, which approximates the integral by evaluating an expression for a specified number of equidistant points in an interval:
template <class ExprT> double integrate (ExprT e,double from,double to,size_t n) { double sum=0, step=(to-from)/n; for (double i=from+step/2; i<to; i+=step) sum+=e.eval(i); return step*sum; }
We would like to use this function as shown below to approximate the integral of x/(1+x) from the example above:
Identity x; cout << integrate (x/(1.0+x),1.0,5.0,10) << endl;
We need an expression object that can be interpreted repeatedly, something that our expression templates do not allow yet. It just takes a minor modification to turn our static syntax tree interpretation into a repeatable interpretation. We just have to change all evaluation functions of all the expression class templates so that they accept an input argument, namely the value for which they evaluate themselves. The non-terminal expressions will pass on the argument to their sub-expressions. The Literal will accept the argument and ignore it; it will continue returning the constant value that it represents. The Variable will no longer return the value of a variable that it refers to but will return the value of its argument. For this reason we rename it to Identity. Listing 10 shows the modified classes.
If we add non-terminal expressions for incorporation of numeric functions (such as sqrt, sqr, exp, log, etc.), we can even calculate the Gauss distribution:
double sigma=2.0, mean=5.0; const double Pi = 3.141593; cout << integrate( 1.0/(sqrt(2*Pi)*sigma) * exp(sqr(x-mean)/(-2*sigma*sigma)), 2.0,10.0,100) << endl;
For implementation of non-terminal expressions for the numeric functions, such as sqrt, exp, and sqr in the example, we can use functions from the Standard C library. We would implement these non-terminal expressions as creator functions that create a unary expression, whose operation is one of the predefined C functions (see Listing 11).
With these additions in place, we now have a powerful, high-performance solution for evaluating arithmetic expressions. The techniques demonstrated in this article allow you to set up expression templates for logical expressions. By renaming the evaluation function from eval() to operator(), which is the function call operator, you can easily turn the expression objects into function objects, which you can then use in conjunction with STL algorithms. Below is an example of a logical expression that is used as a predicate for counting elements in a list:
list<int> l; Identity x; count_if(l.begin(),l.end(), x >= 0 && x <= 100 );
The example nicely illustrates the gain in readability that expression templates enable, at zero cost in terms of run-time performance. Provided that the expression templates are already in place, they are easy and convenient to use. All expression template libraries are built on principles similar to the ones discussed in this article.
References
Quite a number of expression template libraries are available for free download. Some of these libraries are mentioned in the list of references below. However, this list is by no means comprehensive or representative. If you are interested in further information on expression templates, we recommend one of the directories ([5, 6]).
[1] Thomas Becker. STL & Generic Programming: C++
Template Metprogramming, C/C++ Users Journal, August 2002.
[return to text]
[2] Thomas Becker. STL & Generic Programming: More
on C++ Template Metprogramming, C/C++ Users Journal, October 2002.
[return to text]
[3] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides.
Design Patterns: Elements of Reusable Object-Oriented Software (Addison-Wesley,
1995).
[return to text]
[4] T. Veldhuizen. Expression Templates, C++
Report, June 1995, pp. 26-31. The article has been reprinted in the book
C++ Gems edited by Stanley Lippman (SIGS Books & Multimedia, 1996).
[return to text]
[5] Research Centre Jülich, <www.fz-juelich.de/zam/cxx/>.
An impressive directory of C++ resources such as books, articles, FAQs, other
C++ pages, compilers, libraries, etc. See in particular the links to other C++
libraries at <www.fz-juelich.de/zam/cxx/extmain.html#lib>.
[return to text]
[6] The Object-Oriented Numerics Page, <www.oonumerics.org/oon/#libraries>.
A directory of links to freely available numeric libraries.
[return to text]
[7] The Blitz Project, <http://oonumerics.org/ blitz/>. A C++ class library for scientific computing that uses template techniques to achieve high performance.
[8] PETE (Portable Expression Template Engine), <www.acl.lanl.gov/pete/>. A portable C++ framework to easily add powerful expression templates.
[9] POOMA (Parallel Object-Oriented Methods and Applications), <www.acl.lanl.gov/pooma/>. An object-oriented framework for applications in computational science requiring high-performance parallel computers.
[10] MET (Matrix Expression Templates), <http://met.sourceforge.net/>. A C++ matrix class library. C++ overloaded operators enable one to write the simplest code like u = m*v + w.
[11] MTL (Matrix Template Library), <www.osl.iu.edu/research/mtl/>. A high-performance generic component library that provides linear algebra functionality for a wide variety of matrix formats. Uses an STL-style approach. Provides generic algorithms corresponding to the mathematical operations that define linear algebra. Containers, adaptors, and iterators are used to represent and to manipulate matrices and vectors.
[12] Jeremy G. Siek and Andrew Lumsdaine. The Matrix Template Library: A Generic Programming Approach to High-Performance, <www.lsc.nd.edu/downloads/research/ mtl/papers/iscope_final.pdf>.
[13] FACT! (Functional Additions to C++ through Templates and Classes), <www.kfa-juelich.de/zam/FACT/start/index.html>. FACT! is a library that offers lambda expressions built on top of PETE. By using FACT!'s lambda syntax, the programmer can define generic functions for use with STL on the fly.
[14] FC++ (Functional Programming in C++), <www.cc.gatech.edu/~yannis/fc++/>. FC++ is a library for functional programming in C++, which you can use to define your own higher-order polymorphic functions.
[15] BLL (The Boost Lambda Library), <www.boost.org/libs/lambda/doc/index.html>. BLL is a C++ template library that implements lambda abstractions for C++. The primary motivation for BLL is to provide flexible and convenient means to define unnamed function objects for STL algorithms.
[16] Phoenix (A parser used by Spirit), <http://spirit.sourceforge.net/index.php?doc=docs/phoenix_v1_0/index.html>. Phoenix is a framework that opens up functional programming techniques such as Lambda (unnamed functions) and Currying (partial function evaluation), similar to FC++ and BLL.
Acknowledgements
Our thanks to Gary Powell, who brought FACT!, FC++, BLL, and Phoenix to our attention.
About the Authors
Angelika Langer works as an independent freelance trainer/consultant. Her main area of interest and expertise is object-oriented software development in C++ and Java. She can be reached at [email protected].
Klaus Kreft is a senior consultant at Siemens Business Services in Germany. He can be reached at [email protected].