Welcome
When CUJ's Editor-in-Chief asked me if I wanted to write a column on Generic Programming and the C++ STL (Standard Template Library) for CUJ, I felt quite pleased and honored. Mixed in with this pleasure was a good dose of fear at the prospect of having to write regularly for an audience as competent and critical as the CUJ readership. But then, the only sure way to find out if you are up to a task is to take it on. So here I am, and welcome everybody to this new column.
Target Audience and Prerequisites
Let me begin by saying a few words about who I am, for whom I will be writing, and how I think we can make this column a fruitful endeavor for everybody.
When I read a book or an article written by someone like Donald Knuth or Bjarne Stroustrup, I see the relationship between the author and myself as a teacher-student relationship. I do hope that this column will be read by junior software engineers or perhaps even students of computer science to whom I could well be a teacher. By and large, however, I regard the readers of this journal as my peers. Just like you, dear reader, I develop software for a living. While that includes a variety of different activities, it mainly consists of writing code. When I go to work at noon, I sit down at my terminal and write code. It's what I do best, it's what I like, and I am proud of what I do. I have no desire to do anything else for a living.
The only thing that perhaps distinguishes me from you is that I have had the good fortune of being able to use the STL, and generic programming in general, extensively in my work. Moreover, I have some experience as a teacher and a writer, and I believe that this will enable me to share with you some of the things that I have learned. I also hope that this will not remain a one-way street. I know that many of you have learned and understood things that have evaded me. If you email me on any subject that I treat here, I will try my best to pass your insights on to everybody else. I would much rather assist you in exchanging your ideas than just lecturing you. A word of caution, though: The fact that I am a software developer naturally implies that I am under the pressure of impossible deadlines and unfulfillable requests most of the time. So please do not be offended if an email ever remains unanswered. It will not be out of disrespect, but for sheer lack of time.
Next, let me say a few words about the audience to whom my column is directed. The only prerequisite for benefiting from this column is that you know C++. By "knowing C++," I mean that you have a basic understanding of every feature of the language. It is not necessary for you to have used every one of these features extensively. I know only too well that this would not be a realistic assumption for a general audience. Doesn't it always go like this: you study a programming language such as C++, or a library such as MFC, or an API such as Win32. You do your best to learn everything about it. Then you work with it every day for months or even years. Although you apply most of what you have learned, there is always the one feature or the one class or the one group of functions that you never get to use, and therefore never really become familiar with. Somehow it's always the other guy who gets to use templates, or CWinThread, or the socket interface.
C++ templates are in fact a good example to explain the kind of prerequisites that I will assume. As we will see shortly, templates are the most important part of C++ for the purpose of generic programming. I will not explain them to you from scratch. Rather, I will assume that you have read about them in a book or learned about them in a C++ class. However, I will not assume that you have used templates extensively in your daily work. I will devote an entire installment to bring you up to speed on those aspects of templates that you do not learn from books or classes, but only from falling flat on your face in everyday programming.
What This Column Is About
The main focus of this column is the C++ Standard Template Library, or STL for short. The STL is part of the C++ Standard Library. As the name would suggest, the STL provides a collection of class and function templates, primarily container classes and related items. However, the STL is much more than just a collection of utilities. It is a piece of software that is designed and written following the generic programming paradigm. To fully understand the STL and get the maximum benefit out of its use, you must view it in the context of generic programming. Conversely, understanding the STL will help you understand generic programming, which goes far beyond the STL as a powerful tool in software engineering. In fact, in this first column I will talk about generic programming in general, with no mention of the STL.
I will not assume any prior knowledge of the STL. My treatment will be entirely self-contained. However, this does not mean that I will write a comprehensive introduction to using the STL. There are several good books on the STL out there. I see no need to write another one. Besides, I was never much of a fan of books that come in installments. I did enjoy reading the preprints of Tom Wolfe's Bonfire of the Vanities in Rolling Stone magazine back in the eighties. But a book on computer programming is not worth much to me if I don't have the whole thing with index and table of contents in front of me. Therefore, if you want to use the STL safely and effectively, you will have to get a book if you do not have one yet. Having no affiliation whatsoever with the author or the publisher, I see no reason to hide from you that I prefer Matt Austern's book [1] over the rest of them by a wide margin. But then, books are a matter of taste, and I urge you to browse and find your personal favorite.
So if this column is not going to be a comprehensive treatment of the STL, what will it be instead? My purpose is threefold. First, I want to introduce you to the tools and possibilities that the STL provides, thereby motivating you to read more and explore further on your own. Second, I will point out common pitfalls in using the STL, of which there are many. Finally, I will try to look at each topic from a more general, conceptual point of view, thereby providing insights that often get lost in a more systematic treatment. In particular, I will emphasize how the STL is an example of generic programming. I cannot stress enough how important the generic programming viewpoint is for the use and understanding of the STL. In the long run, I will also write about topics in generic programming that go well beyond the STL.
In short, I hope to provide valuable insights and information to both the STL novice and the programmer who has already used the STL. You can read this column with no prior knowledge of the STL, but it will not provide you with the comprehensive and systematic treatment that a book affords.
Introduction to Generic Programming
With the personal stuff and the preliminaries out of the way, let us move right along to the main topic of today's column. The term "generic programming" has been around in computer science for quite a while, but until recently, it was rarely heard outside of academic circles. That's changing fast, and for good reasons. Therefore, this first column will give you a very general and basic explanation of what generic programming is.
First of all, let me assure you that generic programming is not some new technology or a new set of rules that you will have to learn. It is merely an attempt to look at certain well known programming techniques from a conceptual, somewhat abstract point of view. In this respect, generic programming is akin to object-oriented programming. Imagine for a moment, if you will, an experienced C programmer who has been living and programming on an island for the last twenty years. Now she is given a bare-bones manual on the C++ syntax. I am sure that it would not take long for her to figure out clever uses of classes and inheritance and the like. And yet, you will agree that a person does not even come close to reaping the full benefit of using C++ until she starts looking at object-oriented programming on a conceptual level. I am beginning to think that the idea of generic programming might be as far reaching and essential as object-oriented programming. Be that as it may, I believe that using programming techniques such as C++ templates without an understanding of generic programming is not unlike using classes without an understanding of object-oriented programming.
Since I believe in tracing things all the way to their roots and beginnings, let us start by looking up the word "generic" in the dictionary. It derives from Latin "genus" for "birth," "kind," "type." Of the several definitions given at www.m-w.com, one actually reads, "having no particularly distinctive quality or application." I have seen a lot of programming that was generic in this sense, but that is not the kind we are talking about here. The relevant definition in our context is "relating to or characteristic of a whole group or class: general."
This explanation from the dictionary should give you a fairly good intuition of what generic programming is. Unfortunately, there is no computer science dictionary that would provide us with an authoritative definition. In fact, if you browse the literature, you will find a number of definitions which, although not exactly contradicting each other, vary quite a bit in scope and in meaning. I do not think that there is much of a controversy going on here. It is just that software engineering as a discipline is so young that the terminology and the underlying concepts are still evolving. We should not be confused or upset by this. Instead, we should be thrilled about living in such exciting times, where we can be an active part of truly historic changes and developments.
A Definition of Generic Programming
As a former mathematician specializing in algebra and logic, I can barely restrain myself from hitting you with the full force of one of the abstract, academic definitions of generic programming. They are so beautiful. Instead, I will do the exact opposite and start you off with a very practical, hands-on, C++-oriented definition. We'll get the bigger picture in later columns as we learn about the STL.
For a solid, hands-on understanding, I like to refer to Bjarne Stroustrup's book [3], page 327, where, almost in passing, he defines generic programming as "programming using types as parameters." To fully understand this, let us take a step back and look at two ways of using parameters in traditional, non-generic programming.
Using Values as Parameters
The most frequent and most obvious use of parameters occurs when we write a function that takes one or more arguments, as in:
void CircleArea(double dblRadius) { return dblRadius * dblRadius * kPi; }
Here, dblRadius is a parameter in the sense that its value is unspecified. The value of dblRadius will be specified at run time, and it is quite likely to be different for different calls of the function. Therefore, we call dlbRadius a run-time parameter.
The above example in fact demonstrates a second way of using values as parameters. The function is obviously part of a set of utility functions. It would not have been a good idea for the programmer to make a decision concerning the desired accuracy of pi. Therefore, he defined a constant kPi in a suitable scope or namespace, like this:
const double kPi = 3.14159;
The variable kPi is a parameter in the sense that its value is unspecified when kPi occurs in the actual code. This time, however, the value is not specified at run time, but at compile time. What has been parameterized here is a block of code. If, for example, kPi is a private static class member, this block of code would be a particular class. The class could then be used in different contexts, accommodating various requirements for the accuracy of pi.
Using Types as Parameters
Now back to Bjarne Stroustrup's definition of generic programming: using types as parameters. It turns out that C++ allows us to use types as parameters in two different ways. These correspond exactly to the two ways in which we have just used values as parameters: types can be run-time parameters, and they can be compile-time parameters. Let us begin by looking at types as run-time parameters. For a simple example, we need two classes that are derived from a common base class and a function that takes a pointer to the base class as its argument:
class Base { public: virtual void Memfoo()=0; } class Derived1 : public Base { public: void Memfoo(); } class Derived2 : public Base { public: void Memfoo(); } void Foo(Base* pb) { pb->Memfoo(); }
When Foo gets called at run time, its argument can be the address of a Derived1 object or the address of a Derived2 object:
Derived1 der1; Derived2 der2; Base* ptrBase; if( /* some condition */ ) ptrBase = &der1; else ptrBase = &der2; Foo(ptrBase);
The call to the member function Memfoo will be dispatched correctly to Derived1::Memfoo in the first case and to Derived2::Memfoo in the second case. As you can see, the actual type of Foo's argument is specified at run time. Therefore, we are looking at an example of a run-time type parameter. Note that it is the existence of a virtual function that makes the two types Derived1 and Derived2 behave differently when referenced through a pointer to Base.
Now let us look at the second kind of parameter, the compile-time parameter. How can we use types as compile-time parameters in C++? This is exactly what templates do for you. The programmer who wrote the function CircleArea that we discussed earlier may well decide to also provide a function to square a double:
double Square(double x) { return x * x; }
Odds are someone will want the same function for floats, ints, and maybe even for some non-basic types. Rather than writing the same code over and over again with just the type of x changed, we may write:
template<typename T> T Square(T x) { return x * x; }
We may then call the function Square on variables of different types, as in
int i; float x; Square(i); Square(x);
The compiler will go back to our definition of Square, substitute the appropriate type (int or float) for the type parameter T, and generate the actual code for Square with the respective type.
Here, T is a type parameter because it is a type that remains unspecified when the function Square is written. It is a compile-time parameter because it is the compiler that substitutes the various types for T as needed.
Conclusion
Following Bjarne Stroustrup, I have defined "generic programming" as "using types as parameters." We saw that C++ allows us to use types as parameters in two different ways: as run-time parameters and as compile-time parameters. For types to be run-time parameters, they must be class types derived from a common base class, and they should have virtual functions. A pointer to base class can then, at run time, become a pointer to any one of the derived classes. Compile-time type parameters in C++, on the other hand, are synonymous with template parameters.
As I said before, this is a hands-on definition of generic programming. As we discuss the C++ Standard Template Library, we will see that there is a bigger picture behind all this.
Epilogue
I mentioned earlier that the terminology used in modern software engineering is still in the process of settling. Therefore, it would be prudent to say a few words about some common variations in the terminology and the definitions concerning generic programming. First of all, Chapter 6 of Czarnecki and Eisenecker's new book [2], besides being an excellent introduction to generic programming in general, gives a comprehensive overview of the terminology that various authors have used.
A recent topic in Herb Sutter's Guru of the Week [4] referred to the use of types as run-time parameters and compile-time parameters in C++ as "flavors of genericity." This is pretty much in keeping with Stroustrup's point of view. Austern [1], on the other hand, gives several variants of the more general, conceptual definition of generic programming in his book. This conceptual point of view is very important. I will come back to this in due time.
As you probably know, the use of types as run-time parameters, that is, referring to derived classes with virtual functions through pointers to a common base class, is commonly known as polymorphism. Most authors choose not to view this in the more general context of generic programming at all.
A very interesting discussion of run-time versus compile-time type parameters can be found in Section 13.6 of Stroustrup's book [3]. Here, Stroustrup points out that using types as parameters in these two different ways has also been referred to as run-time polymorphism versus compile-time polymorphism. It is very common in the C++ community to use the term "polymorphism" as a synonym for "run-time polymorphism." However, as the discussion of polymorphism in Czarnecki and Eisenecker's book [2] shows, it is quite enlightening to view polymorphism as the general idea of programming against an abstract interface.
As the name would suggest, the C++ Standard Template Library makes extensive use of templates. Inheritance and virtual functions, on the other hand, play virtually (no pun intended) no role in the STL. Therefore, my perception of generic programming in future installments of this column will be very much tilted towards templates, that is, towards the use of types as compile-time parameters.
References
[1] Matthew H. Austern. Generic Programming and the STL (Addison Wesley, 1998).
[2] Krzystof Czarnecki and Ulrich W. Eisenecker. Generative Programming (Addison-Wesley, 2000).
[3] Bjarne Stroustrup. The C++ Programming Language, Third Edition (Addison Wesley, 1997).
[4] Guru of the Week #68. <http://www.peerdirect.com/resources/gotw068.html>.
Thomas Becker works as a senior software engineer for Zephyr Associates, Inc. in Zephyr Cove, NV. He can be reached at [email protected].