Dr. Dobb's | Living By the Rules: Part II

Living By the Rules: Part II

Pete discusses the addition of almost all of TR1 to the new C++ Standard.

June 01, 2006
URL:http://drdobbs.com/cpp/living-by-the-rules-part-ii/188700816

Pete is a consultant specializing in library design and implementation. He has been a member of the C++ Standards Committee since its inception, and is Project Editor for the C++ Standard. He is writing a book on the newly approved Technical Report on C++ Library Extensions to be published by Addison-Wesley. Pete can be contacted at [email protected].

Last March, I spent a week and a half in Berlin, Germany. Despite missed airline connections, cancelled flights, and lost luggage, I enjoyed the trip. Its intellectual high point was the Deutches Theater's presentation of a German translation of Othello. My high-school German is pretty rusty, but the elaborate staging and sound effects, combined with a rudimentary familiarity with the plot, made it fairly easy to follow the action. One of the interesting innovations was the halftime show, designed, I suspect, to lure the patrons, out drinking beer and wine in the lobby, back into the theater for the second act. Imagine, if you will, Desdemona and Bianca's slowly simmering conflict erupting into a duel, with each one armed with two swords...

Shortly after that, the C++ Standards Committee met. These meetings don't have the drama of Shakespeare, but they do have the complex plots. Fortunately, they don't usually have tragic endings. They're more like his comedies, where all the complexities get sorted out with explanations that are nearly plausible. This month, I look at how some of the plot lines have so far been resolved, with changes to the C++ Standard that you can look forward to seeing officially sometime in the future.

The biggest change was the addition of almost all of TR1 to the Standard [1]. That means that you'll get smart pointers, four new kinds of call wrappers [2], a bunch of type traits templates for those who enjoy template metaprogramming, random-number generators with well-defined properties and distributions, an array template that should be your replacement for fixed-size C-style arrays, hashed containers, regular expressions, and nearly full C99 library compatibility [3]. I'm not going to talk about that in this column. It's far too big. But we'll look at the rest of the changes, including a few from the meeting last October in Mont Tremblant, Canada.

Making Common Mistakes Legal

It's frustrating to have to add spaces between ">" symbols when you're wrapping up a declaration that involves nested templates. Many people's natural instinct is to jam them all together:

std::vector<std::pair<int, std::string>> vect;

Today, that's illegal: The two ">" symbols at the end are treated as a shift-right, and the declaration is ill-formed. In the next version of the Standard, that will be legal. As long as you're unwinding template IDs, a non-nested ">>" is treated as two ">"s and not as a single token. This means that you have to be a little careful if you really want to have a shift-right somewhere in your template usage, but that's far less common. For example, the working draft of the new Standard gives this example of code that will be invalid:


template <int i> class X { /* ... */ };
template <class Ty> class Y { /* ... */ };
Y<X<6>>1>> x4;

To make it valid, you have to put parentheses around the arithmetic expression:

Y<X<(6>>1)>> x4;

Another common problem occurs when template code uses a type name inside a friend declaration. Technically, the type name has to be what the Standard calls an "elaborated type specifier." That makes this usage illegal:

template <class Ty> class no_children
{   // template to block inheritance from Ty
no_children() {}
friend class Ty;
};

class final : virtual no_children<final>
{
/* ... */
};

If it worked, this pattern would prevent deriving from the class final [4]. There's a problem, though: In the declaration friend class Ty, the name Ty is not an elaborated type specifier, so it can't be the target of a friend declaration. That's been changed, so that a broader range of syntactic elements can be used as the target of a friend declaration, and this code will be legal.

A common beginner's mistake is writing something like this:


struct S
{
S() : i(0), j(0) {}
S(int j0)
    {
    S();  // mistake
    j = j0;
    }
int i, j;
};

The mistake is thinking that the line marked "mistake" applies the default constructor to the object being constructed, thus setting i and j to 0. Of course, we've all learned to patiently explain that what that line actually does is create an unnamed temporary object of type S, and immediately destroy it. That will still be a mistake, but there is a new syntax that allows a constructor to delegate part or all of its work to another constructor:


struct S
{
S() : i(0), j(0) {}
S(int j0) : S() // changed
    {
    j = j0;
    }
int i, j;
};

The line marked "changed" has a constructor as its initializer list. That tells the compiler to apply that constructor first, and then to execute the body of the constructor being defined [5].

Handy Extensions

One of the first things you learn when you're using containers is to write a bunch of typedefs so that you can change the type of the container without rewriting the rest of the code:


typedef std::vector<int> values;
typedef values::iterator iter;
typedef values::const_iterator const_iter;

Now you can write a loop easily:

values data;
for (iter it = data.begin(); it != data.end(); ++it)
   { /* ... */ }

And you can change the container from, say, a vector to a deque by changing its typedef:

typedef std::deque<int> values;

The typedefs for iter and const_iter will change meaning appropriately, and the loop will compile correctly. But that's a lot of boilerplate to have to write every time you use a container type. Soon, you'll be able to write that code like this:

typedef std::vector<int> values;
for (auto it = data.begin(); 
        it != data.end(); ++it)
   { /* ... */ }

and when you change the container type, the compiler will simply change the type to match [6].

If you write code that uses dynamic libraries (DLLs under Windows, and shared libraries under UNIX), you've probably used a language extension that lets you talk about a template instantiation whose code lives somewhere else. For example, std::basic_string<char> should be instantiated in the dynamic library that has the rest of the Standard Library code, and not in the code that uses it. Unfortunately, if you just use the name of the template instantiation, the compiler doesn't know that you don't mean to put the code there:


std::basic_string<char> string; 
  // might generate code in 
  // executable

The trick that Standard Library implementers have been using relies on a compiler extension. The header <string> generally has a declaration for basic_string<char> that looks something like this:


extern template <> basic_string<char>;  
   // not instantiated here

Then somewhere in the library's implementation code, there's the actual instantiation:


template <> basic_string<char>;
   // may be instantiated here

That is, putting the extern keyword in front of the template declaration tells the compiler that you're using a template instantiation, but you don't want to have it instantiated at that point. That extension is now part of the language, giving you better control over where templates are instantiated.

New Algorithms

The algorithms [7] min, max, min_element, and max_element are sometimes used in pairs, when you're concerned about both the minimum and the maximum values. They've been supplemented by two new algorithms, minmax and minmax_element, that determine both the minimum and the maximum value with a single call to the algorithm. When you're searching through a sequence this is obviously beneficial, because you only have to go through the sequence once instead of making two passes—once to find the minimum value and once to find the maximum value.

Library Interface Improvements

Today, this isn't legal:


std::string file_name("test.txt");
std::ofstream file(file_name);

Instead, you have to go through this circumlocution:

std::ofstream file(file_name.c_str());

One of the library changes is to add std::string overloads for all of the C++ Standard Library functions that currently take only char*. You don't have to call c_str to get to the C++ Library functions.

The other interface change is to containers. Under the current Standard, if you need a const_iterator object that points into a container, you sometimes have to write a cast:


vector<int> vec;
vec.push_back(1);
vec.push_back(2);
vec.begin();  // returns vector<int>::iterator
((const vector<int>&)vec).begin();  
             // returns vector<int>::const_iterator

With the new changes, there's a set of member functions that always returns const_iterator objects:

vec.cbegin(); // returns vector<int>::const_iterator
vec.cend();   // returns vector<int>::const_iterator
vec.crbegin();// returns vector<int>::const_reverse_iterator
vec.crend(); // returns vector<int>::const_reverse_iterator

Floating-Point Improvements

The template numeric_limits<Ty> has a new member, max_digits10. It gives the minimum number of base 10 digits that you need to write if you want to be sure that numbers that differ by numeric_limits<Ty>::epsilon() will be distinguished. For example, if you only write out one digit, then values such as 1.1f and 1.2f will be written as 1., and when you read them back in, they'll have the same value. In this case, two digits are sufficient to distinguish these values. More generally, numeric_limits<Ty>::max_digits10 digits are sufficient to distinguish any two floating-point values that differ by at least epsilon [8].

One problem you may have run into is that there's no easy way, after you've set several floating-point formatting flags, to clear them out so that you can write floating-point values with the default formatting. There's a new manipulator, defaultfloat, that does this for the various floatfield flags:

cout << scientific << 1.0 << defaultfloat << 1.0 << '\n';

C99 Compatibility

One of the big language changes in C99 was the addition of the types long long int and unsigned long long int to the language. They've now been added to C++. This also requires some library additions to provide functions and format specifiers for these types. These library changes were made in TR1, and now, with TR1 incorporated into the C++ Standard, they'll be part of the Standard Library as well.

C99 also made some changes to the preprocessor, which will also become part of the C++ Standard. You can look forward to a new predefined macro, __STDC_HOSTED__, that expands to 1 for a hosted implementation or 0 otherwise. The _Pragma operator has been added, which provides an alternate way of writing pragmas. The benefit over using #pragma is that _Pragma can have a macro as its argument, and the macro will be expanded before being interpreted as a pragma. Syntactically, _Pragma takes a quoted string where #pragma takes a sequence of pp-tokens. The example from the Standard is:


#pragma listing on "..\listing.dir"
_Pragma( "listing on \"..\\listing.dir\"" )

And, for those who really like macro programming, macros can now have variable-length argument lists. Here's one of the examples from the Standard:


#define debug(...) fprintf(stderr, __VA_ARGS__)
debug("Flag");
debug("X = %d\n", x);

These work just the way you'd expect: __VA_ARGS__ is replaced by the arguments passed to the portion of the macro's argument list designated by the ellipsis [9].

Conclusion

I've skipped over some of the details of most of these changes. I don't plan on addressing these changes systematically—you'll undoubtedly be seeing books about the new language Standard as it becomes more complete and more stable. In the meantime, I'll continue to highlight the changes as they're approved, which will be once every six months.

Notes

The nearly final draft is available at www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1836.pdf. The special math functions, including such things as elliptic integrals of the first kind and hermite polynomials, were felt to be too complex and too narrowly focused for the Standard.
A call wrapper is one form of what is known generically as a "function object." A call wrapper type has a function call operator that forwards to a function object held by the call wrapper. One example of a call wrapper is the object returned by std::mem_fun: It holds a pointer to member function. It has a function call operator that can be called with ordinary function call syntax, and forwards to the pointer to member function, treating its first argument as the object to apply the pointer to.
Of course, to use these new library facilities you'll need a good reference book. My completely unbiased recommendation is my book, The C++ Standard Library Extensions: A Tutorial and Reference, which will be published this summer by Addison-Wesley.
This example is attributed to Hyman Rosen in the paper that first proposed changing this rule.
This example puts the default constructor in the initializer list, but you can use any valid constructor invocation. When you do this, you can't put anything else in the initializer list. You have to do all of the relevant initialization in the target constructor, and then the body of the constructor you're defining can do whatever else needs to be done.
Unfortunately, this change will be interpreted as an endorsement of writing one-off loops instead of algorithms. In general, instead of creating named iterator objects, you should write template functions and call them with the temporary objects returned by begin() and end().
Or eight, depending on whether you count the version that uses "<" as separate from the version that uses a user-supplied predicate.
Epsilon is the smallest value that can be added to 1.0 to produce a different value.
You can also have ordinary macro arguments before the ellipsis.