Every algorithm takes at least one pair of iterators that specify a range of objects over which to do something. min_element
finds the smallest value in the range, for example, while accumulate
summarizes some information about the range as a whole and partition
separates all the elements of a range into those that do and do not satisfy some criterion. For algorithms to do their work, they must examine every object in the range(s) they are passed, and they do this in the way you'd expect: they loop from the beginning of the range(s) to the end. Some algorithms, such as find
and find_if
, may return before they complete the traversal, but even these algorithms internally contain a loop. After all, even find
and find_if
must look at every element of a range before they can conclude that what they are looking for is not present.
Internally, then, algorithms are loops. Furthermore, the breadth of STL algorithms means that many tasks you might naturally code as loops could also be written using algorithms. For example, if you have a Widget
class that supports redrawing,
class Widget { public: ... void redraw() const; ... };
and you'd like to redraw all the Widget
s in a list, you could do it with a loop, like this,
list<Widget> lw; ... for (list<Widget>::iterator i = lw.begin(); i != lw.end(); ++i) { i->redraw(); }
but you could also do it with the for_each
algorithm:
for_each(lw.begin(), lw.end(), mem_fun_ref(&Widget::redraw));
For many C++ programmers, writing the loop is more natural than calling the algorithm, and reading the loop is more comfortable than making sense of mem_fun_ref
and the taking of Widget::redraw
's address. Yet this article argues that the algorithm call is preferable. In fact, this article argues that calling an algorithm is usually preferable to any hand-written loop. Why?
There are three reasons:
- Efficiency: Algorithms are often more efficient than the loops programmers produce.
- Correctness: Writing loops is more subject to errors than calling algorithms.
- Maintainability: Algorithm calls often yield code that is clearer and more straightforward than the corresponding explicit loops.
The remainder of this article lays out the case for algorithms.
From an efficiency perspective, algorithms can beat explicit loops in three ways, two major, one minor. The minor way involves the elimination of redundant computations. Look again at the loop we just saw:
for (list<Widget>::iterator i = lw.begin(); i != lw.end(); ++i) { i->redraw(); }
I've highlighted the loop termination test to emphasize that each time around the loop, i
will be checked against lw.end()
. That means that each time around the loop, the function list::end
will be invoked. But we don't need to call end
more than once, because we're not modifying the list. A single call to end
would suffice, and, if we look again at the algorithm invocation, we'll see that that's exactly how many times end
is evaluated:
// this call evaluates lw.end() exactly // once for_each(lw.begin(), lw.end(), mem_fun_ref(&Widget::redraw));
To be fair, STL implementers understand that begin
and end
(and similar functions, such as size
) are used frequently, so they're likely to design them for maximal efficiency. They'll almost certainly inline
them and strive to code them so that most compilers will be able to avoid repeated computations by hoisting their results out of loops like the one above. Experience shows that implementers don't always succeed, however, and when they don't, the avoidance of repeated computations is enough to give the algorithm a performance edge over the hand-written loop.
But that's the minor efficiency argument. The first major argument is that library implementers can take advantage of their knowledge of container implementations to optimize traversals in a way that no library user ever could. For example, the objects in a deque
are typically stored (internally) in one or more fixed-size arrays. Pointer-based traversals of these arrays are faster than iterator-based traversals, but only library implementers can use pointer-based traversals, because only they know the size of the internal arrays and how to move from one array to the next. Some STLs contain algorithm implementations that take their deque
's internal data structures into account, and such implementations have been known to clock in at more than 20 percent faster than the normal implementations of the algorithms.
The point is not that STL implementations are optimized for deque
s (or any other specific container type), but that implementers know more about their implementations than you do, and they can take advantage of this knowledge in algorithm implementations. If you shun algorithm calls in favor of your own loops, you forgo the opportunity to benefit from any implementation-specific optimizations they may have provided.
The second major efficiency argument is that all but the most trivial STL algorithms use computer science algorithms that are more sophisticated sometimes much more sophisticated than anything the average C++ programmer will be able to come up with. It's next to impossible to beat sort
or its kin (e.g., stable_sort
, nth_element
, etc.); the search algorithms for sorted ranges (e.g., binary_search
, lower_bound
, etc.) are equally good; and even such mundane tasks as eliminating objects from vector
s, deque
s, and arrays are more efficiently accomplished using the erase-remove
idiom than the loops most programmers come up with.
If the efficiency argument for algorithms doesn't persuade you, perhaps you're more amenable to a plea based on correctness. One of the trickier things about writing your own loops is making sure you use only iterators that (a) are valid and (b) point where you want them to. For example, suppose you have an array, and you'd like to take each array element, add 41 to it, then insert it into the front of a deque
. Writing your own loop, you might come up with this:
// C API: this function takes a pointer // to an array of at most arraySize // doubles and writes data to it. It // returns the number of doubles written. size_t fillArray(double *pArray, size_t arraySize); // create local array of max possible size double data[maxNumDoubles]; // create deque, put data into it deque<double> d; ... // get array data from API size_t numDoubles = fillArray(data, maxNumDoubles); // for each i in data, insert data[i]+41 // at the front of d; this code has a bug! for (size_t i = 0; i < numDoubles; ++i) { d.insert(d.begin(), data[i] + 41); }
This works, as long as you're happy with a result where the newly inserted elements are in the reverse order of the corresponding elements in data
. Because each insertion location is d.begin()
, the last element inserted will go at the front of the deque
!
If that's not what you wanted (and admit it, it's not), you might think to fix it like this:
// remember d's begin iterator deque<double>::iterator insertLocation = d.begin(); // insert data[i]+41 at insertLocation, then // increment insertLocation; this code is also buggy! for (size_t i = 0; i < numDoubles; ++i) { d.insert(insertLocation++, data[i] + 41); }
This looks like a double win, because it not only increments the iterator specifying the insertion position, it also eliminates the need to call begin
each time around the loop; that eliminates the minor efficiency hit we discussed earlier. Alas, this approach runs into a different problem: it yields undefined results. Each time deque::insert
is called, it invalidates all iterators into the deque
, and that includes insertLocation
. After the first call to insert
, insertLocation
is invalidated, and subsequent loop iterations are allowed to head straight to looneyland.