One of C++'s strengths is its Standard Template Library. Because the authors of the STL did a good job of defining the foundation classes and functions that are decoupled from any particular problem, all serious C++ programmers sooner or later meet the classes defined in the STL [1] no matter what problem they're trying to solve. Of course, like any other library of its size, the STL contains some classes that are used more often than others. It is no surprise that the most frequently used (and first learned) are the classes that implement different kinds of sequences. A sequence is a data structure that provides a linear access to its elements. The name of the concept sequence says it clearly. Moreover, some elements, such as std::vector or std::deque, can also provide efficient random access to the data stored.
All the standard sequences have one thing in common they own the objects they store. This means that it is impossible to have the same object contained in two different sequences. Of course, it is possible to create a sequence of pointers (even some smart pointers can be used in standard containers, but it should not be std::auto_ptr) that can implement the concept of "one object in many containers." However, such a solution is error-prone and cannot be seamlessly used with the rest of the STL. Still, the concept can be encapsulated in a class so that it can be compatible with the rest of the STL and safely used in different contexts.
The View Concept
The generalization of this concept is some kind of access filter to existing objects, which I call a "view" because it can be used as a looking glass through which the objects can be looked at. Figure 1 is a view's internal structure. Of course, the collection of iterators can be created, for example, as std::vector<int*>, so it is not much of a discovery. For a view to be useful, it has to behave like the container keeping the actual objects, so that calling operator[] on a view or dereferencing the view's iterator would return the reference to the actual object and not to the contained pointer. This way, the view can provide the transparency, just like a looking glass.
What are the advantages of having transparent collections of iterators? With such a tool in hand, you can:
- Create one consistent view on the objects physically spread over different data structures.
- Create different views that behave like collections containing the same objects.
- Create a view for an existing data structure to provide access type impossible otherwise. For example, views can be used to efficiently provide random access to the objects stored in lists.
- Create different views, presenting the same objects in different order.
- Change the apparent order of objects in some collection without actually moving them. In the case of heavy objects (that is, objects that are difficult to copy), it can result in considerable speedup of operations such as sorting. In the case of objects that cannot be normally swapped (because they are const, for example), a view can be the only way to sort them.
To effectively implement the view concept, you must express Figure 1 in terms of known concepts and classes. Again, a view is itself a sequence: specifically, a sequence of pointers to objects of some type. For more flexibility, it is better to assume that the view is a sequence of iterators to some objects. In this article, I refer to the data structure used internally by the view object as the underlying sequence.
The STL provides many different kinds of sequences and they differ in their tradeoffs between access types and efficiency of sequence operations. For instance, std::vector beats std::list in some situations, but not others. To avoid imposing any particular internal implementation, the view class should let you choose the best underlying data structure.
Again, a view should have specially defined access methods and operators, so that it can behave transparently. The same applies to the view's iterators. If the view object is to be used together with the rest of the STL (with algorithms, for example), its iterators have to comply to the specific standard requirements. The problem is that it is not possible to predict what particular underlying sequence class will be used to implement the view (because it is up to you to decide).
Hopefully, C++ supports incomplete instantiation, so that the generic view's iterator class can provide operations required for the most flexible iterator concept (the random-access iterator). They will be effectively accessible only if the underlying sequence provides them. The same strategy applies to the view class itself depending on what particular sequence will be used to instantiate the view object, the sets of member methods should differ. For example, if the underlying structure is std::list, the push_front method could be provided, but not random-access operator[]. In the case of std::vector, it is the other way around.
Consequently, the generic view class and its iterators provide the most optimistic superset of the functionalities provided by standard sequences and their iterators. The final (instantiated) view object and its iterators provide the operations corresponding to those of the actual underlying sequence and its iterators.
Listing 1 is an excerpt from the view's iterator interface. The view's iterator is a thin wrapper around the iterator provided by the underlying sequence. The most crucial methods dereference operators are implemented so that they provide access not to the contents of the sequence (which stores the iterators or pointers to objects), but to the actual objects themselves. The additional dereference implemented in these methods provides the transparency of the view object. The view_iterator is generic, so that it can wrap around different iterator classes provided by the underlying sequences. Listing 2 presents the view class itself.
As presented in Figure 1, a view is a simple wrapper on the underlying sequence object. To support the transparency required by the view's concept, the begin(), end(), and similar methods return a special (double-dereferencing) iterator object described earlier. That way, operator[] works in the view class (of course, it is accessible only when the underlying sequence class provides this operator).
To provide access to the stored iterators themselves, the v_at(), v_begin(), and v_end() methods are provided so that it is possible to operate on the values stored in the view's underlying sequence, omitting the transparency provided by the usual accessors.
By default, a view object uses std::vector as its underlying sequence (second template parameter).
Example 1: Consistent View Over Nonconsistent Data
Consider the following integers:
int a[] = { 0, 1, 2, 3 }; const int i = 4; int j = 5;
Apparently, these objects are stored in different data structures; some of them can be automatic, some static or even global. To provide one consistent view on these integers, do this:
view<const int*> v; v.assign(&a[0], a + 4); v.push_back(&i); v.push_back(&j);
These instructions create a view object and populate it with pointers (of type const int*) to different int objects. Thanks to its transparency, the view can behave like a vector of integers from this point forward, and provide consistent access to all of them:
copy(v.begin(), v.end(), ostream_iterator<int> (cout, " "));
or
for (int i = 0; i < v.size(); ++i) cout << v[i] << ' ';
Example 2: Different Views On the Same Objects
Because views are physically only a transparent collection of pointers (iterators, in general) to actual data, they are easy to implement the idea of many collections containing the same objects:
int a[] = { 0, 1, 2, 3 }; view<int*> v1(&a[0], a + 3); view<int*> v2(&a[1], a + 4);
The v1 object is a view on the first three integers in an array, while the v2 object presents the last three integers. As a result, the integers with values 1 and 2 are pointed to by pointers stored in both views. Changing the element in one view influences the value of the object contained in the other one; for example:
v1[2] = 20;
changes the third integer in an array. Later, this new value can be read by any of these accessors: a[2], v1[2], v2[1] all of them denote the l-value of the same object.
Example 3: Additional Access Type
The STL provides different sequences for different needs. If there is a need to frequently insert objects in the middle of a sequence, std::list is a good choice. Similarly, when fast random access is required, std::vector is likely to be used. But what do you do when you need to frequently insert elements in the middle of a sequence during its population, and once the sequence is created, fast random access to its elements would be beneficial? No problem, use a list as a physical data structure and later create a vectorized view on all the elements:
list<int> l; // populate the list // ... // create the view on the list // with default vector as the // underlying structure view<list<int>::iterator> v(l.begin(), l.end());
From this point on, you have random access (read and write. If you only want to have read access in a view, use list<int>::const_iterator instead) to the elements in a list just use normal accessors, such as v[12], v.at(123), and so on. This strategy can prove useful when the objects contained in the list are heavy. A view object is relatively small (it contains only iterators) and its construction takes linear time with respect to the list's size. Once the view is created, you can use it for algorithms expecting random-access iterators, such as fast binary_search.
Similarly, if you have two std::vectors and want to merge them, create two views on them with std::list as an underlying sequence. After that, you can call the special list algorithm (merge) as a member function on the view object. Other special list algorithms are also provided (sort, reverse, unique, and so on).
Example 4: Different Views On the Same Data
Probably the most spectacular use of a view object is sorting elements of a sequence without actually moving them. Take, for example, the sequence of heavy objects (so that copying them is highly inefficient) or objects in a sequence that is impossible to sort anyway (i.e., when all we have is a const reference to the sequence object). Sorting elements of such a sequence can be performed with the help of a view. However, the view's transparency here, useful in previous situations, introduces a bit of difficulty. The problem is that to sort a sequence using a view, it is a view's underlying sequence that has to be sorted, but the actual physical objects should be taken for comparisons. Normally, the std::sort algorithm (and all other algorithms that use predicates) uses the same values for comparisons and for swaps. This strategy cannot work with a view. The instruction:
sort(v.begin(), v.end());
sorts the actual objects, which is what is to be avoided in the first place. Similarly,
sort(v.v_begin(), v.v_end());
sorts the iterators in the underlying sequence, but the sorting will be based on comparisons of the iterators' values, not the objects they point to. To effectively sort iterators (or pointers) in a view, based on the comparisons of objects that they point to, the special adaptor should be used with every predicate object:
const int a[] = { 5, 1, 4, 2, 3 }; view<const int*> v1(&a[0], a + 5); view<const int*> v2(v1); sort(v1.v_begin(), v1.v_end(), dereference<const int*, less<int> >()); sort(v2.v_begin(), v2.v_end(), dereference<const int*, greater<int> >());
These instructions sort the view v1, so that it presents the integers in the array in ascending order and the view v2 so that it presents the same integers in descending order. No object was copied, moved, or swapped. Here, the original sequence was a sequence of const objects and it would not be possible to sort them normally. In the case of a sequence of heavy objects, the view can provide a way to sort the elements really fast without copying the actual objects.
Listing 3 presents the dereference adaptor. Its only method binary operator() uses additional dereference adaptors to achieve the effect just described. This adaptor can be used in other algorithms with all binary predicates so that it is possible to create different views on the same data, sorted on different criteria (for example, one view sorted by name, other sorted by address, yet another sorted by phone number, and so on).
Compatibility Issues
At this writing, neither Microsoft Visual C++ 6.0 (SP5) nor 7.0 properly compile the examples presented here. The problem is that those compilers do not support template partial specializations, which is required for std::iterator_traits, when the view class is instantiated with pointers (such as int*). VC++ 7.0 is able to compile the code where "real" iterators are used, as in the third example. When solving actual problems, this should not be an issue because the view class is supposed to be a helper for use with already-created STL sequences, which can provide such real iterators. For comparison, g++ 2.96 has no problem with the code presented here.
Conclusion
The view concept can be a powerful tool in the C++ programmer's toolbox. The initial intent of this class was to provide a way to effectively sort objects in STL sequences, without actually moving them. However, the solution I present here is generic enough (and compatible with the rest of the STL) that it can be used in a broader range of situations. The four examples are probably the easiest and the simplest ones, but I believe the imagination of programmers will find many more uses for this concept. You can download the complete source code (with examples) from http://www.cuj.com/code/.
In fact, the whole concept of views can be extended to the form of quite a big library, like the one available at http://www.zib.de/weiser/vtl/. The view, as a concept, can be easily extended to associative views. A view_set could be a set of iterators to some data, view_map could be used as a structure mapping some keys to some objects without the need to copy them, and so on.
In closing, the view object stores iterators (pointers) to some data and as such is subject to all the limitations of the iterators themselves. For example, the view cannot be used when the actual objects go out of scope, were deleted, or when the iterators were invalidated for some reason. With these limitations in mind, a view object is probably a candidate for some transient, short-lived helper structure to facilitate an operation such as sorting. Using views as objects with a long lifetime requires extreme care; otherwise, you can easily find yourself walking on the edge of undefined behavior.
References
[1] Austern, Matthew H. Generic Programming and the STL: Using and Extending the C++ Standard Template Library, Addison-Wesley, 1998.
[2] Meyers, Scott. Effective STL: 50 Specific Ways to Improve Your Use of the Standard Template Library, Addison-Wesley, 2001.
[3] ISO/IEC 14882:1998(E), Programming Languages C++.
Maciej Sobczak is a Ph.D. student at the Institute of Computer Science, Warsaw University of Technology. You can contact him at http://www.msobczak.com/.