Allocators are one of the most mysterious parts of the C++ Standard library. Allocators are rarely used explicitly; the Standard doesn't make it clear when they should ever be used. Today's allocators are substantially different from those in the original STL proposal, and there were two other designs in between all of which relied on language features that, until recently, were available on few compilers. The Standard appears to make promises about allocator functionality with one hand and then take those promises away with the other.
This column will discuss what you can use allocators for and how you can define your own. I'm only going to discuss allocators as defined by the C++ Standard: bringing in pre-Standard designs, or workarounds for deficient compilers, would just add to the confusion.
When Not to Use Allocators
Allocators in the C++ Standard come in two pieces: a set of generic requirements, specified in 20.1.5 (Table 32), and the class std::allocator, specified in 20.4.1. We call a class an allocator if it conforms to the requirements of Table 32. The std::allocator class conforms to those requirements, so it is an allocator. It is the only predefined allocator class in the standard library.
Every C++ programmer already knows about dynamic memory allocation: you write new X to allocate memory and create a new object of type X, and you write delete p to destroy the object that p points to and return its memory. You might reasonably think that allocators have something to do with new and delete but they don't. (The Standard describes ::operator new as an "allocation function," but, confusingly, that's not the same as an allocator.)
The most important fact about allocators is that they were intended for one purpose only: encapsulating the low-level details of STL containers' memory management. You shouldn't invoke allocator member functions in your own code, unless you're writing an STL container yourself. You shouldn't try to use allocators to implement operator new[]; that's not what they're for. If you aren't sure whether you need to use allocators, then you don't.
An allocator is a class with member functions allocate and deallocate, the rough equivalents of malloc and free. It also has helper functions for manipulating the memory that it allocated and typedefs that describe how to refer to the memory names for pointer and reference types. If an STL container allocates all of its memory through a user-provided allocator (which the predefined STL containers all do; each of them has a template parameter that defaults to std::allocator), you can control its memory management by providing your own allocator.
This flexibility is limited: a container still decides for itself how much memory it's going to ask for and how the memory will be used. You get to control which low-level functions a container calls when it asks for more memory, but you can't use allocators to make a vector act like a deque. Sometimes, though, even this limited flexibility is useful. If you have a special fast_allocator that allocates and deallocates memory quickly, for example (perhaps by giving up thread safety, or by using a small local heap), you can make the standard list class use it by writing std::list<T, fast_allocator<T> > instead of plain std::list<T>.
If this seems esoteric to you, you're right. There is no reason to use allocators in normal code.
Defining an Allocator
This already shows you something about allocators: they're templates. Allocators, like containers, have value types, and an allocator's value type must match the value type of the container it's used with. This can sometimes get ugly: map's value type is fairly complicated, so a map with an explicit allocator involves expressions like std::map<K, V, fast_allocator<std::pair<const K, V> > >. (As usual, typedefs help.)
Let's start with a simple example. According to the C++ Standard, std::allocator is built on top of ::operator new. If you're using an automatic tool to trace memory usage, it's often more convenient to have something a bit simpler than std::allocator. We can use malloc instead of ::operator new, and we can leave out the complicated performance optimizations that you'll find in a good implementation of std::allocator. We'll call this simple allocator malloc_allocator.
Since the memory management in malloc_allocator is simple, we can focus on the boilerplate that's common to all STL allocators. First, some types: an allocator is a class template, and an instance of that template allocates memory specifically for objects of some type T. We provide a series of typedefs that describe how to refer to objects of that type: value_type for T itself, and others for the various flavors of pointers and references.
template <class T> class malloc_allocator { public: typedef T value_type; typedef value_type* pointer; typedef const value_type* const_pointer; typedef value_type& reference; typedef const value_type& const_reference; typedef std::size_t size_type; typedef std::ptrdiff_t difference_type; ... };
It's no accident that these types are so similar to those in an STL container: a container class usually gets those types directly from its allocator.
Why so many typedefs? You might think that pointer is superfluous: it's just value_type*. Most of the time that's true, but you might occasionally want to define an unconventional allocator where pointer is some pointer-like class, or where it's some nonstandard vendor-specific type like value_type __far*; allocators are a standard hook for nonstandard extensions. Unusual pointer types are also the reason for the address member function, which in malloc_allocator is just an alternate spelling for operator&:
template <class T> class malloc_allocator { public: pointer address(reference x) const { return &x; } const_pointer address(const_reference x) const { return &x; } ... };
Now we can get to the real work: allocate and deallocate. They're straightforward, but they don't look quite like malloc and free. We pass two arguments to allocate: the number of objects that we're allocating space for (max_size returns the largest request that might succeed), and, optionally, an address that can be used as a locality hint. A simple allocator like malloc_allocator makes no use of that hint, but an allocator designed for high performance might. The return value is a pointer to a block of memory that's large enough for n objects of type value_type and that has the correct alignment for that type.
We also pass two arguments to deallocate: a pointer, of course, but also an element count. A container has to keep track of sizes on its own; the size arguments to allocate and deallocate must match. Again, this extra argument exists for reasons of performance, and again, malloc_allocator doesn't use it.
template <class T> class malloc_allocator { public: pointer allocate(size_type n, const_pointer = 0) { void* p = std::malloc(n * sizeof(T)); if (!p) throw std::bad_alloc(); return static_cast<pointer>(p); } void deallocate(pointer p, size_type) { std::free(p); } size_type max_size() const { return static_cast<size_type>(-1) / sizeof(value_type); } ... };
The allocate and deallocate member functions deal with uninitialized memory; they don't construct or destroy objects. An expression like a.allocate(1) is more like malloc(sizeof(int)) than like new int. Before using the memory you get from allocate, you have to create some objects in that memory; before returning that memory with deallocate, you have to destroy those objects.
C++ provides a mechanism for creating an object at a specific memory location: placement new. If you write new(p) T(a, b) then you are invoking T's constructor to create a new object, just as if you had written new T(a, b) or T t(a, b). The difference is that when you write new(p) T(a, b) you're specifying the location where that object is constructed: the address where p points. (Naturally, p has to point to a large enough region of memory, and it has to point to raw memory; you can't construct two different objects at the same address.) You can also call an object's destructor, without releasing any memory, by writing p->~T().
These features are rarely used, because usually memory allocation and initialization go together: it's inconvenient and dangerous to work with pointers to uninitialized memory. One of the few places where you need such low-level techniques is if you're writing a container class, so allocators decouple allocation from initialization. The member function construct performs placement new, and the member function destroy invokes the destructor.
template <class T> class malloc_allocator { public: void construct(pointer p, const value_type& x) { new(p) value_type(x); } void destroy(pointer p) { p->~value_type(); } ... };
(Why do allocators have those member functions, when containers could use placement new directly? One reason is to hide the somewhat awkward syntax, and another is that if you're writing a more complicated allocator you might want construct and destroy to have some side effects beside object construction and destruction. An allocator might, for example, maintain a log of all currently active objects.)
None of these member functions is static, so the first thing a container has to do before using an allocator is create allocator objects and that means we should define some constructors. We don't need an assignment operator, though: once a container creates its allocator, the allocator isn't ever supposed to be changed. The allocator requirements in Table 32 don't include assignment. Just to be on the safe side, to make sure nobody uses an assignment operator accidentally, we'll disable the one that would otherwise be generated automatically.
template <class T> class malloc_allocator { public: malloc_allocator() {} malloc_allocator(const malloc_allocator&) {} ~malloc_allocator() {} private: void operator=(const malloc_allocator&); ... };
None of these constructors actually does anything, because this allocator doesn't have any member variables to initialize. For the same reason, any two malloc_allocator objects are interchangeable; if a1 and a2 are both of type malloc_allocator<int>, we can freely allocate memory through a1 and deallocate it through a2. We therefore define a comparison operator that says all malloc_allocator objects are equal:
template <class T> inline bool operator==(const malloc_allocator<T>&, const malloc_allocator<T>&) { return true; } template <class T> inline bool operator!=(const malloc_allocator<T>&, const malloc_allocator<T>&) { return false; }
Would you ever want to have an allocator where different objects weren't interchangeable? Certainly but simple and useful examples are hard to come by. One obvious possibility is memory pools. It's common for large C programs to allocate memory from several different places ("pools"), instead of directly doing everything through malloc. This has several benefits, one of which is that it only takes a single function call to reclaim all of the memory associated with a particular phase of the program. A program that uses memory pools might define utility functions like mempool_Alloc and mempool_Free, where mempool_Alloc(n, p) allocates n bytes from pool p. It's easy to write a mempool_allocator that fits into such a framework: each mempool_allocator object would have a member variable to specify which pool it's associated with, and mempool_allocator::allocate would invoke mempool_Alloc to get memory from the appropriate pool [1].
Finally, we get to the one tricky part of defining an allocator: mapping between different types. The problem is that an allocator class, like malloc_allocator<int>, is all built around a single value type: malloc_allocator<int>::pointer is int*, malloc_allocator<int>().allocate(1) returns enough memory for one int object, and so on. In general, however, a container class that uses malloc_allocator may have to deal with more than one type. A list class, for example, doesn't allocate ints; internally, it allocates list nodes. (We'll see more about that in the next section.) Somehow, when you create an std::list<int, malloc_allocator<int> >, the list has to turn the malloc_allocator<int> into a malloc_allocator for list nodes.
The mechanism for this is called rebinding, and it has two parts. First, given an allocator type A1 whose value type is X1, you must be able to write down an allocator type A2 that's just the same as A1 except that its value type is X2. Second, given an object a1 of type A1, you must be able to create an equivalent object a2 of type A2. Both of these parts use member templates, which is why allocators were unsupported or poorly supported on older compilers.
template <class T> class malloc_allocator { public: template <class U> malloc_allocator(const malloc_allocator<U>&) {} template <class U> struct rebind { typedef malloc_allocator<U> other; }; ... };
What this really means is that an allocator class can't ever just be a single class; it has to be a family of related classes, each with its own value type. An allocator class must always have a rebind member, because that's what makes it possible to go from one class in that family to another.
If you have an allocator class A1, the corresponding allocator class for a different value type is typename A1::template rebind<X2>::other [2]. And just as you can convert from one type to another, the templatized converting constructor lets you convert values: you can write malloc_allocator<int>(a) whether a is an object of type malloc_allocator<int>, or malloc_allocator<double>, or malloc_allocator<string>. As usual, malloc_allocator's converting constructor doesn't do anything because malloc_allocator has no member variables.
Incidentally, while most allocators have a single template parameter (the allocator's value type) that's not a requirement it just often happens to be convenient. The rebind mechanism would work just as well for allocators with multiple template parameters:
template <class T, int flags> class my_allocator { public: template <class U> struct rebind { typedef my_allocator<U, flags> other; }; ... };
Finally, one last detail: what do we do about void? Sometimes a container has to refer to void pointers (again, we'll see more about that in the next section), and the rebind mechanism almost gives us what we need, but not quite. It doesn't work, because we would need to write something like malloc_allocator<void>::pointer, and we've defined malloc_allocator in such a way that instantiating it for void would be illegal. It uses sizeof(T), and it refers to T&; neither is legal when T is void. The solution is as simple as the problem: specialize malloc_allocator for void, leaving out everything except the bare minimum that we need for referring to void pointers.
template<> class malloc_allocator<void> { typedef void value_type; typedef void* pointer; typedef const void* const_pointer; template <class U> struct rebind { typedef malloc_allocator<U> other; };
That's it! The complete source code for malloc_allocator is shown in Listing 1.
Using Allocators
The easiest way to use allocators, of course, is to pass them as arguments to container classes; write
std::vector<char, malloc_allocator<char> > V;
instead of plain std::vector<char>, or
typedef std::list<int, mempool_allocator<int> > List; List L(mempool_allocator<int>(p));
instead of plain std::list<int>.
But you can do more than that. The whole point of the STL is that it's extensible: just as you can write your own allocators, you can also write your own container classes. If you are careful, and if you write a container class that uses its allocator for all memory-related functionality, then somebody else will be able to plug in their own custom-written allocators.
Container classes like std::vector and std::list are complicated, and a lot of the complexity has nothing to do with memory management. Let's start with a simple example, so that we can focus just on the allocators. Consider a fixed-size array class, Array, where the number of elements is set in the constructor and can never change thereafter. (This is something like a simplified version of std::valarray.) We'll have two template parameters, the element type and an allocator type.
Containers, like allocators, start with nested types: value_type, reference, const_reference, size_type, difference_type, iterator, and const_iterator. In general, most of those types can be taken directly from the container's allocator thus illustrating why the container's value type and the allocator's must match.
The iterator types, of course, don't usually come from the allocator; usually an iterator is some kind of class, closely tied to the container's internal representation. The Array class is simpler than usual because it's natural to store all of our elements in a single contiguous memory block; we'll maintain a pointer to the beginning of the block and a pointer to the end. The iterators will just be pointers.
Before we go any further, we have to make a decision: how are we going to store the allocator? The constructor will take an allocator object as one of its arguments. We need to hold on to a copy of the allocator throughout the lifetime of the container, since we'll still need it in the destructor.
In some sense, there's no problem here: we could just declare a member variable of type Allocator and be done with it. That solution would be correct, but annoying. Ninety-nine percent of the time, after all, users don't want to bother thinking about allocators; they'll just write Array<int> and use the default and the default allocator is probably an empty class with no non-static member variables. The trouble is that a member variable of type Allocator will take up even when Allocator is an empty class. (This is required by the C++ Standard.) Our Array class will have three words of overhead, instead of two. Maybe an extra word of overhead isn't a big deal, but it's annoying to burden all users with that overhead for a feature that most of them will never use.
There are a number of solutions to this problem, some of which rely on traits classes and partial specialization. Probably the simplest solution is to use a (private) base class of type Allocator instead of a member variable. Compilers are allowed to optimize away empty base classes, and nowadays most compilers do.
We can finally write down a skeleton class definition:
template <class T, class Allocator = std::allocator<T> > class Array : private Allocator { public: typedef T value_type; typedef typename Allocator::reference reference; typedef typename Allocator::const_reference const_reference; typedef typename Allocator::size_type size_type; typedef typename Allocator::difference_type difference_type; typedef typename Allocator::pointer iterator; typedef typename Allocator::const_pointer const_iterator; typedef Allocator allocator_type; allocator_type get_allocator() const { return static_cast<const Allocator&>(*this); } iterator begin() { return first; } iterator end() { return last; } const_iterator begin() const { return first; } const_iterator end() const { return last; } Array(size_type n = 0, const T& x = T(), const Allocator& a = Allocator()); Array(const Array&); ~Array(); Array& operator=(const Array&); private: typename Allocator::pointer first; typename Allocator::pointer last; };
This doesn't yet have all of the boilerplate that we need if we are to satisfy the container requirements (see Table 65, in 23.1 of the C++ Standard, for the complete set of requirements), but most of that boilerplate has little to do with allocators. For our purposes, the most interesting member functions are the constructor, which allocates memory and creates objects, and the destructor, which destroys memory and frees memory.
The constructor initializes the allocator base class, obtains a block of memory that's large enough for n elements (if we were writing something like vector, we might obtain a larger block to allow room for growth), and then loops through the block creating copies of the initial value. The only tricky part is exception safety: if one of the elements' constructors throws an exception, we have to undo everything we've done.
template <class T, class Allocator> Array<T, Allocator>::Array(size_type n, const T& x, const Allocator& a) : Allocator(a), first(0), last(0) { if (n != 0) { first = Allocator::allocate(n); size_type i; try { for (i = 0; i < n; ++i) { Allocator::construct(first + i, x); } } catch(...) { for(size_type j = 0; j < i; ++j) { Allocator::destroy(first + j); } Allocator::deallocate(first, n); throw; } } }
(You might wonder why we're writing out these loops by hand; doesn't std::uninitialized_fill already do what we need? Almost, but not quite. We have to use the allocator's construct member function instead of plain placement new. Perhaps a future version of the C++ Standard will include a version of uninitialized_fill that takes an allocator argument and make such explicit loops unnecessary.)
The destructor is simpler, since we don't have to worry about exception safety: T's destructor is never supposed to throw.
template <class T, class Allocator> Array<T, Allocator>::~Array() { if (first != last) { for (iterator i = first; i < last; ++i) Allocator::destroy(i); Allocator::deallocate(first, last - first); } }
Our simple array class doesn't have to use rebinding or conversion, but that's only because Array<T, Allocator> never creates objects of any type other than T. Other types come in when you define more complicated data structures. Consider, for example, a linked list class whose value type is T. A linked list typically consists of nodes, where each node holds an object of type T and a pointer to the next node. So, as a first attempt, we might define a list node as follows:
template <class T> struct List_node { T val; List_node* next; };
The procedure for adding a new value to the list might look something like this:
- Using an allocator whose value type is List_node<T>, allocate memory for a new list node.
- Using an allocator whose value type is T, construct the new list element in the node's val slot.
- Link the node into place appropriately.
This procedure requires dealing with two different allocators, one of which is obtained from the other via rebinding. It's good enough for almost all applications, even ones that use allocators for quite sophisticated purposes. What it doesn't do is support allocators with unusual pointer types. It explicitly relies on being able to use an ordinary pointer of type List_node<T>*.
If you're extremely ambitious, and you want to support allocators with alternative pointer types, everything suddenly becomes much more complicated the pointer from one list node to another can no longer be of type List_node<T>*, or even of type void*, but must be of some type taken from the allocator. Writing this without circularity is nontrivial: it's illegal to instantiate an allocator with an incomplete type, so there's no way to talk about pointers to List_node until after List_node has been fully defined. We need a delicate chain of declarations.
template <class T, class Pointer> struct List_node { T val; Pointer next; }; template <class T, class Alloc> class List : private Alloc { private: typedef typename Alloc::template rebind<void>::other Void_alloc; typedef typename Void_alloc::pointer Voidptr; typedef typename List_node<T, Voidptr> Node; typedef typename Alloc::template rebind<Node>::other Node_alloc; typedef typename Node_alloc::pointer Nodeptr; typedef typename Alloc::template rebind<Voidptr>::other Voidptr_alloc; Nodeptr new_node(const T& x, Nodeptr next) { Alloc a = get_allocator(); Nodeptr p = Node_alloc(a).allocate(1); try { a.construct(p->val, x); } catch(...) { Node_alloc(a).deallocate(p, 1); throw; } Voidptr_alloc(a).construct(p->next, Voidptr(next)); return p; } ... };
And finally, in case you think that this is far too much effort for far too small a benefit, a reminder: just because you can write a container class that uses allocators doesn't mean that you have to, or that you should. Sometimes you might want to write a container class that relies on a specific allocation strategy, whether it's something as ambitious as a disk-based B-tree container or as simple as the block class that I describe in my book. Even if you do want to write a container class that uses allocators, you don't have to support alternate pointer types. You can write a container where you require that any user-supplied allocator uses ordinary pointers, and document that restriction. Not everything has to be fully general.
Future Directions
If you want to write a simple allocator like malloc_allocator, you should have no difficulty. (Provided that you're using a reasonably modern compiler, that is.) If you have more ambitious plans, however a memory pool allocator or an allocator with nonstandard pointer types for distributed computing the situation is less satisfactory.
If you want to use some alternative pointer-like type, what operations does it have to support? Must it have a special null value, and, if so, how is that value written? Can you use casts? How can you convert between pointer-like objects and ordinary pointers? Do you have to worry about pointer operations throwing exceptions? I made some assumptions in the last section; the C++ Standard doesn't say whether those assumptions are right or wrong. These details are left to individual standard library implementations, and it's even legal for an implementation to ignore alternative pointer types altogether. The C++ Standard also leaves a few unanswered questions about what happens when different instances of an allocator aren't interchangeable.
Fortunately, the situation isn't quite as dire as the words in the Standard (20.1.5, paragraphs 4-5) might make it seem. The Standard left some questions unanswered because, at the time it was written, the C++ standardization committee wasn't able to agree on the answers; the necessary experience with allocators did not exist. Everyone involved in writing this section of the Standard considered it to be a temporary patch, and the vagueness will definitely be removed in a future revision.
For the moment, it's best to stay away from alternative pointer types if you're concerned about portability, but, if you're willing to accept a few limitations, you can safely use allocators like mempool_allocator where the differences between individual objects is important. All major standard library implementations now support such allocators in some way, and the differences between implementations are minor.
Just as the containers take allocator types as template parameters, so the containers' constructors take allocator objects as arguments. A container makes a copy of that argument and uses the copy for all of its memory management; once it is initialized in the constructor, the container's allocator is never changed.
The only question is what happens when you perform an operation that requires two containers to cooperate on memory management. There are exactly two such operations in the standard library: swap (all containers) and std::list::splice. In principle, an implementation could handle them in several different ways:
If you just stay away from swap and splice whenever two containers might be using different allocators, you'll be safe. In practice, I haven't found this to be a serious restriction: you need tight discipline to use a feature like memory pools safely, and you probably won't want indiscriminate mixing between containers with different allocators.
Partly because of unfamiliarity and partly because of the unsatisfactory state of the C++ Standard's requirements, most uses of allocators today are simple. As the C++ community becomes more familiar with allocators, and as the Standard is clarified, we can expect more sophisticated uses to emerge.
[1] You can see an example of a pool allocator in the open source SGI Pro64TM compiler, http://oss.sgi.com/projects/Pro64/.
[2] Why the funny template keyword in that expression? It's an annoying little technicality; like typename, it helps the compiler resolve a parsing ambiguity. The problem is that when A is a template parameter, and the compiler sees an expression like A::B<T>, the compiler doesn't know anything about A's members. Should it assume that B<T> is a member template, or should it assume that B is an ordinary member variable and that < is just a less than sign? A human reader knows which way to interpret it, but the compiler doesn't. You need to put in template to force the first interpretation. Formally, the rule (in 14.2 of the C++ Standard) is: "When the name of a member template specialization appears after . or -> in a postfix-expression, or after nested-name-specifier in a qualified-id, and the postfix-expression or qualified-id explicitly depends on a template-parameter (14.6.2), the member template name must be prefixed by the keyword template. Otherwise the name is assumed to name a non-template."
Matt Austern is the author of Generic Programming and the STL and the chair of the C++ standardization committee's library working group. He works at AT&T Labs Research and can be contacted at [email protected].
Notes