C has good support for one-dimensional arrays (vectors), and with C++ and STL, things have only gotten better. Even without STL, whose vector class template is very flexible and efficient, C++ makes using one-dimensional arrays very easy; you can just allocate them directly on the stack or heap and access them like you expect to with operator []. Additionally, almost any application framework available provides many types of standard containers, including vectors. Unfortunately, support for multidimensional arrays is not as good, neither from C++ nor from libraries like STL.
Standard Solutions
If you need a three-dimensional array, you have a few choices. You can declare it this way:
int MyArray[10][20][30];
The above declaration has the following problems: the size is fixed at compile time, the array too easily decays into more elementary pointers (e.g., when passed to functions), and there is no bounds checking. You can overcome the first problem, in part, by fixing all but the first dimension and allocating the array on the heap, but for an N-dimensional array so defined, N-1 dimensions are fixed at compile time, which is not very useful.
You can do the following with STL:
vector<vector<vector<int> > > MyArray (10, vector<vector<int> > (20, vector<int>(30)) );
This array has a few problems, too, and they might not all be obvious. First, it allocates 1+10*20=201 vector objects, in addition to the 10*20*30=6000 integers. In a more general case, for a five-dimensional array of dimensions M*N*J*K*L, a total of 1+M*N*J*K vectors will be allocated. On my platform, a vector is 16 bytes in size, so a lot of space is wasted. But allocating this memory does more than take up space, it also takes time, as the same number of heap allocations will be performed. And the allocated array of integers will not end up in contiguous blocks of memory, but in a lot of small scattered memory chunks.
Second, you will not be able to use many STL algorithms (e.g., max_element) on the three-dimensional vector thus defined, and even using standard iterators to traverse the array is difficult.
Third, you won't be able to access vector's member functions to get information about the entire three-dimensional array, but only about its first dimension. An STL vector of vector of vector is really a 3-D structure where algorithms can work on each dimension, rather than a linear structure (container) that can be accessed through one-, two-, or three-subscript access methods.
Clearly this is not a nice solution, as the extra flexibility provided by the vector class is unnecessary here. std::vector was not intended to implement multidimensional vectors efficiently. You can compound it like above, but the result is suboptimal to say the least.
A Better Solution
Given these limitations, I created my own class template since I needed a two-dimensional, generic, and resizable array:
template <typename T> class Array2D {...};
When I needed a three-dimensional array, I created another class template:
template <typename T> class Array3D {...};
Then I looked at the two class templates and discovered they had a lot in common, so the obvious question came up: can't I create a single class template for an N-dimensional generic resizable array?
template <typename T, unsigned int N> class Array {...};
This turned out to be much easier than I expected, as this article will demonstrate. The class I designed allows me to write code like that found in Listing 1.
Building the Array
In creating such a class template, the allocation of the actual array elements is trivial. The class template needs a pointer to elements of type T, and it will allocate a single one-dimensional C array of them on the heap with a size equal to the product of the N dimensions' sizes. Along with this physical representation, you must provide access to this internal one-dimensional array as if it were an N-dimensional array. All you need are the necessary indexing operators (operator []) to access the individual elements. The first problem starts here: you cannot define operator [][][]... in class Array; you can only define a single operator []. The solution is easy: operator [] needs to return an object of some class upon which you can apply another operator [], and so on until you have specified all N array indexes, one for each dimension. Listing 2 shows a first attempt at this class template, which leaves one big question unanswered: what ReturnType must operator [] have?
SubArrays
An object of some type ReturnType must be returned from the array's operator [], to which another operator [] will be applied if necessary. Naturally, if you apply operator [] to an N-dimensional array, you should return an N-1 dimensional array. You don't want to copy any elements, though, so the operator should not actually return another N-1 dimensional array; instead, it should return a reference to an N-1 dimensional projection (or view) within the original N-dimensional array. Enter subarrays. I call a SubArray a part of an N-dimensional Array. A SubArray within an N-dimensional Array can have any number of dimensions between N-1 and 1. It is an M-dimensional array in its own right, but it doesn't own any elements; it refers to the elements in another Array. So a three-dimensional array of dimensions [10][20][30] has 10 two-dimensional subarrays of dimension 20*30=600 elements, and each of these subarrays contains 20 one-dimensional subarrays of 30 elements each. Note that a subarray can contain other subarrays, but each contained subarray has one less dimension than the containing one. Listing 3 shows what to return from the operator [] applied to the array, and what to return from the operator [] applied to each subarray.
Partial Specialization
I have just defined a mechanism that allows you to traverse an N-dimensional array, one dimension at a time, until a problem is discovered. The recursive definition of class template SubArray (in Listing 3) does not have the necessary base of recursion so that the instantiation process can terminate, providing a fully indexed Array element. Each SubArray<T,N> has an operator [] that returns a SubArray<T,N-1>, but you could continue applying operator [] ad infinitum without getting anywhere. Explicit specialization, commonly used in this scenario as the base of recursion, comes to the rescue. Explicit specialization allows you to provide a specific definition of a class template, for a particular set of template arguments [1]. You must specify that SubArray<T,1>, which is one-dimensional, has an operator [] that returns a reference to an actual array element of type T, not a SubArray<T,0>. This can be done with partial specialization, which is a kind of explicit specialization where not all template parameters are specialized, since type T remains generic (see Listing 4). The core of my solution is in Listings 3 and 4.
To summarize, class template Array represents an N-dimensional generic array. Its operator [] returns an N-1 dimensional SubArray within the array, and operator [] applied to this N-1 dimensional SubArray will return an N-2 dimensional SubArray within the same array, and so on until you reach SubArray<T,1>. SubArray<T,1> will point to a one-dimensional array of elements, and its operator [] will return a reference to an actual element. What's left is the implementation of a mechanism that each Array or SubArray will use to pass to its subarrays the correct part of the original Array and all the information needed to continue the indexing process.
Be warned that some current C++ compilers do not support partial specialization, although it is part of the C++ Standard. My compiler, Microsoft Visual C++ v6.0, does not support it, so I had to use a trick to define these classes. I nested the SubArray classes within Array and removed template parameter T from them so that I could use total specialization. This works well with Visual C++. You can find one version of my Visual C++ classes and other versions for Standard C++ on the CUJ website (www.cuj.com/code). So far, the Standard C++ versions have been tested with Comeau C/C++ 4.2.42 and GNU C++ 2.95.2, but I imagine that soon most compilers will handle them correctly.
Making It Work
With the above architecture in place, you now need a mechanism to create the necessary SubArrays. The first SubArray will be created by Array's operator [], and the subsequent subarrays (if any) will be created by each previous one. The last subarray, SubArray<T,1>, will return a reference to the actual element indexed. You must pass some information to each SubArray so that it can refer to the elements inside the original Array correctly and so that it can pass the necessary information on to the subarrays it will create, allowing them to continue the indexing process.
For this to happen, each SubArray<T,N> needs three pieces of information:
- It needs a pointer, of type T*, to the first element in the SubArray.
- It needs to know how many subarrays of type SubArray<T,N-1> it contains.
- It needs to know the size of each contained subarray of type SubArray<T,N-1>.
Given these three pieces of information, each SubArray<T,N>::operator [] can provide a pointer to the first element of each SubArray<T,N-1> it contains. Note that each subarray lies in a contiguous part of the one-dimensional C array of T elements stored in the Array's member m_pArrayElements. So a SubArray is just a way to see this contiguous area of memory as an M-dimensional array, and each SubArray's operator [] returns a smaller contiguous part within this same area.
A simple example will make it clearer. Suppose you have a three-dimensional Array of dimensions [10][20][30]. The single C array containing all the Array elements will be created in class Array<T,3> as follows:
m_pArrayElements = new T [6000]; // All the T elements lie here
To be able to access the individual elements according to the specified three dimensions, you need two C arrays containing the following information:
int m_NDimensions[3]={ 10, 20, 30 }; int m_SubArrayLen[3]={ 600, 30, 1 }; // 600=20*30
The first array is intuitive; it contains the number of elements for each array dimension. The second is slightly more complicated; it contains the number of elements in each subarray.
The above information can be read as follows: the three-dimensional array is made of 10 two-dimensional subarrays of 600 elements each, each of those is made of 20 one-dimensional subarrays of 30 elements each, and each of those is made of 30 individual elements. If I create these two C arrays in class Array<T,3>, I have all the information I need for each SubArray that will be created in the indexing process. All I need is a way to pass this information along to each SubArray, and each SubArray will need the next element in these two C arrays. Listing 5 shows the source code for the complete indexing process, while Listing 6 shows the implementation of the resize member function. resize builds up the two C arrays, m_NDimensions and m_SubArrayLen as described above, and allocates the C array with all the elements of type T, assigning it to m_pArrayElements. In Listing 5, note that array m_pNDimensions is passed from SubArray to SubArray, but it is used only for bounds checking and could be removed from SubArrays for efficiency. m_SubArrayLen[0], which is the size of the subarrays contained in the current SubArray, is multiplied by the array index to retrieve the pointer to the beginning of the next SubArray. Note also that each SubArray, by passing m_NDimensions+1 and m_SubArrayLen+1 to the subarrays it creates, is in fact passing the part of the relative C arrays that follows the part it needs for itself. m_NDimensions and m_SubArrayLen are pointers, so adding one just points to the next elements in the arrays. If this is not clear to you, the easiest thing to do is step through the code with a debugger. You can download the code from www.cuj.com/code.
Allocation
The resize member function is responsible for the Array class's only piece of tedious work. Everything else is done in tiny member functions. resize must calculate all the information the Array class requires (arrays m_NDimensions and m_SubArrayLen) and allocate the elements with operator new, before deleting the previous ones, if any. Note that resize can preserve the existing array elements by copying them onto the newly allocated array, where possible, in each position that maintains the same set of indexes. Since this might be a slow operation, you have to request it specifically by passing parameter PreserveElems as true (shown in Listing 6). resize also receives as input parameters a C array of N unsigned integers, representing the N dimensions of the array, and an initialization value for the newly created elements. resize does a bit of dirty work, but not much and not often; it is called only when you create the array and when you change its dimensions. resize is the only member function that invalidates current iterators.
Back to One-Dimensional Arrays
The class template for generic N-dimensional arrays can be used as follows:
Array<int,5> A5; // Five-dimensional array of ints! Array<float,1> A1; // One-dimensional array of floats?
True? Almost. The one-dimensional array above will not work correctly. Array<T,1>'s operator [] will return a SubArray<T,0>. However, there is no working specialization for SubArray<T,0>, and the return type went past the SubArray<T,1> specialization that does exist. Array<T,1>::operator [] somehow must return a reference to a single element of type T. I had two choices here; either I could provide a specialization for SubArray<T,0> and add to it two conversion operators that return a reference to an element of type T (one of them const) and an assignment operator, or I could provide a specialization for Array<T,1> itself. Providing a specialization for SubArray<T,0> is easier, since it only needs to implement three tiny operators, but providing a full specialization for Array<T,1>, with its memory management and all, would be better. The full specialization would allow operator [] to return a reference to an element of type T directly, and it would also allow me to change the interface of the class to suit the fact that the array is one-dimensional. For instance, I could change the resize member function to take the actual dimension, not a pointer to an array of one element containing the dimension. Besides, the conversion operators that would be needed if I were to provide a specialization for SubArray<T,0> are dangerous and would probably not always work as expected.
The real issue is, who needs this one-dimensional generic resizable array? Plenty of these classes exist, in STL or other application frameworks, so I think the best specialization for Array<T,1> is shown in Listing 7. Empty. Just to be sure, I also provided an empty specialization for Array<T,0>, since an Array<T,0> truly does not make sense.
Back to STL
Given any container, it is a good idea to make it integrate well with STL, since there are so many algorithms and so much other reusable code that works with STL iterators and containers. It is simple to add iterators, a few typedefs, and member functions ala STL to my Array and SubArray classes. Since the internal representation of the elements inside my Array class is just a standard C array of T elements, ordinary pointers will make good and fast iterators. Listing 8 shows what I added to my Array class. (I made similar additions to SubArray.) It is enough to make my classes integrate well with STL, though more features can still be added. This will also allow you to iterate through all the elements of an Array of N dimensions without nesting N for loops, with just one very fast loop using the iterator. You can use many STL algorithms on my Array and SubArray too. The iterators actually provide a mono-dimensional view of this N-dimensional Array.
Back to Array Dimensions
To set the dimensions of the Array, an Array constructor and a resize member function take an unsigned int [N] (i.e., a C array of N unsigned integers). Although this solution is type safe, it is not very nice, since you must define an external array and set its elements one by one if they are not constants. It would be nice to be able to pass the N dimensions directly to the Array constructor and resize member function without passing through an external C array. But how do you construct the unsigned int array inside the call to resize? And how do you guarantee at compile time that the array contains exactly N elements?
Using the same type of solution that I devised for SubArrays, I created class ArraySizes and class template ArraySize<unsigned int N> (shown in Listing 9). Class ArraySize<N> contains a C array of N unsigned integers, and ArraySize<N> objects can only be created starting from class ArraySizes. Class ArraySizes contains an unsigned int array of one element, and its constructor puts the first Array dimension inside it. Its operator () returns an ArraySize<2> element, which contains the previous Array dimension plus the one passed to operator (). ArraySize<2> has an operator () that returns an ArraySize<3> element, which contains the previous two Array dimensions plus the one now passed to operator (), and so on. This way I guarantee that type ArraySize<N> contains a type-safe C array that was filled with exactly N elements. Since ArraySize<N> has a conversion operator to convert it to an unsigned int (&)[N], I can resize my Arrays as shown in Listing 9. If I accidentally call operator () the wrong number of times, the code will not compile. Note that this solution works only for N>=2, but so does my Array class. Just like my Array class, this solution can easily be extended to work for N>=1. Note also that since Visual C++ v6.0 does not allow specifying parameters of type unsigned int (&)[N], I had to implement these classes differently, passing through a std::vector to maintain type safety.
Finishing Up
To finish my Array class, I added a copy constructor and an assignment operator, equality and inequality operators, a clear member function to delete the contained elements explicitly, and other details, such as a member function to retrieve the size of each Array dimension and a swap member function. Then I made the SubArray's constructor private, as subarrays should be created only by my classes. This was easy to do, by making class SubArray<T,N> a friend of Array<T,N+1> and SubArray<T,N+1>, as these are the only classes that can create an object of class SubArray<T,N>. Note that the SubArray's default copy constructor and assignment operator work fine and are public. See the source code for more details. As far as I can tell, my Array class is type safe, thread safe, and exception safe. For an excellent treatment of container exception safety see [2].
Efficiency
I think the Array class is quite efficient, especially with the help of a good optimizing compiler that, among other things, supports the return value optimization. I ran some simple tests indexing a cubic three-dimensional array of one million integers, with three nested loops that assigned a value to each element of the array, and my Array class ran about half as fast as a pure C array, when compiled with a GNU C++ compiler with full optimization turned on (O3). But if I traversed my Array with one single loop using the iterator (or with three nested loops as shown in Listing 10, holding onto the temporary subarrays traversed), my Array class is just as fast as pure C arrays.
There are also ways to speed up the array. You could, for instance, provide a specialization for SubArray<T,2> to return a T *, instead of a SubArray<T,1> so that the last operator [] would be a normal C pointer dereference, which should be a bit faster since the temporary SubArray<T,1> on which to call operator [] would not be created. However, this way eliminates the possibility of doing bounds checking on the last dimension, causing you to fall back on arrays decaying into pointers, which you are trying to avoid. You could also avoid passing m_NDimensions to SubArrays, as mentioned previously, but again you lose the bounds checking ability.
I think Andrei Alexandrescu provided the best way to speed up these classes. He suggested I derive SubArray<T,N> from SubArray<T,N-1> so that each SubArray<T,N>::operator [] can return a reference to a SubArray<T,N-1> by returning itself instead of returning a new SubArray<T,N-1> object. I did so, and I even pushed it one (dangerous) step further by including a single SubArray<T,N-1> object as member of class Array<T,N> and doing all the indexing by returning references to that single object. This avoids the creation of any SubArray during the indexing process. It works correctly in most situations, but is not thread-safe. Nevertheless, if you need maximum speed in well-behaved single-threaded applications and test your code very well, it may be an alternative. These two versions of the Array class are also available on this website; I think they are both interesting and efficient. The fastest version is roughly 40 percent slower than pure C arrays. In the source code section of this site, you can also find the small test program I used to test the efficiency of my classes, along with a set of results.
Back to Subarrays
My discussion of SubArrays is rather simple and intuitive, but the concept is not general enough. The main problem is that Array<T,N> and SubArray<T,N> are totally distinct types, even though they behave in a very similar way, at least as far as indexing is concerned. So if I have a function that takes a SubArray<int,3> as an argument, I cannot pass it an Array<int,3>, although they both represent three-dimensional arrays of ints. What's more, the implementation of the indexing operators, iterators, etc. is identical in Array<T,N> and SubArray<T,N> and is repeated twice. This is often an indication that a redesign is due.
So, I renamed class SubArray to RefArray, which more correctly indicates that the class is actually a reference to an N-dimensional array, not necessarily a subarray. Then I had two options. First, I could derive Array<T,N> from RefArray<T,N>, thus achieving the twofold advantage of being able to pass an Array to functions that expect a RefArray and reusing RefArray's implementation of the indexing, iterators, etc. Class Array would just add the interface for creating, resizing, deleting, assigning, and swapping an Array; the rest would be inherited from class RefArray. Note that to do this last modification, I would mainly have to remove code, so the result would be greatly simplified.
There are a few extra things to consider. Is it really true that an Array IS-A RefArray? Do I really want to add the necessary virtual destructor? Do I want the extra (protected?) data members of RefArray<T,N> in my Array<T,N>? One thing is sure, I don't want to add a destructor to RefArray for performance reasons (I tried it with 100 percent performance overhead), and not adding it opens the door to misuse. Besides, just as with STL containers, I don't want my Array to have any virtual functions. Let's consider the second alternative. I can simply add to Array<T,N> a member function (better than a conversion operator) to return a RefArray<T,N>. This way the code duplication would not be avoided, but the solution would be neater. You can find both solutions on this website. README.TXT describes all the versions provided.
Extensions
These classes can easily be extended. You could add a lower bound to each array dimension, whereas currently it is fixed to zero. You could use exceptions instead of assertions in bounds checking, if you are willing to put up with the overhead of the extra check in such a critical position, even in release builds. You can also add an allocator as a template parameter to my Array class template and some useful member function templates, as in other STL containers. If you want my Array to work with different SubArray classes, you could pass them as a template parameter to Array. If you need to do numerical calculations, you might want to add numerical operations that work on entire arrays. I leave these and other modifications for the interested reader, as my goal is to present a useful and extensible idea.
Alternatives
Bjarne Stroustrup pointed out another multidimensional Array class that uses a template argument to specify the number of dimensions. It is part of POOMA (Parallel Object-Oriented Methods and Applications), a collection of C++ template classes for writing parallel PDE (Partial Differential Equation) solvers using finite-difference and particle methods [3]. POOMA is a very powerful, big, and complex object-oriented framework for applications in computational science requiring high-performance parallel computers.
Besides being much, much smaller, simpler, and less ambitious, my Array class differs from POOMA's Array class in significant ways. First of all, it does not have the support for parallel computation, polymorphic indexing through engine objects, floating-point domains, or any other advanced numerical feature. POOMA's Array is dedicated to numerical applications and can hold only elements of intrinsic numeric types (like int and double) and fixed-size Vector, Tensor, and Matrix classes (also provided by POOMA). My Array is much more general; it can contain elements of any intrinsic type and any class that has a default constructor and an assignment operator.
Second, whereas I use a sequence of operator [] to index the N-dimensional Array, as with C arrays and std::vector, POOMA's Array uses operator (), passing N parameters to a single function. Since POOMA's Array has to implement these operator ()s individually (along with many other functions coded individually for each specific number of dimensions), it has a fixed maximum number of dimensions (currently seven). My Array does not have this limitation; it can have any number of dimensions. (Currently my Array has a lower limit, N>=2, but the limit can easily be removed, as explained above.)
Lastly, POOMA requires a license for its use (although it is free for government agencies and teaching institutions). My Array certainly cannot compete with POOMA's power for complex numerical applications and is not meant to, but it is a more general, much smaller, and simpler container. It can provide a simple, efficient, generic, and resizable N-dimensional Array to most applications, like std::vector does with mono-dimensional vectors.
Conclusions
The presented classes yield a generic resizable N-dimensional array, all in about 400 lines of source code. In the complete article on the website, I have discussed and provided various implementations of the Array class, each with its own advantages and disadvantages, hoping to inspire interest, feedback, and new alternatives. I think the exercise of building these classes is instructive, and the power and elegance of template programming and explicit (partial) specialization is evident. The ideas presented, like the recursive definition of class templates SubArray and ArraySize, are applicable to many different fields beside Arrays.
Acknowledgments
My sincere thanks to two exceptional reviewers, Andrei Alexandrescu and Carlo Pescio, who kindly provided me with many great ideas and some clever criticism, influencing this article in many ways with remarkable insight. Thanks also to Herb Sutter for honoring me by sending my article to Bjarne Stroustrup, and to Bjarne Stroustrup for his comments (and for C++).
References
[1] Bjarne Stroustrup. The C++ Programming Language, Third Edition (Addison-Wesley, 1997), section 13.5.
[2] Herb Sutter. Exceptional C++ (Addison-Wesley, 1999), items 8-17.
[3] POOMA websites: http://www.acl.lanl.gov/pooma/; http://acts.nersc.gov/pooma/; http://www.zib.de/benger/pooma/.
[4] Andrei Alexandrescu. "Adapting Automation Arrays to the Standard vector Interface," C/C++ Users Journal, April 1999.
[5] Carlo Pescio. "Binary Constants Using Template Metaprogramming," C/C++ Users Journal, February 1997.
Giovanni Bavestrelli is Technology and Research Manager for the World-Wide Unisys Publishing Solutions Program, headquartered in Unisys Italia S.p.A. in Milan, Italy. He has a degree in Electronic Engineering from the Politecnico di Milano and a passion for C++, Object Oriented Programming and Design, and Software Development in general. He can be reached at [email protected].