Tools

Performance Portable C++

By Jeff Keasler, May 07, 2008

Performance portability means that code can achieve good performance across a range of computer architectures while maintaining a single body of source code.

Class Partitioning

Before hiding the implementation of arrays behind a class API, you may want to consider how that extra layer of API affects code. Now that I've given hard facts above describing the technique and its performance, I turn to some potential guidelines for organizing array classes.

First of all, it's almost certain that any chosen partitioning of data at the beginning of a project changes by the time the project ends. Keeping this in mind, you want to partition your data such that there's as little pain as possible if you have to refactor your code. There are two extremes of data organization:

Lump all your arrays into one huge class.
Separate your arrays into a very large number of small classes, each containing only a few arrays.

The first choice has the advantage of better readability and refactorability. If all of your data members are prefixed with the same short mnemonic, then it is almost like having a namespace for your array data. It is more refactorable because you can move data around within the class, and the class API won't change, so your code won't have to change.

The first choice has the disadvantage that it is hard to instantiate multiple copies of the object without incurring a huge fixed overhead of memory and class construction time. For instance, if you have 800 arrays in a class implemented using STL vectors, then you are going to pay the penalty of instantiating all the vectors, even if you are cloning the class just for a few of the vectors it contains. Construction/Initialization can also be a problem, especially for subsets of arrays.

The second choice has the advantage of extreme flexibility. You can group just a few arrays that are often used together in a class, examples being coordinate vectors or velocity vectors. This is optimal when you want to construct/destruct a lot of temporary objects throughout your run. Small classes are also easier to tune for performance portability.

The disadvantage can be reduced readability because almost every variable will effectively have a different namespace prefix. It's not always fun to write that kind of code, or read it in equations. The readability issue can be mitigated by pulling data out into local temporaries before use, but that can cause code bloat, not to mention the introduction of errors due to cut-and-paste code and typos. Another drawback of many small classes is that if you want to refactor your code, say by moving an array from one class to another, that change will often require changes throughout your code.

Now that I've covered the advantages/disadvantages of the two extremes, I look at a third choice, which is to group data having similar "topological" characteristics.

An indication that arrays may be topologically similar is that they probably have the same length. For example, when working with physics on meshes, velocity components are often defined at the coordinates, so perhaps coordinates and velocities should be encapsulated in the same class.

In contrast, particle data may contain coordinates, but you wouldn't want to necessarily group particle coordinates with mesh coordinates because particles have a completely different function. Particles can move independently from mesh coordinates, and they will probably be created and destroyed more often.

The advantage of grouping arrays topologically is that they can often be nested in hierarchies. For example, one array class could contain array data common to all the nodes of a mesh, while another class could contain extra array data pertaining to a subset. An index set could be used to map array indices in the subset to corresponding indices in the larger mesh as a way to implement inheritance.

Topological grouping is a good guideline, but it's only a guideline. There may be performance-sensitive groups of arrays that should be split off into separate classes for easier cache tuning.

Once you have decided how to group your classes, there is one other important point. For classes that have an underlying implementation that is Struct-like, make sure that variables that are likely to be used together are in adjacent locations in the struct. This greatly increases your cache hits, and thus your performance. Resist the urge to order struct members strictly alphabetically because that can cause a huge performance penalty.

Previous 1 2 3 4 5 Next

More Insights

INFO-LINK


	To upload an avatar photo, first complete your Disqus profile. \| View the list of supported HTML tags you can use to style comments. \| Please read our commenting policy.

Tools

Performance Portable C++

Class Partitioning

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Tools Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content

Tools

Performance Portable C++

Class Partitioning

Related Reading

News

Commentary

Slideshow

Video

Most Popular

More Insights

White Papers

Reports

Webcasts

Currently we allow the following HTML tags in comments:

Single tags

Matching tags

Tools Recent Articles

Most Popular

This month's Dr. Dobb's Journal

Upcoming Events

Featured Reports

Featured Whitepapers

Most Recent Premium Content