STL Options
If the STL doesn't have a hash function that suits your needs, take a look at the Boost library's hash functions before crafting your own. The boost::hash library provides hashing functions for many primitives, but even more useful is its hash_combine
function. This function takes two hash values and calculates a new, unique hash from them. The Boost hashing library includes templates that apply hash_combine
recursively to calculate hash values for multiobject data structures such as arrays, pairs, and the STL containers. Starting from a user-specified seed value, the hash value is calculated by iterating over each individual object and combining the object's hash value with the previously calculated hash value (or the seed value if the particular object is the first one in the container). This is a useful concept, but again, the performance of hash containers and their associated hash functions are very heavily dependent on the details of the data. Whenever you have a choice between multiple hash functions, you should always benchmark each option before committing to a final decision.
Conclusion
Hash-based containers can provide a significant speed boost to your application under certain conditions. Still, they aren't appropriate in every situation, and the performance gain is greatly influenced by your choice of hash function, which in turn depends on the properties of the data you are hashing.
If you understand the details of your application and data, hash containers are a powerful tool to add to your performance toolbox.
Thomas works as a Project Engineer in Advertising.com's Research and Development department, where he leads software projects that implement newly developed algorithms. Thomas can be contacted at [email protected].