Should std::hash<T> work when T is std::pair<two simpler types also supported by std::hash>?

947 Views Asked by At

I was using an ordered set declared as so:

std::set<std::pair<const std::string, const myClass *> > myset;

After doing some analysis of the way I was using the set, I concluded that an unordered_set would be a smarter choice. But when I changed std::set to std::unordered_set, I got a vast spew of error messages from my compiler (g++ 4.8.1) complaining of an

invalid use of incomplete type struct std::hash<std::pair<const std::basic_string<char>, const myClass * > >

I figured out that std::hash didn't know how to deal with a type which was std::pair, despite the fact that the two types that made up the pair were each hashable. I think error for hash function of pair of ints contains relevant information about the C++11 standard that explains why things went awry. (There's no good explanation for the impenetrable wall of error text that g++ emits for this.)

It would seem to me that

std::hash<std::pair<T1, T2>> hasher(make_pair(x,y))
  = some_func(std::hash<T1>hasher(x), std::hash<T2>hasher(y) )

where some_func() could be as simple as XOR (or not; see Why is XOR the default way to combine hashes?)

Is there a good reason for the standard to not require std::hash to know how to construct a hash value for an object which is a pair of types that are each hashable?

1

There are 1 best solutions below

5
On

The reason is simple, it was not added to the standard. The same is true of hashing other structures like tuple.

Things tend to be added to the standard when they are good enough, not when they are perfect, as perfection is the enemy of the good. More specializations of std::hash are not things that will break code (that often), so adding new ones is relatively harmless.

In any case, to that end, we can write our own hash extenders. As an example:

namespace hashers {
  constexpr size_t hash_combine( size_t, size_t ); // steal from boost, or write your own
  constexpr size_t hash_combine( size_t a ) { return a; }
  constexpr size_t hash_combine() { return 0; }
  template<class...Sizes>
  constexpr size_t hash_combine( size_t a, size_t b, Sizes... sizes ) {
    return hash_combine( hash_combine(a,b), sizes... );
  }

  template<class T=void> struct hash;

  template<class A, class B>
  constexpr size_t custom_hash( std::pair<A,B> const& p ) {
    return hash_combine( hash<size_t>{}(2), hash<std::decay_t<A>>{}(p.first), hash<std::decay_t<B>>{}(p.second) );
  }
  template<class...Ts, size_t...Is>
  constexpr size_t custom_hash( std::index_sequence<Is...>, std::tuple<Ts...> const& p ) {
    return hash_combine( hash<size_t>{}(sizeof...(Ts)), hash<std::decay_t<Ts>>{}(std::get<Is>(p))... );
  }
  template<class...Ts>
  constexpr size_t custom_hash( std::tuple<Ts...> const& p ) {
    return custom_hash( std::index_sequence_for<Ts...>{}, p );
  }
  template<class T0, class C>
  constexpr size_t custom_hash_container( size_t n, C const& c) {
    size_t retval = hash<size_t>{}(n);
    for( auto&& x : c)
      retval = hash_combine( retval, hash<T>{}(x) );
    return retval;
  }
  template<class T0, class C>
  constexpr size_t custom_hash_container( C const& c) {
    return custom_hash_container( c.size(), c );
  }
  template<class T, class...Ts>
  size_t custom_hash( std::vector<T, Ts...> const& v ) {
    return custom_hash_container<T>(v);
  }
  template<class T, class...Ts>
  size_t custom_hash( std::basic_string<T, Ts...> const& v ) {
    return custom_hash_container<T>(v);
  }
  template<class T, size_t n>
  constexpr size_t custom_hash( std::array<T, n> const& v ) {
    return custom_hash_container<T>(n, v);
  }
  template<class T, size_t n>
  constexpr size_t custom_hash( T (const& v)[n] ) {
    return custom_hash_container<T>(n, v);
  }
  // etc -- list, deque, map, unordered map, whatever you want to support
  namespace details {
    template<class T, class=void>
    struct hash : std::hash<T> {};
    using hashers::custom_hash;
    template<class T>
    struct hash<T,decltype(void(
      custom_hash(declval<T const&>())
    )) {
      constexpr size_t operator()(T const& t)const {
        return custom_hash(t);
      }
    };
  }
  template<class T>
  struct hash : details::hash<T> {};
  template<>
  struct hash<void> {
    template<class T>
    constexpr size_t operator()(T const& t)const { return hash<T>{}(t); }
  }
}

and now hashers::hash<T> will recursively use either an ADL-looked up custom_hash function, or std::hash if that fails, to hash T and its components, and hashers::hash<> is a universal hasher that tries to hash anything passed to it.

Code may not compile as shown.

I chose to hash all containers and tuples as hash their length, followed by hashing the combination of their contents. As a side effect, array<int, 3> hashes the same as tuple<int,int,int>, and tuple<int,int> hashes the same as pair<int,int>, and std::vector<char>{'a','b','c', '\0'} hashes the same as "abc", which I think is a nice property. The empty array/tuple/vector/etc hashes like size_t(0).

You can extend the above system for your own types by simply overriding custom_hash in the namespace of the type in question, or specializing either std::hash<X> or hashers::hash<X> to do your custom hash (I would go with std::hash for the principle of least surprise myself). For advanced use, you can specialize hashers::details::hash<X,void> with SFINAE, but I'd say do it for custom_hash instead.