Open MPI logo

Open MPI Development Mailing List Archives

  |   Home   |   Support   |   FAQ   |   all Development mailing list

Subject: Re: [OMPI devel] RFC: improve the hash function used by opal_hash_table_t
From: Nathan Hjelm (hjelmn_at_[hidden])
Date: 2013-06-11 18:22:17


I think as long as we are updating this we might as well add a performant hash function. I see Murmur2 is 32-bit and has a public domain implementation: https://code.google.com/p/smhasher/source/browse/trunk/MurmurHash2.cpp

Since Murmur2 hashes in 4-byte blocks I expect it probably would perform better than either a nieve crc32 or Jenkins. Though a hardware accelerated crc32 (if available) would probably work great as well. Both are good suggestions that I will look into. I will also look into the string variant of Murmur though that would require another set of opal_hash_table functions for string keys (I was using the _ptr variety).

-Nathan

On Wed, Jun 12, 2013 at 12:10:42AM +0200, George Bosilca wrote:
> The one-at-the-time version computes on chars, if the performance of the hash function is a critical element in the equation then you will be better off avoiding its usage. I would suggest going with Murmur (http://en.wikipedia.org/wiki/MurmurHash) instead, which is faster and perform well in random distribution. Another interesting features is that there are specialized derivative for strings based key, a feature that might prove helpful with the MCA parameters and the MPI_T stuff.
>
> George.
>
>
> On Jun 11, 2013, at 23:32 , Nathan Hjelm <hjelmn_at_[hidden]> wrote:
>
> > What: Implement a better hash function in opal_hash_table_t. The function is a simple one-at-a-time Jenkin's hash (see http://en.wikipedia.org/wiki/Jenkins_hash_function) and has good collision rates and isn't overly complex or slow.
> >
> > Why: I am preparing an update to the MCA variable system (adding performance variables) which will include a hash-based lookup function (in preperation for the inevitable MPI_T_cvar/pvar/category lookup functions-- MPI 3.0 errata). The current hash function is not very good so now seems like a good time to update it.
> >
> > When: Will push this to trunk on Thursday if there are no objections.
> >
> > Patch below
> >
> > Index: opal/class/opal_hash_table.c
> > ===================================================================
> > --- opal/class/opal_hash_table.c (revision 28609)
> > +++ opal/class/opal_hash_table.c (working copy)
> > @@ -356,14 +356,20 @@
> > static inline uint32_t opal_hash_value(size_t mask, const void *key,
> > size_t keysize)
> > {
> > - size_t h, i;
> > - const unsigned char *p;
> > -
> > - h = 0;
> > - p = (const unsigned char *)key;
> > - for (i = 0; i < keysize; i++, p++)
> > - h = HASH_MULTIPLIER*h + *p;
> > - return (uint32_t)(h & mask);
> > + const unsigned char *p = (const unsigned char *) key;
> > + uint32_t h = 0, i;
> > +
> > + for (i = 0 ; i < keysize ; ++i, ++p) {
> > + h += *p;
> > + h += h << 10;
> > + h ^= h >> 6;
> > + }
> > +
> > + h += h << 3;
> > + h ^= h >> 11;
> > + h += h << 15;
> > +
> > + return h & (uint32_t) mask;
> > }
> >
> > int opal_hash_table_get_value_ptr(opal_hash_table_t* ht, const void* key,
> >
> >
> >
> >
> > -Nathan
> > _______________________________________________
> > devel mailing list
> > devel_at_[hidden]
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> _______________________________________________
> devel mailing list
> devel_at_[hidden]
> http://www.open-mpi.org/mailman/listinfo.cgi/devel