src/rnd.h · master · uplex-varnish / libvmod-selector

Implement perfect hashing based on universal hashing. · 1f7815f1

Geoff Simmons authored Jun 01, 2020

Universal hashing has a sounder theoretical basis; in particular, it
doesn't have the dubious minimum hash table size below which a
perfect hash may not be possible, and which was set by trial and error.

For nearly all test data, universal hashing performs at least as
well or better. Especially better for sets with longer strings,
since the subject string is cast as an array of uint32_t, so the
hash is computed in fewer operations.

The only exception I've noticed is /usr/share/dict/words, which now
appears to have more collisions than under the previous approach.
But it appears likely that this only becomes an issue for sets that
are much larger than are probable for VCL use cases (in the 100,000
range), and if all of the sets' elements are tested for matches
about equally often (whereas real-world usage patterns tend to
match a subset much more frequently).

1f7815f1

rnd.h 2.4 KB

Edit Web IDE

Replace rnd.h