Commit 7d6d570e authored by Geoff Simmons's avatar Geoff Simmons

Ditch the use of Mersenne primes.

The math was wrong, and changing the hash function to correctly
compute mod Mersenne prime just made it slower, and didn't seem
to lower collision rates.

Hash table sizes are just the next higher power of 2. As I interpret
Thorup (2020), this is still strongly universal hashing.
parent 7a96693c
......@@ -45,11 +45,6 @@
#include "ph.h"
#include "rnd.h"
/* Exponents of the first 7 Mersenne primes */
static unsigned lg_mersenne[] = {
2, 3, 5, 7, 13, 17, 19,
};
#define LEN(a) (sizeof(a) / sizeof(a[0]))
#define LAST(a) (a[LEN(a)-1])
......@@ -128,27 +123,11 @@ lg(unsigned n)
return (lg);
}
/*
* For set sizes <= 2^19, the hash table size is 2^N, where N is the
* exponent of the next highest Mersenne prime, so that the bitmasking is
* equivalent to mod prime. That probably covers every realistic use case
* for VCL. But should someone use a set larger than 2^19, we take the
* next highest power of 2.
*/
static uint32_t
getsz(unsigned n)
{
unsigned bits = lg(n);
for (unsigned i = 0; i < LEN(lg_mersenne); i++) {
if (bits < lg_mersenne[i]) {
bits = lg_mersenne[i];
if (i > 0 && n == (1U << lg_mersenne[i - 1]))
bits = lg_mersenne[i - 1];
break;
}
}
if (n > (1U << LAST(lg_mersenne)) && n != (unsigned)(1U << bits))
if (n != (unsigned)(1U << bits))
bits++;
assert(bits <= MAX_LG);
return (1U << bits);
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment