Commit 4de40f3a authored by Geoff Simmons's avatar Geoff Simmons

Document bench_qp.

parent da41e697
......@@ -9,6 +9,12 @@ not measure any overhead added by the VMOD or Varnish.
The directory also contains files with test data and inputs, some of
which are meant to simulate common use cases for the VMOD.
As documented in [CONTRIBUTING](../../../CONTRIBUTING.rst), the
benchmarks are included in builds when ``configure`` is invoked with
``--enable-benchmarks``. They are always built when ``make`` is
invoked in this directory, but require that the ``ph.o`` or ``qp.o``
object file is built first.
## `bench_ph` -- benchmark perfect hashing
`bench_ph` reads a set of strings from a file or stdin, runs exact
......@@ -128,3 +134,68 @@ $ ./bench_ph -s -i inputs.txt -n 100 -c set.csv set.txt
# using the set as its own test inputs.
$ shuf -n 5000 /usr/share/dict/words | ./bench_ph
```
## `bench_qp` -- benchmark trie matches
Like `bench_ph`, `bench_qp` tests the QP implementation by reading a
set of strings from a file or stdin, running prefix matches and/or
exact matches against the strings, and reporting statistics about the
match operation:
```
bench_qp [-hos] [-c csvfile] [-d dumpfile] [-i inputfile] [-m m|p]
[-n iterations] [file]
```
The `-h`, `-s`, `-c`, `-d`, `-i`, `-n` options and the optional `file`
argument have the same meaning as described above for `bench_ph`. The
discussion above concerning `-s` and the effects of usage patterns,
locality and branch prediction apply here as well.
By default, `bench_qp` runs both prefix matches and exact matches
against the set, with `QP_Prefixes()` and `QP_Lookup()` respectively.
Note that the VMOD does not use `QP_Lookup()`, since exact matches
with perfect hashing is faster for all but some unusual data sets.
The `-m` option can be set to `p` to run only prefix matches, or `m`
to run only exact matches. So to test only what the VMOD uses, specify
`-m p`.
If `-o` is specified, then the `allow_overlaps` flag for `QP_Insert()`
is set to 0. In that case, a set in which a string is a prefix for
another string in the set is rejected. By default, overlaps are
allowed.
The procedure of a benchmark is the same as described above for
`bench_ph`, except as follows:
* The string set is sorted before building the trie, using `qsort(2)`
as the VMOD does. The time for the sort is reported.
* The trie is built by iterating `QP_Insert()` over the sorted set.
* Stats are obtained from `QP_Stats()`, and an optional dump file is
written with the contents from `QP_Dump()`.
* The format of a CSV file is:
* `type`: `prefix` for a prefix match, `match` for an exact match
* `matches`: the number of matches found, which can be > 1 when
there are common prefixes. 0 for non-matches.
* `exact`: 1 if an exact match was found, 0 otherwise. Always 1
on a benchmark for `QP_Lookup()`.
* `t`: time for the match operation in ns (as for `bench_ph`)
* Benchmarks iterate the input for `QP_Prefixes()`, `QP_Lookup()`, or
both.
Example:
```
# Benchmark prefix matches for the set in url.txt, using inputs from
# urlpfx_input.txt, with default 1000 iterations and no shuffling.
./bench_qp -m p -i urlpfx_input.txt url.txt
```
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment