Commit da41e697 authored by Geoff Simmons's avatar Geoff Simmons

Start documenting the benchmarks.

parent 3c676ffb
# Benchmarking the search implementations
This directory contains utilities for benchmarking the PH (perfect
hash) and QP (quadbit patricia trie) implementations, used by the VMOD
for the `.match()` and `.hasprefix()` methods, respectively. They are
meant to aid testing the search algorithms and implementations, and do
not measure any overhead added by the VMOD or Varnish.
The directory also contains files with test data and inputs, some of
which are meant to simulate common use cases for the VMOD.
## `bench_ph` -- benchmark perfect hashing
`bench_ph` reads a set of strings from a file or stdin, runs exact
matches against the strings, and reports statistics about the match
operation:
```
bench_ph [-hs] [-c csvfile] [-d dumpfile] [-i inputfile] [-n iterations] [file]
```
`bench_ph` reads the string set from `file`, or from stdin if no file
is specified. Each line of input forms a string in the set, excluding
the terminating newline.
The string set MAY NOT include the same string more than once, but
`bench_ph` does not check for duplicates, and will likely not
terminate if duplicates are present. (The VMOD runs `QP_Insert()`,
which rejects duplicate strings, before building the perfect hash.)
If `-i inputfile` is specified, then test inputs -- strings to be
matched against the set -- are read from `inputfile`, one string per
line (newlines excluded). If there is no `inputfile` then the strings
from the set are also used as test inputs, in which case every lookup
is a successful match. Lookup misses can only be tested by using an
input file.
`-n iterations` specifies the number of times each test input string
is matched against the set, default 1000. If `-n 0` is specified, then
`bench_ph` builds the set and reports statistics about it, and may
generate a dump file if requested, but does not run any matches.
If `-s` is specified, the test inputs are shuffled before each
iteration. This may reveal effects of locality of reference and
branch prediction on the performance of matches.
Note that the benchmarks may implement an unnatural usage pattern --
every test input is matched against the set exactly the same number of
times. Real-world usages commonly match some strings more frequently
than others, which may be beneficial for locality and branch
prediction. Performance differences with and without `-s` show that
there can be an impact -- generally, mean match times tend to be
longer for large string sets and/or for sets with long strings. But it
may be necessary to craft input data to simulate real usage patterns,
for example by including some strings more frequently in the input.
If `-d dumpfile` is specified, then the text dump of the perfect hash
structure produced by `PH_Dump()` is written to `dumpfile`.
If `-c csvfile` is specified, then `csvfile` is written with data
about each match operation in the benchmark. For consistency, the
format is the same used for `bench_qp` (see below). The CSV file has a
header line with the column names, with the following columns:
- `type`: always `match`
- `matches`: 1 for a match, 0 for a miss
- `exact`: 1 for a match, 0 for a miss
- `t`: time for the match operation in nanoseconds
If `-h` is specified, `bench_ph` prints a usage message and exits.
A benchmark proceeds as follows, with timings obtained by calling
`clock_gettime(2)` just before and just after each operation to be
measured, using the monotonic clock:
* The string set is read from `file` or stdin, and test inputs are
read if there is an `inputfile`.
* The perfect hash is generated by calling `PH_Init()` (with seeds
from `/dev/urandom`) and `PH_Generate()`. The total time for
`PH_Generate()` is reported, as well as the mean time per string in
the set.
* If a `dumpfile` was specified, call `PH_Dump()` and write its
contents to the file.
* Statistics obtained from `PH_Stats()` are printed, as well as the
time to run `PH_Stats()`.
* Exit if `-n` is set to 0.
* Run the benchmark with the specified number of iterations (default
1000).
* Report results:
* The number of match operations executed.
* The number of matches and misses.
* The cumulative time for all match operations, and the mean time
per operation.
* Throughput, as the number of operations per second.
* Report stats from `getrusage(2)` for the complete run of `bench_ph`:
* user and system time
* numbers of voluntary and involuntary context switches as `vcsw`
and `ivcsw`
Since the match operation is CPU and memory bandwidth intensive, mean
times may be increased if `ivcsw` is high.
Examples:
```
# Benchmark the set in set.txt, using inputs from inputs.txt, with 100
# iterations, shuffling inputs on each iteration, and recording
# results in set.csv.
$ ./bench_ph -s -i inputs.txt -n 100 -c set.csv set.txt
# Form a set from 5000 strings chosen randomly from the words list,
# using the set as its own test inputs.
$ shuf -n 5000 /usr/share/dict/words | ./bench_ph
```
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment