Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
L
libvmod-selector
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
uplex-varnish
libvmod-selector
Commits
0c698101
Commit
0c698101
authored
Sep 19, 2020
by
Geoff Simmons
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Start documenting the benchmarks.
parent
af2ce0c5
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
130 additions
and
0 deletions
+130
-0
README.md
src/tests/bench/README.md
+130
-0
No files found.
src/tests/bench/README.md
0 → 100644
View file @
0c698101
# Benchmarking the search implementations
This directory contains utilities for benchmarking the PH (perfect
hash) and QP (quadbit patricia trie) implementations, used by the VMOD
for the
`.match()`
and
`.hasprefix()`
methods, respectively. They are
meant to aid testing the search algorithms and implementations, and do
not measure any overhead added by the VMOD or Varnish.
The directory also contains files with test data and inputs, some of
which are meant to simulate common use cases for the VMOD.
## `bench_ph` -- benchmark perfect hashing
`bench_ph`
reads a set of strings from a file or stdin, runs exact
matches against the strings, and reports statistics about the match
operation:
```
bench_ph [-hs] [-c csvfile] [-d dumpfile] [-i inputfile] [-n iterations] [file]
```
`bench_ph`
reads the string set from
`file`
, or from stdin if no file
is specified. Each line of input forms a string in the set, excluding
the terminating newline.
The string set MAY NOT include the same string more than once, but
`bench_ph`
does not check for duplicates, and will likely not
terminate if duplicates are present. (The VMOD runs
`QP_Insert()`
,
which rejects duplicate strings, before building the perfect hash.)
If
`-i inputfile`
is specified, then test inputs -- strings to be
matched against the set -- are read from
`inputfile`
, one string per
line (newlines excluded). If there is no
`inputfile`
then the strings
from the set are also used as test inputs, in which case every lookup
is a successful match. Lookup misses can only be tested by using an
input file.
`-n iterations`
specifies the number of times each test input string
is matched against the set, default 1000. If
`-n 0`
is specified, then
`bench_ph`
builds the set and reports statistics about it, and may
generate a dump file if requested, but does not run any matches.
If
`-s`
is specified, the test inputs are shuffled before each
iteration. This may reveal effects of locality of reference and
branch prediction on the performance of matches.
Note that the benchmarks may implement an unnatural usage pattern --
every test input is matched against the set exactly the same number of
times. Real-world usages commonly match some strings more frequently
than others, which may be beneficial for locality and branch
prediction. Performance differences with and without
`-s`
show that
there can be an impact -- generally, mean match times tend to be
longer for large string sets and/or for sets with long strings. But it
may be necessary to craft input data to simulate real usage patterns,
for example by including some strings more frequently in the input.
If
`-d dumpfile`
is specified, then the text dump of the perfect hash
structure produced by
`PH_Dump()`
is written to
`dumpfile`
.
If
`-c csvfile`
is specified, then
`csvfile`
is written with data
about each match operation in the benchmark. For consistency, the
format is the same used for
`bench_qp`
(see below). The CSV file has a
header line with the column names, with the following columns:
-
`type`
: always
`match`
-
`matches`
: 1 for a match, 0 for a miss
-
`exact`
: 1 for a match, 0 for a miss
-
`t`
: time for the match operation in nanoseconds
If
`-h`
is specified,
`bench_ph`
prints a usage message and exits.
A benchmark proceeds as follows, with timings obtained by calling
`clock_gettime(2)`
just before and just after each operation to be
measured, using the monotonic clock:
*
The string set is read from
`file`
or stdin, and test inputs are
read if there is an
`inputfile`
.
*
The perfect hash is generated by calling
`PH_Init()`
(with seeds
from
`/dev/urandom`
) and
`PH_Generate()`
. The total time for
`PH_Generate()`
is reported, as well as the mean time per string in
the set.
*
If a
`dumpfile`
was specified, call
`PH_Dump()`
and write its
contents to the file.
*
Statistics obtained from
`PH_Stats()`
are printed, as well as the
time to run
`PH_Stats()`
.
*
Exit if
`-n`
is set to 0.
*
Run the benchmark with the specified number of iterations (default
1000).
*
Report results:
*
The number of match operations executed.
*
The number of matches and misses.
*
The cumulative time for all match operations, and the mean time
per operation.
*
Throughput, as the number of operations per second.
*
Report stats from
`getrusage(2)`
for the complete run of
`bench_ph`
:
*
user and system time
*
numbers of voluntary and involuntary context switches as
`vcsw`
and
`ivcsw`
Since the match operation is CPU and memory bandwidth intensive, mean
times may be increased if
`ivcsw`
is high.
Examples:
```
# Benchmark the set in set.txt, using inputs from inputs.txt, with 100
# iterations, shuffling inputs on each iteration, and recording
# results in set.csv.
$ ./bench_ph -s -i inputs.txt -n 100 -c set.csv set.txt
# Form a set from 5000 strings chosen randomly from the words list,
# using the set as its own test inputs.
$ shuf -n 5000 /usr/share/dict/words | ./bench_ph
```
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment