Commit 3b72a3bd authored by Nils Goroll's avatar Nils Goroll

polish documentation

parent c1184875
Pipeline #24 skipped
......@@ -8,18 +8,3 @@ DISTCHECK_CONFIGURE_FLAGS = \
EXTRA_DIST = README.rst LICENSE
doc_DATA = README.rst LICENSE
dist_man_MANS = vmod_shard.3
MAINTAINERCLEANFILES = $(dist_man_MANS)
vmod_shard.3: README.rst
%.1 %.2 %.3 %.4 %.5 %.6 %.7 %.8 %.9:
if HAVE_RST2MAN
${RST2MAN} $< $@
else
@echo "========================================"
@echo "You need rst2man installed to make dist"
@echo "========================================"
@false
endif
......@@ -25,63 +25,68 @@ Director vmod to implement backend sharding with consistent hashing,
previously known also as the VSLP (Varnish StateLess Persistence)
director.
The basic concept behind this director is:
* Generate a load balancing key, which will be used to select the
backend. The key values should be as uniformly distributed as
possible. For all requests which need to hit the same backend
server, the same key must be generated. For strings, a hash
function can be used to generate the key.
* Select the preferred backend server using an implementation of
consistent hashing (cf. Karger et al, references below), which
ensures that the same backend are always chosen for every key (for
instance hash of incoming URL) in the same order (i.e. if the
preferred host is down, then alternative hosts are always chosen in
a fixed and deterministic, but seemingly random order).
* The consistent hashing circular data structure gets built from hash
values of "ident%d" (default ident being the backend name) for each
backend and for a running number from 1 to n (n is the number of
"replicas").
* For the load balancing key, find the smallest hash value in the
circle that is larger than the key (searching clockwise and wrapping
around as necessary).
* If the backend thus selected is down, choose alternative hosts by
continuing to search clockwise in the circle.
On consistent hashing see:
* http://www8.org/w8-papers/2a-webserver/caching/paper2.html
* http://www.audioscrobbler.net/development/ketama/
* svn://svn.audioscrobbler.net/misc/ketama
* http://en.wikipedia.org/wiki/Consistent_hashing
This technique allowes to create shards of backend servers without
keeping any state, and, in particular, without the need to synchronize
state between nodes of a cluster of Varnish servers. Sharding by some
request property (for instance by URL) may help optimize cache
efficiency.
One particular applicatoin of sharding is to implement persistence of
backend requests, such that all requests sharing a certain criterium
(such as an IP address or session ID) get forwarded to the same
backend server.
Introduction
============
The shard director selects backends by a key, which can be provided
directly or derived from strings. For the same key, the shard director
will always return the same backend, unless the backend configuration
or health state changes. Conversely, for differing keys, the shard
director will likely choose different backends. In the default
configuration, unhealthy backends are not selected.
The shard director resembles the hash director, but its main advantage
is that, when the backend configuration or health states change, the
association of keys to backends remains as stable as possible.
In addition, the rampup and warmup features can help to further
improve user-perceived response times.
Sharding
--------
This basic technique allows for numerious applications like optimizing
backend server cache efficiency, Varnish clustering or persisting
sessions to servers without keeping any state, and, in particular,
without the need to synchronize state between nodes of a cluster of
Varnish servers:
* Many applications use caches for data objects, so, in a cluster of
application servers, requesting similar objects from the same server
may help to optimize efficiency of such caches.
For example, sharding by URL or some `id` component of the url has
been shown to drastically improve the efficiency of many content
management systems.
* As special case of the previous example, in clusters of Varnish
servers without additional request distribution logic, each cache
will need store all hot objects, so the effective cache size is
approximately the smallest cache size of any server in the cluster.
Sharding allows to segregate objects within the cluster such that
each object is only cached on one of the servers (or on one primary
and one backup, on a primary for long and others for short
etc...). Effectively, this will lead to a cache size in the order of
the sum of all individual caches, with the potential to drastically
increase efficiency (scales by the number of servers).
* Another application is to implement persistence of backend requests,
such that all requests sharing a certain criterium (such as an IP
address or session ID) get forwarded to the same backend server.
When used with clusters of varnish servers, the shard director will,
if otherwise configured equally, make the same shard decision on all
if otherwise configured equally, make the same decision on all
servers. In other words, requests sharing a common criterium used as
the shard key will be balanced onto the same backend server(s) no matter
which Varnish server handles the request.
the shard key will be balanced onto the same backend server(s) no
matter which Varnish server handles the request.
The drawbacks are:
* the distribution of requests depends on the number of requests per key and
the uniformity of the distribution of key values. In short, this technique
will generally lead to less good load balancing compared to stateful
techniques.
* the distribution of requests depends on the number of requests per
key and the uniformity of the distribution of key values. In short,
while this technique may lead to much better efficiency overall, it
may also lead to less good load balancing for specific cases.
* When a backend server becomes unavailable, every persistence
technique has to reselect a new backend server, but this technique
......@@ -91,6 +96,7 @@ The drawbacks are:
a selected server for as long as possible (or dictated by a TTL)).
INSTALLATION
============
......
......@@ -22,6 +22,15 @@ nodist_libvmod_shard_la_SOURCES = \
parse_vcc_enums.h \
parse_vcc_enums.c
dist_man_MANS = vmod_shard.3
vmod_shard.3: vmod_shard.man.rst
${RST2MAN} $< $@
vmod_shard.lo: vcc_if.c
vmod_shard.man.rst vcc_if.c: vcc_if.h
parse_vcc_enums.h: parse_vcc_enums.c
parse_vcc_enums.c: gen_enum_parse.pl
......
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment