Documentation overhaul

parent 3370a4c4
......@@ -33,7 +33,8 @@ Makefile.in
/src/vcc_pesi_debug_if.[ch]
/src/vcc_pesi_if.[ch]
/src/vmod_*rst
/src/vmod_*debug*rst
/src/vmod_pesi.rst
/src/VSC_pesi.c
/src/VSC_pesi.h
......
......@@ -9,10 +9,6 @@ EXTRA_DIST = README.rst LICENSE CONTRIBUTING.rst INSTALL.rst
doc_DATA = README.rst LICENSE CONTRIBUTING.rst INSTALL.rst
README.rst: src/vdp_pesi.vcc
$(MAKE) $(AM_MAKEFLAGS) -C src vmod_pesi.man.rst
cp src/vmod_pesi.man.rst README.rst
coverage:
$(MAKE) $(AM_MAKEFLAGS) -C src coverage
......
..
.. NB: This file is machine generated, DO NOT EDIT!
..
.. Edit ./vdp_pesi.vcc and run make instead
..
==============================
Parallel ESI for Varnish-Cache
==============================
.. role:: ref(emphasis)
=========
vmod_pesi
=========
.. _Varnish-Cache: https://varnish-cache.org/
----------------------------------------------------
Varnish Delivery Processor for parallel ESI includes
----------------------------------------------------
This project provides parallel ESI processing for `Varnish-Cache`_ as
a module (VMOD).
:Manual section: 3
PROJECT RESOURCES
=================
SYNOPSIS
========
* The primary repository is at https://code.uplex.de/uplex-varnish/libvdp-pesi
::
This server does not accept user registrations, so please use ...
* the mirror at https://gitlab.com/uplex/varnish/libvdp-pesi for issues,
merge requests and all other interactions.
INTRODUCTION
============
.. _Standard ESI processing: https://varnish-cache.org/docs/trunk/users-guide/esi.html
`Standard ESI processing`_ in `Varnish-Cache`_ is sequential. In
short, it works like this:
import pesi;
1. Process the (sub)request
# Enable parallel ESI processing in vcl_deliver {}.
VOID pesi.activate()
2. For a cache-miss or pass, fetch the requested object and parse it
on the backend side, if ESI parsing is enabled. Store the object in
a parsed, pre-segmented form.
# Set a boolean configuration parameter.
VOID pesi.set(ENUM, BOOL)
3. Back on the client side, process the parsed, pre-segmented ESI
object. For all includes, create a sub-request and start with it at
step 1.
# Configure workspace pre-allocation for internal variable-sized
# data structures.
VOID pesi.workspace_prealloc(BYTES min_free, INT max_nodes)
Simply put, this process is very efficient if Step 2 does not need to
be done because the requested object is already in cache.
# Configure the memory pool used when pre-allocated structures
# from the workspace are insufficient.
VOID pesi.pool(INT min, INT max, DURATION max_age)
Conversely, the total time it takes to generate an ESI response is
roughly the sum of all fetch times.
# VDP version
STRING pesi.version()
This is where parallel ESI processing can help: In step 3, all the
sub-requests for any particular object are run in parallel, such that
the total time it takes to generate an ESI response at a particular
include level is reduced to the longest of the fetch times.
.. _varnishd(1): https://varnish-cache.org/docs/trunk/reference/varnishd.html
"At a particular include level" is important, because the optimization
only helps if there are many includes at a particular level: For
example, if object A includes object B, which includes object C and no
object is cachable, they still need to be fetched in order: The
request for B can only be started once A is available, and likewise
for B and C.
.. _vcl(7): https://varnish-cache.org/docs/trunk/reference/vcl.html
To summarize:
.. _varnishadm(1): https://varnish-cache.org/docs/trunk/reference/varnishadm.html
* Parallel ESI can *substantially* increase the response times for ESI
if cachable objects include many uncachable objects. The maximum
benefit, compared with standard, serial processing, is achieved
through parallel ESI in cases where all nodes of an ESI tree are
cacheable and at least some leafs are not.
.. _varnishstat(1): https://varnish-cache.org/docs/trunk/reference/varnishstat.html
* If basically all objects are cacheable, parallel ESI only provides a
relevant benefit on an empty cache or of cache TTLs are low, such
that cache misses are likely.
Example
-------
Consider this ESI tree, where an object A includes B1 and B2, which,
in turn, include C1 to C3 and C4 to C6, respectively::
A
__/ \__
/ \
B1 B2
/ | \ / | \
C1 C2 C3 C4 C5 C6
Let's assume that A, B1 and B2 are cacheable and already in cache and
all C objects are uncacheable (passes). Let's also assume that C1 to
C6 take their number times 100ms to fetch from the backend - that is,
C1 takes 100ms, C2 200ms etc.
With `Standard ESI processing`_, the total response time will be
roughly 100ms + 200ms + ... 600ms = 2100ms = 2.1s. If the response is
a web page, the top bit will load relatively fast, the next part half
as fast, the third part again 100ms slower etc.
With parallel ESI, the total response time will be roughly 600ms =
0.6s. There will still be a delay for each fragment of the page, but
it will be 100ms for each part.
REQUIREMENTS
============
All versions of the VDP require strict ABI compatibility with Varnish,
meaning that it must run against the same build version of Varnish as
the version against which the VDP was built. This means that the
"commit id" portion of the Varnish version string (the SHA1 hash) must
be the same at runtime as at build time.
INSTALLATION
============
See `INSTALL.rst <INSTALL.rst>`_ in the source repository.
TL;DR: QUICK START
==================
This documentation is detailed on purpose. It aims to explain well
The full documentation of thie VMOD is in :ref:`vmod_slash(3)`. If you
are reading this document online, it should be available as
`vmod_pesi.man.rst <src/vmod_pesi.man.rst>`_.
The full documentation is detailed on purpose. It aims to explain well
how this VMOD works and how optimizations can be tuned.
We welcome all users to read the documentation, but many users will
......@@ -89,744 +153,6 @@ differently from standard ESI. Understanding these difference, and how
to monitor and manage resource usage affected by pESI, is a main focus
of the detailed discussion that follows.
DESCRIPTION
===========
.. _standard ESI processing: https://varnish-cache.org/docs/trunk/users-guide/esi.html
VDP pesi is a Varnish Delivery Processor for parallel Edge Side
Includes (ESI). The VDP implements content composition in client
responses as specified by ``<esi>`` directives in the response body,
just as Varnish does with its `standard ESI processing`_. While
standard Varnish processes ESI subrequests serially, in the order in
which the ``<esi>`` directives appear in the response, the pesi VDP
executes the subrequests in parallel. This can lead to a significant
reduction in latency for the complete response, if Varnish has to wait
for backend fetches for more than one of the included requests.
Backend applications that use ESI includes for standard Varnish can be
expected to work without changes with the VDP, provided that they do
not depend on assumptions about the serialization of ESI subrequests.
Serial ESI requests are processed in a predictable order, one after
the other, but the pesi VDP executes them at roughly the same time. A
backend may conceivably receive a request forwarded for the second
include in a response before the first one. If the logic of ESI
composition in a standard Varnish deployment does not depend on the
serial order, then it will work the same way with VDP pesi.
Parallel ESI processing is enabled by invoking |pesi.activate()|_ in
``vcl_deliver {}``::
import pesi;
sub vcl_backend_response {
set beresp.do_esi = true;
}
sub vcl_deliver {
pesi.activate();
}
Other functions provided by the VDP serve to set configuration
parameters (or return the VDP version string). If your deployment uses
the default configuration, then |pesi.activate()|_ in ``vcl_deliver``
may be the only modification to VCL that you need.
The invocation of |pesi.activate()|_ can of course be subject to
logic in VCL::
sub vcl_deliver {
# Use parallel ESI only if the request header X-PESI is present.
if (req.http.X-PESI) {
pesi.activate();
}
}
But see below for restrictions on the use of |pesi.activate()|_.
All of the computing resources used by the pesi VDP -- threads, storage,
workspace, locks, and so on -- can be configured, either with Varnish
runtime parameters or configuration settings made available by the
pesi VDP. And their usage can be monitored with Varnish statistics. So you
can limit resource usage, and use monitoring tools such as
`varnishstat(1)`_ to ensure efficient parallel ESI processing. For
details see `RESOURCE USAGE, CONFIGURATION AND MONITORING`_ below.
.. _pesi.activate():
VOID activate()
---------------
Enable parallel ESI processing for the client response.
``pesi.activate()`` MUST be called in ``vcl_deliver {}`` only. If it is
called in any other VCL subroutine, VCL failure is invoked (see
`ERRORS`_ below for details).
If ``pesi.activate()`` is called on *any* ESI level (any depth of include
nesting), then it MUST be called on *all* levels of the response. If
``pesi.activate()`` is invoked at some ESI levels but not others, then the
results are undefined, and will very likely lead to a Varnish panic.
It is also safe, for instance, to call ``pesi.activate()`` only if a
request header is present, as in the example shown above; since the
same request headers are set for every ESI subrequest, the result is
the same at every ESI level. But that should *not* be done if you have
logic that unsets the header at some ESI levels but not at
others. Under no circumstances should the invocation of ``pesi.activate()``
depend on the value of ``req.esi_level``, or on ``req.url`` (since
URLs are different at different ESI levels).
See |pesi.set()|_ below for a way to choose serial
ESI processing for all of the includes in the response at the current
ESI level. Even then, ``pesi.activate()`` must be called in ``vcl_deliver
{}`` in addition to ``pesi.set()``.
As with standard Varnish, ESI processing can be selectively disabled
for a client response, by setting ``resp.do_esi`` to ``false`` in VCL
since version 4.1, or setting ``req.esi`` to ``false`` in VCL 4.0 (see
`vcl(7)`_). The requirement remains: if ESI processing is enabled and
``pesi.activate()`` is called at any ESI level, then both must happen at
all levels.
``pesi.activate()`` has the effect of setting the VCL string variable
``resp.filters``, which is a whitespace-separated list of the names of
delivery processors to be applied to the client response (see
`vcl(7)`_). It configures the correct list of filters for the current
response, analogous to the default filter settings in Varnish when
sequential ESI is in use. These include the ``gunzip`` VDP for
uncompressed responses, and ``range`` for responses to range
requests. ``pesi.activate()`` checks the conditions for which the VDPs are
required, and arranges them in the correct order.
It is possible to manually set or change ``resp.filters`` to enable
parallel ESI, instead of calling ``pesi.activate()``, but that is only
advised to experts. If you do so, use the string ``pesi`` for this
VDP, and do *not* include ``esi``, for Varnish's standard ESI VDP, in
the same list with ``pesi``. As with the ``pesi.activate()`` call -- if
``pesi`` appears in ``resp.filters`` for a response at *any* ESI
level, it MUST be in ``resp.filters`` at *all* ESI levels.
Notice that all VCL code affecting ESI (such as setting
``resp.do_esi``), gzip (such as changes to
``req.http.Accept-Encoding``) or range processing (such as changes
``req.http.Range``) must execute before this function is called to
have an effect.
Example::
vcl 4.1;
import pesi;
sub vcl_recv {
# Disable gzipped responses by removing Accept-Encoding.
unset req.http.Accept-Encoding;
}
sub vcl_backend_response {
set beresp.do_esi = true;
}
sub vcl_deliver {
# If the request header X-Debug-ESI is present, then disable ESI
# for the current response.
if (req.http.X-Debug-ESI) {
set resp.do_esi = false;
}
pesi.activate()
}
.. _pesi.set():
VOID set(ENUM {serial, thread} parameter, [BOOL bool])
------------------------------------------------------
Set a configuration parameter for the VDP, which holds for the current
(sub)request, as documented below. The parameter to be set is
identified by the ENUM ``parameter``. Currently the parameters can
only be set with a boolean value in ``bool`` (but future versions of
this function may allow for setting other data types).
``pesi.set()`` MUST be called in ``vcl_deliver {}`` only; otherwise VCL
failure is invoked (see `ERRORS`_).
The parameters that can be set are currently ``serial`` and ``thread``:
``serial``
----------
Activates serial mode if ``bool`` is ``true``; default is ``false``.
In serial mode, the ESI subrequests processed for includes in the
current response body are processed in serial, in the current thread.
In other words, all ESI subrequests at the next level will be
processed without requesting threads from the thread pool (which
potentially starts new threads, if necessary). This setting only
affects include processing at the current ESI level, not nested
includes at the next level.
It is strongly recommended to *not* use serial mode from ESI level 0
(the top level request received from a client), because the ESI level
0 thread can send available data to the client concurrently with other
parallel ESI threads.
Serial mode may sensibly be used to reduce overhead and the number of
threads required without relevant drawbacks
* at ESI level > 0 _and_
* when the VCL author knows that all objects included by the current
request are cacheable, and thus are highly likely to lead to cache
hits.
Example::
# Activate serial mode at ESI level > 0, if we know that all includes
# in the response at this level lead to cacheable responses.
sub vcl_deliver {
pesi.activate();
if (req.esi_level > 0 && req.url ~ "^/all/cacheable/includes") {
pesi.set(serial, true);
}
}
.. _thread:
``thread``
----------
Whether we always request a new thread for includes, default is
``true``.
* ``false``
Only use a new thread if immediately available, process the include
in the same thread otherwise.
* ``true``
Request a new thread, potentially waiting for one to become
available.
See the detailled discussion in `THREADS`_ for details.
.. _pesi.workspace_prealloc():
VOID workspace_prealloc(BYTES min_free, INT max_nodes)
------------------------------------------------------
::
VOID workspace_prealloc(BYTES min_free=4096, INT max_nodes=32)
Configure the maximum amount of workspace used for pesi internal data
structures.
The pesi VDP builds a structure, whose size is roughly proportional to
the size of the ESI tree -- the conceptual tree with the top-level
response at the root, and its includes and all of their nested
includes as branches. The nodes in this structure have a fixed size,
but the number of nodes used by the VDP varies with the size of the
ESI tree.
For each (sub)request, the VDP pre-allocates a constant number of such
nodes in client workspace, and initially uses the pre-allocation for
child nodes of that (sub)request. If more are needed, they are
obtained from a global memory pool as described below. The use of
pre-allocated nodes from workspace is preferred, since it never
requires new system memory allocations (workspaces themselves are
pre-allocated by Varnish), and because they are local to each request,
so locking is never required to access them (but is required for the
memory pool).
The pre-allocation only uses workspace available after ``vcl_deliver
{}`` returns, keeping at least ``min_free`` bytes free, if
possible. Thus, the number of nodes configured by ``max_nodes`` may
not actually be available, unless the ``workspace_client`` parameter
is set sufficiently high.
``pesi.workspace_prealloc()`` configures the pre-allocation. The default
values of its parameters are defaults used by the VDP; that is, the
configuration if ``pesi.workspace_prealloc()`` is never called.
The ``min_free`` parameter sets the minimum amount of space that the
pre-allocation will always leave free in client workspace; if the
targeted number of pre-allocated nodes would result in less free space
than ``min_free`` bytes in workspace, then fewer nodes are
allocated. This ensures that free workspace is always left over for
other VMODs, VCL usage, and so forth. Note that most of the operations
typically requiring workspace have already finished when VDP pesi
makes the pre-allocation, because it starts after `vcl_deliver
{}`. Thus, the reservation is mostly for other VDPs and VMODs using
`PRIV_TOP`. ``min_free`` defaults to 4 KiB.
If other VDPs or VMODs using `PRIV_TOP` report workspace overflows,
``min_free`` should be increased.
The ``max_nodes`` parameter sets the number of nodes to be allocated,
unless the limit imposed by ``min_free`` is exceeded; ``max_nodes``
defaults to 32. ``max_nodes`` MUST be >= 0; otherwise, VCL failure is
invoked (see `ERRORS`_). If ``max_nodes`` is set to 0, then no nodes
are pre-allocated; they are all taken from the memory pool described
below.
Ideally, ``max_nodes`` matches the number of includes any one ESI
object can have plus the number of fragments before, after and
in between the includes. For all practical purposes, ``max_nodes``
should match twice the number of expected ESI includes. However, if
the number of ESI includes across objects varies substantially, it
might be better to use less memory and set ``max_nodes`` according to
the number of includes of a typical object, so that objects with
more includes use the memory pool.
When ``pesi.workspace_prealloc()`` is called, its configuration becomes
effective immediately for all new requests processed by the VDP. The
configuration remains valid for all instances of VCL, for as long as
the VDP remains loaded; that is, until the last instance of VCL using
the VDP is discarded.
``pesi.workspace_prealloc()`` can be called in ``vcl_init`` to set the
configuration at VCL load time. But you can also write VCL that calls
the function when a request is received by Varnish, for example using
a special URL for system administrators. This is similar to using the
``param.set`` command for `varnishadm(1)`_ to change a Varnish
parameter at runtime. Such a request should be protected, for example
with an ACL and/or Basic Authentication, so that it can be invoked
only by admins. Remember that as soon as such a request is processed
and ``pesi.workspace_prealloc()`` is executed, the changed configuration is
globally valid.
Examples::
# Configure workspace pre-allocation at VCL load time.
sub vcl_init {
pesi.workspace_prealloc(min_free=8k, max_nodes=64);
}
# Change the configuration at runtime, when Varnish receives an
# admin request.
import pesi;
import std;
sub vcl_recv {
if (req.url ~ "^/admin/pesi_ws") {
# Reject the request with "403 Forbidden" unless the client
# IP matches an ACL for admin requests.
if (client.ip !~ admin_acl) {
return (synth(403));
}
# Set min_free from a GET parameter, if present.
if (req.url ~ "\bmin_free=\d+[kmgtp]?") {
# Extract the BYTES parameter.
set req.http.Tmp-Bytes
= regsub(req.url, "^.+\bmin_free=(\d+[kmgtp]?).*$", "\1");
pesi.workspace_prealloc(std.bytes(req.http.Tmp-Bytes));
}
# Set max_nodes from a GET parameter.
if (req.url ~ "\bmax_nodes=\d+") {
# Extract the INT parameter.
set req.http.Tmp-Nodes
= regsub(req.url, "^.+\bmax_nodes=(\d+).*$", "\1");
pesi.workspace_prealloc(max_nodes=std.integer(req.http.Tmp-Nodes));
}
# Return status 204 to indicate success.
return (synth(204));
}
}
.. _pesi.pool():
VOID pool(INT min=10, INT max=100, DURATION max_age=10)
-------------------------------------------------------
Configure the memory pool used by the VDP for internal variable-sized
data structures, when more is needed than is provided by the client
workspace pre-allocation described above. The objects in the memory
pool are the nodes used in structures whose size is proportional to
the size of the ESI tree, as discussed above.
The VDP uses the same mechanism that Varnish uses for its memory
pools, and the configuration values have the same meaning and defaults
as the Varnish runtime parameters ``pool_req``, ``pool_sess`` and
``pool_vbo`` (see `varnishd(1)`_). ``min`` and ``max`` control the
size of the pool -- the number of pre-allocated nodes available for
allocation requests. ``max_age`` is the maximum lifetime for nodes in
the pool -- when there are no pending allocation requests, nodes in
the pool that are older than ``max_age`` are freed, down to the limit
imposed by ``min``.
The values of the parameters MUST fulfill the following requirements,
otherwise VCL failure is invoked (see `ERRORS`_):
* ``min`` and ``max`` MUST be both > 0.
* ``max`` MUST be >= ``min``.
* ``max_age`` MUST be >= 0s (and <= one million seconds).
Note that ``max`` is a soft limit. The memory pool satisfies all
allocation requests, even if ``max`` is execeeded when nodes are
returned to the pool. But the pool size will then be reduced to
``max``, without waiting for ``max_age`` to expire.
As with |pesi.workspace_prealloc()|_: when ``pesi.pool()`` is called, the
changed configuration immediately becomes valid (although it may take
some time for the memory pool to adjust to the new values). It remains
vaild for as long as the VDP is still loaded, unless ``pesi.pool()`` is
called again. ``pesi.pool()`` may be called in ``vcl_init`` to set a
configuration at VCL load time, but may also be called elsewhere in
VCL, for example to enable changing configurations at runtime using a
special "admin" request.
Examples::
# Configure the memory pool at VCL load time.
sub vcl_init {
pesi.pool(min=50, max=500, max_age=30s);
}
# Change the configuration at runtime, when Varnish receives an
# admin request.
import pesi;
import std;
sub vcl_recv {
if (req.url ~ "^/admin/pesi_pool") {
# Protect the call with an ACL, as in the example above.
if (client.ip !~ admin_acl) {
return (synth(403));
}
# Set max_age from a GET parameter.
if (req.url ~ "\bmax_age=\d+(\.\d+)?(ms|s|m|h|d|w|y)") {
# Extract the DURATION parameter.
set req.http.Tmp-Duration
= regsub(req.url,
"^.\bmax_age=(\d+(?:\.\d+)?(?:ms|s|m|h|d|w|y))+.*$",
"\1");
pesi.pool(max_age=std.duration(req.http.Tmp-Duration));
}
# Set min from a GET parameter.
if (req.url ~ "\bmin=\d+") {
# Extract the INT parameter.
set req.http.Tmp-Min = regsub(req.url, "^.+\bmin=(\d+).*$", "\1");
pesi.pool(min=std.integer(req.http.Tmp-Min));
}
# Extract max from a GET parameter, the same way as for min,
# not repeated here ...
# Status 204 indicates success.
return (synth(204));
}
}
.. _pesi.version():
STRING version()
----------------
Return the version string for this VDP.
Example::
std.log("Using VDP pesi version: " + pesi.version());
ERRORS
======
As documented above, VCL failure is invoked under some of the error
conditions for functions provided by the VDP. VCL failure has the same
results as if ``return(fail)`` is called from a VCL subroutine:
* If the failure occurs in ``vcl_init``, then the VCL load fails with
an error message.
* If the failure occurs in any other subroutine besides ``vcl_synth``,
then a ``VCL_Error`` message is written to the log, and control is
directed immediately to ``vcl_synth``, with ``resp.status`` set to
503 and ``resp.reason`` set to ``"VCL failed"``.
* If the failure occurs in ``vcl_synth``, then ``vcl_synth`` is
aborted, and the response line "503 VCL failed" is sent.
RESOURCE USAGE, CONFIGURATION AND MONITORING
============================================
.. _Transient storage allocator: https://varnish-cache.org/docs/trunk/users-guide/storage-backends.html#transient-storage
To understand the way computing resources are used by the VDP, and
thus how they can be configured and monitored, first note that
response bodies returned for ESI subrequests running in parallel may
have to be buffered. Consider a response body with two
``<esi:include>`` directives, both of which lead to parallel backend
fetches, and the second fetch is finished before the first one. The
second response body must be retained while Varnish waits for the
first one, since the contents of the top-level client response must be
delivered in correct order.
If the second response is added to the cache, then buffering is not
necessary, because it can be retrieved from the cache (this is also
true if it was a cache hit in the first place). But an uncacheable
response must be buffered, until its contents are delivered. The VDP
uses Varnish's `Transient storage allocator`_ for this
purpose. Transient storage only needs to be used while the VDP is
waiting to deliver response contents; space is returned as soon as the
contents have been sent. The amount of Transient storage needed
depends on the size of all uncacheable included responses being
processed at any one time.
The VDP runs ESI subrequests (for each ``<esi:include>`` directive at
every ESI level) in separate threads, unless instructed not to do so
due to the use of either ``pesi.set(serial, true)`` or ``pesi.set(thread,
false)``, as documented above. The threads are requested from the
thread pools managed by Varnish. This means that in most cases, for
well-configured thread pools, the overhead of starting new threads is
not incurred during request processing -- the VDP obtains a thread
that is immediately ready for use.
The VDP uses client workspace at the top-level request (ESI level 0)
for fixed-sized internal metadata. It also uses client workspace to
pre-allocate a constant number of nodes in variable-sized structures,
as described in |pesi.workspace_prealloc()|_ above. Together these
make for a fixed-sized demand on client workspace, when
|pesi.activate()|_ is invoked. The size of the space needed from
workspace varies on different systems, and depends on
|pesi.workspace_prealloc()|_ setting, but broadly speaking, it can
expected to be less than 10 KiB.
As described for |pesi.pool()|_, the VDP uses a memory pool for
nodes in its internal reconstruction of the ESI tree, if more are
needed than are pre-allocated in workspace. The same mechanism is
employed as Varnish's memory pools, so the same considerations apply
to the configuration and monitoring of the pool.
For each top-level ESI request using the VDP, two locks are employed;
one to synchronize access to common data structures, and another to
manage tasks being run in different threads. The VDP uses Varnish's
mechanisms for implementing locks, so they can be observed with
``LCK.*`` statistics.
To summarize, the VDP makes use of the following resources:
* Transient storage
* threads from Varnish's thread pools
* client workspace
* the memory pool created for this VDP
* locks
These resources are configured as follows:
.. _Storage Backend: https://varnish-cache.org/docs/trunk/reference/varnishd.html#storage-backend
.. _Storage backends: https://varnish-cache.org/docs/trunk/users-guide/storage-backends.html
.. _Varnish User's Guide: https://varnish-cache.org/docs/trunk/users-guide/index.html
* A maximum size for Transient storage can be set with the ``-s``
command-line option for varnishd, using the name ``Transient`` for
the storage backend (see `Storage Backend`_ in `varnishd(1)`_, and
`Storage backends`_ in the `Varnish User's Guide`_). If no storage
backend with the name ``Transient`` is specified, then Varnish uses
unlimited malloc storage for Transient. Set ``-sTransient`` to set
an upper bound.
Example::
varnishd -sTransient=malloc,500m
* Thread pools are configured with the varnishd parameters
``thread_pools``, ``thread_pool_min`` and ``thread_pool_max``, see
`varnishd(1)`_.
Example::
varnishd -p thread_pools=4 -p thread_pool_min=500 -p thread_pool_max=1000
* Client workspace is configured with the varnishd parameter
``workspace_client``, see `varnishd(1)`_. The VDP's use of client
workspace can be configured in part by using the
``workspace_prealloc()`` function described above.
Example::
varnishd -p workspace_client=128k
# See also the examples for pesi.workspace_prealloc() above.
* The VDP's memory pool is configured with the ``pool()`` function
described above.
Statistics counters that are relevant to the resource usage of the VDP
are:
.. _varnish-counters(7): https://varnish-cache.org/docs/trunk/reference/varnish-counters.html
.. _SMA: https://varnish-cache.org/docs/trunk/reference/varnish-counters.html#sma-malloc-stevedore-counters
.. _LCK: https://varnish-cache.org/docs/trunk/reference/varnish-counters.html#lck-lock-counters
.. _MEMPOOL: https://varnish-cache.org/docs/trunk/reference/varnish-counters.html#mempool-memory-pool-counters
* ``SMA.Transient.*`` for the use of Transient storage, see the `SMA`_
section in `varnish-counters(7)`_.
* ``MAIN.threads`` shows the current number of threads in all pools.
``MAIN.threads_limited`` shows the number of times threads were
requested from the pools, but the limit imposed by
``thread_pool_max`` was reached. See `varnish-counters(7)`_.
You may also want to monitor ``MAIN.thread_queue_len``. This is the
length of the queue for sessions that are waiting for a thread so
that Varnish can accept new client connections -- a sign that thread
pools may be too small.
* ``MAIN.ws_client_overflow`` shows the number of times client
workspace was exhausted (see `varnish-counters(7)`_). Workspace
overflow will also cause ``pesi.activate()`` to invoke VCL failure
(see `ERRORS`_).
* The VDP adds custom counters ``LCK.pesi.buf.*`` and
``LCK.pesi.tasks.*``, so that its locks may be monitored; see the
`LCK`_ section in `varnish-counters(7)`_.
Varnish since version 6.2.0 has the ``lck`` flag for the varnishd
parameter ``debug``. When the flag is set, the
``LCK.pesi.*.dbg_busy`` counters are incremented when there is lock
contention, see `varnishd(1)`_.
Example::
varnishd -p debug=+lck
* The VDP also adds the ``MEMPOOL.pesi.*`` counters, to monitor the
memory pool described in the documentation for ``pool()`` above.
See the `MEMPOOL`_ section in `varnish-counters(7)`_.
If the mempool routinely shows a relevant number of `live` objects,
consider increasing ``max_nodes`` via |pesi.workspace_prealloc()|_,
keeping in mind that prealloc requires free workspace, so adjusting
``workspace_client`` might also be required.
* The VDP adds another counter ``PESI.no_thread``, which is
incremented when ``set(thread, false)`` has been set as described
above, and an ESI subrequest had to be processed in serial (in the
same thread as for the including request), because no thread was
available from the thread pools.
THREADS
=======
For parallel ESI to work as efficiently as possible, it traverses the
ESI tree *breadth first* by default, processing any ESI object
completely, with new threads scheduled for any includes encountered.
Once the top ESI object is processed, available data from a subtree
(an ESI object and anything below) can be sent to the client while
processing of the remaining tree continues. As soon as ESI object
processing is complete, the respective thread will be returned to the
thread pool and become available for any other varnish task (except
for the request for esi_level 0, which _has_ to wait for completion of
the entire ESI request anyway and will send data to the client in the
meantime).
With the `thread`_ setting to ``true`` (the default), this is what
happens. But a thread may not be immediately available if the thread
pool is not sufficiently sized for the current load, and thus the
include request may have to be queued.
With the `thread`_ setting to ``false``, include processing happens in
the same thread as if ``serial`` mode had been activated if there is
no new thread immediately available. While this may sound like the
more sensible option at first, we did not make this the default for
the following reasons:
* Before completion of ESI processing, the subtree below it is not yet
available for delivery to the client because additional VDPs behind
pesi cannot be called from a different thread.
* While processing of the include may take an arbitrarily long time
(for example because it requires a lengthy backend fetch), we know
that the ESI object is fully available in the stevedore (and usually
in memory already) when we parse an include, because streaming is
not supported for ESI. So we know that completing the processing of
the current ESI object will be quick, while descending into a
subtree may be take a long time.
* Except for ESI level 0, the current thread will become available as
soon as ESI processing has completed.
* The thread herder may breed new threads and other threads may
terminate, so queuing a thread momentarily is not a bad thing per
se.
In short, keeping the `thread`_ setting at the default ``true`` should
be the right option, but the alternative exists just in case.
LIMITATIONS
===========
As emphasized above, ``pesi.activate()`` must be called at all ESI
levels if it is called at any ESI level (and equivalently, if ``pesi``
is added by hand to ``resp.filters``, it must be present in
``resp.filters`` at all ESI levels). This is similar to the fact that
serial ESI processing in standard Varnish cannot be disabled in the
"middle" of an ESI tree. If ``resp.do_esi`` is set to ``false`` (in
VCL 4.1) after ESI processing has already begun, Varnish knows to
ignore it, and ESI processing continues. But the pesi VDP is unable to
check for this condition -- it can only operate at all if
``activate()`` has been called (or ``pesi`` is present in
``resp.filters``).
If VDP pesi has been activated at ESI level 0 but not at another
level, Varnish is likely to infer that standard serial ESI processing
should be invoked for the subrequest. The standard ESI VDP and the
pesi VDP are not compatible with one another, so that this situation
is very likely to lead to a Varnish panic. There is nothing we can do
to prevent that, other than urgently advise users to activate VDP pesi
at all ESI levels, or not at all.
.. _vsl(7): https://varnish-cache.org/docs/trunk/reference/vsl.html
The size of the response body as reported by Varnish log records with
the ``ReqAcct`` tag (see `vsl(7)`_) may be slightly different for
different deliveries of the same ESI tree, even though the responses
as viewed by a client are identical. This has to do with the way
fragments in the response are transmitted on the wire to clients --
chunked encoding for HTTP/1, and sequences of DATA frames for
HTTP/2. The overhead for these transmission methods is included in the
accounting of ``ReqAcct``. The "chunking" of the response may differ
at different times, depending on the order of events, and on whether
or not we use (partial) sequential delivery (for example, when no
threads are available).
REQUIREMENTS
============
All versions of the VDP require strict ABI compatibility with Varnish,
meaning that it must run against the same build version of Varnish as
the version against which the VDP was built. This means that the
"commit id" portion of the Varnish version string (the SHA1 hash) must
be the same at runtime as at build time.
INSTALLATION
============
See `INSTALL.rst <INSTALL.rst>`_ in the source repository.
ACKNOWLEDGEMENTS
================
......@@ -843,28 +169,34 @@ The initial release to the public in 2021 has been supported by
SUPPORT
=======
For community support, please use `Gitlab Issues`_.
.. _gitlab.com issues: https://gitlab.com/uplex/varnish/libvdp-pesi/-/issues
To report bugs, use `gitlab.com issues`_.
For enquiries about professional service and support, please contact
info@uplex.de\ .
CONTRIBUTING
============
.. _merge requests on gitlab.com: https://gitlab.com/uplex/varnish/libvdp-pesi/-/merge_requests
To contribute to the project, please use `merge requests on gitlab.com`_.
For commercial support, please contact varnish-support@uplex.de
To support the project's development and maintenance, there are
several options:
.. _Gitlab Issues: https://gitlab.com/uplex/varnish/libvdp-pesi/-/issues
.. _paypal: https://www.paypal.com/donate/?hosted_button_id=BTA6YE2H5VSXA
SEE ALSO
========
.. _github sponsor: https://github.com/sponsors/nigoroll
.. |pesi.activate()| replace:: ``pesi.activate()``
.. |pesi.set()| replace:: ``pesi.set()``
.. |pesi.workspace_prealloc()| replace:: ``pesi.workspace_prealloc()``
.. |pesi.pool()| replace:: ``pesi.pool()``
* Donate money through `paypal`_. If you wish to receive a commercial
invoice, please add your details (address, email, any requirements
on the invoice text) to the message sent with your donation.
.. _Content composition with Edge Side Includes: https://varnish-cache.org/docs/trunk/users-guide/esi.html
* Become a `github sponsor`_.
* `varnishd(1)`_
* `vcl(7)`_
* `varnishstat(1)`_
* `varnish-counters(7)`_
* `varnishadm(1)`_
* `Content composition with Edge Side Includes`_ in the `Varnish User's Guide`_
* Contact info@uplex.de to receive a commercial invoice for SWIFT payment.
COPYRIGHT
=========
......
......@@ -62,47 +62,6 @@ SYNOPSIS
.. _varnishstat(1): https://varnish-cache.org/docs/trunk/reference/varnishstat.html
TL;DR: QUICK START
==================
This documentation is detailed on purpose. It aims to explain well
how this VMOD works and how optimizations can be tuned.
We welcome all users to read the documentation, but many users will
neither want to nor need to understand the details. Thus, here is what
you *really* need to know:
* See `INSTALL.rst <INSTALL.rst>`_ in the source repository for
installation instructions.
* To use pESI, add to the top of your VCL::
import pesi;
and to your ``sub vcl_deliver {}``, add::
pesi.activate();
This should be added *after* any modification of ``resp.do_esi``,
``req.http.Accept-Encoding``, ``req.http.Range`` or
``resp.filters``, if these exist.
To be safe, ``pesi.activate()`` can be called before any
``return(deliver)`` in ``sub vcl_deliver {}``.
* If you call ``pesi.activate()``, call it unconditionally and on all
ESI levels. Read this documentation for details.
It is possible that your current configuration of system resources,
such as thread pools, workspaces, memory allocation and so forth, will
suffice after this simple change, and will need no further
optimization.
But that is by no means ensured, since pESI uses system resources
differently from standard ESI. Understanding these difference, and how
to monitor and manage resource usage affected by pESI, is a main focus
of the detailed discussion that follows.
DESCRIPTION
===========
......@@ -132,11 +91,11 @@ Parallel ESI processing is enabled by invoking |pesi.activate()|_ in
``vcl_deliver {}``::
import pesi;
sub vcl_backend_response {
set beresp.do_esi = true;
}
sub vcl_deliver {
pesi.activate();
}
......@@ -809,42 +768,6 @@ at different times, depending on the order of events, and on whether
or not we use (partial) sequential delivery (for example, when no
threads are available).
REQUIREMENTS
============
All versions of the VDP require strict ABI compatibility with Varnish,
meaning that it must run against the same build version of Varnish as
the version against which the VDP was built. This means that the
"commit id" portion of the Varnish version string (the SHA1 hash) must
be the same at runtime as at build time.
INSTALLATION
============
See `INSTALL.rst <INSTALL.rst>`_ in the source repository.
ACKNOWLEDGEMENTS
================
.. _Otto GmbH & Co KG: https://www.otto.de/
Most of the development work of this vmod in 2019 and 2020 has been
sponsored by `Otto GmbH & Co KG`_.
.. _BoardGameGeek: https://boardgamegeek.com/
The initial release to the public in 2021 has been supported by
`BoardGameGeek`_.
SUPPORT
=======
For community support, please use `Gitlab Issues`_.
For commercial support, please contact varnish-support@uplex.de
.. _Gitlab Issues: https://gitlab.com/uplex/varnish/libvdp-pesi/-/issues
SEE ALSO
========
......
..
.. NB: This file is machine generated, DO NOT EDIT!
..
.. Edit ./vdp_pesi.vcc and run make instead
..
.. role:: ref(emphasis)
=========
vmod_pesi
=========
----------------------------------------------------
Varnish Delivery Processor for parallel ESI includes
----------------------------------------------------
:Manual section: 3
SYNOPSIS
========
::
import pesi;
# Enable parallel ESI processing in vcl_deliver {}.
VOID pesi.activate()
# Set a boolean configuration parameter.
VOID pesi.set(ENUM, BOOL)
# Configure workspace pre-allocation for internal variable-sized
# data structures.
VOID pesi.workspace_prealloc(BYTES min_free, INT max_nodes)
# Configure the memory pool used when pre-allocated structures
# from the workspace are insufficient.
VOID pesi.pool(INT min, INT max, DURATION max_age)
# VDP version
STRING pesi.version()
.. _varnishd(1): https://varnish-cache.org/docs/trunk/reference/varnishd.html
.. _vcl(7): https://varnish-cache.org/docs/trunk/reference/vcl.html
.. _varnishadm(1): https://varnish-cache.org/docs/trunk/reference/varnishadm.html
.. _varnishstat(1): https://varnish-cache.org/docs/trunk/reference/varnishstat.html
DESCRIPTION
===========
.. _standard ESI processing: https://varnish-cache.org/docs/trunk/users-guide/esi.html
VDP pesi is a Varnish Delivery Processor for parallel Edge Side
Includes (ESI). The VDP implements content composition in client
responses as specified by ``<esi>`` directives in the response body,
just as Varnish does with its `standard ESI processing`_. While
standard Varnish processes ESI subrequests serially, in the order in
which the ``<esi>`` directives appear in the response, the pesi VDP
executes the subrequests in parallel. This can lead to a significant
reduction in latency for the complete response, if Varnish has to wait
for backend fetches for more than one of the included requests.
Backend applications that use ESI includes for standard Varnish can be
expected to work without changes with the VDP, provided that they do
not depend on assumptions about the serialization of ESI subrequests.
Serial ESI requests are processed in a predictable order, one after
the other, but the pesi VDP executes them at roughly the same time. A
backend may conceivably receive a request forwarded for the second
include in a response before the first one. If the logic of ESI
composition in a standard Varnish deployment does not depend on the
serial order, then it will work the same way with VDP pesi.
Parallel ESI processing is enabled by invoking |pesi.activate()|_ in
``vcl_deliver {}``::
import pesi;
sub vcl_backend_response {
set beresp.do_esi = true;
}
sub vcl_deliver {
pesi.activate();
}
Other functions provided by the VDP serve to set configuration
parameters (or return the VDP version string). If your deployment uses
the default configuration, then |pesi.activate()|_ in ``vcl_deliver``
may be the only modification to VCL that you need.
The invocation of |pesi.activate()|_ can of course be subject to
logic in VCL::
sub vcl_deliver {
# Use parallel ESI only if the request header X-PESI is present.
if (req.http.X-PESI) {
pesi.activate();
}
}
But see below for restrictions on the use of |pesi.activate()|_.
All of the computing resources used by the pesi VDP -- threads, storage,
workspace, locks, and so on -- can be configured, either with Varnish
runtime parameters or configuration settings made available by the
pesi VDP. And their usage can be monitored with Varnish statistics. So you
can limit resource usage, and use monitoring tools such as
`varnishstat(1)`_ to ensure efficient parallel ESI processing. For
details see `RESOURCE USAGE, CONFIGURATION AND MONITORING`_ below.
.. _pesi.activate():
VOID activate()
---------------
Enable parallel ESI processing for the client response.
``pesi.activate()`` MUST be called in ``vcl_deliver {}`` only. If it is
called in any other VCL subroutine, VCL failure is invoked (see
`ERRORS`_ below for details).
If ``pesi.activate()`` is called on *any* ESI level (any depth of include
nesting), then it MUST be called on *all* levels of the response. If
``pesi.activate()`` is invoked at some ESI levels but not others, then the
results are undefined, and will very likely lead to a Varnish panic.
It is also safe, for instance, to call ``pesi.activate()`` only if a
request header is present, as in the example shown above; since the
same request headers are set for every ESI subrequest, the result is
the same at every ESI level. But that should *not* be done if you have
logic that unsets the header at some ESI levels but not at
others. Under no circumstances should the invocation of ``pesi.activate()``
depend on the value of ``req.esi_level``, or on ``req.url`` (since
URLs are different at different ESI levels).
See |pesi.set()|_ below for a way to choose serial
ESI processing for all of the includes in the response at the current
ESI level. Even then, ``pesi.activate()`` must be called in ``vcl_deliver
{}`` in addition to ``pesi.set()``.
As with standard Varnish, ESI processing can be selectively disabled
for a client response, by setting ``resp.do_esi`` to ``false`` in VCL
since version 4.1, or setting ``req.esi`` to ``false`` in VCL 4.0 (see
`vcl(7)`_). The requirement remains: if ESI processing is enabled and
``pesi.activate()`` is called at any ESI level, then both must happen at
all levels.
``pesi.activate()`` has the effect of setting the VCL string variable
``resp.filters``, which is a whitespace-separated list of the names of
delivery processors to be applied to the client response (see
`vcl(7)`_). It configures the correct list of filters for the current
response, analogous to the default filter settings in Varnish when
sequential ESI is in use. These include the ``gunzip`` VDP for
uncompressed responses, and ``range`` for responses to range
requests. ``pesi.activate()`` checks the conditions for which the VDPs are
required, and arranges them in the correct order.
It is possible to manually set or change ``resp.filters`` to enable
parallel ESI, instead of calling ``pesi.activate()``, but that is only
advised to experts. If you do so, use the string ``pesi`` for this
VDP, and do *not* include ``esi``, for Varnish's standard ESI VDP, in
the same list with ``pesi``. As with the ``pesi.activate()`` call -- if
``pesi`` appears in ``resp.filters`` for a response at *any* ESI
level, it MUST be in ``resp.filters`` at *all* ESI levels.
Notice that all VCL code affecting ESI (such as setting
``resp.do_esi``), gzip (such as changes to
``req.http.Accept-Encoding``) or range processing (such as changes
``req.http.Range``) must execute before this function is called to
have an effect.
Example::
vcl 4.1;
import pesi;
sub vcl_recv {
# Disable gzipped responses by removing Accept-Encoding.
unset req.http.Accept-Encoding;
}
sub vcl_backend_response {
set beresp.do_esi = true;
}
sub vcl_deliver {
# If the request header X-Debug-ESI is present, then disable ESI
# for the current response.
if (req.http.X-Debug-ESI) {
set resp.do_esi = false;
}
pesi.activate()
}
.. _pesi.set():
VOID set(ENUM {serial, thread} parameter, [BOOL bool])
------------------------------------------------------
Set a configuration parameter for the VDP, which holds for the current
(sub)request, as documented below. The parameter to be set is
identified by the ENUM ``parameter``. Currently the parameters can
only be set with a boolean value in ``bool`` (but future versions of
this function may allow for setting other data types).
``pesi.set()`` MUST be called in ``vcl_deliver {}`` only; otherwise VCL
failure is invoked (see `ERRORS`_).
The parameters that can be set are currently ``serial`` and ``thread``:
``serial``
----------
Activates serial mode if ``bool`` is ``true``; default is ``false``.
In serial mode, the ESI subrequests processed for includes in the
current response body are processed in serial, in the current thread.
In other words, all ESI subrequests at the next level will be
processed without requesting threads from the thread pool (which
potentially starts new threads, if necessary). This setting only
affects include processing at the current ESI level, not nested
includes at the next level.
It is strongly recommended to *not* use serial mode from ESI level 0
(the top level request received from a client), because the ESI level
0 thread can send available data to the client concurrently with other
parallel ESI threads.
Serial mode may sensibly be used to reduce overhead and the number of
threads required without relevant drawbacks
* at ESI level > 0 _and_
* when the VCL author knows that all objects included by the current
request are cacheable, and thus are highly likely to lead to cache
hits.
Example::
# Activate serial mode at ESI level > 0, if we know that all includes
# in the response at this level lead to cacheable responses.
sub vcl_deliver {
pesi.activate();
if (req.esi_level > 0 && req.url ~ "^/all/cacheable/includes") {
pesi.set(serial, true);
}
}
.. _thread:
``thread``
----------
Whether we always request a new thread for includes, default is
``true``.
* ``false``
Only use a new thread if immediately available, process the include
in the same thread otherwise.
* ``true``
Request a new thread, potentially waiting for one to become
available.
See the detailled discussion in `THREADS`_ for details.
.. _pesi.workspace_prealloc():
VOID workspace_prealloc(BYTES min_free, INT max_nodes)
------------------------------------------------------
::
VOID workspace_prealloc(BYTES min_free=4096, INT max_nodes=32)
Configure the maximum amount of workspace used for pesi internal data
structures.
The pesi VDP builds a structure, whose size is roughly proportional to
the size of the ESI tree -- the conceptual tree with the top-level
response at the root, and its includes and all of their nested
includes as branches. The nodes in this structure have a fixed size,
but the number of nodes used by the VDP varies with the size of the
ESI tree.
For each (sub)request, the VDP pre-allocates a constant number of such
nodes in client workspace, and initially uses the pre-allocation for
child nodes of that (sub)request. If more are needed, they are
obtained from a global memory pool as described below. The use of
pre-allocated nodes from workspace is preferred, since it never
requires new system memory allocations (workspaces themselves are
pre-allocated by Varnish), and because they are local to each request,
so locking is never required to access them (but is required for the
memory pool).
The pre-allocation only uses workspace available after ``vcl_deliver
{}`` returns, keeping at least ``min_free`` bytes free, if
possible. Thus, the number of nodes configured by ``max_nodes`` may
not actually be available, unless the ``workspace_client`` parameter
is set sufficiently high.
``pesi.workspace_prealloc()`` configures the pre-allocation. The default
values of its parameters are defaults used by the VDP; that is, the
configuration if ``pesi.workspace_prealloc()`` is never called.
The ``min_free`` parameter sets the minimum amount of space that the
pre-allocation will always leave free in client workspace; if the
targeted number of pre-allocated nodes would result in less free space
than ``min_free`` bytes in workspace, then fewer nodes are
allocated. This ensures that free workspace is always left over for
other VMODs, VCL usage, and so forth. Note that most of the operations
typically requiring workspace have already finished when VDP pesi
makes the pre-allocation, because it starts after `vcl_deliver
{}`. Thus, the reservation is mostly for other VDPs and VMODs using
`PRIV_TOP`. ``min_free`` defaults to 4 KiB.
If other VDPs or VMODs using `PRIV_TOP` report workspace overflows,
``min_free`` should be increased.
The ``max_nodes`` parameter sets the number of nodes to be allocated,
unless the limit imposed by ``min_free`` is exceeded; ``max_nodes``
defaults to 32. ``max_nodes`` MUST be >= 0; otherwise, VCL failure is
invoked (see `ERRORS`_). If ``max_nodes`` is set to 0, then no nodes
are pre-allocated; they are all taken from the memory pool described
below.
Ideally, ``max_nodes`` matches the number of includes any one ESI
object can have plus the number of fragments before, after and
in between the includes. For all practical purposes, ``max_nodes``
should match twice the number of expected ESI includes. However, if
the number of ESI includes across objects varies substantially, it
might be better to use less memory and set ``max_nodes`` according to
the number of includes of a typical object, so that objects with
more includes use the memory pool.
When ``pesi.workspace_prealloc()`` is called, its configuration becomes
effective immediately for all new requests processed by the VDP. The
configuration remains valid for all instances of VCL, for as long as
the VDP remains loaded; that is, until the last instance of VCL using
the VDP is discarded.
``pesi.workspace_prealloc()`` can be called in ``vcl_init`` to set the
configuration at VCL load time. But you can also write VCL that calls
the function when a request is received by Varnish, for example using
a special URL for system administrators. This is similar to using the
``param.set`` command for `varnishadm(1)`_ to change a Varnish
parameter at runtime. Such a request should be protected, for example
with an ACL and/or Basic Authentication, so that it can be invoked
only by admins. Remember that as soon as such a request is processed
and ``pesi.workspace_prealloc()`` is executed, the changed configuration is
globally valid.
Examples::
# Configure workspace pre-allocation at VCL load time.
sub vcl_init {
pesi.workspace_prealloc(min_free=8k, max_nodes=64);
}
# Change the configuration at runtime, when Varnish receives an
# admin request.
import pesi;
import std;
sub vcl_recv {
if (req.url ~ "^/admin/pesi_ws") {
# Reject the request with "403 Forbidden" unless the client
# IP matches an ACL for admin requests.
if (client.ip !~ admin_acl) {
return (synth(403));
}
# Set min_free from a GET parameter, if present.
if (req.url ~ "\bmin_free=\d+[kmgtp]?") {
# Extract the BYTES parameter.
set req.http.Tmp-Bytes
= regsub(req.url, "^.+\bmin_free=(\d+[kmgtp]?).*$", "\1");
pesi.workspace_prealloc(std.bytes(req.http.Tmp-Bytes));
}
# Set max_nodes from a GET parameter.
if (req.url ~ "\bmax_nodes=\d+") {
# Extract the INT parameter.
set req.http.Tmp-Nodes
= regsub(req.url, "^.+\bmax_nodes=(\d+).*$", "\1");
pesi.workspace_prealloc(max_nodes=std.integer(req.http.Tmp-Nodes));
}
# Return status 204 to indicate success.
return (synth(204));
}
}
.. _pesi.pool():
VOID pool(INT min=10, INT max=100, DURATION max_age=10)
-------------------------------------------------------
Configure the memory pool used by the VDP for internal variable-sized
data structures, when more is needed than is provided by the client
workspace pre-allocation described above. The objects in the memory
pool are the nodes used in structures whose size is proportional to
the size of the ESI tree, as discussed above.
The VDP uses the same mechanism that Varnish uses for its memory
pools, and the configuration values have the same meaning and defaults
as the Varnish runtime parameters ``pool_req``, ``pool_sess`` and
``pool_vbo`` (see `varnishd(1)`_). ``min`` and ``max`` control the
size of the pool -- the number of pre-allocated nodes available for
allocation requests. ``max_age`` is the maximum lifetime for nodes in
the pool -- when there are no pending allocation requests, nodes in
the pool that are older than ``max_age`` are freed, down to the limit
imposed by ``min``.
The values of the parameters MUST fulfill the following requirements,
otherwise VCL failure is invoked (see `ERRORS`_):
* ``min`` and ``max`` MUST be both > 0.
* ``max`` MUST be >= ``min``.
* ``max_age`` MUST be >= 0s (and <= one million seconds).
Note that ``max`` is a soft limit. The memory pool satisfies all
allocation requests, even if ``max`` is execeeded when nodes are
returned to the pool. But the pool size will then be reduced to
``max``, without waiting for ``max_age`` to expire.
As with |pesi.workspace_prealloc()|_: when ``pesi.pool()`` is called, the
changed configuration immediately becomes valid (although it may take
some time for the memory pool to adjust to the new values). It remains
vaild for as long as the VDP is still loaded, unless ``pesi.pool()`` is
called again. ``pesi.pool()`` may be called in ``vcl_init`` to set a
configuration at VCL load time, but may also be called elsewhere in
VCL, for example to enable changing configurations at runtime using a
special "admin" request.
Examples::
# Configure the memory pool at VCL load time.
sub vcl_init {
pesi.pool(min=50, max=500, max_age=30s);
}
# Change the configuration at runtime, when Varnish receives an
# admin request.
import pesi;
import std;
sub vcl_recv {
if (req.url ~ "^/admin/pesi_pool") {
# Protect the call with an ACL, as in the example above.
if (client.ip !~ admin_acl) {
return (synth(403));
}
# Set max_age from a GET parameter.
if (req.url ~ "\bmax_age=\d+(\.\d+)?(ms|s|m|h|d|w|y)") {
# Extract the DURATION parameter.
set req.http.Tmp-Duration
= regsub(req.url,
"^.\bmax_age=(\d+(?:\.\d+)?(?:ms|s|m|h|d|w|y))+.*$",
"\1");
pesi.pool(max_age=std.duration(req.http.Tmp-Duration));
}
# Set min from a GET parameter.
if (req.url ~ "\bmin=\d+") {
# Extract the INT parameter.
set req.http.Tmp-Min = regsub(req.url, "^.+\bmin=(\d+).*$", "\1");
pesi.pool(min=std.integer(req.http.Tmp-Min));
}
# Extract max from a GET parameter, the same way as for min,
# not repeated here ...
# Status 204 indicates success.
return (synth(204));
}
}
.. _pesi.version():
STRING version()
----------------
Return the version string for this VDP.
Example::
std.log("Using VDP pesi version: " + pesi.version());
ERRORS
======
As documented above, VCL failure is invoked under some of the error
conditions for functions provided by the VDP. VCL failure has the same
results as if ``return(fail)`` is called from a VCL subroutine:
* If the failure occurs in ``vcl_init``, then the VCL load fails with
an error message.
* If the failure occurs in any other subroutine besides ``vcl_synth``,
then a ``VCL_Error`` message is written to the log, and control is
directed immediately to ``vcl_synth``, with ``resp.status`` set to
503 and ``resp.reason`` set to ``"VCL failed"``.
* If the failure occurs in ``vcl_synth``, then ``vcl_synth`` is
aborted, and the response line "503 VCL failed" is sent.
RESOURCE USAGE, CONFIGURATION AND MONITORING
============================================
.. _Transient storage allocator: https://varnish-cache.org/docs/trunk/users-guide/storage-backends.html#transient-storage
To understand the way computing resources are used by the VDP, and
thus how they can be configured and monitored, first note that
response bodies returned for ESI subrequests running in parallel may
have to be buffered. Consider a response body with two
``<esi:include>`` directives, both of which lead to parallel backend
fetches, and the second fetch is finished before the first one. The
second response body must be retained while Varnish waits for the
first one, since the contents of the top-level client response must be
delivered in correct order.
If the second response is added to the cache, then buffering is not
necessary, because it can be retrieved from the cache (this is also
true if it was a cache hit in the first place). But an uncacheable
response must be buffered, until its contents are delivered. The VDP
uses Varnish's `Transient storage allocator`_ for this
purpose. Transient storage only needs to be used while the VDP is
waiting to deliver response contents; space is returned as soon as the
contents have been sent. The amount of Transient storage needed
depends on the size of all uncacheable included responses being
processed at any one time.
The VDP runs ESI subrequests (for each ``<esi:include>`` directive at
every ESI level) in separate threads, unless instructed not to do so
due to the use of either ``pesi.set(serial, true)`` or ``pesi.set(thread,
false)``, as documented above. The threads are requested from the
thread pools managed by Varnish. This means that in most cases, for
well-configured thread pools, the overhead of starting new threads is
not incurred during request processing -- the VDP obtains a thread
that is immediately ready for use.
The VDP uses client workspace at the top-level request (ESI level 0)
for fixed-sized internal metadata. It also uses client workspace to
pre-allocate a constant number of nodes in variable-sized structures,
as described in |pesi.workspace_prealloc()|_ above. Together these
make for a fixed-sized demand on client workspace, when
|pesi.activate()|_ is invoked. The size of the space needed from
workspace varies on different systems, and depends on
|pesi.workspace_prealloc()|_ setting, but broadly speaking, it can
expected to be less than 10 KiB.
As described for |pesi.pool()|_, the VDP uses a memory pool for
nodes in its internal reconstruction of the ESI tree, if more are
needed than are pre-allocated in workspace. The same mechanism is
employed as Varnish's memory pools, so the same considerations apply
to the configuration and monitoring of the pool.
For each top-level ESI request using the VDP, two locks are employed;
one to synchronize access to common data structures, and another to
manage tasks being run in different threads. The VDP uses Varnish's
mechanisms for implementing locks, so they can be observed with
``LCK.*`` statistics.
To summarize, the VDP makes use of the following resources:
* Transient storage
* threads from Varnish's thread pools
* client workspace
* the memory pool created for this VDP
* locks
These resources are configured as follows:
.. _Storage Backend: https://varnish-cache.org/docs/trunk/reference/varnishd.html#storage-backend
.. _Storage backends: https://varnish-cache.org/docs/trunk/users-guide/storage-backends.html
.. _Varnish User's Guide: https://varnish-cache.org/docs/trunk/users-guide/index.html
* A maximum size for Transient storage can be set with the ``-s``
command-line option for varnishd, using the name ``Transient`` for
the storage backend (see `Storage Backend`_ in `varnishd(1)`_, and
`Storage backends`_ in the `Varnish User's Guide`_). If no storage
backend with the name ``Transient`` is specified, then Varnish uses
unlimited malloc storage for Transient. Set ``-sTransient`` to set
an upper bound.
Example::
varnishd -sTransient=malloc,500m
* Thread pools are configured with the varnishd parameters
``thread_pools``, ``thread_pool_min`` and ``thread_pool_max``, see
`varnishd(1)`_.
Example::
varnishd -p thread_pools=4 -p thread_pool_min=500 -p thread_pool_max=1000
* Client workspace is configured with the varnishd parameter
``workspace_client``, see `varnishd(1)`_. The VDP's use of client
workspace can be configured in part by using the
``workspace_prealloc()`` function described above.
Example::
varnishd -p workspace_client=128k
# See also the examples for pesi.workspace_prealloc() above.
* The VDP's memory pool is configured with the ``pool()`` function
described above.
Statistics counters that are relevant to the resource usage of the VDP
are:
.. _varnish-counters(7): https://varnish-cache.org/docs/trunk/reference/varnish-counters.html
.. _SMA: https://varnish-cache.org/docs/trunk/reference/varnish-counters.html#sma-malloc-stevedore-counters
.. _LCK: https://varnish-cache.org/docs/trunk/reference/varnish-counters.html#lck-lock-counters
.. _MEMPOOL: https://varnish-cache.org/docs/trunk/reference/varnish-counters.html#mempool-memory-pool-counters
* ``SMA.Transient.*`` for the use of Transient storage, see the `SMA`_
section in `varnish-counters(7)`_.
* ``MAIN.threads`` shows the current number of threads in all pools.
``MAIN.threads_limited`` shows the number of times threads were
requested from the pools, but the limit imposed by
``thread_pool_max`` was reached. See `varnish-counters(7)`_.
You may also want to monitor ``MAIN.thread_queue_len``. This is the
length of the queue for sessions that are waiting for a thread so
that Varnish can accept new client connections -- a sign that thread
pools may be too small.
* ``MAIN.ws_client_overflow`` shows the number of times client
workspace was exhausted (see `varnish-counters(7)`_). Workspace
overflow will also cause ``pesi.activate()`` to invoke VCL failure
(see `ERRORS`_).
* The VDP adds custom counters ``LCK.pesi.buf.*`` and
``LCK.pesi.tasks.*``, so that its locks may be monitored; see the
`LCK`_ section in `varnish-counters(7)`_.
Varnish since version 6.2.0 has the ``lck`` flag for the varnishd
parameter ``debug``. When the flag is set, the
``LCK.pesi.*.dbg_busy`` counters are incremented when there is lock
contention, see `varnishd(1)`_.
Example::
varnishd -p debug=+lck
* The VDP also adds the ``MEMPOOL.pesi.*`` counters, to monitor the
memory pool described in the documentation for ``pool()`` above.
See the `MEMPOOL`_ section in `varnish-counters(7)`_.
If the mempool routinely shows a relevant number of `live` objects,
consider increasing ``max_nodes`` via |pesi.workspace_prealloc()|_,
keeping in mind that prealloc requires free workspace, so adjusting
``workspace_client`` might also be required.
* The VDP adds another counter ``PESI.no_thread``, which is
incremented when ``set(thread, false)`` has been set as described
above, and an ESI subrequest had to be processed in serial (in the
same thread as for the including request), because no thread was
available from the thread pools.
THREADS
=======
For parallel ESI to work as efficiently as possible, it traverses the
ESI tree *breadth first* by default, processing any ESI object
completely, with new threads scheduled for any includes encountered.
Once the top ESI object is processed, available data from a subtree
(an ESI object and anything below) can be sent to the client while
processing of the remaining tree continues. As soon as ESI object
processing is complete, the respective thread will be returned to the
thread pool and become available for any other varnish task (except
for the request for esi_level 0, which _has_ to wait for completion of
the entire ESI request anyway and will send data to the client in the
meantime).
With the `thread`_ setting to ``true`` (the default), this is what
happens. But a thread may not be immediately available if the thread
pool is not sufficiently sized for the current load, and thus the
include request may have to be queued.
With the `thread`_ setting to ``false``, include processing happens in
the same thread as if ``serial`` mode had been activated if there is
no new thread immediately available. While this may sound like the
more sensible option at first, we did not make this the default for
the following reasons:
* Before completion of ESI processing, the subtree below it is not yet
available for delivery to the client because additional VDPs behind
pesi cannot be called from a different thread.
* While processing of the include may take an arbitrarily long time
(for example because it requires a lengthy backend fetch), we know
that the ESI object is fully available in the stevedore (and usually
in memory already) when we parse an include, because streaming is
not supported for ESI. So we know that completing the processing of
the current ESI object will be quick, while descending into a
subtree may be take a long time.
* Except for ESI level 0, the current thread will become available as
soon as ESI processing has completed.
* The thread herder may breed new threads and other threads may
terminate, so queuing a thread momentarily is not a bad thing per
se.
In short, keeping the `thread`_ setting at the default ``true`` should
be the right option, but the alternative exists just in case.
LIMITATIONS
===========
As emphasized above, ``pesi.activate()`` must be called at all ESI
levels if it is called at any ESI level (and equivalently, if ``pesi``
is added by hand to ``resp.filters``, it must be present in
``resp.filters`` at all ESI levels). This is similar to the fact that
serial ESI processing in standard Varnish cannot be disabled in the
"middle" of an ESI tree. If ``resp.do_esi`` is set to ``false`` (in
VCL 4.1) after ESI processing has already begun, Varnish knows to
ignore it, and ESI processing continues. But the pesi VDP is unable to
check for this condition -- it can only operate at all if
``activate()`` has been called (or ``pesi`` is present in
``resp.filters``).
If VDP pesi has been activated at ESI level 0 but not at another
level, Varnish is likely to infer that standard serial ESI processing
should be invoked for the subrequest. The standard ESI VDP and the
pesi VDP are not compatible with one another, so that this situation
is very likely to lead to a Varnish panic. There is nothing we can do
to prevent that, other than urgently advise users to activate VDP pesi
at all ESI levels, or not at all.
.. _vsl(7): https://varnish-cache.org/docs/trunk/reference/vsl.html
The size of the response body as reported by Varnish log records with
the ``ReqAcct`` tag (see `vsl(7)`_) may be slightly different for
different deliveries of the same ESI tree, even though the responses
as viewed by a client are identical. This has to do with the way
fragments in the response are transmitted on the wire to clients --
chunked encoding for HTTP/1, and sequences of DATA frames for
HTTP/2. The overhead for these transmission methods is included in the
accounting of ``ReqAcct``. The "chunking" of the response may differ
at different times, depending on the order of events, and on whether
or not we use (partial) sequential delivery (for example, when no
threads are available).
SEE ALSO
========
.. |pesi.activate()| replace:: ``pesi.activate()``
.. |pesi.set()| replace:: ``pesi.set()``
.. |pesi.workspace_prealloc()| replace:: ``pesi.workspace_prealloc()``
.. |pesi.pool()| replace:: ``pesi.pool()``
.. _Content composition with Edge Side Includes: https://varnish-cache.org/docs/trunk/users-guide/esi.html
* `varnishd(1)`_
* `vcl(7)`_
* `varnishstat(1)`_
* `varnish-counters(7)`_
* `varnishadm(1)`_
* `Content composition with Edge Side Includes`_ in the `Varnish User's Guide`_
COPYRIGHT
=========
::
Copyright 2019 - 2021 UPLEX Nils Goroll Systemoptimierung
All rights reserved
Authors: Geoffrey Simmons <geoffrey.simmons@uplex.de>
Nils Goroll <nils.goroll@uplex.de>
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment