Commit 3d888804 authored by Nils Goroll's avatar Nils Goroll

update README.rst (auto-generated)

parent 16c102d5
......@@ -23,7 +23,7 @@ SYNOPSIS
import pesi;
# Enable parallel ESI processing in vcl_deliver.
# Enable parallel ESI processing in vcl_deliver {}.
VOID pesi.activate()
# Set a boolean configuration parameter.
......@@ -58,7 +58,7 @@ Includes (ESI). The VDP implements content composition in client
responses as specified by ``<esi>`` directives in the response body,
just as Varnish does with its `standard ESI processing`_. While
standard Varnish processes ESI subrequests serially, in the order in
which the ``<esi>`` directives appear in the response, the VDP
which the ``<esi>`` directives appear in the response, the pesi VDP
executes the subrequests in parallel. This can lead to a significant
reduction in latency for the complete response, if Varnish has to wait
for backend fetches for more than one of the included requests.
......@@ -67,31 +67,31 @@ Backend applications that use ESI includes for standard Varnish can be
expected to work without changes with the VDP, provided that they do
not depend on assumptions about the serialization of ESI subrequests.
Serial ESI requests are processed in a predictable order, one after
the other, but the VDP executes them at roughly the same time. A
the other, but the pesi VDP executes them at roughly the same time. A
backend may conceivably receive a request forwarded for the second
include in a response before the first one. If the logic of ESI
composition in a standard Varnish deployment does not depend on the
serial order, then it will work the same way with VDP pesi.
Parallel ESI processing is enabled by invoking ``pesi.activate()`` in
``vcl_deliver``::
Parallel ESI processing is enabled by invoking |pesi.activate()|_ in
``vcl_deliver {}``::
import pesi;
sub vcl_backend_response {
set beresp.do_esi = true;
}
sub vcl_deliver {
pesi.activate();
}
Other functions provided by the VDP serve to set configuration
parameters (or return the VDP version string). If your deployment uses
the default configuration, then ``pesi.activate()`` in ``vcl_deliver``
the default configuration, then |pesi.activate()|_ in ``vcl_deliver``
may be the only modification to VCL that you need.
The invocation of ``pesi.activate()`` can of course be subject to
The invocation of |pesi.activate()|_ can of course be subject to
logic in VCL::
sub vcl_deliver {
......@@ -101,70 +101,68 @@ logic in VCL::
}
}
But see below for restrictions on the use of ``pesi.activate()``.
But see below for restrictions on the use of |pesi.activate()|_.
All of the computing resources used by the VDP -- threads, storage,
All of the computing resources used by the pesi VDP -- threads, storage,
workspace, locks, and so on -- can be configured, either with Varnish
runtime parameters or configuration settings made available by the
VDP. And their usage can be monitored with Varnish statistics. So you
pesi VDP. And their usage can be monitored with Varnish statistics. So you
can limit resource usage, and use monitoring tools such as
`varnishstat(1)`_ to ensure efficient parallel ESI processing. For
details see `CONFIGURATION AND MONITORING`_ below.
.. _vmod_pesi.activate:
.. _pesi.activate():
VOID activate()
---------------
Enable parallel ESI processing for the client response.
``activate()`` MUST be called in ``vcl_deliver`` only. If it is called
in any other VCL subroutine, VCL failure is invoked (see `ERRORS`_
below for details).
``pesi.activate()`` MUST be called in ``vcl_deliver {}`` only. If it is
called in any other VCL subroutine, VCL failure is invoked (see
`ERRORS`_ below for details).
If ``activate()`` is called on *any* ESI level (any depth of include
If ``pesi.activate()`` is called on *any* ESI level (any depth of include
nesting), then it MUST be called on *all* levels of the response. If
``activate()`` is invoked at some ESI levels but not others, then the
``pesi.activate()`` is invoked at some ESI levels but not others, then the
results are undefined, and will very likely lead to a Varnish panic.
Typically it suffices to simply call ``activate()`` in
``vcl_deliver``, since the code in ``vcl_deliver`` is executed at
every ESI level. It is also safe, for instance, to call ``activate()``
only if a request header is present, as in the example shown above;
since the same request headers are set for every ESI subrequest, the
result is the same at every ESI level. But that should *not* be done
if you have logic that unsets the header at some ESI levels but not at
others. Under no circumstances should the invocation of ``activate()``
It is also safe, for instance, to call ``pesi.activate()`` only if a
request header is present, as in the example shown above; since the
same request headers are set for every ESI subrequest, the result is
the same at every ESI level. But that should *not* be done if you have
logic that unsets the header at some ESI levels but not at
others. Under no circumstances should the invocation of ``pesi.activate()``
depend on the value of ``req.esi_level``, or of ``req.url`` (since
URLs are different at different ESI levels).
See the documentation of ``set()`` below for a way to choose serial
See |pesi.set()|_ below for a way to choose serial
ESI processing for all of the includes in the response at the current
ESI level. Even then, ``activate()`` must be called in ``vcl_deliver``
in addition to ``set()``.
ESI level. Even then, ``pesi.activate()`` must be called in ``vcl_deliver
{}`` in addition to ``pesi.set()``.
As with standard Varnish, ESI processing can be selectively disabled
for a client response, by setting ``resp.do_esi`` to ``false`` in VCL
since version 4.1, or setting ``req.esi`` to ``false`` in VCL 4.0 (see
`vcl(7)`_). The requirement remains: if ESI processing is enabled and
``activate()`` is called at any ESI level, then both must happen at
``pesi.activate()`` is called at any ESI level, then both must happen at
all levels.
``activate()`` has the effect of setting the VCL string variable
``pesi.activate()`` has the effect of setting the VCL string variable
``resp.filters``, which is a whitespace-separated list of the names of
delivery processors to be applied to the client response (see
`vcl(7)`_). It configures the correct list of filters for the current
response, analogous to the default filter settings in Varnish when
sequential ESI is in use. These include the ``gunzip`` VDP for
uncompressed responses, and ``range`` for responses to range
requests. ``activate()`` checks the conditions for which the VDPs are
requests. ``pesi.activate()`` checks the conditions for which the VDPs are
required, and arranges them in the correct order.
It is possible to manually set or change ``resp.filters`` to enable
parallel ESI, instead of calling ``activate()``, but that is only
parallel ESI, instead of calling ``pesi.activate()``, but that is only
advised to experts. If you do so, use the string ``pesi`` for this
VDP, and do *not* include ``esi``, for Varnish's standard ESI VDP, in
the same list with ``pesi``. As with the ``activate()`` call -- if
the same list with ``pesi``. As with the ``pesi.activate()`` call -- if
``pesi`` appears in ``resp.filters`` for a response at *any* ESI
level, it MUST be in ``resp.filters`` at *all* ESI levels.
......@@ -198,7 +196,7 @@ Example::
pesi.activate()
}
.. _vmod_pesi.set:
.. _pesi.set():
VOID set(ENUM {serial, thread} parameter, [BOOL bool])
------------------------------------------------------
......@@ -209,7 +207,7 @@ identified by the ENUM ``parameter``. Currently the parameters can
only be set with a boolean value in ``bool`` (but future versions of
this function may allow for setting other data types).
``set()`` MUST be called in ``vcl_deliver`` only; otherwise VCL
``pesi.set()`` MUST be called in ``vcl_deliver {}`` only; otherwise VCL
failure is invoked (see `ERRORS`_).
The parameters that can be set are currently ``serial`` and ``thread``:
......@@ -233,15 +231,27 @@ It is strongly recommended to *not* use serial mode from ESI level 0
parallel ESI threads.
Serial mode may sensibly be used to reduce overhead and the number of
threads required without relevant drawbacks:
threads required without relevant drawbacks
* at ESI level > 0
* at ESI level > 0 _and_
* when the VCL author knows that all objects included by the current
request are cacheable, and thus are highly likely to lead to cache
hits.
XXX ... Also see the discussion below for more detail.
Example::
# Activate serial mode at ESI level > 0, if we know that all includes
# in the response at this level lead to cacheable responses.
sub vcl_deliver {
pesi.activate();
if (req.esi_level > 0 && req.url == "/all/cacheable/includes") {
pesi.set(serial, true);
}
}
.. _thread:
``thread``
----------
......@@ -259,66 +269,9 @@ Whether we always request a new thread for includes, default is
Request a new thread, potentially waiting for one to become
available.
XXX move the longer discussion to a document dedicated to the subjects
of tuning, efficiency etc
For parallel ESI to work as efficiently as possible, it should
traverse the ESI tree *breadth first*, processing any ESI object
completely, with new threads scheduled for any includes
encountered. Completing processing of an ESI object allows for data
from the subtree (the ESI object and anything below) to be sent to the
client concurrently. As soon as ESI object processing is complete, the
respective thread will be returned to the thread pool and become
available for any other varnish task (except for the request for
esi_level 0, which _has_ to wait for completion of the entire ESI
request anyway and will send data to the client in the meantime).
With this setting to ``true`` (the default), this is what happens
always, but a thread may not be immediately available if the thread
pool is not sufficiently sized for the current load and thus the
include request may have to be queued.
With this setting to ``false``, include processing happens in the same
thread as if ``serial`` mode had been activated, but only in the case
where there is no new thread available. While this may sound like the
more sensible option at first, we did not make this the default for
the following reasons:
* Before completion of the ESI processing, the subtree below it is not
yet available for delivery to the client because additional VDPs
behind pesi cannot be called from a different thread.
* While processing of the include may take an arbitrarily long time
(for example because it requires a lengthy backend fetch), we know
that the ESI object is fully available in the stevedore (and usually
in memory already) when we parse an include because streaming is not
supported for ESI. So we know that completing the processing of the
current ESI object will be quick, while descending into a subtree
may be take a long time.
* Except for ESI level 0, the current thread will become available as
soon as ESI processing has completed.
* The thread herder may breed new threads and other threads may
terminate, so queuing a thread momentarily is not a bad thing per
se.
In short, keeping ``thread`` at the default ``true`` should be the
right option, the alternative exists just in case.
See the detailled discussion in `THREADS`_ for details.
Example::
# Activate serial mode at ESI level > 0, if we know that all includes
# in the response at this level lead to cacheable responses.
sub vcl_deliver {
pesi.activate();
if (req.esi_level > 0 && req.url == "/all/cacheable/includes") {
pesi.set(serial, true);
}
}
.. _vmod_pesi.workspace_prealloc:
.. _pesi.workspace_prealloc():
VOID workspace_prealloc(BYTES min_free, INT max_nodes)
------------------------------------------------------
......@@ -327,40 +280,45 @@ VOID workspace_prealloc(BYTES min_free, INT max_nodes)
VOID workspace_prealloc(BYTES min_free=4096, INT max_nodes=32)
Configure workspace pre-allocation of objects in variable-sized
internal data structures.
For each request, the VDP builds such a structure, whose size is
roughly proportional to the size of the ESI tree -- the conceptual
tree with the top-level response at the root, and its includes and all
of their nested includes as branches. The nodes in this structure have
a fixed size, but the number of nodes used by the VDP varies with the
size of the ESI tree.
The VDP pre-allocates a constant number of such nodes in client
workspace, and initially takes nodes from the pre-allocation. If more
are needed for larger ESI trees, they are obtained from a global
memory pool as described below. The use of pre-allocated nodes from
workspace is preferred, since it never requires new system memory
allocations (workspaces themselves are pre-allocated by Varnish), and
because they are local to each request, so locking is never required
to access them (but is required for the memory pool).
Configure the amount of workspace used for pesi internal data
structures.
The pesi VDP builds a structure, whose size is roughly proportional to
the size of the ESI tree -- the conceptual tree with the top-level
response at the root, and its includes and all of their nested
includes as branches. The nodes in this structure have a fixed size,
but the number of nodes used by the VDP varies with the size of the
ESI tree.
For each (sub)request, the VDP pre-allocates a constant number of such
nodes in client workspace, and initially uses the pre-allocation for
child nodes of that (sub)request. If more are needed, they are
obtained from a global memory pool as described below. The use of
pre-allocated nodes from workspace is preferred, since it never
requires new system memory allocations (workspaces themselves are
pre-allocated by Varnish), and because they are local to each request,
so locking is never required to access them (but is required for the
memory pool).
The pre-allocation contributes a fixed size to client workspace usage,
since the number of pre-allocated nodes is constant. So any adjustment
to Varnish's ``workspace_client`` parameter that may be necessary due
to the pre-allocation will be valid for all requests.
``workspace_prealloc()`` configures the pre-allocation. The default
``pesi.workspace_prealloc()`` configures the pre-allocation. The default
values of its parameters are defaults used by the VDP; that is, the
configuration if ``workspace_prealloc()`` is never called.
configuration if ``pesi.workspace_prealloc()`` is never called.
The ``min_free`` parameter sets the minimum amount of space that the
pre-allocation will always leave free in client workspace; if the
targeted number of pre-allocated nodes would result in less free space
than ``min_free`` bytes in workspace, then fewer nodes are
allocated. This ensures that free workspace is always left over for
other VMODs, VCL usage, and so forth. ``min_free`` defaults to 4 KiB.
other VMODs, VCL usage, and so forth. Note that most of the operations
typically requiring workspace have already finished when VDP pesi
makes the pre-allocation, because it starts after `vcl_deliver
{}`. Thus, the reservation is mostly for other VDPs and VMODs using
`PRIV_TOP`. ``min_free`` defaults to 4 KiB.
The ``max_nodes`` parameter sets the number of nodes to be allocated,
unless the limit imposed by ``min_free`` is exceeded; ``max_nodes``
......@@ -369,13 +327,13 @@ invoked (see `ERRORS`_). If ``max_nodes`` is set to 0, then no nodes
are pre-allocated; they are all taken from the memory pool described
below.
When ``workspace_prealloc()`` is called, its configuration becomes
When ``pesi.workspace_prealloc()`` is called, its configuration becomes
effective immediately for all new requests processed by the VDP. The
configuration remains valid for all instances of VCL, for as long as
the VDP remains loaded; that is, until the last instance of VCL using
the VDP is discarded.
``workspace_prealloc()`` can be called in ``vcl_init`` to set the
``pesi.workspace_prealloc()`` can be called in ``vcl_init`` to set the
configuration at VCL load time. But you can also write VCL that calls
the function when a request is received by Varnish, for example using
a special URL for system administrators. This is similar to using the
......@@ -383,7 +341,7 @@ a special URL for system administrators. This is similar to using the
parameter at runtime. Such a request should be protected, for example
with an ACL and/or Basic Authentication, so that it can be invoked
only by admins. Remember that as soon as such a request is processed
and ``workspace_prealloc()`` is executed, the changed configuration is
and ``pesi.workspace_prealloc()`` is executed, the changed configuration is
globally valid.
Examples::
......@@ -428,7 +386,7 @@ Examples::
}
}
.. _vmod_pesi.pool:
.. _pesi.pool():
VOID pool(INT min=10, INT max=100, DURATION max_age=10)
-------------------------------------------------------
......@@ -463,11 +421,11 @@ allocation requests, even if ``max`` is execeeded when nodes are
returned to the pool. But the pool size will then be reduced to
``max``, without waiting for ``max_age`` to expire.
As with ``workspace_prealloc()``: when ``pool()`` is called, the
As with |pesi.workspace_prealloc()|_: when ``pesi.pool()`` is called, the
changed configuration immediately becomes valid (although it may take
some time for the memory pool to adjust to the new values). It remains
vaild for as long as the VDP is still loaded, unless ``pool()`` is
called again. ``pool()`` may be called in ``vcl_init`` to set a
vaild for as long as the VDP is still loaded, unless ``pesi.pool()`` is
called again. ``pesi.pool()`` may be called in ``vcl_init`` to set a
configuration at VCL load time, but may also be called elsewhere in
VCL, for example to enable changing configurations at runtime using a
special "admin" request.
......@@ -517,7 +475,7 @@ Examples::
}
}
.. _vmod_pesi.version:
.. _pesi.version():
STRING version()
----------------
......@@ -574,7 +532,7 @@ processed at any one time.
The VDP runs ESI subrequests (for each ``<esi:include>`` directive at
every ESI level) in separate threads, unless instructed not to do so
due to the use of either ``set(serial, true)`` or ``set(thread,
due to the use of either ``pesi.set(serial, true)`` or ``pesi.set(thread,
false)``, as documented above. The threads are requested from the
thread pools managed by Varnish. This means that in most cases, for
well-configured thread pools, the overhead of starting new threads is
......@@ -584,18 +542,18 @@ that is immediately ready for use.
The VDP uses client workspace at the top-level request (ESI level 0)
for fixed-sized internal metadata. It also uses client workspace to
pre-allocate a constant number of nodes in variable-sized structures,
as described in the documentation of ``workspace_prealloc()`` above.
Together these make for a fixed-sized demand on client workspace, when
``activate()`` is invoked. The size of the space needed from workspace
varies on different systems, and depends on any
``workspace_prealloc()`` you may have set, but broadly speaking, it can
as described in |pesi.workspace_prealloc()|_ above. Together these
make for a fixed-sized demand on client workspace, when
|pesi.activate()|_ is invoked. The size of the space needed from
workspace varies on different systems, and depends on
|pesi.workspace_prealloc()|_ setting, but broadly speaking, it can
expected to be less than 10 KiB.
As described in the documentation for ``pool()`` above, the VDP uses a
memory pool for nodes in its internal reconstruction of the ESI tree,
if more are needed than are pre-allocated in workspace. The same
mechanism is employed as Varnish's memory pools, so the same
considerations apply to the configuration and monitoring of the pool.
As described for |pesi.pool()|_, the VDP uses a memory pool for
nodes in its internal reconstruction of the ESI tree, if more are
needed than are pre-allocated in workspace. The same mechanism is
employed as Varnish's memory pools, so the same considerations apply
to the configuration and monitoring of the pool.
For each top-level ESI request using the VDP, two locks are employed;
one to synchronize access to common data structures, and another to
......@@ -713,6 +671,54 @@ Considerations about tuning the configuration and interpreting the
statistics are beyond the scope of this manual. For a deeper
discussion, see $EXTERNAL_DOCUMENT.
THREADS
=======
For parallel ESI to work as efficiently as possible, it should
traverse the ESI tree *breadth first*, processing any ESI object
completely, with new threads scheduled for any includes
encountered. Completing processing of an ESI object allows for data
from the subtree (the ESI object and anything below) to be sent to the
client concurrently. As soon as ESI object processing is complete, the
respective thread will be returned to the thread pool and become
available for any other varnish task (except for the request for
esi_level 0, which _has_ to wait for completion of the entire ESI
request anyway and will send data to the client in the meantime).
With the `thread`_ setting to ``true`` (the default), this is what
happens, but a thread may not be immediately available if the thread
pool is not sufficiently sized for the current load and thus the
include request may have to be queued.
With the `thread`_ setting to ``false``, include processing happens in
the same thread as if ``serial`` mode had been activated if there is
no new thread immediately available. While this may sound like the
more sensible option at first, we did not make this the default for
the following reasons:
* Before completion of ESI processing, the subtree below it is not yet
available for delivery to the client because additional VDPs behind
pesi cannot be called from a different thread.
* While processing of the include may take an arbitrarily long time
(for example because it requires a lengthy backend fetch), we know
that the ESI object is fully available in the stevedore (and usually
in memory already) when we parse an include because streaming is not
supported for ESI. So we know that completing the processing of the
current ESI object will be quick, while descending into a subtree
may be take a long time.
* Except for ESI level 0, the current thread will become available as
soon as ESI processing has completed.
* The thread herder may breed new threads and other threads may
terminate, so queuing a thread momentarily is not a bad thing per
se.
In short, keeping the `thread`_ setting at the default ``true`` should
be the right option, the alternative exists just in case.
LIMITATIONS
===========
......@@ -773,6 +779,11 @@ See `INSTALL.rst <INSTALL.rst>`_ in the source repository.
SEE ALSO
========
.. |pesi.activate()| replace:: ``pesi.activate()``
.. |pesi.set()| replace:: ``pesi.set()``
.. |pesi.workspace_prealloc()| replace:: ``pesi.workspace_prealloc()``
.. |pesi.pool()| replace:: ``pesi.pool()``
.. _Content composition with Edge Side Includes: https://varnish-cache.org/docs/trunk/users-guide/esi.html
* `varnishd(1)`_
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment