update README.rst (auto-generated)

3d888804 · Nils Goroll · 16c102d5 · 3d888804
Commit 3d888804 authored Sep 13, 2019 by Nils Goroll
Hide whitespace changes
Inline Side-by-side

Showing with 149 additions and 138 deletions

README.rst README.rst +149 -138

No files found.
--- a/README.rst
+++ b/README.rst
@@ -23,7 +23,7 @@ SYNOPSIS

  import pesi;

-  # Enable parallel ESI processing in vcl_deliver.
+  # Enable parallel ESI processing in vcl_deliver {}.
  VOID pesi.activate()

  # Set a boolean configuration parameter.
@@ -58,7 +58,7 @@ Includes (ESI). The VDP implements content composition in client
 responses as specified by ``<esi>`` directives in the response body,
 just as Varnish does with its `standard ESI processing`_. While
 standard Varnish processes ESI subrequests serially, in the order in
-which the ``<esi>`` directives appear in the response, the VDP
+which the ``<esi>`` directives appear in the response, the pesi VDP
 executes the subrequests in parallel. This can lead to a significant
 reduction in latency for the complete response, if Varnish has to wait
 for backend fetches for more than one of the included requests.
@@ -67,31 +67,31 @@ Backend applications that use ESI includes for standard Varnish can be
 expected to work without changes with the VDP, provided that they do
 not depend on assumptions about the serialization of ESI subrequests.
 Serial ESI requests are processed in a predictable order, one after
-the other, but the VDP executes them at roughly the same time. A
+the other, but the pesi VDP executes them at roughly the same time. A
 backend may conceivably receive a request forwarded for the second
 include in a response before the first one. If the logic of ESI
 composition in a standard Varnish deployment does not depend on the
 serial order, then it will work the same way with VDP pesi.

-Parallel ESI processing is enabled by invoking ``pesi.activate()`` in
-``vcl_deliver``::
+Parallel ESI processing is enabled by invoking |pesi.activate()|_ in
+``vcl_deliver {}``::

   import pesi;
-
+   
   sub vcl_backend_response {
       set beresp.do_esi = true;
   }
-
+   
   sub vcl_deliver {
       pesi.activate();
   }

 Other functions provided by the VDP serve to set configuration
 parameters (or return the VDP version string). If your deployment uses
-the default configuration, then ``pesi.activate()`` in ``vcl_deliver``
+the default configuration, then |pesi.activate()|_ in ``vcl_deliver``
 may be the only modification to VCL that you need.

-The invocation of ``pesi.activate()`` can of course be subject to
+The invocation of |pesi.activate()|_ can of course be subject to
 logic in VCL::

   sub vcl_deliver {
@@ -101,70 +101,68 @@ logic in VCL::
       }
   }

-But see below for restrictions on the use of ``pesi.activate()``.
+But see below for restrictions on the use of |pesi.activate()|_.

-All of the computing resources used by the VDP -- threads, storage,
+All of the computing resources used by the pesi VDP -- threads, storage,
 workspace, locks, and so on -- can be configured, either with Varnish
 runtime parameters or configuration settings made available by the
-VDP. And their usage can be monitored with Varnish statistics. So you
+pesi VDP. And their usage can be monitored with Varnish statistics. So you
 can limit resource usage, and use monitoring tools such as
 `varnishstat(1)`_ to ensure efficient parallel ESI processing. For
 details see `CONFIGURATION AND MONITORING`_ below.

-.. _vmod_pesi.activate:
+.. _pesi.activate():

 VOID activate()
 ---------------

 Enable parallel ESI processing for the client response.

-``activate()`` MUST be called in ``vcl_deliver`` only. If it is called
-in any other VCL subroutine, VCL failure is invoked (see `ERRORS`_
-below for details).
+``pesi.activate()`` MUST be called in ``vcl_deliver {}`` only. If it is
+called in any other VCL subroutine, VCL failure is invoked (see
+`ERRORS`_ below for details).

-If ``activate()`` is called on *any* ESI level (any depth of include
+If ``pesi.activate()`` is called on *any* ESI level (any depth of include
 nesting), then it MUST be called on *all* levels of the response. If
-``activate()`` is invoked at some ESI levels but not others, then the
+``pesi.activate()`` is invoked at some ESI levels but not others, then the
 results are undefined, and will very likely lead to a Varnish panic.

-Typically it suffices to simply call ``activate()`` in
-``vcl_deliver``, since the code in ``vcl_deliver`` is executed at
-every ESI level. It is also safe, for instance, to call ``activate()``
-only if a request header is present, as in the example shown above;
-since the same request headers are set for every ESI subrequest, the
-result is the same at every ESI level. But that should *not* be done
-if you have logic that unsets the header at some ESI levels but not at
-others. Under no circumstances should the invocation of ``activate()``
+It is also safe, for instance, to call ``pesi.activate()`` only if a
+request header is present, as in the example shown above; since the
+same request headers are set for every ESI subrequest, the result is
+the same at every ESI level. But that should *not* be done if you have
+logic that unsets the header at some ESI levels but not at
+others. Under no circumstances should the invocation of ``pesi.activate()``
 depend on the value of ``req.esi_level``, or of ``req.url`` (since
 URLs are different at different ESI levels).

-See the documentation of ``set()`` below for a way to choose serial
+See |pesi.set()|_ below for a way to choose serial
 ESI processing for all of the includes in the response at the current
-ESI level. Even then, ``activate()`` must be called in ``vcl_deliver``
-in addition to ``set()``.
+ESI level. Even then, ``pesi.activate()`` must be called in ``vcl_deliver
+{}`` in addition to ``pesi.set()``.

 As with standard Varnish, ESI processing can be selectively disabled
 for a client response, by setting ``resp.do_esi`` to ``false`` in VCL
 since version 4.1, or setting ``req.esi`` to ``false`` in VCL 4.0 (see
 `vcl(7)`_). The requirement remains: if ESI processing is enabled and
-``activate()`` is called at any ESI level, then both must happen at
+``pesi.activate()`` is called at any ESI level, then both must happen at
 all levels.

-``activate()`` has the effect of setting the VCL string variable
+``pesi.activate()`` has the effect of setting the VCL string variable
 ``resp.filters``, which is a whitespace-separated list of the names of
 delivery processors to be applied to the client response (see
 `vcl(7)`_). It configures the correct list of filters for the current
 response, analogous to the default filter settings in Varnish when
 sequential ESI is in use. These include the ``gunzip`` VDP for
 uncompressed responses, and ``range`` for responses to range
-requests. ``activate()`` checks the conditions for which the VDPs are
+requests. ``pesi.activate()`` checks the conditions for which the VDPs are
 required, and arranges them in the correct order.

 It is possible to manually set or change ``resp.filters`` to enable
-parallel ESI, instead of calling ``activate()``, but that is only
+parallel ESI, instead of calling ``pesi.activate()``, but that is only
 advised to experts. If you do so, use the string ``pesi`` for this
 VDP, and do *not* include ``esi``, for Varnish's standard ESI VDP, in
-the same list with ``pesi``. As with the ``activate()`` call -- if
+the same list with ``pesi``. As with the ``pesi.activate()`` call -- if
 ``pesi`` appears in ``resp.filters`` for a response at *any* ESI
 level, it MUST be in ``resp.filters`` at *all* ESI levels.

@@ -198,7 +196,7 @@ Example::
      pesi.activate()
  }

-.. _vmod_pesi.set:
+.. _pesi.set():

 VOID set(ENUM {serial, thread} parameter, [BOOL bool])
 ------------------------------------------------------
@@ -209,7 +207,7 @@ identified by the ENUM ``parameter``. Currently the parameters can
 only be set with a boolean value in ``bool`` (but future versions of
 this function may allow for setting other data types).

-``set()`` MUST be called in ``vcl_deliver`` only; otherwise VCL
+``pesi.set()`` MUST be called in ``vcl_deliver {}`` only; otherwise VCL
 failure is invoked (see `ERRORS`_).

 The parameters that can be set are currently ``serial`` and ``thread``:
@@ -233,15 +231,27 @@ It is strongly recommended to *not* use serial mode from ESI level 0
 parallel ESI threads.

 Serial mode may sensibly be used to reduce overhead and the number of
-threads required without relevant drawbacks:
+threads required without relevant drawbacks

-* at ESI level > 0
+* at ESI level > 0 _and_

 * when the VCL author knows that all objects included by the current
  request are cacheable, and thus are highly likely to lead to cache
  hits.

-XXX ... Also see the discussion below for more detail.
+Example::
+
+  # Activate serial mode at ESI level > 0, if we know that all includes
+  # in the response at this level lead to cacheable responses.
+
+  sub vcl_deliver {
+      pesi.activate();
+      if (req.esi_level > 0 && req.url == "/all/cacheable/includes") {
+          pesi.set(serial, true);
+      }
+  }
+
+.. _thread:

 ``thread``
 ----------
@@ -259,66 +269,9 @@ Whether we always request a new thread for includes, default is
  Request a new thread, potentially waiting for one to become
  available.

-XXX move the longer discussion to a document dedicated to the subjects
-of tuning, efficiency etc
-
-For parallel ESI to work as efficiently as possible, it should
-traverse the ESI tree *breadth first*, processing any ESI object
-completely, with new threads scheduled for any includes
-encountered. Completing processing of an ESI object allows for data
-from the subtree (the ESI object and anything below) to be sent to the
-client concurrently. As soon as ESI object processing is complete, the
-respective thread will be returned to the thread pool and become
-available for any other varnish task (except for the request for
-esi_level 0, which _has_ to wait for completion of the entire ESI
-request anyway and will send data to the client in the meantime).
-
-With this setting to ``true`` (the default), this is what happens
-always, but a thread may not be immediately available if the thread
-pool is not sufficiently sized for the current load and thus the
-include request may have to be queued.
-
-With this setting to ``false``, include processing happens in the same
-thread as if ``serial`` mode had been activated, but only in the case
-where there is no new thread available. While this may sound like the
-more sensible option at first, we did not make this the default for
-the following reasons:
-
-* Before completion of the ESI processing, the subtree below it is not
-  yet available for delivery to the client because additional VDPs
-  behind pesi cannot be called from a different thread.
-
-* While processing of the include may take an arbitrarily long time
-  (for example because it requires a lengthy backend fetch), we know
-  that the ESI object is fully available in the stevedore (and usually
-  in memory already) when we parse an include because streaming is not
-  supported for ESI. So we know that completing the processing of the
-  current ESI object will be quick, while descending into a subtree
-  may be take a long time.
-
-* Except for ESI level 0, the current thread will become available as
-  soon as ESI processing has completed.
-
-* The thread herder may breed new threads and other threads may
-  terminate, so queuing a thread momentarily is not a bad thing per
-  se.
-
-In short, keeping ``thread`` at the default ``true`` should be the
-right option, the alternative exists just in case.
+See the detailled discussion in `THREADS`_ for details.

-Example::
-
-  # Activate serial mode at ESI level > 0, if we know that all includes
-  # in the response at this level lead to cacheable responses.
-
-  sub vcl_deliver {
-      pesi.activate();
-      if (req.esi_level > 0 && req.url == "/all/cacheable/includes") {
-      	 pesi.set(serial, true);
-      }
-  }
-
-.. _vmod_pesi.workspace_prealloc:
+.. _pesi.workspace_prealloc():

 VOID workspace_prealloc(BYTES min_free, INT max_nodes)
 ------------------------------------------------------
@@ -327,40 +280,45 @@ VOID workspace_prealloc(BYTES min_free, INT max_nodes)

   VOID workspace_prealloc(BYTES min_free=4096, INT max_nodes=32)

-Configure workspace pre-allocation of objects in variable-sized
-internal data structures.
-
-For each request, the VDP builds such a structure, whose size is
-roughly proportional to the size of the ESI tree -- the conceptual
-tree with the top-level response at the root, and its includes and all
-of their nested includes as branches. The nodes in this structure have
-a fixed size, but the number of nodes used by the VDP varies with the
-size of the ESI tree.
-
-The VDP pre-allocates a constant number of such nodes in client
-workspace, and initially takes nodes from the pre-allocation. If more
-are needed for larger ESI trees, they are obtained from a global
-memory pool as described below. The use of pre-allocated nodes from
-workspace is preferred, since it never requires new system memory
-allocations (workspaces themselves are pre-allocated by Varnish), and
-because they are local to each request, so locking is never required
-to access them (but is required for the memory pool).
+Configure the amount of workspace used for pesi internal data
+structures.
+
+The pesi VDP builds a structure, whose size is roughly proportional to
+the size of the ESI tree -- the conceptual tree with the top-level
+response at the root, and its includes and all of their nested
+includes as branches. The nodes in this structure have a fixed size,
+but the number of nodes used by the VDP varies with the size of the
+ESI tree.
+
+For each (sub)request, the VDP pre-allocates a constant number of such
+nodes in client workspace, and initially uses the pre-allocation for
+child nodes of that (sub)request. If more are needed, they are
+obtained from a global memory pool as described below. The use of
+pre-allocated nodes from workspace is preferred, since it never
+requires new system memory allocations (workspaces themselves are
+pre-allocated by Varnish), and because they are local to each request,
+so locking is never required to access them (but is required for the
+memory pool).

 The pre-allocation contributes a fixed size to client workspace usage,
 since the number of pre-allocated nodes is constant. So any adjustment
 to Varnish's ``workspace_client`` parameter that may be necessary due
 to the pre-allocation will be valid for all requests.

-``workspace_prealloc()`` configures the pre-allocation. The default
+``pesi.workspace_prealloc()`` configures the pre-allocation. The default
 values of its parameters are defaults used by the VDP; that is, the
-configuration if ``workspace_prealloc()`` is never called.
+configuration if ``pesi.workspace_prealloc()`` is never called.

 The ``min_free`` parameter sets the minimum amount of space that the
 pre-allocation will always leave free in client workspace; if the
 targeted number of pre-allocated nodes would result in less free space
 than ``min_free`` bytes in workspace, then fewer nodes are
 allocated. This ensures that free workspace is always left over for
-other VMODs, VCL usage, and so forth. ``min_free`` defaults to 4 KiB.
+other VMODs, VCL usage, and so forth. Note that most of the operations
+typically requiring workspace have already finished when VDP pesi
+makes the pre-allocation, because it starts after `vcl_deliver
+{}`. Thus, the reservation is mostly for other VDPs and VMODs using
+`PRIV_TOP`. ``min_free`` defaults to 4 KiB.

 The ``max_nodes`` parameter sets the number of nodes to be allocated,
 unless the limit imposed by ``min_free`` is exceeded; ``max_nodes``
@@ -369,13 +327,13 @@ invoked (see `ERRORS`_). If ``max_nodes`` is set to 0, then no nodes
 are pre-allocated; they are all taken from the memory pool described
 below.

-When ``workspace_prealloc()`` is called, its configuration becomes
+When ``pesi.workspace_prealloc()`` is called, its configuration becomes
 effective immediately for all new requests processed by the VDP. The
 configuration remains valid for all instances of VCL, for as long as
 the VDP remains loaded; that is, until the last instance of VCL using
 the VDP is discarded.

-``workspace_prealloc()`` can be called in ``vcl_init`` to set the
+``pesi.workspace_prealloc()`` can be called in ``vcl_init`` to set the
 configuration at VCL load time.  But you can also write VCL that calls
 the function when a request is received by Varnish, for example using
 a special URL for system administrators. This is similar to using the
@@ -383,7 +341,7 @@ a special URL for system administrators. This is similar to using the
 parameter at runtime. Such a request should be protected, for example
 with an ACL and/or Basic Authentication, so that it can be invoked
 only by admins. Remember that as soon as such a request is processed
-and ``workspace_prealloc()`` is executed, the changed configuration is
+and ``pesi.workspace_prealloc()`` is executed, the changed configuration is
 globally valid.

 Examples::
@@ -428,7 +386,7 @@ Examples::
      }
  }

-.. _vmod_pesi.pool:
+.. _pesi.pool():

 VOID pool(INT min=10, INT max=100, DURATION max_age=10)
 -------------------------------------------------------
@@ -463,11 +421,11 @@ allocation requests, even if ``max`` is execeeded when nodes are
 returned to the pool. But the pool size will then be reduced to
 ``max``, without waiting for ``max_age`` to expire.

-As with ``workspace_prealloc()``: when ``pool()`` is called, the
+As with |pesi.workspace_prealloc()|_: when ``pesi.pool()`` is called, the
 changed configuration immediately becomes valid (although it may take
 some time for the memory pool to adjust to the new values). It remains
-vaild for as long as the VDP is still loaded, unless ``pool()`` is
-called again. ``pool()`` may be called in ``vcl_init`` to set a
+vaild for as long as the VDP is still loaded, unless ``pesi.pool()`` is
+called again. ``pesi.pool()`` may be called in ``vcl_init`` to set a
 configuration at VCL load time, but may also be called elsewhere in
 VCL, for example to enable changing configurations at runtime using a
 special "admin" request.
@@ -517,7 +475,7 @@ Examples::
      }
  }

-.. _vmod_pesi.version:
+.. _pesi.version():

 STRING version()
 ----------------
@@ -574,7 +532,7 @@ processed at any one time.

 The VDP runs ESI subrequests (for each ``<esi:include>`` directive at
 every ESI level) in separate threads, unless instructed not to do so
-due to the use of either ``set(serial, true)`` or ``set(thread,
+due to the use of either ``pesi.set(serial, true)`` or ``pesi.set(thread,
 false)``, as documented above. The threads are requested from the
 thread pools managed by Varnish. This means that in most cases, for
 well-configured thread pools, the overhead of starting new threads is
@@ -584,18 +542,18 @@ that is immediately ready for use.
 The VDP uses client workspace at the top-level request (ESI level 0)
 for fixed-sized internal metadata. It also uses client workspace to
 pre-allocate a constant number of nodes in variable-sized structures,
-as described in the documentation of ``workspace_prealloc()`` above.
-Together these make for a fixed-sized demand on client workspace, when
-``activate()`` is invoked. The size of the space needed from workspace
-varies on different systems, and depends on any
-``workspace_prealloc()`` you may have set, but broadly speaking, it can
+as described in |pesi.workspace_prealloc()|_ above.  Together these
+make for a fixed-sized demand on client workspace, when
+|pesi.activate()|_ is invoked. The size of the space needed from
+workspace varies on different systems, and depends on
+|pesi.workspace_prealloc()|_ setting, but broadly speaking, it can
 expected to be less than 10 KiB.

-As described in the documentation for ``pool()`` above, the VDP uses a
-memory pool for nodes in its internal reconstruction of the ESI tree,
-if more are needed than are pre-allocated in workspace. The same
-mechanism is employed as Varnish's memory pools, so the same
-considerations apply to the configuration and monitoring of the pool.
+As described for |pesi.pool()|_, the VDP uses a memory pool for
+nodes in its internal reconstruction of the ESI tree, if more are
+needed than are pre-allocated in workspace. The same mechanism is
+employed as Varnish's memory pools, so the same considerations apply
+to the configuration and monitoring of the pool.

 For each top-level ESI request using the VDP, two locks are employed;
 one to synchronize access to common data structures, and another to
@@ -713,6 +671,54 @@ Considerations about tuning the configuration and interpreting the
 statistics are beyond the scope of this manual. For a deeper
 discussion, see $EXTERNAL_DOCUMENT.

+THREADS
+=======
+
+For parallel ESI to work as efficiently as possible, it should
+traverse the ESI tree *breadth first*, processing any ESI object
+completely, with new threads scheduled for any includes
+encountered. Completing processing of an ESI object allows for data
+from the subtree (the ESI object and anything below) to be sent to the
+client concurrently. As soon as ESI object processing is complete, the
+respective thread will be returned to the thread pool and become
+available for any other varnish task (except for the request for
+esi_level 0, which _has_ to wait for completion of the entire ESI
+request anyway and will send data to the client in the meantime).
+
+With the `thread`_ setting to ``true`` (the default), this is what
+happens, but a thread may not be immediately available if the thread
+pool is not sufficiently sized for the current load and thus the
+include request may have to be queued.
+
+With the `thread`_ setting to ``false``, include processing happens in
+the same thread as if ``serial`` mode had been activated if there is
+no new thread immediately available. While this may sound like the
+more sensible option at first, we did not make this the default for
+the following reasons:
+
+* Before completion of ESI processing, the subtree below it is not yet
+  available for delivery to the client because additional VDPs behind
+  pesi cannot be called from a different thread.
+
+* While processing of the include may take an arbitrarily long time
+  (for example because it requires a lengthy backend fetch), we know
+  that the ESI object is fully available in the stevedore (and usually
+  in memory already) when we parse an include because streaming is not
+  supported for ESI. So we know that completing the processing of the
+  current ESI object will be quick, while descending into a subtree
+  may be take a long time.
+
+* Except for ESI level 0, the current thread will become available as
+  soon as ESI processing has completed.
+
+* The thread herder may breed new threads and other threads may
+  terminate, so queuing a thread momentarily is not a bad thing per
+  se.
+
+In short, keeping the `thread`_ setting at the default ``true`` should
+be the right option, the alternative exists just in case.
+
+
 LIMITATIONS
 ===========

@@ -773,6 +779,11 @@ See `INSTALL.rst <INSTALL.rst>`_ in the source repository.
 SEE ALSO
 ========

+.. |pesi.activate()| replace:: ``pesi.activate()``
+.. |pesi.set()| replace:: ``pesi.set()``
+.. |pesi.workspace_prealloc()| replace:: ``pesi.workspace_prealloc()``
+.. |pesi.pool()| replace:: ``pesi.pool()``
+
 .. _Content composition with Edge Side Includes: https://varnish-cache.org/docs/trunk/users-guide/esi.html

 * `varnishd(1)`_