Varnish Module for matching strings associated with backends, regexen and other strings
| pkg/rpm | ||
| src | ||
| .clang-tidy | ||
| .dir-locals.el | ||
| .gitignore | ||
| .gitlab-ci.yml | ||
| bootstrap | ||
| CHANGES.md | ||
| configure.ac | ||
| CONTRIBUTING.rst | ||
| INSTALL.rst | ||
| LICENSE | ||
| Makefile.am | ||
| README.rst | ||
| STATISTICS.md | ||
| TODO | ||
..
.. NB: This file is machine generated, DO NOT EDIT!
..
.. Edit ./vmod_selector.vcc and run make instead
..
.. role:: ref(emphasis)
=============
vmod_selector
=============
--------------------------------------------------------------------------------------------------
Varnish Module for matching fixed strings, and mapping strings to backends, regexen and other data
--------------------------------------------------------------------------------------------------
:Manual section: 3
SYNOPSIS
========
::
import selector;
# Set creation
new <obj> = selector.set([BOOL case_sensitive]
[, BOOL allow_overlaps])
VOID <obj>.add(STRING [, STRING string] [, REGEX regex]
[, BACKEND backend] [, INT integer] [, BOOL bool]
[, SUB sub])
VOID <obj>.create_stats()
# Matching
BOOL <obj>.match(STRING)
BOOL <obj>.hasprefix(STRING)
# Match properties
INT <obj>.nmatches()
BOOL <obj>.matched([INT n] [, STRING element] [, ENUM select])
INT <obj>.which([ENUM select] [, STRING element])
BOOL <obj>.check_call([INT n] [, STRING element] [, ENUM select])
# Retrieving objects by index, by string, or after match
STRING <obj>.element([INT n] [, ENUM select])
STRING <obj>.string([INT n] [, STRING element] [, ENUM select])
BACKEND <obj>.backend([INT n] [, STRING element] [, ENUM select])
INT <obj>.integer([INT n] [, STRING element] [, ENUM select])
BOOL <obj>.bool([INT n] [, STRING element] [, ENUM select])
BOOL <obj>.re_match(STRING [, INT n] [, STRING element]
[, ENUM select])
STRING <obj>.sub(STRING text, STRING rewrite [, BOOL all] [, INT n]
[, STRING element] [, ENUM select])
SUB <obj>.subroutine([INT n] [, STRING element] [, ENUM select])
# VMOD version
STRING selector.version()
DESCRIPTION
===========
.. _VMOD re2: https://code.uplex.de/uplex-varnish/libvmod-re2
Varnish Module (VMOD) for matching strings against sets of fixed
strings. A VMOD object may also function as an associative array,
mapping the matched string to one or more of a backend, another
string, an integer, or a regular expression. The string may also map
to a subroutine that can be invoked with ``call``.
The VMOD is intended to support a variety of use cases that are
typical for VCL deployments, such as:
* Determining the backend based on the Host header or the prefix of
the URL.
* Rewriting the URL or a header.
* Generating redirect responses, based on a header or the URL.
* Permitting or rejecting request methods.
* Matching the Basic Authentication credentials in an Authorization
request header.
* Matching media types in the Content-Type header of a backend
response to determine if the content is compressible.
* Accessing data by string match, as in an associative array, or by
numeric index, as in a standard array.
* Dispatching subroutine calls based on string matches.
* Executing conditional logic that depends on features of the request
or response that can be determined by matching headers or URLs.
Operations such as these are commonly implemented in native VCL with
an ``if-elsif-elsif`` sequence of string comparisons or regex matches.
As the number of matches increases, such a sequence becomes cumbersome
and scales poorly -- the time needed to execute the sequence increases
with the number of matches to be performed.
With the VMOD, the strings to be matched are declared in a tabular
form in ``vcl_init``, and the operation is executed in a few
lines. For example::
import selector;
# Assume that you have defined these subroutines to execute logic
# in vcl_recv for URLs beginning with /foo/, /bar/ or /baz/.
sub foo { # ...
}
sub bar { # ...
}
sub baz { # ...
}
sub vcl_init {
# Requests for URLs with these prefixes will be sent to the
# associated backend. In vcl_recv, the associated subroutine
# will be called.
new url_prefix = selector.set();
url_prefix.add("/foo/", backend=foo_backend, sub=foo);
url_prefix.add("/bar/", backend=bar_backend, sub=bar);
url_prefix.add("/baz/", backend=baz_backend, sub=baz);
# For requests with these Host headers, generate a redirect
# response, using the associated string to construct the
# Location header, and the integer to set the response code.
new redirect = selector.set();
redirect.add("www.foo.com", string="/foo", integer=301);
redirect.add("www.bar.com", string="/bar", integer=302);
redirect.add("www.baz.com", string="/baz", integer=303);
redirect.add("www.quux.com", string="/quux", integer=307);
# Requests for these URLs are rewritten by altering the
# query string, using the associated regex for a
# substitution operation, each of which removes a
# parameter.
new rewrite = selector.set();
rewrite.add("/alpha/beta", regex="(\?.*)\bfoo=[^&]+&?(.*)$");
rewrite.add("/delta/gamma", regex="(\?.*)\bbar=[^&]+&?(.*)$");
rewrite.add("/epsilon/zeta", regex="(\?.*)\bbaz=[^&]+&?(.*)$");
}
sub vcl_recv {
# .match() returns true if the Host header exactly matches
# one of the strings in the set.
if (redirect.match(req.http.Host)) {
# .string() returns the string added to the set above with
# the 'string' parameter, for the string that was
# matched. We use it to construct a Location header, which
# will be retrieved in vcl_synth below to construct the
# redirect response.
#
# .integer() returns the integer added to the set with the
# 'integer' parameter, for the string that was matched. We
# use it as the argument of synth() to set the response
# status (one of the redirect status codes).
set req.http.Location
= "http://other.com" + redirect.string() + req.url;
return (synth(redirect.integer()));
}
# If the URL matches the rewrite set, change the query string by
# applying a substitution using the associated regex (removing a
# query parameter).
if (rewrite.match(req.url)) {
set req.url = rewrite.sub(req.url, "\1\2");
}
# If the URL has a prefix in the url_prefix set, call the
# associated subroutine.
if (url_prefix.hasprefix(req.url)) {
call url_prefix.subroutine();
}
}
sub vcl_synth {
# We come here when Host matched the redirect set in vcl_recv
# above. Set the Location response header from the request header
# set in vcl_recv.
if (req.http.Location && resp.status >= 301 && resp.status <= 307) {
set resp.http.Location = req.http.Location;
return (deliver);
}
}
sub vcl_backend_fetch {
# The .hasprefix() method returns true if the URL has a prefix
# in the set.
if (url_prefix.hasprefix(bereq.url)) {
# .backend() returns the backend associated with the
# string in the set that was matched as a prefix.
set bereq.backend = url_prefix.backend();
}
}
Matches with the ``.match()`` and ``.hasprefix()`` methods scale well
as the number of strings in the set increases. Experience has shown
that both operations are predictable and fast for large sets of
strings.
When new strings are added to a set (with new ``.add()`` statements in
``vcl_init``), the VCL code that executes the various operations
(rewrites, backend assignment and so forth) can remain unchanged. So
the VMOD can contribute to better code maintainability.
Matches with ``.match()`` and ``.hasprefix()`` are fixed string
matches; characters such as wildcards and regex metacharacters are
matched literally, and have no special meaning. Regex operations
such as matching or substitution can be performed after set matches,
using the regex saved with the ``regex`` parameter. But if you need
to match against sets of patterns, consider using the set interface
of `VMOD re2`_, which provides techniques similar to the present VMOD.
The limited expressiveness of strings to be matched means that this
VMOD can implement fast algorithms. While regexen and a VMOD like re2
can be used to match fixed strings and prefixes, the matching
operations of VMOD selector are orders of magnitude faster. That in
turn contributes to scalability by consuming less CPU time for
matches. So if your use case allows matches against strings without
patterns, prefer the use of this VMOD.
Selecting matched elements of a set
-----------------------------------
The ``.match()`` operation is an exact, fixed string match, and hence
always matches exactly one string in the set if it succeeds. With
``.hasprefix()``, more than one string in the set may be matched, if
the set includes strings that are prefixes of other strings in the
same set::
sub vcl_init {
new myset = selector.set();
myset.add("/foo/"); # element 1
myset.add("/foo/bar/"); # element 2
myset.add("/foo/bar/baz/"); # element 3
}
sub vcl_recv {
# With .hasprefix(), a URL such as /foo/bar/baz/quux matches all
# 3 elements in the set.
if (myset.hasprefix(req.url)) {
# ...
}
}
Just calling ``.hasprefix()`` may be sufficient if all that matters is
whether a string has any prefix that appears in the set. But for some
uses it may be necessary to identify one matching element of the set;
this is done in particular for the methods that retrieve data
associated with a specific set element. For such cases, the method
parameters ``INT n``, ``STRING element`` and ``ENUM select`` are used
to choose a matched element.
As indicated in the example, elements of a set are implicitly numbered
in the order in which they were added to the set using the ``.add()``
method, starting from 1. In all of the following, the ``n``,
``element`` and ``select`` parameters for a method call are evaluated
as follows:
* If ``n`` >= 1, then the ``n``-th element of the set is chosen, and
the ``element`` and ``select`` parameters have no effect. A method
with ``n`` >= 1 can be called in any context, and does not depend on
prior match operations. This is essentially a lookup by index.
* If ``n`` is greater than the number of elements in the set, the
method invokes VCL failure (see `ERRORS`_).
* If ``n`` <= 0 and the ``element`` parameter is set, then the VMOD
searches for the string specified by ``element``, in the same way
that the ``.match()`` method is executed. This is in essence a
lookup in an associative array.
If ``element`` is set but the lookup fails, that is if there is no
such element in the set, then VCL failure is invoked, with the
string "no such element" in the ``VCL_Error`` log message.
If the lookup for the ``element`` succeeds, then the successful
match establishes a match context for subsequent code. That means
that the rules presently described can be applied again, as if
``.match()`` had returned ``true`` for the ``element`` (internally,
that is in fact what happens).
The internal match against ``element`` is case sensitive if and only
if the ``case_sensitive`` flag was ``true`` in the set constructor
(this is the default).
``n`` is 0 by default, so it can be left out of the method call when
``element`` is set.
* If ``n`` <= 0 and ``element`` is unset, then the ``select``
parameter is used to choose an element based on the most recent
``.match()`` or ``.hasprefix()`` call for the same set object in the
same task scope; that is, the most recent call in the same client or
backend context. Thus a method call in one of the ``vcl_backend_*``
subroutines refers back to the most recent ``.match()`` or
``.hasprefix()`` invocation in the same backend context.
By default, ``n`` is 0 and ``element`` is unset, so both of them can
be left out of the call to use ``select``.
* If ``n`` <= 0 and ``element`` is unset, and neither of ``.match()``
or ``.hasprefix()`` has been called for the same set object in the
same task scope, or if the most recent call resulted in a failed
match, then the method invokes VCL failure.
* When ``n`` <= 0 and ``element`` is unset after a successful
``.match()`` call, then for any value of ``select``, the element
chosen is the one that matched.
* When ``n`` <= 0 and ``element`` is unset after a successful
``.hasprefix()`` call, then the value of ``select`` determines the
element chosen, as follows:
* ``UNIQUE`` (default): if exactly one element of the set matched,
choose that element. The method invokes VCL failure in this case
if more than one element matched.
Since the defaults for ``n`` and ``select`` are 0 and ``UNIQUE``,
and ``element`` is unset by default, ``select=UNIQUE`` is in
effect if all three parameters are left out of the method call.
* ``EXACT``: if one of the elements in the set matched exactly (even
if other prefixes in the set matched as well), choose that
element. VCL failure is invoked if there was no exact match.
Thus if a prefix match for ``/foo/bar`` is run against a set
containing ``/foo`` and ``/foo/bar``, the latter element is chosen
with ``select=EXACT``.
* ``FIRST``: choose the first element in the set that matched
(in the order in which they were added with ``.add()``).
* ``LAST``: choose the last element in the set that matched.
* ``SHORTEST``: choose the shortest element in the set that matched.
* ``LONGEST``: choose the longest element in the set that matched.
So for sets of strings with common prefixes, a strategy for selecting
the matched element after a prefix match can be implemented by
ordering the strings added to the set, by choosing only an exact match
or the longest match, and so on::
# In this example, we set the backend for a fetch based on the most
# specific matching prefix of the URL, i.e. the longest prefix in
# the URL that appears in the set.
sub vcl_init {
new myset = selector.set();
myset.add("/foo/", backend=foo_backend);
myset.add("/foo/bar/", backend=bar_backend);
myset.add("/foo/bar/baz/", backend=baz_backend);
}
sub vcl_backend_fetch {
if (myset.hasprefix(bereq.url)) {
set bereq.backend = myset.backend(select=LONGEST);
}
}
# This sets baz_backend for /foo/bar/baz/quux
# bar_backend for /foo/bar/quux
# foo_backend for /foo/quux
To re-state the rules more informally:
* Use only one of ``n``, ``element`` or ``select`` to select a string
in the set.
* If ``n`` > 0, use ``n``. ``n`` = 0 by default.
* Otherwise if ``element`` is set, use ``element``. ``element`` is
unset by default.
* Otherwise use ``select``, default ``UNIQUE``.
* ``n`` is a lookup by numeric index, as implied by the order of
``.add()`` in ``vcl_init``.
* ``element`` is an associative array lookup by string.
* ``select`` refers back to the previous invocation of ``.match()`` or
``.hasprefix()``.
* The value of ``select`` is irrelevant (and can just as well be
left out) if the prior invocation was ``.match()``, or if it was
``.hasprefix()`` and exactly one string was found (which is always
the case if strings in the set have no common
prefixes). ``select`` is meant to pick an element when
``.hasprefix()`` finds more than one string.
.. _selector.set():
new xset = selector.set(BOOL case_sensitive, BOOL allow_overlaps)
-----------------------------------------------------------------
::
new xset = selector.set(
BOOL case_sensitive=1,
BOOL allow_overlaps=1
)
Create a set object.
When ``case_sensitive`` is ``false``, matches using the ``.match()``
and ``.hasprefix()`` methods are case-insensitive. By default,
``case_sensitive`` is ``true``.
When ``allow_overlaps`` is ``false``, the VCL load fails if any string
added to the set is a prefix of another string in the set. This can be
used to ensure that methods using the ``select=UNIQUE`` enum will
always succeed after ``.hasprefix()`` matches (and to fail fast if
the restriction is not met). By default, ``allow_overlaps`` is
``true``.
The initialization of a set is completed when ``vcl_init`` finishes,
or when the deprecated ``.compile()`` method is called. This prepares
the set for use with the strings added with the ``.add()`` method
described below. The VCL load fails if:
* The same string is added to the same set more than once (that string
is included in the error message).
* The set contains a string that is a prefix of another string in the
same set, but ``allow_overlaps`` was set to ``false`` in the
constructor.
Set initialization may also fail due to conditions such as out of
memory.
If no strings were added to the set before ``vcl_init`` finishes or
``.compile()`` is invoked, the VCL load will not fail, but all match
operations on the set will fail. In that case, a warning is emitted to
the log with the ``VCL_Error`` tag. Since that happens outside of any
request/response transaction, the error message can only be seen when
a tool like ``varnishlog(1)`` is used with raw grouping (``-g raw``).
Examples::
sub vcl_init {
# By default, matches are case-sensitive, and overlapping
# prefixes are permitted.
new myset = selector.set();
# ...
# For case-insensitive matching.
new caseless = selector.set(case_sensitive=false);
# ...
# Forbid overlapping prefixes.
new allunique = selector.set(allow_overlaps=false);
# ...
}
.. _xset.add():
VOID xset.add(STRING, [STRING string], [REGEX regex], [BACKEND backend], [INT integer], [BOOL bool], [SUB sub])
---------------------------------------------------------------------------------------------------------------
::
VOID xset.add(
STRING,
[STRING string],
[REGEX regex],
[BACKEND backend],
[INT integer],
[BOOL bool],
[SUB sub]
)
Add the given string to the set. As indicated above, elements added to
the set are implicitly numbered in the order in which they are added
with ``.add()``, starting with 1.
If values are set for any of the following optional parameters, then
those values are associated with this element, and can be retrieved
with the method shown in the second column. The retrieval methods are
documented below.
==================== ===========================
``.add()`` parameter Retrieval methods
-------------------- ---------------------------
``string`` ``.string()``
``regex`` ``.re_match()``, ``.sub()``
``backend`` ``.backend()``
``integer`` ``.integer()``
``bool`` ``.bool()``
``sub`` ``.subroutine()``
==================== ===========================
A regular expression in the ``regex`` parameter is compiled at VCL
load time. If the compile fails, then the VCL load fails with an error
message. Regular expressions are evaluated exactly as native regexen
in VCL.
A VCL subroutine specified by the ``sub`` parameter MUST be defined
*prior* to the definition of ``vcl_init`` in which ``.add()`` is
invoked. The VCL compiler does not support forward definitions for
this purpose.
``.add()`` invokes VCL failure if it is called in any subroutine
besides ``vcl_init``. The VCL load fails if:
* The string to be added is NULL.
* A regular expression in the ``regex`` parameter fails to compile.
* A subroutine specified by the ``sub`` parameter was not defined
previously in the VCL source.
* The deprecated ``.compile()`` method has already been called.
Example::
sub my_quux_sub {
set req.http.Quux = "xyzzy";
}
sub vcl_init {
new myset = selector.set();
myset.add("www.foo.com");
myset.add("www.bar.com", string="/bar");
myset.add("www.baz.com", string="/baz", backend=baz_backend);
myset.add("www.quux.com", string="/quux", backend=quux_backend,
regex="^/quux/([^/]+)/", sub=my_quux_sub);
}
.. _xset.compile():
VOID xset.compile()
-------------------
**This method is deprecated**, and will be removed in a future
version. ``.compile()`` may be omitted, since compilation happens
automatically when ``vcl_init`` finishes.
``.compile()`` compiles the set. This is done after all of the strings
have been added.
``.compile()`` invokes VCL failure if it is called in any subroutine
besides ``vcl_init``. The VCL load may fail for the same reasons
described for set initialization above, or if ``.compile()`` is
invoked more than once.
.. _xset.create_stats():
VOID xset.create_stats()
------------------------
Create statistics counters for this object that are displayed by tools
such as ``varnishstat(1)``. See `STATISTICS`_ for details. It must be
called in ``vcl_init``. No statistics are created for a set object if
``.create_stats()`` is not invoked.
``.create_stats()`` invokes VCL failure if it is called in any VCL
subroutine besides ``vcl_init``.
Example::
sub vcl_init {
new myset = selector.set();
myset.add("foo");
myset.add("bar");
myset.add("baz");
myset.create_stats();
}
.. _xset.match():
BOOL xset.match(STRING)
-----------------------
Returns ``true`` if the given STRING exactly matches one of the
strings in the set. The match is case insensitive if and only if the
parameter ``case_sensitive`` was set to ``false`` in the set
constructor (matches are case sensitive by default).
``.match()`` invokes VCL failure if:
* No strings were added to the set.
* There is insufficient workspace for internal operations.
If the string to be matched is NULL, for example when an unset header
is unspecified, then ``.match()`` returns ``false``, and a warning is
emitted to the log with the ``Notice`` header (see `LOGGING`_). This
is because a match against an unset header may or may not have been
intentional.
If you need to distinguish whether or not the header exists when using
``.match()``, you can evaluate the header in boolean context::
if (!myset.match(req.http.Foo)) {
# Either there is no such header in the client request, or
# the header does not match the set.
# ...
}
if (req.http.Foo && !myset.match(req.http.Foo)) {
# The header exists, but does not match the set.
# ...
}
.. _xset.hasprefix():
BOOL xset.hasprefix(STRING)
---------------------------
Returns ``true`` if the STRING to be matched has a prefix that is in
the set. The match is case insensitive if ``case_sensitive`` was set
to ``false`` in the constructor.
``.hasprefix()`` invokes VCL failure under the same conditions given
for ``.match()`` above. Like ``.match()``, ``.hasprefix()`` returns
``false`` if the string to be matched is NULL, for example if it is an
unset header, and a ``Notice`` message is emitted to the log (see
`LOGGING`_).
Example::
if (myset.hasprefix(req.url)) {
call do_if_prefix_matched;
}
.. _xset.nmatches():
INT xset.nmatches()
-------------------
Returns the number of elements that were matched by the most recent
successful invocation of ``.match()`` or ``.hasprefix()`` for the same
set object in the same task scope (that is, in the same client or
backend context).
``.nmatches()`` returns 0 after either of ``.match()`` or
``.hasprefix()`` returned ``false``, and it returns 1 after
``.match()`` returned ``true``. After a successful ``.hasprefix()``
call, it returns the number of strings in the set that are prefixes of
the string that was matched.
``.nmatches()`` invokes VCL failure if there was no prior invocation
of ``.match()`` or ``.hasprefix()`` in the same task scope.
Example::
# For a use case that requires a unique prefix match, use
# .nmatches() to ensure that there was exactly one match, and fail
# fast with VCL failure otherwise.
if (myset.hasprefix(bereq.url)) {
if (myset.nmatches() != 1) {
std.log(bereq.url + " matched > 1 prefix in the set");
return (fail);
}
set bereq.backend = myset.backend(select=UNIQUE);
}
.. _xset.matched():
BOOL xset.matched(INT n, STRING element, ENUM select)
-----------------------------------------------------
::
BOOL xset.matched(
INT n=0,
STRING element=0,
ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE
)
After a successful ``.match()`` or ``.hasprefix()`` call for the same
set object in the same task scope, return ``true`` if the element
indicated by the ``n``, ``element`` and ``select`` parameters was
matched, according to the rules described above.
For example if ``n`` > 0, ``.matched(n)`` returns ``true`` if and only
if the ``n``-th element matched. The numbering corresponds to the
order of ``.add()`` invocations in ``vcl_init`` (starting from 1). The
``select`` and ``element`` parameters are ignored in this case.
If ``n`` <= 0 and ``element`` is set, then ``.matched()`` returns
``true`` if and only if the string specified by ``element`` was
matched in the previous successful ``.match()`` or ``.hasprefix()``
call. If ``element`` is not in the set, then ``.matched()`` does not
invoke VCL failure (this is a deviation from the general rules for
``element``), but ``.matched()`` always returns ``false`` in that
case. Thus ``.matched()`` can always used with ``element`` to safely
check if a string was previously matched, regardless of whether the
string is in the set.
``n`` defaults to 0, so the ``n`` parameter can be left out if
``element`` is set.
If ``n`` <= 0 and ``element`` is unset, the set element is determined
by the ``select`` enum. In that case, ``.matched()`` returns ``true``
if and only if the element indicated by the enum was matched by the
previous successful match operation. These distinctions are only
relevant if the previous operation was ``.hasprefix()``, and more than
one string was matched due to overlapping prefixes. ``.matched()``
returns ``true`` for all values of ``select`` if the previous
successful operation was ``.match()``.
``n`` defaults to 0 and ``element`` is unset by default, so the ``n``
and ``element`` parameters can be left out if the use of ``select`` is
intended.
If ``n`` <= 0, ``element`` is unset, and ``select`` is ``UNIQUE`` or
``EXACT``, then ``.matched()`` returns ``true`` if the enum's criteria
are met; otherwise it returns ``false``, and does not fail. This can
be used as a safeguard for the methods described below, which invoke
VCL failure if either of these two enums are specified, but their
criteria are not met.
The other enum values (``FIRST``, ``LAST``, ``SHORTEST`` and
``LONGEST``) are included for consistency with the other methods, but
they don't make a relevant distinction. If the prior invocation of
``.match()`` or ``.hasprefix()`` was successful (returned ``true``),
then ``.matched()`` returns ``true`` for each of these, since there is
always an element that meets the criteria.
``.matched()`` always returns ``false`` if the most recent
``.match()`` or ``.hasprefix()`` call returned ``false``.
``.matched()`` invokes VCL failure if:
* The ``n`` parameter is out of range -- greater than the number of
elements in the set.
* There was no prior invocation of ``.match()`` or ``.hasprefix()`` in
the same task scope.
Example::
if (hosts.match(req.http.Host)) {
if (hosts.matched(1)) {
call do_if_the_first_host_element_matched;
}
}
if (url_prefixes.hasprefix(req.url)) {
if (urls.matched(select=UNIQUE)) {
call do_if_a_unique_url_prefix_was_matched;
}
}
if (url_prefixes.hasprefix(bereq.url)) {
if (urls.matched(element="/foo/")) {
call do_if_foo_was_matched;
}
}
.. _xset.which():
INT xset.which(ENUM select, STRING element)
-------------------------------------------
::
INT xset.which(
ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE,
STRING element=0
)
Return the index of the element indicated by ``element`` or
``select``. The numbering corresponds to the order of ``.add()`` calls
in ``vcl_init``, starting from 1.
If the ``element`` parameter is set, then return the numeric index for
that string in the set.
If ``element`` is unset, then the index is chosen with the ``select``
parameter, and refers to the previous ``.match()`` or ``.hasprefix()``
call for the same set object in the same task scope, according to the
rules given above. By default, ``select`` is ``UNIQUE``.
If ``element`` is unset, and the most recent ``.match()`` or
``.hasprefix()`` call returned ``false``, return 0.
``.which()`` invokes VCL failure if:
* The choice of ``element`` or ``select`` indicates failure, as
documented above; that is, if ``element`` is a string that is not in
the set, or ``select`` is ``UNIQUE`` or ``EXACT``, but there was no
unique or exact match, respectively.
* There was no prior invocation of ``.match()`` or ``.hasprefix()`` in
the same task scope.
Example::
if (myset.hasprefix(req.url)) {
if (myset.which(select=SHORTEST) > 1) {
call do_if_the_shortest_match_was_not_the_first_element;
}
}
if (myset.which(element=bereq.url) == 1) {
call do_if_the_url_was_the_first_element;
}
.. _xset.element():
STRING xset.element(INT n, ENUM select)
---------------------------------------
::
STRING xset.element(
INT n=0,
ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE
)
Returns the element of the set indicated by the ``n`` and ``select``
parameters as described above. Thus if ``n`` >= 1, the ``n``-th
element of the set is returned; otherwise the matched element
indicated by ``select`` is returned after calling ``.match()`` or
``.hasprefix()``.
The string returned is the same as it was added to the set; even if a
prior match was case insensitive, and the matched string differs in
case, the string with the case as added to the set is returned.
``.element()`` invokes VCL failure if the rules for ``n`` and
``select`` indicate failure; that is:
* ``n`` is out of range (greater than the number of elements in the
set)
* ``n`` < 1 and ``select`` fails for ``UNIQUE`` or ``EXACT``
* ``n`` < 1 and there was no prior invocation of ``.match()`` or
``.hasprefix()``.
Example::
if (myset.hasprefix(req.url)) {
# Construct a redirect response for another host, using the
# matching prefix in the request URL as the new URL path.
set resp.http.Location = "http://other.com" + myset.element();
}
.. _xset.backend():
BACKEND xset.backend(INT n, STRING element, ENUM select)
--------------------------------------------------------
::
BACKEND xset.backend(
INT n=0,
STRING element=0,
ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE
)
Returns the backend associated with the element of the set indicated
by ``n``, ``element`` and ``select``, according to the rules given
above; that is, it returns the backend that was set via the
``backend`` parameter in ``.add()``.
``.backend()`` invokes VCL failure if:
* The rules for ``n``, ``element`` and ``select`` indicate failure.
* No backend was set with the ``backend`` parameter in the ``.add()``
call corresponding to the selected element.
Example::
if (myset.hasprefix(bereq.url)) {
# Set the backend associated with the string in the set that
# forms the longest prefix of the URL
set bereq.backend = myset.backend(select=LONGEST);
}
.. _xset.string():
STRING xset.string(INT n, STRING element, ENUM select)
------------------------------------------------------
::
STRING xset.string(
INT n=0,
STRING element=0,
ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE
)
Returns the string set by the ``string`` parameter for the element of
the set indicated by ``n``, ``element`` and ``select``, according to
the rules given above.
``.string()`` invokes VCL failure if:
* The rules for ``n``, ``element`` and ``select`` indicate failure.
* No string was set with the ``string`` parameter in ``.add()``.
Example::
# Rewrite the URL if it matches one of the strings in the set.
if (myset.match(req.url)) {
set req.url = myset.string();
}
.. _xset.integer():
INT xset.integer(INT n, STRING element, ENUM select)
----------------------------------------------------
::
INT xset.integer(
INT n=0,
STRING element=0,
ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE
)
Returns the integer set by the ``integer`` parameter for the element
of the set indicated by ``n``, ``element`` and ``select``, according
to the rules given above.
``.integer()`` invokes VCL failure if:
* The rules for ``n``, ``element`` and ``select`` indicate failure.
* No integer was set with the ``integer`` parameter in ``.add()``.
Example::
# Send a synthetic response if the URL has a prefix in the set,
# using the response code set in .add().
if (myset.hasprefix(req.url)) {
# Check .nmatches() to ensure that select=UNIQUE can be used
# without risk of VCL failure.
if (myset.nmatches() == 1) {
return( synth(myset.integer(select=UNIQUE)) );
}
}
.. _xset.bool():
BOOL xset.bool(INT n, STRING element, ENUM select)
--------------------------------------------------
::
BOOL xset.bool(
INT n=0,
STRING element=0,
ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE
)
Returns the boolean value set by the ``bool`` parameter for the
element of the set indicated by ``n``, ``element`` and ``select``,
according to the rules given above.
``.bool()`` invokes VCL failure if:
* The rules for ``n``, ``element`` and ``select`` indicate failure.
* No boolean was set with the ``bool`` parameter in ``.add()``.
Example::
# Match domains to the Host header, and append "www." where
# necessary.
sub vcl_init {
new domains = selector.set();
domains.add("example.com", bool=true);
domains.add("www.example.net", bool=false);
domains.add("example.org", bool=true);
domains.add("www.example.edu", bool=false)
}
sub vcl_recv {
if (domains.match(req.http.Host)) {
if (domains.bool()) {
set req.http.Host = "www." + req.http.Host;
}
}
}
.. _xset.re_match():
BOOL xset.re_match(STRING subject, INT n, STRING element, ENUM select)
----------------------------------------------------------------------
::
BOOL xset.re_match(
STRING subject,
INT n=0,
STRING element=0,
ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE
)
Using the regular expression set by the ``regex`` parameter for the
element of the set indicated by ``n``, ``element`` and ``select``,
return the result of matching the regex against ``subject``. The regex
match is the same operation performed for the native VCL ``~``
operator, see vcl(7).
In other words, this method can be used to perform a second match with
the saved regular expression, after matching a fixed string against
the set.
The regex match is subject to the same conditions imposed for matching
in native VCL; in particular, it may be limited by the varnishd
parameters ``pcre_match_limit`` and ``pcre_match_limit_recursion``
(see varnishd(1)).
``.re_match()`` invokes VCL failure if:
* The rules for ``n``, ``element`` and ``select`` indicate failure.
* No regular expression was set with the ``regex`` parameter in
``.add()``.
The regex match may fail for any of the reasons that cause a native
match to fail. In that case, ``.re_match()`` returns ``false``, and a
log message with tag ``VCL_Error`` is emitted (as for native regeex
match failures).
Example::
# If the Host header exactly matches a string in the set, perform a
# regex match against the URL.
if (myset.match(req.http.Host)) {
if (myset.re_match(req.url)) {
call do_if_the_URL_matches_the_regex_for_Host;
}
}
.. _xset.sub():
STRING xset.sub(STRING str, STRING sub, BOOL all, INT n, STRING element, ENUM select)
-------------------------------------------------------------------------------------
::
STRING xset.sub(
STRING str,
STRING sub,
BOOL all=0,
INT n=0,
STRING element=0,
ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE
)
Using the regular expression set by the ``regex`` parameter for the
element of the set indicated by ``n``, ``element`` and ``select``,
return the result of a substitution using ``str`` and ``sub``.
Note that the method name "sub" refers to string substitution. To
retrieve the subroutine set with the ``sub`` parameter in ``.add()``,
use the ``.subroutine()`` method documented below.
If ``all`` is ``false``, then return the result of replacing the first
portion of ``str`` that matches the regex with ``sub``. ``sub`` may
contain backreferences ``\0`` through ``\9``, to include captured
substrings from ``str`` in the substitution. This is the same
operation performed by the native VCL function
``regsub(str, regex, sub)`` (see vcl(7)). By default, ``all`` is false.
If ``all`` is ``true``, return the result of replacing each
non-overlapping portion of ``str`` that matches the regex with ``sub``
(possibly with backreferences). This is the same operation as native
VCL's ``regsuball(str, regex, sub)``.
``.sub()`` invokes VCL failure if:
* The rules for ``n``, ``element`` and ``select`` indicate failure.
* No regular expression was set with the ``regex`` parameter in
``.add()``.
The substitution may fail for any of the reasons that cause native
``regsub()`` or ``regsuball()`` to fail. In that case, ``.sub()``
returns ``str``, and a ``VCL_Error`` message is written to the log, as
for failures of native match substitution functions. As with the
native functions, ``str`` is returned if ``regex`` does not match
``str``.
Example::
# In this example we match the URL prefix, and if a match is found,
# rewrite the URL by exchanging path components as indicated.
sub vcl_init {
new rewrite = selector.set();
rewrite.add("/foo/", regex="^(/foo)/([^/]+)/([^/]+)/");
rewrite.add("/foo/bar/", regex="^(/foo/bar)/([^/]+)/([^/]+)/");
rewrite.add("/foo/bar/baz/", regex="^(/foo/bar/baz)/([^/]+)/([^/]+)/");
}
if (rewrite.hasprefix(req.url)) {
set req.url = rewrite.sub(req.url, "\1/\3/\2/", select=LAST);
}
# /foo/1/2/* is rewritten as /foo/2/1/*
# /foo/bar/1/2/* is rewritten as /foo/bar/2/1/*
# /foo/bar/baz/1/2/* is rewritten as /foo/bar/baz/2/1/*
.. _xset.subroutine():
SUB xset.subroutine(INT n, STRING element, ENUM select)
-------------------------------------------------------
::
SUB xset.subroutine(
INT n=0,
STRING element=0,
ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE
)
Returns the subroutine set by the ``sub`` parameter for the element of
the set indicated by ``n``, ``element`` and ``select``, according to
the rules given above. The subroutine may be invoked with VCL
``call``.
**Note**: you must ensure that the subroutine may invoked legally in
the context in which it is called. This means that:
* The subroutine may only refer to VCL elements that are legal in the
invocation context. For example, if the subroutine only refers to
headers in ``req.http.*``, then it may be called in ``vcl_recv``,
but not if it refers to any header in ``resp.http.*``. See
``vcl-var(7)`` for the specification of which VCL variables may be
used in which contexts.
* Recursive subroutine calls are not permitted in VCL. The subroutine
invocation may not appear anywhere in its own call stack.
For standard subroutine invocations with ``call``, the VCL compiler
checks these conditions and issues a compile-time error if either one
is violated. This is not possible with invocations using
``.subroutine()``; the error can only be determined at runtime. So it
is advisable to test the use of ``.subroutine()`` carefully before
using it in production. You can use the ``.check_call()`` method
described below to determine if the subroutine call is legal.
``.subroutine()`` invokes VCL failure if:
* The rules for ``n``, ``element`` and ``select`` indicate failure.
* No subroutine was set with the ``sub`` parameter in ``.add()``.
* The subroutine is invoked with ``call``, but the call is not legal
in the invocation context, for the reasons given above.
Example::
# Due to the use of resp.http.*, this subroutine may only be invoked
# in vcl_deliver or vcl_synth, as documented in vcl-var(7). Note
# that subroutine definitions must appear before vcl_init to
# permitted for the sub parameter in .add().
sub resp_sub {
set resp.http.Call-Me = "but only in deliver or synth";
}
sub vcl_init {
new myset = selector.set();
myset.add("/foo", sub=resp_sub);
myset.add("/foo/bar", sub=some_other_sub);
# ...
}
sub vcl_deliver {
if (resp_sub.hasprefix(req.url)) {
call resp_sub.subroutine(select=LONGEST);
}
}
.. _xset.check_call():
BOOL xset.check_call(INT n, STRING element, ENUM select)
--------------------------------------------------------
::
BOOL xset.check_call(
INT n=0,
STRING element=0,
ENUM {UNIQUE, EXACT, FIRST, LAST, SHORTEST, LONGEST} select=UNIQUE
)
Returns ``true`` iff the subroutine returned by ``.subroutine()`` for
the element of the set indicated by ``n``, ``element`` and ``select``
may be invoked legally in the current context. The conditions for
legal invocation are documented for ``.subroutine()`` above.
``.check_call()`` never invokes VCL failure, but rather returns
``false`` under conditions for which the use of ``.subroutine()``
would invoke VCL failure, as described above. In that case, a message
is emitted to the Vanrish log using the ``Notice`` tag (the same
message that would appear with the ``VCL_Error`` tag if the subroutine
were called).
Example::
sub vcl_deliver {
if (resp_sub.hasprefix(req.url)) {
if (resp_sub.check_call(select=LONGEST)) {
call resp_sub.subroutine(select=LONGEST);
}
else {
call do_if_resp_sub_is_illegal;
}
}
}
.. _selector.version():
STRING version()
----------------
Return the version string for this VMOD.
Example::
std.log("Using VMOD selector version: " + selector.version());
STATISTICS
==========
When ``.create_stats()`` is invoked for a set object, statistics are
created that can be viewed with a tool like varnishstat(1).
The stats have the following naming schema::
SELECTOR.<vcl>.<object>.<stat>
... where ``<vcl>`` is the VCL instance name, ``<object>`` is the
object name, and ``<stat>`` is the statistic. So the ``elements`` stat
of the ``myset`` object in the VCL instance ``boot`` is named::
SELECTOR.boot.myset.elements
The statistics describe properties of the set, and their values are
constant, never changing during the lifetime of the VCL instance.
Statistics provided by the VMOD include:
* ``elements``: the number of elements in the set (added via
``.add()``)
* ``setsz``: the total size of the strings in the set -- the sum of
the lengths of all of the strings, including their terminating null
bytes
* ``minlen``: the length of the shortest string in the set
* ``maxlen``: the length of the shortest string in the set
The remaining stats refer to properties of a set object's internal
data structures, and depend on the internal implementation. The
implementation may be changed in any new version of the VMOD, and
hence the stats may change. These are described further in an external
document (see `STATISTICS <STATISTICS.md>`_ in the source repository).
The stats for a VCL instance are removed from view when the instance is
set to the cold state, and become visible again when it set to the warm
state. They are removed permanently when the VCL instance is discarded
(see varnish-cli(7)).
ERRORS
======
The method documentation above describes illegal uses for which VCL
failure is invoked. VCL failure has the same results as if
``return(fail)`` is called from a VCL subroutine:
* If the failure occurs in ``vcl_init``, then the VCL load fails with
an error message.
* If the failure occurs in any other subroutine besides ``vcl_synth``,
then a ``VCL_Error`` message is written to the log, and control is
directed immediately to ``vcl_synth``, with ``resp.status`` set to
503 and ``resp.reason`` set to ``"VCL failed"``.
* If the failure occurs in ``vcl_synth``, then ``vcl_synth`` is
aborted, and the response line "503 VCL failed" is sent.
VCL failure is meant to "fail fast" on conditions that cannot be
correct, or when resource limitations such as workspace exhaustion
prevent further processing. Depending on your use case, you may be
able to use the VMOD's methods without additional checking and with no
risk of failure. For example, if it is known that none of the strings
in a set have common prefixes, then methods with ``select=UNIQUE`` can
be used safely after calling ``.hasprefix()``.
If you need to check against possible failure conditions:
* If ``.nmatches() == 1``, then ``select=UNIQUE`` can be used safely.
* The ``UNIQUE`` and ``EXACT`` conditions can also be checked with
``.matched(select=UNIQUE)`` and ``.matched(select=EXACT)``.
* The ``allow_overlaps`` flag can be set in the constructor, to
ensure that VCL load fails if a set unintentionally has strings
with common prefixes.
* In most cases, a method invokes VCL failure if the value of the
``element`` parameter is not in the set. But ``element`` can be used
safely with any string in ``.matched()`` to check if a string
matched previously -- ``.matched()`` returns ``false`` if the
``element`` is not in the set.
* The ``.check_call()`` method may be used to avoid VCL failure if a
subroutine call using ``.subroutine()`` would be illegal.
See `LIMITATIONS`_ for considerations if you encounter conditions such
as workspace exhaustion.
LOGGING
=======
Both of ``.match()`` and ``.hasprefix()`` return ``false`` when the
string to be matched is NULL, typically because an unset header was
specified. Such usage may be deliberate; you might intend VCL logic to
depend on whether a header either doesn't match or does not exist. But
it may be an error, for example due to misspelling the header name.
When the string to be matched is NULL, the VMOD emits a warning to the
log with the tag ``Notice``, in this format::
vmod_selector: <obj>.<method>(): subject string is NULL
... where ``<obj>`` is the object name and ``<method>`` is either
``match`` or ``hasprefix``.
If ``.check_call()`` returns ``false``, indicating that the use of
``.subroutine()`` would be illegal in that context, then the VMOD
emits a log meesage using ``Notice`` in this format::
vmod_selector: <obj>.check_call(): <errmsg>
... where ``<obj>`` is the object name and ``<errmsg>`` is the message
that would have been logged with ``VCL_Error`` if the subroutine were
invoked.
As noted above, VCL failure during request/response transactions
(after successful VCL load) is logged with an error message using the
``VCL_Error`` tag. These messages begin with the prefix ``vmod
selector failure``.
REQUIREMENTS
============
The VMOD requires Varnish since version 6.6.0. See the source
repository for versions of the VMOD that are compatible with released
versions of Varnish.
INSTALLATION
============
See `INSTALL.rst <INSTALL.rst>`_ in the source repository.
LIMITATIONS
===========
The VMOD uses workspace for two purposes:
* Saving task-scoped data about a match with ``.match()`` and
``.hasprefix()``, for use by the methods that retrieve information
about the prior match. This data is stored separately for each
object for which a match is executed.
* A copy of the string to be matched for case insensitive matches (the
copy is set to all one case).
The default workspace sizes are usually more than large enough for
typical usages, but that depends on workspace consumption for other
purposes.
If you find that methods are failing with ``VCL_Error`` messages
indicating "out of space", consider increasing the varnishd parameters
``workspace_client`` and/or ``workspace_backend`` (see varnishd(1)).
Set objects and their internal structures are allocated from the heap,
and hence are only limited by available RAM.
The regex methods ``.re_match()`` and ``.sub()`` use the same internal
mechanisms as native VCL's ``~`` operator and the ``regsub/all()``
functions, and are subject to the same limitations. In particular,
they may be limited by the varnishd parameters ``pcre_match_limit``
and ``pcre_match_limit_recursion``, in which case they emit the same
``VCL_Error`` messages as the native operations. If necessary, adjust
these parameters as advised in varnishd(1).
SEE ALSO
========
* varnishd(1)
* vcl(7)
* vcl-var(7)
* varnishstat(1)
* varnishlog(1)
* varnish-cli(7)
* VMOD source repository: https://code.uplex.de/uplex-varnish/libvmod-selector
* Gitlab mirror: https://gitlab.com/uplex/varnish/libvmod-selector
* `VMOD re2`_: https://code.uplex.de/uplex-varnish/libvmod-re2
COPYRIGHT
=========
::
Copyright (c) 2018 UPLEX Nils Goroll Systemoptimierung
All rights reserved
Author: Geoffrey Simmons <geoffrey.simmons@uplex.de>
See LICENSE