#- # This document is licensed under the same conditions as the # libvmod-blobcode project. See LICENSE for details. # # Authors: Nils Goroll # Geoffrey Simmons # $Module blobcode 3 binary-to-text encodings and decodings for the VCL blob type :: sub vcl_init { # Create blob objects from encodings such as base64 or hex. new myblob = blobcode.blob(BASE64, "Zm9vYmFy"); new yourblob = blobcode.blob(encoded="666F6F", decoding=HEX); } sub vcl_deliver { # The .get() method retrieves the BLOB from an object. set resp.http.MyBlob-As-Hex = blobcode.encode(blob=myblob.get(), encoding=HEXLC); # The .encode() method efficiently retrieves an encoding. set resp.http.YourBlob-As-Base64 = yourblob.encode(BASE64); # decode() and encode() functions convert blobs to text and # vice versa at runtime. set resp.http.Base64-Encoded = blobcode.encode(BASE64, blobcode.decode(HEX, req.http.Hex-Encoded)); } sub vcl_recv { # transcode() converts from one encoding to another. set req.http.Hex-Encoded = blobcode.transcode(decoding=BASE64, encoding=HEXUC, encoded="YmF6"); # transcode() can replace other specific encoding/deconding # vmods - e.g. vmod_urlcode set req.url = blobcode.transcode(encoded=req.url, decoding=URL); set req.http.url_urlcoded = blobcode.transcode(encoded=req.url, encoding=URLLC); } DESCRIPTION =========== This Varnish module (VMOD) supports binary-to-text encodings and decodings for the VCL data type BLOB, which may contain arbitrary data of any length. Currently BLOBs may only be used as arguments of VMOD functions; so this module is meant to facilitate the use of other VMODs. ENCODING SCHEMES ================ Encoding schemes are specified by ENUMs in the VMOD's constructor, methods and functions. Decodings convert a (possibly concatenated) string into a blob, while encodings convert a blob into a string. ENUM values for a decoding can be one of: * ``IDENTITY`` * ``BASE64`` * ``BASE64URL`` * ``BASE64URLNOPAD`` * ``HEX`` * ``URL`` An encoding can be one of: * ``IDENTITY`` * ``BASE64`` * ``BASE64URL`` * ``BASE64URLNOPAD`` * ``HEXUC`` * ``HEXLC`` * ``URLUC`` * ``URLLC`` Empty strings are decoded into a "null blob" (of length 0), and conversely a null blob is encoded as the empty string. IDENTITY -------- The simplest encoding converts between the BLOB and STRING data types, leaving the contents byte-identical. Note that a BLOB may contain a null byte at any position before its end; if such a BLOB is decoded with IDENTITY, the resulting STRING will have a null byte at that position. Since VCL strings, like C strings, are represented with a terminating null byte, the string will be truncated, appearing to contain less data than the original blob. For example:: # Decode from the hex encoding for "foo\0bar". # The header will be seen as "foo". set resp.http.Trunced-Foo1 = blobcode.encode(IDENTITY, blobcode.decode(HEX, "666f6f00626172")); Because the IDENTITY is the default encoding and decoding, the above can also be written as:: # Decode from the hex encoding for "foo\0bar". # The header will be seen as "foo". set resp.http.Trunced-Foo2 = blobcode.encode(blob=blobcode.decode(HEX, "666f6f00626172")); BASE64* ------- The base64 encoding schemes use 4 characters to encode 3 bytes. There are no newlines or maximal line lengths -- whitespace is not permitted. The ``BASE64`` encoding uses the alphanumeric characters, ``+`` and ``/``; and encoded strings are padded with the ``=`` character so that their length is always a multiple of four. The ``BASE64URL`` encoding also uses the alphanumeric characters, but ``-`` and ``_`` instead of ``+`` and ``/``, so that an encoded string can be used safely in a URL. This scheme also uses the padding character ``=``. The ``BASE64URLNOPAD`` encoding uses the same alphabet as ``BASE6URL``, but leaves out the padding. Thus the length of an encoding with this scheme is not necessarily a mutltiple of four. HEX* ---- The ``HEX`` decoding converts a hex string, which may contain upper- or lowercase characters for hex digits ``A`` through ``f``, into a blob. The ``HEXUC`` or ``HEXLC`` encodings convert a blob into a hex string with upper- and lowercase digits, respectively. A prefix such as ``0x`` is not used for any of these schemes. If a hex string to be decoded has an odd number of digits, it is decoded as if a ``0`` is prepended to it; that is, the first digit is interpreted as representing the least significant nibble of the first byte. For example:: # The concatenated string is "abcdef0", and is decoded as "0abcdef0". set resp.http.First = "abc"; set resp.http.Second = "def0"; set resp.http.Hex-Decoded = blobcode.encode(HEXLC, blobcode.decode(HEX, resp.http.First + resp.http.Second)); URL* ---- The ``URL`` decoding replaces any ``%<2-hex-digits>`` substrings with the binary value of the hexadecimal number after the % sign. The ``URLLC`` and ``URLUC`` encodings implement "percent encoding" as per RFC3986, the hexadecimal characters A-F being output in lower- and uppercase, respectively. $Function BLOB decode(ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEX, URL} decoding="IDENTITY", STRING_LIST encoded) Returns the BLOB derived from the string ``encoded`` according to the scheme specified by ``decoding``. ``decoding`` defaults to IDENTITY Example:: blobcode.decode(BASE64, "Zm9vYmFyYmF6"); # same with named parameters blobcode.decode(encoded="Zm9vYmFyYmF6", decoding=BASE64); # convert string to blob blobcode.decode(encoded="foo"); $Function BLOB decode_n(INT n, ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEX, URL} decoding="IDENTITY", STRING_LIST encoded) Same as ``decode()``, but only decode the first ``n`` characters of the decoded string. If ``n`` is greater than the length of the string, then return the same result as ``decode()``. $Function STRING encode(ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEXUC, HEXLC, URLUC, URLLC} encoding="IDENTITY", BLOB blob) Returns a string representation of the BLOB ``blob`` as specifed by ``encoding``. ``encoding`` defaults to IDENTITY Example:: set resp.http.encode1 = blobcode.encode(HEXLC, blobcode.decode(BASE64, "Zm9vYmFyYmF6")); # same with named parameters set resp.http.encode2 = blobcode.encode(blob=blobcode.decode(encoded="Zm9vYmFyYmF6", decoding=BASE64), encoding=HEXLC); # convert blob to string set resp.http.encode3 = blobcode.encode(blob=blobcode.decode(encoded="foo")); $Function STRING transcode(ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEX, URL} decoding="IDENTITY", ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEXUC, HEXLC, URLUC, URLLC} encoding="IDENTITY", STRING_LIST encoded) Translates from one encoding to another, by first decoding the string ``encoded`` according to the scheme ``decoding``, and then returning the encoding of the resulting blob according to the scheme ``encoding``. ``decoding`` and ``encoding`` default to IDENTITY Example:: set resp.http.Hex2Base64-1 = blobcode.transcode(HEX, BASE64, "666f6f"); # same with named parameters set resp.http.Hex2Base64-2 = blobcode.transcode(encoded="666f6f", encoding=BASE64, decoding=HEX); # replacement for urlcode.decode("foo%20bar") set resp.http.urldecoded = blobcode.transcode(encoded="foo%20bar", decoding=URLLC); # replacement for urlcode.encode("foo bar") set resp.http.urlencoded = blobcode.transcode(encoded="foo bar", encoding=URL); $Function STRING transcode_n(INT n, ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEX, URL} decoding="IDENTITY", ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEXUC, HEXLC, URLUC, URLLC} encoding="IDENTITY", STRING_LIST encoded) Same as ``transcode()``, but only from the first ``n`` characters of the encoded string. $Function STRING version() Returns the version string for this VMOD. Example:: std.log("Using VMOD blobcode version " + blobcode.version()); $Object blob(ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEX, URL} decoding="IDENTITY", STRING_LIST encoded) Creates an object that contains the BLOB derived from the string ``encoded`` according to the scheme ``decoding``. Example:: new theblob1 = blobcode.blob(BASE64, "YmxvYg=="); # same with named arguments new theblob2 = blobcode.blob(encoded="YmxvYg==", decoding=BASE64); # string as a blob new stringblob = blobcode.blob(encoded="bazz"); $Method BLOB .get() Returns the BLOB created by the constructor. Example:: set resp.http.The-Blob1 = blobcode.encode(blob=theblob1.get()); set resp.http.The-Blob2 = blobcode.encode(blob=theblob2.get()); set resp.http.The-Stringblob = blobcode.encode(blob=stringblob.get()); $Method STRING .encode(ENUM {IDENTITY, BASE64, BASE64URL, BASE64URLNOPAD, HEXUC, HEXLC, URLUC, URLLC} encoding="IDENTITY") Returns an encoding of BLOB created by the constructor, according to the scheme ``encoding``. Example:: # blob as text set resp.http.The-Blob = theblob1.encode(); # blob as base64 set resp.http.The-Blob-b64 = theblob1.encode(BASE64); For any ``blob`` object and encoding ``ENC``, encodings via the ``.encode()`` method and the ``encode()`` function are equal:: # Always true: blobcode.encode(ENC, blob.get()) == blob.encode(ENC) But the object method is more efficient -- the encoding is computed once and cached (with allocation in heap memory), and the cached encoding is retrieved on every subsequent call. The ``encode()`` function computes the encoding on every call, allocating space for the string in Varnish workspaces. So if the data in a BLOB are fixed at VCL initialization time, so that its encodings will always be the same, it is better to create a ``blob`` object. The VMOD's functions should be used for data that are not known until runtime. ERRORS ====== The encoders and decoders may fail if there is insufficient space to create the new blob or string. Decoders may also fail if the encoded string is an illegal format for the decoding scheme. If the ``blob`` object constructor fails, then the VCL program will fail to load, and the VCC compiler will emit an error message. If any of the VMOD's methods or functions fail at runtime, then an error message will be written to the Varnish log using the tag ``VCL_Error``. The encoders and decoders return ``NULL`` on failure; this means that, for example, if the return value was to be assigned to a header, then the header will not be set. REQUIREMENTS ============ This version of the VMOD requires Varnish version 5.0.0. (See the project repository for versions that are compatible with other versions of Varnish.) Perl 5 is required for the build. INSTALLATION ============ The VMOD is built against a Varnish installation, and the autotools use ``pkg-config(1)`` to locate the necessary header files and other resources. This sequence will install the VMOD:: > ./autogen.sh # for builds from the git repo > ./configure > make > make check # to run unit tests in src/tests/*.vtc > make distcheck # run check and prepare a distribution tarball > sudo make install If you have installed Varnish in a non-standard directory, call ``autogen.sh`` and ``configure`` with the ``PKG_CONFIG_PATH`` environment variable pointing to the appropriate path. For example, when varnishd configure was called with ``--prefix=$PREFIX``, use:: > PKG_CONFIG_PATH=${PREFIX}/lib/pkgconfig > export PKG_CONFIG_PATH By default, the vmod ``configure`` script installs the vmod in the same directory as Varnish, determined via ``pkg-config(1)``. The vmod installation directory can be overridden by passing the ``VMOD_DIR`` variable to ``configure``. Other files such as this man-page are installed in the locations determined by ``configure``, which inherits its default ``--prefix`` setting from Varnish. For developers -------------- The build specifies C99 conformance, all compiler warnings turned on, and all warnings considered errors (compiler options ``-std=c99 -Werror -Wall``). By default, ``CFLAGS`` is set to ``-g -O2``, so that symbols are included in the shared library, and optimization is at level ``O2``. To change or disable these options, set ``CFLAGS`` explicitly before calling ``make`` (it may be set to the empty string). For development/debugging cycles, the ``configure`` option ``--enable-debugging`` is recommended (off by default). This will turn off optimizations and function inlining, so that a debugger will step through the code as expected. LIMITATIONS =========== The VMOD allocates memory in various ways for new blobs and strings. The ``blob`` object and its methods allocate memory from the heap, and hence they are only limited by available virtual memory. The ``encode()``, ``decode()`` and ``transcode()`` functions allocate Varnish workspace. If these functions are failing, as indicated by "out of space" messages in the Varnish log (with the ``VCL_Error`` tag), then you will need to increase the varnishd parameters ``workspace_client`` and/or ``workspace_backend``. The ``transcode()`` function also allocates space on the stack for a temporary BLOB. If this function causes stack overflow, you may need to increase the stack size for the varnishd process, for example with ``ulimit -s``. By default, the VMOD is built with the stack protector enabled (compile option ``-fstack-protector``), but it can be disabled with the ``configure`` option ``--disable-stack-protector``. AUTHORS ======= * Geoffrey Simmons * Nils Goroll UPLEX Nils Goroll Systemoptimierung This VMOD was originally adapted from, and is heavily indebted to, the digest VMOD by Kristian Lyngstoel. HISTORY ======= * version 0.1: initial version * version 1.0: compatible with Varnish versions since 4.1.2 * version 2.0: compatible with Varnish versions since 5.0.0 SEE ALSO ======== * varnishd(1) * vcl(7) * project repository: https://code.uplex.de/uplex-varnish/libvmod-blobcode * VMOD digest: https://github.com/varnish/libvmod-digest $Event event