L

libvmod-blobdigest

Varnish module (VMOD) for digests and hmacs with the VCL data type BLOB

vmod_blobdigest

digests, checksums and hmacs for the VCL blob type

Manual section: 3

SYNOPSIS

import blobdigest [from "path"] ;

new OBJECT = blobdigest.digest(ENUM hash [, BLOB init])
BOOL <obj>.update(BLOB)
BLOB <obj>.final()

BLOB blobdigest.hash(ENUM hash, BLOB msg)

new OBJECT = blobdigest.hmac(ENUM hash, BLOB key)
BLOB <obj>.hmac(BLOB msg)

BLOB blobdigest.hmacf(ENUM hash, BLOB key, BLOB msg)

DESCRIPTION

This Varnish Module (VMOD) generates message digests, keyed-hash message authentication codes (HMACs) and checksums using the VCL data type BLOB, which may contain arbitrary data of any length.

Currently (in Varnish versions through 5.0), BLOBs may only be used in VCL as arguments of VMOD functions, so this VMOD must be used in combination with other VMODs. For example, the blobcode VMOD (see SEE ALSO) may be used to convert BLOBs using binary-to-text encodings, to initialize data for this VMOD and to save its results. The advantage of using BLOBs is that operations on the data are separated from issues such as encoding.

digest object and hash function

The VMOD provides messages digests by way of the digest object and hash function.

The interface for the digest object follows the "init-update-final" idiom for cryptographic hashes. A digest context is initialized in the constructor. The .update() method is used to incrementally add data to be hashed -- calling .update(b1) and then .update(b2) for BLOBs b1 and b2 has the same effect as calling .update(b3), where b3 is the concatenation of b1 and b2. The .final() method finalizes the message digest and returns the result.

If an initial BLOB is provided in the digest constructor, and/or if the .update() method is called one or more times in vcl_init, then the resulting digest context is saved and re-used for all subsequent method invocations on that object. This can be used, for example, to efficiently compute digests for data that always have a common prefix.

If the .final() method is invoked in vcl_init, then the resulting digest is computed and saved, and is retrieved for any subsequent invocation of .final(). This way, a hash can be computed at initialization time and retrieved efficiently at runtime.

Otherwise, when .update() and .final() are invoked outside of vcl_init, then their effects on the hash context and result have "task" scope, meaning that they are valid within the current client or backend context (which run in different threads in Varnish). For example, when .update() is called in any of the vcl_backend_* subroutines, then it affects the digest result in any of the other subroutines in the same backend transaction. This means that if updates and finalizations are performed on the same object in both client and backend transactions, then the results are independent of another, and need not be the same.

The digest context of an object begins with the state as initialized in vcl_init for each new client and backend transaction. So calls to .update() and .final() outside of vcl_init do not affect the state of the object in any other transaction.

Here are some examples:

import blobdigest;
import blobcode;
import blob;

sub vcl_init {
    # Create a BLOB consisting of the string "foo"
    new foo = blobcode.blob(IDENTITY, "foo");

    # Create a SHA256 context for messages with the prefix "foo"
    new fooprefix = blobdigest.digest(SHA256, foo.get());

    # This has the same effect
    new fooprefix2 = blobdigest.digest(SHA256);
    if (!fooprefix2.update(foo.get())) {
       return(fail);
    }

    # Compute and save the SHA256 hash for "foo"
    new foohash = blobdigest.digest(SHA256, foo.get());
    if (!blob.same(foohash.final(), foo.get())) {
       # This syntax is a workaround to allow .final() to be called,
       # since VCL currently does not allow object methods to be
       # called as procedures.
    }
}

sub vcl_recv {
    # Use the fooprefix object to get the hash of "foobar"
    if (!fooprefix.update(blobcode.decode(IDENTITY, "bar"))) {
       call do_error;
    }
    set req.http.Foobar-Hash = blobcode.encode(BASE64,
                                               fooprefix.final());
}

sub vcl_backend_response {
    # Use the fooprefix object to get the hash of "foobaz".
    # The uses of the object in client and backend contexts have
    # no effect on each other, or on subsequent client or
    # backend transactions.
    if (!fooprefix.update(blobcode.decode(IDENTITY, "baz"))) {
       call do_error;
    }
    set req.http.Foobaz-Hash = blobcode.encode(BASE64,
                                               fooprefix.final());
}

sub vcl_deliver {
    # Retrieve the SHA256 hash of "foo" computed in the constructor
    set req.http.Foo-Hash = blobcode.encode(BASE64, foohash.final());
}

The hash() function computes the message digest for its argument and returns the result. It is functionally equivalent to using a digest object:

import blobdigest;
import blobcode;

sub vcl_init {
    # Create a SHA256 context
    new sha256 = blobdigest.digest(SHA256);
}

sub vcl_recv {
    # Get the SHA256 hash of "foo"
    set req.http.Foo-Hash-Functional
      = blobcode.encode(BASE64,
                        blobdigest.hash(blobcode.decode(IDENTITY,
                                                        "foo")));

    # Same result using the object interface
    if (!sha256.update(blobcode.decode(IDENTITY, "foo"))) {
       call do_error;
    }
    set req.http.Foo-Hash-Object
      = blobcode.encode(BASE64, sha256.final());
}

The hash() function makes for a somewhat less verbose interface, and hence may be appropriate where no incremental updates are necessary, and performance is less critical. But use of the digest object is more efficient, because the hash context is created once at initialization time and then re-used. The hash() function calls initialization, update and finalization internally, so a new hash context is created on every invocation.

HASH ALGORITHMS

The hash enum in the following can have one of the following values:

  • CRC32 (not for HMACs)
  • MD5
  • SHA1
  • SHA224
  • SHA256
  • SHA384
  • SHA512
  • SHA3_224
  • SHA3_256
  • SHA3_384
  • SHA3_512

CONTENTS

  • digest(ENUM {CRC32,MD5,SHA1,SHA224,SHA256,SHA384,SHA512,SHA3_224,SHA3_256,SHA3_384,SHA3_512}, BLOB)
  • BLOB hash(ENUM {CRC32,MD5,SHA1,SHA224,SHA256,SHA384,SHA512,SHA3_224,SHA3_256,SHA3_384,SHA3_512}, BLOB)
  • hmac(ENUM {MD5,SHA1,SHA224,SHA256,SHA384,SHA512,SHA3_224,SHA3_256,SHA3_384,SHA3_512}, BLOB)
  • BLOB hmacf(ENUM {MD5,SHA1,SHA224,SHA256,SHA384,SHA512,SHA3_224,SHA3_256,SHA3_384,SHA3_512}, BLOB, BLOB)
  • STRING version()

digest

new OBJ = digest(ENUM {CRC32,MD5,SHA1,SHA224,SHA256,SHA384,SHA512,SHA3_224,SHA3_256,SHA3_384,SHA3_512} hash, BLOB init=0)

Initialize a message digest context for the algorithm hash, and optionally update it with init. If init is left out, then an empty initial context is created.

If an init BLOB is provided, then the message digests computed from this object result with init prepended before any BLOBs added by the .update() method.

Example:

import blobdigest;
import blobcode;

sub vcl_init {
    # Create an empty digest context for SHA3_256
    new sha3_256 = blobdigest.digest(SHA3_256);

    # Create a digest context for SHA512, and add "foo"
    # as a "prefix" for all other messages to be hashed.
    new foo = blobcode.blob(IDENTITY, "foo")
    new sha512 = blobdigest.digest(SHA512, foo.get());
}

digest.update

BOOL digest.update(BLOB)

Incrementally add the BLOB to the digest context of this object. Returns true if and only if the operation was successful.

As described above: if a digest object is updated in vcl_init, then the updated context is valid for all subsequent uses of the object. Otherwise, the updated context is valid only for the current task (client or backend transaction).

This method MAY NOT be called after .final() has been called for the same object, either in vcl_init or in the current task.

The method fails and returns false if the BLOB is NULL, or if it is called after .final(). If it fails in vcl_init, the VCL load will fail with an error message. If it fails in any other VCL subroutine, an error message is emitted to the Varnish log with the VCL_Error tag, and the message digest context is unchanged.

Example:

import blobdigest;
import blobcode;

sub vcl_init {
    # Create a digest context for SHA512, and add "foo"
    # as a "prefix" for all other messages to be hashed.
    new f = blobcode.blob(IDENTITY, "f")
    new oo = blobcode.blob(IDENTITY, "oo")
    new sha512 = blobdigest.digest(SHA512);
    if (!sha512.update(f.get())) {
       return(fail);
    }
    if (!sha512.update(oo.get())) {
       return(fail);
    }

    # Create an empty digest context for SHA3_256
    new sha3_256 = blobdigest.digest(SHA3_256);
}

sub vcl_recv {
    # Update the SHA3_256 digest in the current client transaction
    if (!sha3_256.update(blobcode.blob(IDENTITY, "bar"))) {
       call do_client_error;
}

sub vcl_backend_fetch {
    # Update the SHA3_256 digest in the current backend transaction
    if (!sha3_256.update(blobcode.blob(IDENTITY, "baz"))) {
       call do_backend_error;
}

digest.final

BLOB digest.final()

Finalize the message digest and return the result.

If .final() is called in vcl_init, then the message digest is computed and saved, and returned for all subsequent calls. If it is called in any other VCL subroutine, then the result is saved on the first call, and returned for all other invocations of .final() in the same task scope (that is, in the same client or backend transaction).

As with .update(), calling .final() outside of vcl_init only affects the state of the object in the current client or backend transaction.

Example:

import blobdigest;
import blobcode;
import blob;

sub vcl_init {
    # Compute and save the SHA512 hash for "foo"
    new foo = blobcode.blob(IDENTITY, "foo")
    new foohash = blobdigest.digest(SHA512, foo.get());
    if (!blob.same(foohash.final(), foo.get())) {
       # As above, this is a workaround to call final(),
       # which cannot be called as a procedure in VCL.
    }

    # Create an empty digest context for SHA3_256
    new sha3_256 = blobdigest.digest(SHA3_256);
}

sub vcl_recv {
    # Retrieve the hash for "foo" computed in vcl_init
    set req.http.Foo-Hash
      = blobcode.encode(BASE64, foohash.final());

    # Compute the base64-encoded SHA3_256 hash for "bar"
    if (!sha3_256.update(blobcode.decode(IDENTITY, "bar"))) {
       call do_client_error;
    }
    set req.http.Bar-Hash-Base64
      = blocbcode.decode(BASE64, sha3_256.final());
}

sub vcl_backend_fetch {
    # Compute the base64-encoded SHA3_256 hash for "baz"
    if (!sha3_256.update(blobcode.decode(IDENTITY, "baz"))) {
       call do_client_error;
    }
    set bereq.http.Baz-Hash-Base64
      = blocbcode.encode(BASE64, sha3_256.final());
}

sub vcl_backend_response {
    # Retrieve the message digest computed in vcl_backend_fetch
    # and get its hex encoding
    set beresp.http.Baz-Hash-Hex
      = blocbcode.encode(HEXLC, sha3_256.final());
    set beresp.http.Baz-Hash-Base64 = bereq.http.Baz-Hash-Base64
}

sub vcl_deliver {
    # Retrieve the message digest computed in vcl_recv and get
    # its hex encoding
    set resp.http.Bar-Hash-Hex
      = blocbcode.encode(HEXLC, sha3_256.final());
    set resp.http.Bar-Hash-Base64 = req.http.Bar-Hash-Base64;
    set resp.http.Foo-Hash = req.http.Foo-Hash;
}

# If there was a backend fetch (not a cache hit), then this VCL
# creates the following response headers:
#
# Foo-Hash:        base64-encoded SHA512 hash of "foo"
# Bar-Hash-Base64: base64-encoded SHA3_256 hash of "bar"
# Bar-Hash-Hex:    hex-encoded SHA3_256 hash of "bar"
# Baz-Hash-Base64: base64-encoded SHA3_256 hash of "baz"
# Baz-Hash-Hex:    hex-encoded SHA3_256 hash of "bar"

hash

BLOB hash(ENUM {CRC32,MD5,SHA1,SHA224,SHA256,SHA384,SHA512,SHA3_224,SHA3_256,SHA3_384,SHA3_512} hash, BLOB msg)

Returns the message digest for msg as specified by hash.

Example:

import blobdigest;
import blobcode;

# Decode the base64-encoded string, generate its SHA256 hash,
# and save the hex-encoded result in a header.
set req.http.SHA256
    = blobcode.encode(HEXUC,
                      blobdigest.hash(SHA256,
                                      blobcode.decode(BASE64, "Zm9v"));

hmac

new OBJ = hmac(ENUM {MD5,SHA1,SHA224,SHA256,SHA384,SHA512,SHA3_224,SHA3_256,SHA3_384,SHA3_512} hash, BLOB key)

Creates an object that generates HMACs based on the digest algorithm hash and the given key.

Example:

import blobdigest;
import blobcode;

# Create a key from the base64-encoded string, and use
# the result to initialize an hmac object.
sub vcl_init {
        new key = blobcode.blob(BASE64, "a2V5");
        new hmac = blobdigest.hmac(SHA256, key.get());
}

hmac.hmac

BLOB hmac.hmac(BLOB msg)

Returns the HMAC for msg based on the key and hash algorithm provided in the constructor.

Example:

# Import a VMOD that tests BLOBs for equality
import blob;

# Check if request header HMAC matches the HMAC for
# the hex-encoded data in request header Msg.
# Respond with 401 Not Authorized if the HMAC doesn't check.
sub vcl_recv {
    if (!blob.equal(blobcode.decode(HEX, req.http.HMAC),
                    hmac.hmac(blobcode.decode(HEX, req.http.Msg)))) {
        return(synth(401));
}

hmacf

BLOB hmacf(ENUM {MD5,SHA1,SHA224,SHA256,SHA384,SHA512,SHA3_224,SHA3_256,SHA3_384,SHA3_512} hash, BLOB key, BLOB msg)

Returns the HMAC for msg as specified by hash and the key.

As with the digest object and hash function, the use of the hmac object is likely to be more efficient than the hmacf function, because the internal cryptographic state of the HMAC for a given key is pre-computed in the object constructor. So if the key is fixed and known at initialization time, then you should use the hmac object. The hmacf function should only be used if the key is not known at runtime, or, for example, should be changed without requiring a VCL reload.

Example:

import blobdigest;
import blobcode;

# Decode the base64-encoded string as a HMAC key, compute the
# SHA512 HMAC of the Msg header, and save the hex-encoded
# result in a header.
set req.http.HMAC
    = blobcode.encode(HEXUC,
                      blobdigest.hmacf(SHA512,
                                       blobcode.decode(BASE64, "Zm9v"),
                                       blobcode.decode(IDENTITY,
                                                       req.http.Msg)));

version

STRING version()

Returns the version string for this VMOD.

Example:

std.log("Using VMOD blobdigest version " + blobdigest.version());

REQUIREMENTS

This version of the VMOD requires Varnish version 5.0.0. See branch 4.1 in the source repository for versions that are compatible with Varnish 4.1.

The gcc compiler and Perl 5 are required for the build. For the self-tests invoked by make check, the VMODs blobcode and blob must be installed (see SEE ALSO).

LIMITATIONS

For operations outside of vcl_init, the VMOD allocates memory for BLOBs and other internal structures in Varnish workspace. If its methods or functions fail, as indicated by "out of space" messages in the Varnish log (with the VCL_Error tag), then you will need to increase the varnishd parameters workspace_client and/or workspace_backend.

For operations invoked in vcl_init, the VMOD allocates heap memory, and hence is only limited by available RAM.

INSTALLATION

See INSTALL.rst in the source repository.

AUTHOR

UPLEX Nils Goroll Systemoptimierung

Cryptographic code is adapted from librhash by Aleksey Kravchenko, which is in the public domain (see LICENSE).

CONTRIBUTING

See CONTRIBUTING.rst in the source repository for notes on contributing source code and documentation, raising issues, and for developer guidelines.

SEE ALSO

COPYRIGHT

Copyright (c) 2016 UPLEX Nils Goroll Systemoptimierung
All rights reserved

Author: Geoffrey Simmons <geoffrey.simmons@uplex.de>

See LICENSE