README.rst 13.1 KB
Newer Older
Nils Goroll committed
1 2 3 4 5 6 7 8 9 10
==============
DCS Classifier
==============

--------------------------------------------
Varnish Device Classification Service Module
--------------------------------------------

:Manual section: 3
:Authors: Nils Goroll
11 12
:Date: 2017-11-20
:Version: 0.6
Nils Goroll committed
13 14 15 16

SYNOPSIS
========

Nils Goroll committed
17 18 19
Command line
------------

Nils Goroll committed
20 21
::

Nils Goroll committed
22 23
	<prefix>/bin/dcs

24
Varnish VMOD (Varnish 3 and later)
25
----------------------------------
Nils Goroll committed
26 27 28 29 30 31 32 33 34 35 36

VCL:

::

	import dcs [from "path"] ;

	# typical use
	sub vcl_recv {
		set req.http.x-nb-classified = dcs.type_name(dcs.classify());
		# - or-
37
		set req.http.X-DeviceClass   = dcs.type_class(dcs.classify());
Nils Goroll committed
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62
		# ...
	}

Varnish 2 inline-C
------------------

Varnish start:

::

	varnishd "-pcc_command=\"exec gcc -I<prefix>/share ...\""

VCL:

::

	C{
	#include "dcs_varnish2.c"
	}C

	sub vcl_recv {
	    C{ dcs_varnish2_classify_hdrs(sp); }C
	    ...
	}

Nils Goroll committed
63 64 65 66

DESCRIPTION
===========

Nils Goroll committed
67 68
This Varnish module provides an efficient implementation of device
detection and classification using the downloadable version of the
Nils Goroll committed
69 70
Netbiscuits Device Classifier Service (DCS) database. or a
self-provided database. An example database is included.
Nils Goroll committed
71

72 73 74
Netbiscuits Device Classifier Service (DCS) database
----------------------------------------------------

Nils Goroll committed
75 76
The DCS database is not part of this module and needs to be obtained
from Netbiscuits, please refer to
77 78 79 80 81 82 83 84
http://www.netbiscuits.com/device-detection/ as a starting point. With
sufficient privileges, a classifier token can be created on
https://my.netbiscuits.com/ under Account -> Token Management. See
http://kb.netbiscuits.com/dcs/dcs_ui_tokenmanagement.html for
instructions.

The classifier token is also referred to as DCS_KEY below.

Nils Goroll committed
85 86 87 88 89 90 91
Demo Database file
------------------

For demonstration purposes, we provice a simple database file with
some minimal and incomplete classification information in
`src/dcs_demo.db`. See :ref:using_the_demo_db for details.

92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109
Meta Classes
------------

Classification types from the database file can be associated with
meta-classes in the file `src/classes.conf`. Its format is

::

	[classname]
	Typename from the database

Note that the bundled tests need entries from the bundled
classes.conf.

During the build process, `gen_dcs_classifier.pl` emits warnings if
entries are missing from the classes configuration or if entries
remain unused. It may be advisable to update the configuration when
these warnigs are seen.
Nils Goroll committed
110

Nils Goroll committed
111 112 113

PERFORMANCE
-----------
Nils Goroll committed
114

Nils Goroll committed
115 116 117 118
This module was developed to provide exceptional performance compared
to previous implementations without requiring any changes to the
structure of the database or introducing any changes to the
semantics.
Nils Goroll committed
119

Nils Goroll committed
120 121
All lookups are uncached and lookup complexity does not depend on the
position of the best match in the dcs database.
Nils Goroll committed
122

Nils Goroll committed
123 124 125 126 127
To achieve high performance, C code for a custom parser for all tokens
(substrings) from the DCS database is generated. The parser is run to
detect all tokens from the User-Agent, marking potential matches. As
the match result, the DCS database entry which comes first in the
database is returned.
Nils Goroll committed
128

Nils Goroll committed
129 130 131 132 133
Exemplary benchmarks on a single i7-4600M core @2.9 GHz max suggest
that detection throughput exceeds 200.000 matches per second, which
corresponds to a latency in the order of 5us (5 microseconds).

INTERFACES
Nils Goroll committed
134 135
----------

Nils Goroll committed
136 137
The following use cases are supported

138
* Use as a Varnish module (vmod) for varnish 3 and later
Nils Goroll committed
139 140 141 142 143 144 145 146 147 148 149
* Use with Varnish 2 as inline-C
* Command line tools

PREREQUISITES
=============

The following tools are required on the system where this module is to
be build.

When building from the source repository:

Nils Goroll committed
150 151 152
* `libtool`
* `autoconf` 2.69 or later
* `automake`
Nils Goroll committed
153 154 155 156 157 158 159

For all builds:

* A working C99 compatible compiler. gcc 4 is tested, other gcc
  versions and clang should work

* A working build environment with all standard headers and tools
Nils Goroll committed
160
  (e.g. `make`)
Nils Goroll committed
161

Nils Goroll committed
162
* `perl`
Nils Goroll committed
163 164

* The following perl modules:
Nils Goroll committed
165

Nils Goroll committed
166
  * `Crypt::RC4`
Nils Goroll committed
167

Nils Goroll committed
168
  * `Digest::MD5`
Nils Goroll committed
169

Nils Goroll committed
170 171
  * `MIME::Base64`

Nils Goroll committed
172 173 174 175
To download the dcs DB (through the `DCS_ACCOUNT` parameter to `configure`)

* `curl`

Nils Goroll committed
176 177
To build the documentation

Nils Goroll committed
178
* `rst2man`
Nils Goroll committed
179 180 181

To build the varnish module (VMOD)

182
* For Varnish 4 and later
183 184 185

  A full installation of Varnish to be used with the VMOD.

186
* For Varnish 3:
187 188

  The source tree of the Varnish version the VMOD is to be used
Nils Goroll committed
189 190
  with. Varnish must have been configured and compiled in this source
  tree.
191

Nils Goroll committed
192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208
* python2 or higher

BUILDING
========

A minimal build will only provide the command line tools. It requires
the follwing steps:

1. Generate `configure` and automake files (Only when building from the
   source repository):

::

	sh autogen.sh

2. Run `configure` to use either an existing database file or download
from the netbiscuits website (standard `configure` arguments are also
209
supported). In both cases the classifier token from Netbiscuits is
Nils Goroll committed
210 211 212 213 214 215 216 217 218 219 220 221 222 223
required:

2a. Existing database file

::

	sh configure DCS_KEY=<key> DCS_DBFILE=<file>

2b. Download with a https://my.netbiscuits.com/ account

::

	sh configure DCS_KEY=<key> DCS_ACCOUNT=<account-name>

Nils Goroll committed
224 225 226
.. _using_the_demo_db:

2c. Use the bundled demo Database.
227

Nils Goroll committed
228 229 230 231
    We provide a very simple database for demonstration purposes.

::

232 233 234 235 236 237 238 239 240 241 242 243 244 245 246
   sh configure DCS_KEY=demo DCS_DBFILE=dcs_demo.db

2d. Optionally add dbfiles to prepend and append

    To override / amend entries from the main ``DCS_DBFILE``,
    additional files can be speficied as ``DCS_DBFILE_PRE`` (to
    prepend / override) and ``DCS_DBFILE_POST``. These files should
    not be encrypted, otherwise the same encryption key must be used.

::

   sh configure DCS_KEY=demo DCS_DBFILE=dcs_demo.db \
		DCS_DBFILE_PRE=/path/to/pre.db \
		DCS_DBFILE_POST=/path/to/post.db

Nils Goroll committed
247

Nils Goroll committed
248 249 250 251 252 253
3. Run `make`

4. Optionally run `make install` to install at the default location
   (normally `/usr/local`) or in the prefix specified by the `--prefix`
   argument to `configure`.

254 255
Building the varnish module (vmod) for Varnish 4 and higher
-----------------------------------------------------------
Nils Goroll committed
256

257 258 259 260 261 262 263
If varnish is installed in a standard directory, generic :ref:BUILDING
as explained above should also build the vmod.

If you have installed Varnish to a non-standard directory, call
``autogen.sh`` and ``configure`` with ``PKG_CONFIG_PATH`` pointing to
the appropriate path. For example, when varnishd configure was called
with ``--prefix=$PREFIX``, use
Nils Goroll committed
264 265 266

::

267 268
   PKG_CONFIG_PATH=${PREFIX}/lib/pkgconfig
   export PKG_CONFIG_PATH
Nils Goroll committed
269 270 271 272 273 274 275 276 277

When building the vmod, an additional

::

	make check

step is recommended to run the bundled `varnishtest` tests.

278 279 280 281 282 283 284 285 286 287 288 289 290 291 292
Building the varnish module (vmod) for Varnish 3
------------------------------------------------

To build the vmod for varnish 3, in addition to the `configure`
arguments given above, the `VARNISHSRC` argument must be used as in

::

	sh configure DCS_KEY=<key> DCS_ACCOUNT=<account-name> \
	   VARNISHSRC=/path/to/your/varnish/source/varnish-cache

Optionally, a custom vmod installation directory can be specified
using `VMOD_DIR=<dir>`


Nils Goroll committed
293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340
Building for use with Varnish 2
-------------------------------

To install all files required to use the DCS module with Varnish 2
inline-C, use the additional `configure` argument `--enable-varnish2`

With this argument, `make install` will create additional source files
in the `share` directory of the installation prefix.

Optimizing the DCS database
---------------------------

The optional `make fixup` before calling `make` will remove entries
from the DCS database which will never be hit and will reorder entries
which are likely to be unintentionally masked by previous entries.

Use `make clean` before `make fixup` if the code has already been build.


.. _detection_methodology:

DETECTION METHODOLOGY
=====================

The following applies to the `classify()` function of the Varnish Module and Varnish 2
inline-C. The `dcs` command line tool only implements the last step.

* If the `x-wap-profile header` is present, the User-Agent will be
  classified as a mobile phone

* If the `X-OperaMini-Phone-UA` header is present, the string " opera/
  opera mini/  " gets appended to the `User-Agent` header for
  classification.

* The contents of the headers `X-OperaMini-Phone-UA`,
  `X-Device-User-Agent`, `X-Original-User-Agent` and `X-Goog-Source`
  are appended to the `User-Agent` header for classification.

* The enrichted `User-Agent` string is passed to the DCS classifer and
  the matching dcs db entry is returned - or a special db entry named
  "unidentified".

VMOD USAGE - FUNCTIONS
======================

To import the vmod, use

::
Nils Goroll committed
341

Nils Goroll committed
342
	import dcs [from "path"] ;
Nils Goroll committed
343

Nils Goroll committed
344
.. _func_classify:
Nils Goroll committed
345

Nils Goroll committed
346 347
INT classify()
--------------
Nils Goroll committed
348

Nils Goroll committed
349
Runs the :ref:detection_methodology as described.
Nils Goroll committed
350

Nils Goroll committed
351
The return value is the index of the DCS DB entry.
Nils Goroll committed
352

Nils Goroll committed
353 354
This vmod function should be used as an argument to one of the
functions described below.
Nils Goroll committed
355

Nils Goroll committed
356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410
As each invocation runs the classifcation again,
it should only be used once per request.

Example:

::

	set req.http.x-nb-classified = dcs.type_name(dcs.classify());

.. _func_entry_key:

STRING entry_key(INT)
---------------------

Returns the key of the dcs db entry whose index is given as the integer argument.

Example:

::

	set req.http.xx-entry-key = dcs.entry_key(dcs.classify());

Might set `xx-entry-key` to something like "android*opera mini/"

.. _func_type_id:

INT type_id(INT)
----------------

Returns the internal type id of the dcs db entry whose index is given as the integer argument.

Example:

::

	set req.http.xx-type-id = dcs.type_id(dcs.classify());

Might set `xx-type-id` to "11"


.. _func_type_name:

STRING type_name(INT)
---------------------

Returns the type name of the dcs db entry whose index is given as the integer argument.

Example:

::

	set req.http.x-nb-classified  = dcs.type_name(dcs.classify());

might set `x-nb-classified` to "Mobile Phone"

411
.. _func_type_class:
Nils Goroll committed
412

413 414
STRING type_class(INT)
----------------------
Nils Goroll committed
415

416
Returns one of the meta types defined in `src/classes.conf`
Nils Goroll committed
417 418 419 420 421

Example:

::

422
	set req.http.X-DeviceClass    = dcs.type_class(dcs.classify());
Nils Goroll committed
423

424
might set `X-DeviceClass` to "smartphone"
Nils Goroll committed
425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440


COMMAND LINE USAGE
==================

dcs
---

This command line tool reads one User-Agent string per input line
until EOF is reached and outputs the respective classifcation in the
following format:

::

	--
	<input-line lowercase>
441
	entry id <entry-id> type <type_id> - <type_class> - <type_name>
Nils Goroll committed
442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506


VARNISH 2 USAGE
===============

To use this module with Varnish 2, the classifier code and some vmod
glue code need to be compiled as inline-C. To do so, the CC command
executed by VCC needs to be modified such that the dcs source code can
be found.

In the following section, `$PREFIX` needs to be replaced by the
installation prefix passed to `configure` using the `--prefix` command
or the default value `/usr/local`:

* Determine the `cc_command` of the running Varnish instance

::

	varnishadm -T ... param.show cc_command

* Add the following to the `cc_command` after the compiler name
  (usually "gcc"):

::

	-I$PREFIX/share

  and add the resulting `cc_command` as a paramter to the varnish
  start script.

  Example: If `cc_command` is

::

	"exec gcc -std=gnu99  -pthread -fpic -shared -Wl,-x -o %o %s"

  then add the following to the Varnish start parameters:

::

	-pcc_command="exec gcc -I$PREFIX/share -std=gnu99  -pthread -fpic -shared -Wl,-x -o %o %s"

To change a running varnish instance without a restart, `varnishadm`
can be used.

Once the new `cc_command` is active, the following can be used in VCL:

::

	C{
	#include "dcs_varnish2.c"
	}C

	sub vcl_recv {
	    C{
		dcs_varnish2_classify_hdrs(sp);
	    }C

	    # ....
	}

This will make the following headers available in `vcl_recv` after the
call to `dcs_varnish2_classify_hdrs`:

* `req.http.x-nb-classified`
Nils Goroll committed
507
  same as :ref:func_type_name e.g. "Mobile Phone"
Nils Goroll committed
508

509
* `req.http.X-DeviceClass`
Nils Goroll committed
510
  same as :ref:func_type_class e.g. "desktop"
Nils Goroll committed
511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527


NOTES
=====

dynamic linking
---------------

The varnsh dcs vmod is always compiled to be self-contained, it does
not link dynamically to `libdcs.so`. This is done for two reasons:

1. To avoid the performance penalty of library calls

2. To avoid version conflicts between `libdcs.so` and the rest of
   `libvmod_dcs.so` which could happen when a `libdcs.so` generated
   from one DCS database is used with `libvmod_dcs.so` generated from
   another
Nils Goroll committed
528

Nils Goroll committed
529 530 531 532 533 534 535 536
cURL
----

`curl` uses environment variables like `http_proxy`. If they do not pass through
the Makefiles (as with some versions of `make`),
`curl` can be configured using `~/.curlrc`.


Nils Goroll committed
537 538 539
ACKNOWLEDGEMENTS
================

Nils Goroll committed
540 541
Development of this module was sponsored by Deutsche Telekom AG - Products & Innovation

Nils Goroll committed
542 543 544
HISTORY
=======

Nils Goroll committed
545 546 547 548 549 550 551 552 553 554 555 556
* Version 0.1: Initial version, mostly feature-complete

* Version 0.2: Rename: `x-variant` -> `X-DeviceClass`, `type_mtd` ->
  `type_class`, change return values. `dcs.type_class` will now return one of:

::

        new value       old value
        ---------       ---------
        desktop         dsk
        smartphone      mob
        tablet          tab
Nils Goroll committed
557

558 559
* Version 0.3: Class assignments can now be defined in `src/classes.conf`

560 561 562
* Version 0.4: Also support Varnish 4. Take memory from Varnish workspace
  instead of stack for Varnish 4.

563 564
* Version 0.5: Also uspport Varnish 6 (post 5.2)

Nils Goroll committed
565 566 567
BUGS
====

Nils Goroll committed
568
None known
Nils Goroll committed
569 570 571 572 573 574 575 576 577 578

SEE ALSO
========

* varnishd(1)
* vcl(7)

COPYRIGHT
=========

579
* Copyright 2014-2017 UPLEX - Nils Goroll Systemoptimierung
Nils Goroll committed
580 581 582 583 584

LICENSE
=======

.. include:: LICENSE