Commits · trackrdrd-v2.0 · uplex-varnish / trackrdrd

20 Feb, 2013 2 commits
- trackrdrd: renamed version number, RPM package name has build number first · cad33bdc
  Geoff Simmons authored Feb 19, 2013
  
  cad33bdc
- trackrdrd: added APR_CFLAGS to build-CI script · 659ee22b
  Geoffrey.Simmons@uplex.de authored Feb 19, 2013
  
  659ee22b
19 Feb, 2013 24 commits

trackrdrd: updated version to 2.0_rc1 · b33234ef
Geoff Simmons authored Feb 19, 2013

b33234ef
trackrdrd: minor comment fix · 36a12051
Geoff Simmons authored Feb 19, 2013

36a12051
trackrdrd: all data table monitoring stats on one output line · 6e6707c8
Geoff Simmons authored Feb 18, 2013

6e6707c8
trackrdrd: minor cleanup · 09ec093a
Geoff Simmons authored Feb 18, 2013

09ec093a
trackrdrd: hash_insert without goto · 6904dca3
Geoff Simmons authored Feb 18, 2013

6904dca3
trackrdrd: unified indentation style, some code cleanup · 365555a4
Geoff Simmons authored Feb 18, 2013

365555a4
trackrdrd: removed logging of worker's pthread IDs (sadly not portable) · f3d9d3ea
Geoff Simmons authored Feb 18, 2013

f3d9d3ea
trackrdrd: consolidated VSL wrap logging (suitable for monitoring) · be2f6ad7
Geoff Simmons authored Feb 18, 2013

be2f6ad7
trackrdrd: using struct initializer idiom to set a const field · 4f0d27b4
Geoff Simmons authored Feb 18, 2013
```
	(trying to hint the compiler to use constant bit mask in hash)
```
4f0d27b4
trackrdrd: Hash uses quadratic probes, restored unit test for hashing · a7cf69b4
Geoff Simmons authored Feb 18, 2013

a7cf69b4
trackrdrd: unless otherwise specified, qlen_goal = maxdone/2 · f9b78ee1
Geoff Simmons authored Feb 18, 2013

f9b78ee1

trackrdrd: - added amq_connection.* and connection pooling · 37d422e7

Geoff Simmons authored Feb 18, 2013

	- reworked config params:
		- maxopen.scale can be an power of 2
		- maxdone, maxdata and qlen.goal not powers of 2

37d422e7

trackrdrd: allow configuration of multiple MQ URIs · 45e72d0c
Geoff Simmons authored Feb 17, 2013

45e72d0c
trackrdrd: Fixed child/worker shutdown condition · 43054c9e
Geoff Simmons authored Feb 16, 2013

43054c9e
trackrdrd: added MQ_Version/AMQ_Worker::getVersion · 1a6a3aa3
Geoff Simmons authored Feb 16, 2013

1a6a3aa3
trackrdrd: - corrected the termination condition · d0fd842e
Geoff Simmons authored Feb 14, 2013
```
	- now logging the varnish instance name
	- some code cleanup
```
d0fd842e
trackrdrd: Added assert.c, removed some code duplication · 85e445a2
Geoff Simmons authored Feb 14, 2013

85e445a2
trackrdrd: Now actually adding/removing files for the previous commits · e81ff9b4
Geoff Simmons authored Feb 13, 2013

e81ff9b4

trackrdrd code reorg: · 7e8af3b2

Geoff Simmons authored Feb 13, 2013

	- child process in child.c (including hashing code)
	- common signal handlers in handler.c
	- other code common to parent and child in trackrdrd.h & config.c

7e8af3b2

trackrdrd: - all data & functions exclusive to VSL reader are now · ff6f8f0f

Geoff Simmons authored Feb 13, 2013

	static in trackrdrd.c (part of data.c and all of hash.c)
	- replaced the global nworkers with WRK_Running(), since
	nworkers caused too many dependencies (esp. for unit tests)

ff6f8f0f

trackrdrd: - make check now passes · 6c3ae785

Geoff Simmons authored Feb 11, 2013

	- Stop/NeedWorker now encapsulated by the SPMCQ interface
	- spmcq_len not exposed by the SPMCQ interface

6c3ae785

Various performance and stability improvements, hash/data table separation · ac9eb9bc

Nils Goroll authored Dec 17, 2012

major changes
=============

hash/data table
---------------

The hash table is now only used for _OPEN records, and the actual data
is stored in a data table. Upon submit, hash entries are cleared and
data continues to live in the data table until it gets freed by a
worker (or upon submit if it is a NODATA record).

This drastically reduces the hash table load and significantly
increases worst case performance. In particular, the hash table load
is now independend of ActiveMQ backend performance (read: stalls).

Preliminary recommendations fon table sizing:

* hash table: double max_sessions from varnish

  e.g.

  maxopen.scale = 16

  for 64K hash table entries to support >32K sessions
  (savely and efficiently)

* data table: max(req/s) * max(ActiveMQ stall time)

  e.g. to survive 8000 req/s with 60 seconds ActiveMQ stall time,
  the data table should be >240K in size, so

  maxdone.scale = 19

  (= 512K entries) should be on the safe side also to provide
  sufficient buffer for temporary load peaks

hash table performance
----------------------

Previously, the maximum number of probes to the hash table was set to
the hash table size - which resulted in bad insert performance and
even worse lookup performance.

Now that the hash table is freed of _OPEN records, we can remove this
burden and limit the maximum number of probles to a sensible value (10
to start with, configurable as hash_max_probes.

As another consequence, as we don't require 100% capacity on the hash
table, we don't need to run an exhaustive search upon insert. Thus,
probing has been changed from liner to hash (by h2()).

only ever insert on ReqStart - and drop if we can't
---------------------------------------------------

Keeping up with the VSL is essential. Once we fall behind, we are in
real trouble:

- If we miss ReqEnd, we will clobber our hash, with drastic effects:
  - hash lookups become inefficient
  - inserts become more likely to fail
  - before we had HASH_Exp (see below), the hash would become useless

- When the VSL writer overtakes our reader, we will see corrupt data
  and miss _many_ VCL Logs and ReqEnds (as many as can be found in the
  whole VSL), so, again, our hash and data arrays will get clobbered
  with incomplete data (which needs to be cleaned up by HASH_Exp).

The latter point is the most relevant, corrupt records are likely to
trigger assertions.

Thus, keeping up with the VSL needs to be our primary objective. When
the VSL overtakes, we will loose a massive amount auf reconds anyway
(and we won't even know how many). As long as we don't stop Varnish
when we fall behind, we can't avoid loosing records under certain
circumstances anway (for instance, when the backend stalls and the
data table runs full), so we should rather drop early, in a controlled
manner - and without drastic performance penalty.

Under this doctrine, it does not make sense to insert records for
VSL_Log or ReqEnd, so if an xid can't be found for these tags, the
respective events will get dropped (and logged).

performance optimizations
=========================

spmcq reader/writer synchronization
-----------------------------------

Various measures have been implemented to reduce syscall and general
function call overhead for reader/writer synchroniration on the
spmcq. Previously, the writer would issue a pthread_cond_signal to
potentially wake up a reader, irrespective of whether or not a reader
was actually blocking on the CV.

- now, the number of waiting readers (workers) is modified inside a
  lock, but queried first from outside the lock, so if there are no
  readers waiting the CV is not signalled.

- The number of running readers is (attempted to be) kept proportional
  to the queue length for queue lengths between 0 and
  2^qlen_goal.scale to further reduce the number of worker thread
  block/wakeup transitions under low to averade load.

pthread_mutex / ptherad_condvar attributes
------------------------------------------

Attributes are now being used to allow the O/S implementation to
choose more efficient low-level synchronization primitives because we
know that we are using these only within one multi-threaded process.

data table freelist
-------------------

To allow for efficient allocation of new data table entries, a free
list with local caches is maintained:

- The data writer (VSL reader thread) maintains its own freelist and
  serves requests from it without any synchronization overhead.

- Only when the data writer's own freelist is exchausted will it
  access the global freelist (under a lock). It will take the whole
  list at once and resume serving new records from its own cache.

- Workers also maintain their own freelist of entries to be returned
  to the global freelist as long as

  - they are running
  - there are entries on the global list.

  Before a worker thread goes to block on the spmcq condvar, it
  returns all its freelist entries to the global freelist. Also, it
  will always check if the global list is empty and return any entries
  immediately if it is.

stability improvements
======================

record timeouts
---------------

Every hash entry gets added to the insert_list ordered by insertion
time. Not any more often then x seconds (currently hard-coded to x=10,
check only performed when ReqStart is seen), the list is checked for
records which have reached their ttl (configured by hash_ttl, default
120 seconds). These get submitted despite the fact that no ReqEnd has
been seen - under the assumption that no ReqEnd is ever to be expected
after a certain time has passed.

hash evacuation
---------------

If no free entry is found when probing all possible locations for an
insert, the oldest record is evacuated from the hash and submitted to
the backend if its live time has exceeded hash_mlt under the
assumption that it is better to submit records early (which are likely
to carry useful log information already) than throwing away records.

If this behavior is not desired, hash_mtl can be set to hash_ttl.

various code changes
====================

* statistics have been reorganized to seperate out
  - hash
  - data writer/VSL reader
  - data reader/worker (partially shared with writer)
  statistics

* print the native thread ID for workers (to allow to correllation
  with prstat/top output)

* workers have a new state when blocking on the spmcq CV: WRK_WAITING
  / "waiting" in monitor output

* because falling behind with VSL reading (the VSL writer overtaking
  our reader) is so bad, notices are logged whenever the new VSL data
  pointer is less than the previous one, iow the VSL ring buffer
  wraps.

  this is not the same as a detection of the VSL writer overtaking
  (which would require varnishapi changes), but noting information and
  some statistics about VSL wraps can (and did) help analyze track
  down strance issues to VSL overtaking.

config file changes
===================

* The _scale options

  maxopen.scale
  maxdone.scale (new, see below)
  maxdata.scale

  are now being used directly, rather than in addition to a base value
  of 10 as before.

  10 is now the minimum value and an EINVAL error will get thrown
  when lower values are used in the config file.

new config options
==================

see trackrdrd.h for documentation in comments:

* maxdone.scale

  Scale for records in _DONE states, determines size of
  - the data table (which is maxopen + maxdone)
  - the spmcq

* qlen_goal.scale

  Scale for the spmcq queue length goal. All worker threads will be
  used when the queue length corresponding to the scale is reached.

  For shorter queue lengths, the number of worker threads will be
  scaled propotionally.

* hash_max_probes

  Maximum number of probes to the hash.

  Smaller values increase efficiency, but reduce the capacity of the
  hash (more ReqStart records may get lost) - and vice versa for
  higher values.

* hash_ttl

  Maximum time to live for records in the _OPEN state

  Entries which are older than this ttl _may_ get expired from the
  trackrdrd state.

  This should get set to a value significantly longer than your
  maximum session lifetime in Varnish.

* hash_mlt

  Minimum lifetime for entries in HASH_OPEN before they could get
  evacuated.

  Entries are guaranteed to remain in trackrdrd for this duration.
  Once the mlt is reached, they _may_ get expired when trackrdrd needs
  space in the hash.

ac9eb9bc

trackrdrd: do not insert partial records, only records seen from ReqStart · 9ad71611
Geoff Simmons authored Dec 17, 2012

9ad71611
trackrdrd: added signal handlers for HUP · 88cbdb30
Geoff Simmons authored Dec 06, 2012

88cbdb30

17 Dec, 2012 1 commit
- add a sensible .gitignore · 9c610034
  Nils Goroll authored Dec 17, 2012
  
  9c610034
07 Dec, 2012 1 commit
- document CXXFLAGS and PKG_CONFIG_PATH · 28e5741d
  Nils Goroll authored Dec 07, 2012
  
  28e5741d
06 Dec, 2012 4 commits
- trackrdrd: fixed bug in the queue full wait condition · 94f1dae6
  Geoff Simmons authored Dec 06, 2012
```
           fixed cksum in regression test
```
  94f1dae6
- trackrdrd: monitor.interval in whole second precision · 28ff7f06
  Geoff Simmons authored Dec 06, 2012
  
  28ff7f06
- trackrdrd: shutdown cancels the monitoring threads (no need to wait for it) · f92a6328
  Geoff Simmons authored Dec 06, 2012
```
           init script start checks if trackrdrd is running, waits for stop
```
  f92a6328
- trackrdrd: fixed a bug computing the SPMC queue length at UINT overflow · 5f12fd96
  Geoff Simmons authored Dec 06, 2012
```
           (caught by slink)
```
  5f12fd96
05 Dec, 2012 1 commit
- trackrdrd: set version 1.0 following acceptance of LHTR-114 · 9a544f78
  Geoff Simmons authored Dec 05, 2012
  
  9a544f78
04 Dec, 2012 3 commits
- trackrdrd: more diagnostic output in the warning "Client bit 'c' not set" · 021abc10
  Geoff Simmons authored Dec 04, 2012
  
  021abc10
- trackrdrd: check and monitor state of worker threads, shutdown if · a26a06fb
  Geoff Simmons authored Dec 04, 2012
```
           all of them have exited
```
  a26a06fb
- trackrdrd: seen stat updated early (not just after an insert) · 4a8c2bf8
  Geoff Simmons authored Dec 04, 2012
  
  4a8c2bf8
03 Dec, 2012 1 commit
- trackrdrd: write monitoring stats to log as unsigned, not signed · f2531dff
  Geoff Simmons authored Dec 03, 2012
  
  f2531dff
30 Nov, 2012 3 commits
- trackrdrd: added config param monitor.workers · d84d6b8b
  Geoff Simmons authored Nov 30, 2012
  
  d84d6b8b
- trackrdrd: added debug output to worker threads after successful send · 9661e373
  Geoff Simmons authored Nov 30, 2012
  
  9661e373
- trackrdrd: bugfix init script · 65790e18
  Geoff Simmons authored Nov 30, 2012
  
  65790e18