Document how to analyze lockups

Motivated by #41
parent 4727ce59
......@@ -49,6 +49,62 @@ For a typical installation, the commands would be::
gdb --batch -ex "thr app all bt full" /usr/sbin/varnishd core
Lockups
-------
To analyze situations where varnish-cache hangs related to the use of
SLASH/ (e.g. the hang occurred only after introducing SLASH/), we need
to investigate the state of the program at the time of the hang.
There are basically two options to get there: Either analyze the still
running process with gdb, or get a core dump and analyze it
later.
Get thread dump of a running process
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The ``gdb`` invocation is almost the same as the example given above,
but you need the pid of the Varnish-Cache *worker* process. It can
usually be found by its name ``cache-main``, so the snippet given here
uses this simple method::
cd $YOUR_VARNISH_WORKING_DIRECTORY
pid=$(pgrep cache-main)
gdb -p $pid --batch -ex "thr app all bt full"
For a typical installation, the commands would be::
cd /var/lib/varnish/$(uname -n)
pid=$(pgrep cache-main)
gdb -p $pid --batch -ex "thr app all bt full"
If, for whatever reason, this method does not work, look for a
``varnishd`` process which is the *child* of another ``varnishd``
process and use the child's pid for the ``pid`` variable.
Getting a core dump
~~~~~~~~~~~~~~~~~~~
Getting the core dump itself is straight forward: Determine the pid of
the cache worker process (see above), then run ``gcore`` on it::
pid=$(pgrep cache-main)
gcore -o slash_issue.core $pid
But to be able to make good use of the core dump, we also need the
varnish working directory, so please save that, too::
cp -pr $YOUR_VARNISH_WORKING_DIRECTORY slash_issue.workdir
For a typical installation, this is::
cp -pr /var/lib/varnish/$(uname -n) slash_issue.workdir
Then, to get the thread dump from this data, use the same command
given under `Panics`_, but use the saved workdir instead of the actual
one.
Other
-----
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment