Commit 8dbb9ab2 authored by Nils Goroll's avatar Nils Goroll

some advise on working with core files and stack overflows

as promised in #3161
parent dfcdd87c
......@@ -81,6 +81,24 @@ The crash might be due to misconfiguration or a bug. If you suspect it
is a bug you can use the output in a bug report, see the "Trouble
Tickets" section in the Introduction chapter above.
Varnish is crashing - stack overflows
-------------------------------------
Bugs put aside, the most likely cause of crashes are stack overflows,
which is why we have added a heuristic to add a note when a crash
looks like it was caused by one. In this case, the panic message
contains something like this::
Signal 11 (Segmentation fault) received at 0x7f631f1b2f98 si_code 1
THIS PROBABLY IS A STACK OVERFLOW - check thread_pool_stack parameter
as a first measure, please follow this advise and check if crashes
still occur when you add 128k to whatever the value of the
``thread_pool_stack`` parameter and restart varnish.
If varnish stops crashing with a larger ``thread_pool_stack``
parameter, it's not a bug (at least most likely).
Varnish is crashing - segfaults
-------------------------------
......@@ -93,12 +111,66 @@ debug a segfault the developers need you to provide a fair bit of
data.
* Make sure you have Varnish installed with debugging symbols.
* Make sure core dumps are allowed in the parent shell::
ulimit -c unlimited
* Check where your operating system writes core files and ensure that
you actually get them. For example on linux, learn about
``/proc/sys/kernel/core_pattern`` from the `core(5)` manpage.
* Make sure core dumps are allowed in the parent shell from which
varnishd is being started. In shell, this would be::
ulimit -c unlimited
but if varnish is started from an init-script, that would need to
be adjusted or in the case of systemd, ``LimitCORE=infinity`` set
in the service's ``[Service]]`` section of the unit file.
Once you have the core, ``cd`` into varnish's working directory (as
given by the ``-n`` parameter, whose default is
``$PREFIX/var/varnish/$HOSTNAME`` with ``$PREFIX`` being the
installation prefix, usually ``/usr/local``, open the core with
``gdb`` and issue the command ``bt`` to get a stack trace of the
thread that caused the segfault.
A basic debug session for varnish installed under ``/usr/local`` could look
like this::
$ cd /usr/local/var/varnish/`uname -n`/
$ gdb /usr/local/sbin/varnishd core
GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
Copyright (C) 2016 Free Software Foundation, Inc.
[...]
Core was generated by `/usr/local/sbin/varnishd -a 127.0.0.1:8080 -b 127.0.0.1:8080'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7f7749ea3700 (LWP 31258))]
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007f775132342a in __GI_abort () at abort.c:89
#2 0x000000000045939f in pan_ic (func=0x7f77439fb811 "VCL", file=0x7f77439fb74c "", line=0,
cond=0x7f7740098130 "PANIC: deliberately!", kind=VAS_VCL) at cache/cache_panic.c:839
#3 0x0000000000518cb1 in VAS_Fail (func=0x7f77439fb811 "VCL", file=0x7f77439fb74c "", line=0,
cond=0x7f7740098130 "PANIC: deliberately!", kind=VAS_VCL) at vas.c:51
#4 0x00007f77439fa6e9 in vmod_panic (ctx=0x7f7749ea2068, str=0x7f7749ea2018) at vmod_vtc.c:109
#5 0x00007f77449fa5b8 in VGC_function_vcl_recv (ctx=0x7f7749ea2068) at vgc.c:1957
#6 0x0000000000491261 in vcl_call_method (wrk=0x7f7749ea2dd0, req=0x7f7740096020, bo=0x0,
specific=0x0, method=2, func=0x7f77449fa550 <VGC_function_vcl_recv>) at cache/cache_vrt_vcl.c:462
#7 0x0000000000493025 in VCL_recv_method (vcl=0x7f775083f340, wrk=0x7f7749ea2dd0, req=0x7f7740096020,
bo=0x0, specific=0x0) at ../../include/tbl/vcl_returns.h:192
#8 0x0000000000462979 in cnt_recv (wrk=0x7f7749ea2dd0, req=0x7f7740096020) at cache/cache_req_fsm.c:880
#9 0x0000000000461553 in CNT_Request (req=0x7f7740096020) at ../../include/tbl/steps.h:36
#10 0x00000000004a7fc6 in HTTP1_Session (wrk=0x7f7749ea2dd0, req=0x7f7740096020)
at http1/cache_http1_fsm.c:417
#11 0x00000000004a72c3 in http1_req (wrk=0x7f7749ea2dd0, arg=0x7f7740096020)
at http1/cache_http1_fsm.c:86
#12 0x0000000000496bb6 in Pool_Work_Thread (pp=0x7f774980e140, wrk=0x7f7749ea2dd0)
at cache/cache_wrk.c:406
#13 0x00000000004963e3 in WRK_Thread (qp=0x7f774980e140, stacksize=57344, thread_workspace=2048)
at cache/cache_wrk.c:144
#14 0x000000000049610b in pool_thread (priv=0x7f774880ec80) at cache/cache_wrk.c:439
#15 0x00007f77516954a4 in start_thread (arg=0x7f7749ea3700) at pthread_create.c:456
#16 0x00007f77513d7d0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
Once you have the core you open it with `gdb` and issue the command ``bt``
to get a stack trace of the thread that caused the segfault.
Varnish gives me Guru meditation
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment