Commits · 65cd2e05c70f2870388c13cc0b4c274e7f94ca89 · varnishcache / varnish-cache

17 Jan, 2020 3 commits

TAKE_OBJ_NOTNULLification: cases with the first AN(objp) missing · 65cd2e05

Nils Goroll authored Jan 17, 2020

in these cases, the initial AN(objp) was missing, so we would
potentially dereference a NULL pointer without an explicit assertion
failure.

in VCL_Rel(), AN(*vcc) was likely just a typo resulting in a semantic
noop (the check was a duplication of the NULL check in
CHECK_OBJ_NOTNULL).

... more examples why using the macro is a good idea

65cd2e05

more TAKE_OBJ_NOTNULLification · 7ce983ec

Nils Goroll authored Jan 17, 2020

command:

spatch -I include/ -I bin/varnishd/ --dir . --in-place \
	--sp-file tools/coccinelle/take_obj_notnull.cocci

7ce983ec

Stabilize s10, again · 4fe88185

Dridi Boukelmoune authored Jan 17, 2020

The previous stabilization turned out not to reliably work with FreeBSD
on aarch64 hardware. It was still an improvement overall, but the test
case turned out to be a bit cryptic and when [idle_]send_timeout support
landed in VCL it wasn't obvious how to add coverage.

This attempt bites the bullet and defines one pair of client/logexpect
instances per use case and clearly (I hope) indicates why it is doing
things or not doing them.

Since we now have two clients that aren't expected to complete before
the test case itself the server is in dispatch mode instead of repeating
its scenario. Using barriers in dispatch mode should raise a red flag
for any reviewer, but in this case the barriers outside the server are
properly serialized, are systematically used by every single client, and
as a result should be safe.

As usual, personal testing limited to x86_64 Linux and FreeBSD.

4fe88185

16 Jan, 2020 8 commits

s00010: try a slight variation · 55ef8e00

Nils Goroll authored Jan 16, 2020

when we stall sending on the backend side, we _have_ to run into a
timeout on the client side, haven't we?

55ef8e00

even more session overflow test tweaking · f5985938
Nils Goroll authored Jan 16, 2020
```
and enable debug output for better failure analysis
```
f5985938

even more s00010.vtc tweaking · 66b27ae1

Nils Goroll authored Jan 16, 2020

Furhter lighten the load for the test to succeed on FreeBSD vtest,
hopefully.

Also, when stars align, we do not git the busy object (streaming) and do
not see chunked encoding.

66b27ae1

more numbers fiddling · 3d4bb424
Nils Goroll authored Jan 16, 2020
```
we need a way for vtc to reach into session setup I guess
```
3d4bb424

s00010.vtc: lower pressure from timeout handling · 5b90890f

Nils Goroll authored Jan 16, 2020

freebsd vtest boxes cannot complete the c2 response within the 20s
extended send_timeout, apparently for too much overhead from timeout
handling.

Try to fix with less pressure

5b90890f

fiddling to bring back an overflow where we test for it · 83b238ed
Nils Goroll authored Jan 16, 2020

83b238ed
panic handling can take a couple seconds · e733fce8
Nils Goroll authored Jan 16, 2020

e733fce8

bump default workspace_session · 0f3bb354

Nils Goroll authored Jan 16, 2020

02bc0a68 requires 24 bytes more on
64bit, which brought o00005.vtc just over the top on smartos 64bit
machines (the vtc tests PROXY TLVs, which get stored on the session
workspace).

Increase the default workspace_session by 256bytes (50%) to accomodate
sane proxy TLV lengths by default.

This is about 10x more than the minimum increase sufficient to
accomodate the increased workspace usage. I went for 768 bytes because

- this gives us the next least common multiple (12k) with the common 4K
  page size unless we went up to 1k

- we were already quite tight with the default and could not accomodate
  common proxy TLV use (see for example #3131)

As a ballpart reality check, I do not think that spending an additional
256MB should be an issue for users supporting 1M sessions.

0f3bb354

15 Jan, 2020 19 commits

Add vcl control over timeout_linger, send_timeout, idle_send_timeout · 02bc0a68

Nils Goroll authored Apr 24, 2019

and test that they work for H1

To @Dridi, I would appreciate if you could have another look at s10.vtc.
I *tried* to keep your good ideas intact, and hope you can live with
these changes.

02bc0a68

A request body now might have up to two references · 4e6e4eed

Nils Goroll authored Jan 15, 2020

since d4b6228e the busyobj might also
gain one, so depending on who deref's first, there may be one or zero
references left.

4e6e4eed

stabilize test: the panic might take some time · 7a1f7b2a
Nils Goroll authored Jan 15, 2020

7a1f7b2a
simplify as suggested by @Dridi · 8dd0a73f
Nils Goroll authored Dec 17, 2019
```
thank you!
```
8dd0a73f

Add information about vcl object instances to the panic output · 5df27a08

Nils Goroll authored Apr 06, 2019

In the absence of a core dump, we do not have any information yet in the
panic output about vcl object instances, for example to find out which
object a priv belongs to when the instance address is used for
per-instance priv state.

To make this information available at the time of a panic, we add the
following:

* A struct vrt_ii (for instance info), of which a static gets
  filled in by VCC to contain the pointers to the C global variable
  instance pointers at VCC time

* A pointer to this struct from the VCL_conf to make it available to
  the varnishd worker

* dumps of the instance info for panics

5df27a08

Add capability to send the authority TLV in the PROXY header · 99ec7796

Geoff Simmons authored Aug 21, 2019

This gives the receiver of the PROXY header (usually the ssl-onloader)
the opportunity to set the SNI (HostName field) from the TLV value, for
the TLS handshake with the remote backend.

From
https://github.com/nigoroll/varnish-cache/commit/e0eb7d0a9c65cdc3c58978656b4c71f4ab8aabca
edited by @nigoroll to split out the proxy header functionality.

Add vmod_debug access to the proxy header formatting and test it

99ec7796

format proxy header on the stack · a74315bc
Nils Goroll authored Apr 16, 2019

a74315bc
add error checking and accounting for sending proxy headers · d4a28b98
Nils Goroll authored Nov 30, 2018

d4a28b98
wrap VPX_Format_Proxy for VRT · de50fefc
Nils Goroll authored Nov 23, 2018

de50fefc
Add VPX_Format_Proxy · 461a22a4
Nils Goroll authored Nov 23, 2018

461a22a4
split out proxyv2 formatting · 519737ac
Nils Goroll authored Nov 23, 2018

519737ac

split out proxyv1 formatting · 6a08beba

Nils Goroll authored Nov 23, 2018

Note: this partially reverts cf14a0fd
to prepare for bytes accounting in a later patch

6a08beba

remove a now pointless vtc · 5fe2a46d

Nils Goroll authored Oct 28, 2019

This test is to detect a deadlock which does not exist any more. IMHO,
the only sensible way to test for the lack of it now is to do a load
test, which is not what we want in vtc.

5fe2a46d

generalize the worker pool reserve to avoid deadlocks · 3bb8b84c

Nils Goroll authored Oct 09, 2018

Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no
threads left over for backend requests.

This was actually only a special case of the more general issue
exposed by h2: Lower priority tasks depend on higher priority tasks
(for h2, sessions need streams, which need requests, which may need
backend requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough
idle threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time
it can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

With the reserves, thread_pool_min needs to be no smaller than the
number of priority classes (TASK_QUEUE__END). Ideally, we should have
an even higher minimum (@Dridi rightly suggested to make it 2 *
TASK_QUEUE__END), but that would prevent the very useful test
t02011.vtc.

For now, the value of TASK_QUEUE__END (5) is hardcoded as such for the
parameter configuration and documentation because auto-generating it
would require include/macro dances which I consider over the top for
now. Instead, the respective places are marked and an assert is in
place to ensure we do not start a worker with too small a number of
workers. I dicided against checks in the manager to avoid include
pollution from the worker (cache.h) into the manager.

Fixes #2418 for real

3bb8b84c

Remove varnishd -C coverage · 75cca3cd

Dridi Boukelmoune authored Jan 15, 2020

This check is not deterministic because vmod_std may indeed be found in
the default vmod_path defined at configure time.

75cca3cd

Whitespace OCD · 1833d7dd
Dridi Boukelmoune authored Jan 15, 2020

1833d7dd

Fail fetch retries when uncached request body has been released · f88b4795

Martin Blix Grydeland authored Oct 10, 2019

Currently we allow fetch retries with body even after we have released the
request that initiated the fetch, and the request body with it. The
attached test case demonstrates this, where s2 on the retry attempt gets
stuck waiting for 3 bytes of body data that is never sent.

Fix this by keeping track of what the initial request body status was, and
failing the retry attempt if the request was already released
(BOS_REQ_DONE) and the request body was not cached.

f88b4795

Fetch thread reference count and keep cached request bodies · d4b6228e

Martin Blix Grydeland authored Nov 04, 2019

With this patch fetch threads will for completely cached request bodies
keep a reference to it for the entire duration of the fetch. This extends
the retry window of backend requests with request body beyond the
BOS_REQ_DONE point.

Patch by: Poul-Henning Kamp

d4b6228e

Assert · eb14a0b6
Dridi Boukelmoune authored Jan 15, 2020

eb14a0b6

14 Jan, 2020 5 commits

avoid the STV_close() race for now · 4df4d2a4
Nils Goroll authored Jan 14, 2020
```
See #3190
```
4df4d2a4
stop the expiry thread before closing stevedores · 34b687e6
Nils Goroll authored Jan 14, 2020
```
This should fix the panic mentioned in
309e807d
```
34b687e6
allow bgthreads to terminate only during shutdown · 4c7108b1
Nils Goroll authored Jan 14, 2020

4c7108b1

try to narrow down a umem panic observed in vtest b00035.vtc · 309e807d

Nils Goroll authored Jan 14, 2020

is it a race with _close ?

***  v1   debug|Child (369) Panic at: Tue, 14 Jan 2020 12:06:12 GMT
***  v1   debug|Wrong turn at
../../../bin/varnishd/cache/cache_main.c:284:
***  v1   debug|Signal 11 (Segmentation Fault) received at b4 si_code 1
***  v1   debug|version = varnish-trunk revision
b8b798a0, vrt api = 10.0
***  v1   debug|ident = -jsolaris,-sdefault,-sdefault,-hcritbit,ports
***  v1   debug|now = 2786648.903965 (mono), 1579003571.310573 (real)
***  v1   debug|Backtrace:
***  v1   debug|  80e1bd8: /tmp/vtest.o32_su12.4/varnish-cache/varnish-trunk/_build/bin/varnishd/varnishd'pan_backtrace+0x18 [0x80e1bd8]
***  v1   debug|  80e2147: /tmp/vtest.o32_su12.4/varnish-cache/varnish-trunk/_build/bin/varnishd/varnishd'pan_ic+0x2c7 [0x80e2147]
***  v1   debug|  81b9a6f: /tmp/vtest.o32_su12.4/varnish-cache/varnish-trunk/_build/bin/varnishd/varnishd'VAS_Fail+0x4f [0x81b9a6f]
***  v1   debug|  80d7fba: /tmp/vtest.o32_su12.4/varnish-cache/varnish-trunk/_build/bin/varnishd/varnishd'child_signal_handler+0x27a [0x80d7fba]
***  v1   debug|  fed92695: /lib/libc.so.1'__sighndlr+0x15 [0xfed92695]
***  v1   debug|  fed86c8b: /lib/libc.so.1'call_user_handler+0x298 [0xfed86c8b]
***  v1   debug|  fda8a93e: /lib/libumem.so.1'umem_cache_free
***  v1   debug|+0x23 [0xfda8a93e]
***  v1   debug|  817f3bc: /tmp/vtest.o32_su12.4/varnish-cache/varnish-trunk/_build/bin/varnishd/varnishd'smu_free+0x35c [0x817f3bc]
***  v1   debug|  817aa21: /tmp/vtest.o32_su12.4/varnish-cache/varnish-trunk/_build/bin/varnishd/varnishd'sml_stv_free+0x101 [0x817aa21]
***  v1   debug|  817b4eb: /tmp/vtest.o32_su12.4/varnish-cache/varnish-trunk/_build/bin/varnishd/varnishd'sml_slim+0x2cb [0x817b4eb]
***  v1   debug|thread = (cache-exp)
***  v1   debug|thr.req = 0 {
***  v1   debug|},
***  v1   debug|thr.busyobj = 0 {
***  v1   debug|},
***  v1   debug|vmods = {
***  v1   debug|},
***  v1   debug|
***  v1   debug|
***  v1   debug|Info: Child (369) said Child dies
***  v1   debug|Debug:
***  v1   debug| Child cleanup complete
***  v1   debug|

309e807d

Revert "does the umem backend affect the amount of malloc NULL returns in vtest?" · b8b798a0
Nils Goroll authored Jan 14, 2020
```
This reverts commit 8ea006ee.

does not seem to make a difference, trying to narrow down using other
means (different platforms)
```
b8b798a0

13 Jan, 2020 5 commits

Don't report send timeouts as REM_CLOSE errors · c186423b

Dridi Boukelmoune authored Jan 03, 2020

V1L_Close() is now in charge of returning a specific enum sess_close
instead of an error flag where SC_NULL implies that everything went
well.

This otherwise maintains the status quo regarding prior handling of
HTTP/1 write errors. The calling code falls back to what it used to
default to when an error occurred somewhere else.

c186423b

Don't ignore OA_GZIPBITS if there is a boc · f82b5f15

Dridi Boukelmoune authored Jan 10, 2020

Under load, client c4 from g00005.vtc may fail with a 200 response
instead of the expected 206 partial response.

There is a window during which we might still see a boc, but because
c4 sets beresp.do_stream to false, the fetch has to be over. To close
this race we can instead reference the boc as suggested in #2904 and
keep track of the boc state.

f82b5f15

does the umem backend affect the amount of malloc NULL returns in vtest? · 8ea006ee
Nils Goroll authored Jan 13, 2020

8ea006ee
test vmod_path for -C · 21d3e8e8
Nils Goroll authored Jan 13, 2020

21d3e8e8
Include proper prototype for daemon() · e527020a
Poul-Henning Kamp authored Jan 13, 2020

e527020a