1. 15 Jan, 2020 15 commits
    • Nils Goroll's avatar
      Add information about vcl object instances to the panic output · 5df27a08
      Nils Goroll authored
      In the absence of a core dump, we do not have any information yet in the
      panic output about vcl object instances, for example to find out which
      object a priv belongs to when the instance address is used for
      per-instance priv state.
      
      To make this information available at the time of a panic, we add the
      following:
      
      * A struct vrt_ii (for instance info), of which a static gets
        filled in by VCC to contain the pointers to the C global variable
        instance pointers at VCC time
      
      * A pointer to this struct from the VCL_conf to make it available to
        the varnishd worker
      
      * dumps of the instance info for panics
      5df27a08
    • Geoff Simmons's avatar
      Add capability to send the authority TLV in the PROXY header · 99ec7796
      Geoff Simmons authored
      This gives the receiver of the PROXY header (usually the ssl-onloader)
      the opportunity to set the SNI (HostName field) from the TLV value, for
      the TLS handshake with the remote backend.
      
      From
      https://github.com/nigoroll/varnish-cache/commit/e0eb7d0a9c65cdc3c58978656b4c71f4ab8aabca
      edited by @nigoroll to split out the proxy header functionality.
      
      Add vmod_debug access to the proxy header formatting and test it
      99ec7796
    • Nils Goroll's avatar
      format proxy header on the stack · a74315bc
      Nils Goroll authored
      a74315bc
    • Nils Goroll's avatar
    • Nils Goroll's avatar
      wrap VPX_Format_Proxy for VRT · de50fefc
      Nils Goroll authored
      de50fefc
    • Nils Goroll's avatar
      Add VPX_Format_Proxy · 461a22a4
      Nils Goroll authored
      461a22a4
    • Nils Goroll's avatar
      split out proxyv2 formatting · 519737ac
      Nils Goroll authored
      519737ac
    • Nils Goroll's avatar
      split out proxyv1 formatting · 6a08beba
      Nils Goroll authored
      Note: this partially reverts cf14a0fd
      to prepare for bytes accounting in a later patch
      6a08beba
    • Nils Goroll's avatar
      remove a now pointless vtc · 5fe2a46d
      Nils Goroll authored
      This test is to detect a deadlock which does not exist any more. IMHO,
      the only sensible way to test for the lack of it now is to do a load
      test, which is not what we want in vtc.
      5fe2a46d
    • Nils Goroll's avatar
      generalize the worker pool reserve to avoid deadlocks · 3bb8b84c
      Nils Goroll authored
      Previously, we used a minimum number of idle threads (the reserve) to
      ensure that we do not assign all threads with client requests and no
      threads left over for backend requests.
      
      This was actually only a special case of the more general issue
      exposed by h2: Lower priority tasks depend on higher priority tasks
      (for h2, sessions need streams, which need requests, which may need
      backend requests).
      
      To solve this problem, we divide the reserve by the number of priority
      classes and schedule lower priority tasks only if there are enough
      idle threads to run higher priority tasks eventually.
      
      This change does not guarantee any upper limit on the amount of time
      it can take for a task to be scheduled (e.g. backend requests could be
      blocking on arbitrarily long timeouts), so the thread pool watchdog is
      still warranted. But this change should guarantee that we do make
      progress eventually.
      
      With the reserves, thread_pool_min needs to be no smaller than the
      number of priority classes (TASK_QUEUE__END). Ideally, we should have
      an even higher minimum (@Dridi rightly suggested to make it 2 *
      TASK_QUEUE__END), but that would prevent the very useful test
      t02011.vtc.
      
      For now, the value of TASK_QUEUE__END (5) is hardcoded as such for the
      parameter configuration and documentation because auto-generating it
      would require include/macro dances which I consider over the top for
      now. Instead, the respective places are marked and an assert is in
      place to ensure we do not start a worker with too small a number of
      workers. I dicided against checks in the manager to avoid include
      pollution from the worker (cache.h) into the manager.
      
      Fixes #2418 for real
      3bb8b84c
    • Dridi Boukelmoune's avatar
      Remove varnishd -C coverage · 75cca3cd
      Dridi Boukelmoune authored
      This check is not deterministic because vmod_std may indeed be found in
      the default vmod_path defined at configure time.
      75cca3cd
    • Dridi Boukelmoune's avatar
      Whitespace OCD · 1833d7dd
      Dridi Boukelmoune authored
      1833d7dd
    • Martin Blix Grydeland's avatar
      Fail fetch retries when uncached request body has been released · f88b4795
      Martin Blix Grydeland authored
      Currently we allow fetch retries with body even after we have released the
      request that initiated the fetch, and the request body with it. The
      attached test case demonstrates this, where s2 on the retry attempt gets
      stuck waiting for 3 bytes of body data that is never sent.
      
      Fix this by keeping track of what the initial request body status was, and
      failing the retry attempt if the request was already released
      (BOS_REQ_DONE) and the request body was not cached.
      f88b4795
    • Martin Blix Grydeland's avatar
      Fetch thread reference count and keep cached request bodies · d4b6228e
      Martin Blix Grydeland authored
      With this patch fetch threads will for completely cached request bodies
      keep a reference to it for the entire duration of the fetch. This extends
      the retry window of backend requests with request body beyond the
      BOS_REQ_DONE point.
      
      Patch by: Poul-Henning Kamp
      d4b6228e
    • Dridi Boukelmoune's avatar
      Assert · eb14a0b6
      Dridi Boukelmoune authored
      eb14a0b6
  2. 14 Jan, 2020 5 commits
    • Nils Goroll's avatar
      avoid the STV_close() race for now · 4df4d2a4
      Nils Goroll authored
      See #3190
      4df4d2a4
    • Nils Goroll's avatar
      stop the expiry thread before closing stevedores · 34b687e6
      Nils Goroll authored
      This should fix the panic mentioned in
      309e807d
      34b687e6
    • Nils Goroll's avatar
      4c7108b1
    • Nils Goroll's avatar
      try to narrow down a umem panic observed in vtest b00035.vtc · 309e807d
      Nils Goroll authored
      is it a race with _close ?
      
      ***  v1   debug|Child (369) Panic at: Tue, 14 Jan 2020 12:06:12 GMT
      ***  v1   debug|Wrong turn at
      ../../../bin/varnishd/cache/cache_main.c:284:
      ***  v1   debug|Signal 11 (Segmentation Fault) received at b4 si_code 1
      ***  v1   debug|version = varnish-trunk revision
      b8b798a0, vrt api = 10.0
      ***  v1   debug|ident = -jsolaris,-sdefault,-sdefault,-hcritbit,ports
      ***  v1   debug|now = 2786648.903965 (mono), 1579003571.310573 (real)
      ***  v1   debug|Backtrace:
      ***  v1   debug|  80e1bd8: /tmp/vtest.o32_su12.4/varnish-cache/varnish-trunk/_build/bin/varnishd/varnishd'pan_backtrace+0x18 [0x80e1bd8]
      ***  v1   debug|  80e2147: /tmp/vtest.o32_su12.4/varnish-cache/varnish-trunk/_build/bin/varnishd/varnishd'pan_ic+0x2c7 [0x80e2147]
      ***  v1   debug|  81b9a6f: /tmp/vtest.o32_su12.4/varnish-cache/varnish-trunk/_build/bin/varnishd/varnishd'VAS_Fail+0x4f [0x81b9a6f]
      ***  v1   debug|  80d7fba: /tmp/vtest.o32_su12.4/varnish-cache/varnish-trunk/_build/bin/varnishd/varnishd'child_signal_handler+0x27a [0x80d7fba]
      ***  v1   debug|  fed92695: /lib/libc.so.1'__sighndlr+0x15 [0xfed92695]
      ***  v1   debug|  fed86c8b: /lib/libc.so.1'call_user_handler+0x298 [0xfed86c8b]
      ***  v1   debug|  fda8a93e: /lib/libumem.so.1'umem_cache_free
      ***  v1   debug|+0x23 [0xfda8a93e]
      ***  v1   debug|  817f3bc: /tmp/vtest.o32_su12.4/varnish-cache/varnish-trunk/_build/bin/varnishd/varnishd'smu_free+0x35c [0x817f3bc]
      ***  v1   debug|  817aa21: /tmp/vtest.o32_su12.4/varnish-cache/varnish-trunk/_build/bin/varnishd/varnishd'sml_stv_free+0x101 [0x817aa21]
      ***  v1   debug|  817b4eb: /tmp/vtest.o32_su12.4/varnish-cache/varnish-trunk/_build/bin/varnishd/varnishd'sml_slim+0x2cb [0x817b4eb]
      ***  v1   debug|thread = (cache-exp)
      ***  v1   debug|thr.req = 0 {
      ***  v1   debug|},
      ***  v1   debug|thr.busyobj = 0 {
      ***  v1   debug|},
      ***  v1   debug|vmods = {
      ***  v1   debug|},
      ***  v1   debug|
      ***  v1   debug|
      ***  v1   debug|Info: Child (369) said Child dies
      ***  v1   debug|Debug:
      ***  v1   debug| Child cleanup complete
      ***  v1   debug|
      309e807d
    • Nils Goroll's avatar
      Revert "does the umem backend affect the amount of malloc NULL returns in vtest?" · b8b798a0
      Nils Goroll authored
      This reverts commit 8ea006ee.
      
      does not seem to make a difference, trying to narrow down using other
      means (different platforms)
      b8b798a0
  3. 13 Jan, 2020 7 commits
  4. 10 Jan, 2020 1 commit
  5. 09 Jan, 2020 3 commits
  6. 08 Jan, 2020 3 commits
    • Dridi Boukelmoune's avatar
      Linux documents SO_SNDTIMEO in socket(7) · 9a7dc49b
      Dridi Boukelmoune authored
      Closes #3178
      9a7dc49b
    • Dridi Boukelmoune's avatar
      Stabilize s10 · bcba9649
      Dridi Boukelmoune authored
      Contrary to previous attempts this one takes a different route that
      is much more reliable and faster.
      
      First, it sets things up so that we can predicatbly lock varnish when
      it's trying to send the first (and only) part of the body. Instead of
      assuming a delay that is sometimes not enough under load, we wait for
      the timeout to show up in the log.
      
      We can't put the barrier in l1 or l2 because logexpect spec evaluation
      is eager, in order to cope with the VSL API.
      
      Because we bypass the cache, we can afford letting c1 bail out before
      completing the transaction, which is necessary because otherwise the
      second c1 run would take forever on FreeBSD that takes our request to
      limit the send buffer to 128 octets very seriously (on Linux we get
      around 4k).
      
      Because we use barriers, the send and receive buffers were bumped to
      256 to ensure c1 doesn't fail (on FreeBSD) before it reaches barrier
      statements.
      bcba9649
    • Dridi Boukelmoune's avatar
      Polish · 8f38a64f
      Dridi Boukelmoune authored
      8f38a64f
  7. 05 Jan, 2020 2 commits
  8. 03 Jan, 2020 2 commits
  9. 02 Jan, 2020 2 commits
    • Dridi Boukelmoune's avatar
      Attempt at stabilizing s10 · df9f3489
      Dridi Boukelmoune authored
      This test has been relying on SLT_Debug records from day one and now
      that we have SLT_Notice we could perpetuate this information and at
      the same time grant ourselves the freedom to explain each case and
      which parameters may be used to try to improve the situation.
      df9f3489
    • Dridi Boukelmoune's avatar
      Attempt at stabilizing e19 · 895810cb
      Dridi Boukelmoune authored
      I'm no longer able to time it out under load.
      895810cb