1. 07 Nov, 2019 4 commits
    • Nils Goroll's avatar
      take 2: fix assert_nexus for partially delivered / unpended subtrees · e493852e
      Nils Goroll authored
      The previous commit went in the right direction, but was not correct.
      
      For CHK_ORDER, we must only ever look at type != T_NEXUS and check any
      T_NEXUS by the criteria originally passed.
      
      CHK_ORDER could be stricter still, after a subtree is completed, we
      could communicate up the stricter CHK_PEND, but I do not want to
      complicate things further for an assertion.
      e493852e
    • Nils Goroll's avatar
      fix assert_nexus for partially delivered / unpended subtrees · 5297394c
      Nils Goroll authored
      we do not unpend T_NEXUS nodes, so the following situation is perfectly
      legal:
      
      	T_NEXUS
      	ST_CLOSED
      
      T_NEXUS		T_DATA
      ST_CLOSED	ST_UNPENDING
      
      the purpose of the assertion is that, if a node below a nexus is
      not yet unpended/delivered, the following need to be also, but this is
      not true for T_NEXUS.
      5297394c
    • Nils Goroll's avatar
      no more delivery special casing for empty T_DATA nodes · 58a16811
      Nils Goroll authored
      We used to set empty T_DATA nodes to delivered as soon as we encountered
      them during unpending. While it seemed like a good idea to not spend
      additional work with nodes which do not contain anything, the tree
      integrity assertions as well as set_deliver() implicitly assume that
      
      - a delivered node cannot follow an undelivered node unter a nexus and
      - once the last node under a nexus is delivered, all of the nexus is
        delivered
      
      Rather than sacrificing the sensible principle that a tree always is to
      be delivered top to bottom, left to right, weakening our assertions and
      possibly introducing new bugs when handling the resulting special cases,
      we go for the simple, clean option and do away with a special case which
      will not contribute much to overall performance anyway.
      58a16811
    • Nils Goroll's avatar
  2. 06 Nov, 2019 1 commit
  3. 05 Nov, 2019 10 commits
  4. 01 Nov, 2019 1 commit
  5. 31 Oct, 2019 4 commits
  6. 30 Oct, 2019 2 commits
  7. 29 Oct, 2019 5 commits
  8. 27 Oct, 2019 1 commit
  9. 25 Oct, 2019 2 commits
    • Nils Goroll's avatar
      hopefully get the subreq task_fini dance right now · 90d21f80
      Nils Goroll authored
      Ref: eb805e8e
      
      we cannot signal task_fini after posting subreq.done, because then we
      race the assertion that all tasks are done when we are done delivering.
      
      But we also cannot do it the other way around because then the
      assertion that subreqs are done when all tasks are finished, does not
      hold.
      
      So the right option should be to do both under the tree lock.
      90d21f80
    • Nils Goroll's avatar
      move a racy assertion to a safe place · 1481bd02
      Nils Goroll authored
      assert_node is only safe under the bytes_tree lock, yet, for the error
      case, we called it while other threads could still be running.
      
      Move it until until after we know that all other tasks are done.
      
      This also implies a second asser_node for the retval == 0 case, which
      shouldn't do any harm.
      
      should fix:
      
      Panic at: Thu, 24 Oct 2019 14:21:27 GMT
      Assert error in assert_node(), node_assert.h line 103:
        Condition((node->nexus.owner) != 0) not true.
      version = varnish-6.2.1 revision
      9f8588e4ab785244e06c3446fe09bf9db5dd8753, vrt api = 9.0
      ident =
      Linux,3.10.0-1062.4.1.el7.x86_64,x86_64,-junix,-smalloc,-smalloc,-hcritbit,epoll
      now = 29366.422728 (mono), 1571926828.402512 (real)
      Backtrace:
        0x43cf3b: /usr/sbin/varnishd() [0x43cf3b]
        0x4a01c2: /usr/sbin/varnishd(VAS_Fail+0x42) [0x4a01c2]
        0x7f6bf06a033c:
      ./vmod_cache/_vmod_pesi.4d9e0603bac2a2e2b2627f7fe90ff1d55d4759545517c869a5571f16636e230e(+0xe33c)
      [0x7f6bf06a033c]
        0x4222b6: /usr/sbin/varnishd(VDP_close+0x66) [0x4222b6]
      ...
      
      (gdb) bt
       #0  0x00007f6db97d2337 in __GI_raise (sig=sig@entry=6) at
       ../nptl/sysdeps/unix/sysv/linux/raise.c:55
       #1  0x00007f6db97d3a28 in __GI_abort () at abort.c:90
       #2  0x000000000043d232 in pan_ic ()
       #3  0x00000000004a01c2 in VAS_Fail ()
       #4  0x00007f6bf06a033c in assert_node (check=CHK_ANY, node=<optimized out>) at node_assert.h:103
       #5  vdp_pesi_fini (req=0x7f6ba8f52020, priv=0x7f6ba8f57aa8) at vdp_pesi.c:782
       #6  0x00000000004222b6 in VDP_close ()
       #7  0x0000000000464c5e in V1D_Deliver ()
       #8  0x0000000000441eab in CNT_Request ()
       #9  0x00000000004665b3 in http1_req ()
       #10 0x000000000045c833 in WRK_Thread ()
       #11 0x000000000045ccf0 in pool_thread ()
       #12 0x00007f6db9b71e65 in start_thread (arg=0x7f6cdc5dc700) at
       pthread_create.c:307
       #13 0x00007f6db989a88d in clone () at
       ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
       (gdb) f 4
       #4  0x00007f6bf06a033c in assert_node (check=CHK_ANY, node=<optimized out>) at node_assert.h:103
       103                     AN(node->nexus.owner);
      1481bd02
  10. 23 Oct, 2019 2 commits
    • Nils Goroll's avatar
      fix assertion on unexpected vdp order for gzip-in-gunzip case · a3940e7f
      Nils Goroll authored
      For the case of gzip included in plain response, the esi_level == 1 vdp
      order was:
      
      	pesi gunzip pesi_buf V2P(to_parent)
      
      Yet we had assertions in place that pesi_buf always immediately follows
      pesi.
      
      The reason was that, for esi_level > 0, we would not push pesi_buf from
      pesi init but rather from the transport, which was plain wrong: We
      should delay any additional vdps in order to buffer the least amount of
      data.
      
      Working on this, I also noted that for the generic buffering case, our
      assertion that pesi_buf is first, might be too strict. Now, any VDPs
      before the buffer are being closed at esi_level > 1.
      
      fixes this panic:
      
      Assert error in vped_close_vdp(), vdp_pesi.c line 1182:
        Condition(vdpe->vdp == vdp) not true.
      
      	...
      
      Backtrace:
        0x43cf3b: /usr/sbin/varnishd() [0x43cf3b]
        0x4a01c2: /usr/sbin/varnishd(VAS_Fail+0x42) [0x4a01c2]
        0x7f3719306b63:
      ./vmod_cache/_vmod_pesi.4d9e0603bac2a2e2b2627f7fe90ff1d55d4759545517c869a5571f16636e230e(+0x8b63)
      [0x7f3719306b63]
        0x7f371930a377:
      ./vmod_cache/_vmod_pesi.4d9e0603bac2a2e2b2627f7fe90ff1d55d4759545517c869a5571f16636e230e(+0xc377)
      [0x7f371930a377]
        0x441eab: /usr/sbin/varnishd(CNT_Request+0x11ab) [0x441eab]
        0x7f3719308043:
      ./vmod_cache/_vmod_pesi.4d9e0603bac2a2e2b2627f7fe90ff1d55d4759545517c869a5571f16636e230e(+0xa043)
      [0x7f3719308043]
        0x45c833: /usr/sbin/varnishd() [0x45c833]
        0x45ccf0: /usr/sbin/varnishd() [0x45ccf0]
        0x7f37e6d18e65: /lib64/libpthread.so.0(+0x7e65) [0x7f37e6d18e65]
        0x7f37e6a4188d: /lib64/libc.so.6(clone+0x6d) [0x7f37e6a4188d]
      thread = (cache-worker)
      pthread.attr = {
        guard = 4096,
        stack_bottom = 0x7f372c482000,
        stack_top = 0x7f372c502000,
        stack_size = 524288,
      }
      thr.req = 0x7f3601106020 {
        vxid = 25559391, transport = PESI_INCLUDE
        step = R_STP_TRANSMIT,
        req_body = R_BODY_NONE,
        restarts = 0, esi_level = 1,
      
      	...
      a3940e7f
    • Nils Goroll's avatar
      ensure nexus reqs from tree have the esi_level == 0 worker · 0db40674
      Nils Goroll authored
      We pass on reqs from esi subrequests to the top request for delivery.
      Doing so we need to give them the top requests's worker such that VDPs
      requiring it be happy.
      0db40674
  11. 18 Sep, 2019 1 commit
    • Nils Goroll's avatar
      fix assertion failure when our vdp never ran · 348f3763
      Nils Goroll authored
      When VDP_DeliverObj() was not called, for example for a head request or a
      return code which implies no response body, bytes_tree->npending == 0
      was not true.
      
      To avoid additional complications, we code the fact that the root node,
      if used, is pending into the npending_private field which meant for this
      purpose, but otherwise only accounts for nodes below it. Yet, this is
      not implied anywhere, so this use case should be perfectly fine.
      
      Also add a test for HEAD requests on an actual ESI object.
      
      Note on possible alternatives: I do not think a solution at VDP init
      time is possible because, after the vmod gets pushed from VCL, the
      response status could still be changed with effects on whether or not a
      response body is to be sent (e.g. changed from 200 to 204 after the VDP
      is pushed). So our only chance is to handle the case when the VDP gets
      called next after _init, which is just _fini.
      348f3763
  12. 13 Sep, 2019 3 commits
  13. 28 Aug, 2019 4 commits