1. 22 Feb, 2018 2 commits
    • Geoff Simmons's avatar
      Support Unix domain sockets as -a listen addresses. · a2682c09
      Geoff Simmons authored
      Also adds the user, group and mode sub-args to -a, to set
      permissions on the path created by -a for UDS.
      
      Add the bogo_ip pseudo-VSA, representing IPv4 0.0.0.0:0, to be
      exposed in VCL for non-IP addresses.
      
      Also adding the field listen_sock to struct sess: pointer to the
      struct listen_sock that was created by the acceptor and lives in
      heritage.socks. This makes information like the endpoint name
      (named -a arg) and the UDS path available from an sp.
      a2682c09
    • Federico G. Schwindt's avatar
      Sync · 8bf0dd53
      Federico G. Schwindt authored
      8bf0dd53
  2. 21 Feb, 2018 7 commits
  3. 20 Feb, 2018 5 commits
  4. 19 Feb, 2018 10 commits
    • Nils Goroll's avatar
      01379caf
    • Pål Hermunn Johansen's avatar
      Fix issue #1799 for keep · c12a3e5e
      Pål Hermunn Johansen authored
      This fixes the long-standing #1799 for "keep" objects, and this commit
      message suggests a way of working around #1799 in the remaining
      cases. The following is a (long) explanation on how grace and keep
      works at the moment, how this relates to #1799, and how this commit
      changes things.
      
      1. How does it work now, before this commit?
      
      Objects in cache can outlive their TTL, and the typical reason for
      this is grace. Objects in cache can also linger because of obj.keep or
      in the (rare but observed) case where the expiry thread have not yet
      evicted an object. Grace and keep are here to minimize backend load,
      but #1799 shows that we are not successful in doing this in some
      important cases.
      
      Whenever sub vcl_recv has ended with return (lookup) (which is the
      default action), we arrive at HSH_Lookup, where varnish sometimes only
      finds an expired object (that match Vary logic, is not banned,
      etc). When this happens, we will initiate a background fetch (by
      adding a "busy object") if and only if there is no busy object on the
      oh already. Then the expired object is returned with HSH_EXP or
      HSH_EXPBUSY, depending on whether a busy object was inserted.
      
      2. What makes us run into #1799?
      
      When we have gotten an expired object, we generally hope that it is in
      grace, and that sub vcl_hit will return(deliver). However, if grace
      has expired, then the default action (ie the action from builtin.vcl)
      is return (miss). It is also possible that the user vcl, for some
      reason, decides that the stale object should not be delivered, and
      does return (miss) explicitly. In these cases it is common that the
      current request is not the one to insert a busy object, and then we
      run into the issue with a message "vcl_hit{} returns miss without busy
      object. Doing pass.".
      
      Note that normally, if a resource is very popular and has a positive
      grace, it is unlikely that #1799 will happen. Then a new version will
      always be available before the grace has run out, and everybody get
      the latest fetched version with no #1799 problems.
      
      However, if a resource is very popular (like a manifest file in a live
      streaming setup) and has 0s grace, and the expiry thread lags a little
      bit behind, then vcl_hit can get an expired object even when obj.keep
      is zero. In these circumstances we can get a surge of requests to the
      backend, and this is especially bad on a very busy server.
      
      Another real world example is where grace is initially set high (48h
      or similar) and vcl_hit considers the health of the backend, and, if
      the backend is healthy, explicitly does a return(miss) ensure that the
      client gets a fresh object. This has been a recommended use of
      vcl_hit, but, because of #1799, can cause considerable load on the
      backend.
      
      Similarly, we can get #1799 if we use "keep" to facilitate IMS
      requests to the backend, and we have a stale object for which several
      requests arrive before the first completes.
      
      3. How do we fix this?
      
      The main idea is to teach varnish to consider grace during lookup.
      
      To be specific, the following changes with this commit: If an expired
      object is found, the ttl+grace has expired and there already is an
      ongoing request for the object (ie. there exists a busy object), then
      the request is put on the waiting list instead of simply returning the
      object ("without a busy object") to vcl_hit. This choice is made
      because we anticipate that vcl_hit will do (the default) return (miss)
      and that it is better to wait for the ongoing request than to initiate
      a new one with "pass" behavior.
      
      The result is that when the ongoing request finishes, we will either
      be able to go to vcl_hit, start a new request (can happen if there was
      a Vary mismatch) by inserting a new "busy object", or we lose the race
      and have to go back to the waiting list (typically unlikely).
      
      When grace is in effect we go to vcl_hit even when we did not insert a
      busy object, anticipating that vcl_hit will return (deliver).
      
      This will will fix the cases where the user does not explicitly do a
      return(miss) in vcl_hit for object where ttl+grace has not
      expired. However, since this is not an uncommon practice, we also have
      to change our recommendation on how to use grace and keep. The new
      recommendation will be:
      
      * Set grace to the "normal value" for a working varnish+backend.
      
      * Set keep to a high value if the backend is not 100% reliable and you
        want to use stale objects as a fallback.
      
      * Do not explicitly return(miss) in sub vcl_hit{}. The exception is
        when this only can happen now and then and you are really sure that
        this is the right thing to do.
      
      * In vcl_hit, check if the backend is sick, and then explicitly
        return(deliver) when appropriate (ie you want an stale object
        delivered instead of an error message).
      
      A test case is included.
      c12a3e5e
    • Federico G. Schwindt's avatar
      Add changes for 5.2.1 · 892928db
      Federico G. Schwindt authored
      Fixes #2562.
      892928db
    • Martin Blix Grydeland's avatar
      Accurate byte counters · 51176640
      Martin Blix Grydeland authored
      There was a regression from Varnish 4.0 to 4.1, where the response
      bytes was counted as the number of bytes fed to the outgoing write
      vector, rather than the bytes that was actually handed off to the OS'
      socket buffer. This would cause for many cases the complete object
      size counted as transmitted bytes, even though the client hung up the
      connection early.
      
      This patch changes the counters to show the amount of bytes sent as
      reported from the write() system calls rather than the bytes we planned
      and prepared to send. The counters will include any protocol overhead (ie
      chunked encoding in HTTP/1 and the frame headers in HTTP/2).
      
      ESI subrequests will as before in their log transactions report the number
      of bytes it (and any subrequests below it) contributed to the total body
      bytes produced.
      
      Some test cases have been adjusted to account for the new counter behaviour.
      
      Fixes: 2558
      51176640
    • Dag Haavi Finstad's avatar
      8f4671af
    • Poul-Henning Kamp's avatar
      Flexelinting · 2a354762
      Poul-Henning Kamp authored
      2a354762
    • Poul-Henning Kamp's avatar
      Flexelinting · acb7c012
      Poul-Henning Kamp authored
      acb7c012
    • Poul-Henning Kamp's avatar
      Finally fix #2495 · 5cc47eaa
      Poul-Henning Kamp authored
      5cc47eaa
    • Poul-Henning Kamp's avatar
    • Poul-Henning Kamp's avatar
      Merge pull request #2569 from xcir/feature/vsc_nuke_limited · dc6c6520
      Poul-Henning Kamp authored
      Add n_lru_limited counter
      dc6c6520
  5. 18 Feb, 2018 1 commit
    • Pål Hermunn Johansen's avatar
      Introduce ttl_now and the new way of calculating TTLs in VCL · 33143e05
      Pål Hermunn Johansen authored
      A new fucntion, ttl_now(VRT_CTX), defines what "now" is when ttl
      and age are calculated in various VCL subs. To sum up,
      
      * Before a backend fetch on the client side (vcl_recv, vcl_hit,
        vcl_miss) we use t_req from the request. This is the significance
        in this commit, and fixes the bug demonstrated by r02555.vtc.
      * On the backend side, most notably vcl_backend_responce, we keep
        the old "now" by simply using ctx->now.
      * In vcl_deliver we use ctx->now, as before.
      
      It was necessary to make all purges use t_req as their base time.
      Then, to not break c00041.vtc it was necessary to change from ">="
      to ">" in HSH_Lookup.
      
      All VMODs that currently use HSH_purge must change to using
      VRT_purge.
      33143e05
  6. 17 Feb, 2018 3 commits
  7. 16 Feb, 2018 9 commits
  8. 15 Feb, 2018 3 commits