- 02 Jan, 2019 1 commit
-
-
Pål Hermunn Johansen authored
This adds error handling for STV_NewObject(.., TRANSIENT) in VRB_Cache, which would fail when transient is full. This is a back port of 6045eaaa. Fixes: #2831 Conflicts: bin/varnishd/cache/cache_req_body.c bin/varnishtest/tests/r02831.vtc
-
- 05 Dec, 2018 1 commit
-
-
Pål Hermunn Johansen authored
Fixes: #2753 Conflicts: bin/varnishtest/tests/b00064.vtc
-
- 05 Oct, 2018 2 commits
-
-
Federico G. Schwindt authored
I've been torturing varnish with this change for some time and was not able to reproduce the problem. Should fix #2719.
-
Poul-Henning Kamp authored
Fixes #2684 Reported by: ernestojpg@github
-
- 26 Sep, 2018 2 commits
-
-
Nils Goroll authored
This is no semantic change, but rather than indirectly checking via the retval, we might also check the reason for keeping a reference (or rather, not).
-
Poul-Henning Kamp authored
-
- 18 Sep, 2018 1 commit
-
-
Federico G. Schwindt authored
Fixes #2661. Conflicts: bin/varnishd/mgt/mgt_main.c
-
- 11 Sep, 2018 1 commit
-
-
Nils Goroll authored
This is a back port of 4a370dc4 Conflicts: bin/varnishd/cache/cache_req_fsm.c
-
- 10 Sep, 2018 2 commits
-
-
Pål Hermunn Johansen authored
-
Pål Hermunn Johansen authored
This is a back port of 7494d6ad, in which the main point is to clearly recomend using req.grace for the most common use case - using different grace time when the backend is healthy. To simplify things, the vcl-grace.rst file is just copied from master. It should be accurate also for the 4.1 branch.
-
- 07 Sep, 2018 1 commit
-
-
Pål Hermunn Johansen authored
This is a back port of ff38535a The req.grace variable can be set in vcl_recv to cap the grace of objects in the cache, in the same way as in 3.0.x The "keep" behavior changes with this patch. We now always go to vcl_miss when the expired object is out of grace, or we go to the waiting list. The result is that it is no longer possible to deliver a "keep" object in vcl_hit. Note that when we get to vcl_miss, we will still have the 304 candidate, but without the detour by vcl_hit. This commit changes VCL, but only slightly, so we aim to back port this to earlier versions of Varnish Cache. Refs: #1799 and #2519 Conflicts: bin/varnishd/cache/cache_hash.c bin/varnishd/cache/cache_req_fsm.c bin/varnishd/cache/cache_varnishd.h bin/varnishd/cache/cache_vrt_var.c doc/sphinx/reference/vcl_var.rst
-
- 05 Jun, 2018 1 commit
-
-
Dag Haavi Finstad authored
Fixes: #2700
-
- 01 Jun, 2018 1 commit
-
-
Nils Goroll authored
HSH_Unbusy() calls BAN_NewObjCore() not holding the objhead lock, so the ban lurker may race and grab the ban mtx just after the new oc has been inserted, but the busy flag not yet cleared. While it would be correct to call BAN_NewObjCore() with the objhead mtx held, doing so would increase the pressure on the combined ban & objhead mtx. If the ban lurker encounters a busy object, we know that there must be an unbusy in progress and it would be wiser to rather back off in favor of the it. Fixes #2681
-
- 25 Apr, 2018 2 commits
-
-
Pål Hermunn Johansen authored
-
Nils Goroll authored
... so log it under the Debug tag. FetchErrors should be actual errors which can be addressed. In this case, nothing is wrong in any way, the fact that we abort a fetch if we don't need the body is a varnish internal optimization (which makes sense, but comes at the cost of closing a connection). Merges #2450 Conflicts: bin/varnishd/cache/cache_fetch.c
-
- 24 Apr, 2018 2 commits
-
-
Pål Hermunn Johansen authored
-
Martin Blix Grydeland authored
If the stevedore failed the object creation, we would leak the temporary VSB holding the computed vary string. This patch frees it. Problem exists in 4.1 and later.
-
- 19 Apr, 2018 1 commit
-
-
Pål Hermunn Johansen authored
What can possibly go wrong?
-
- 12 Apr, 2018 1 commit
-
-
Pål Hermunn Johansen authored
Back port two VCC test cases from the master branch. The latter is from 4b48f886 by Nils Goroll <nils.goroll@uplex.de>
-
- 11 Apr, 2018 1 commit
-
-
Nils Goroll authored
for i < 0, rlen could underflow. We are safe because of the check for i < 0 further down, so this change is just a minor cleanup. Fixes #2444
-
- 25 Feb, 2018 1 commit
-
-
Pål Hermunn Johansen authored
-
- 24 Feb, 2018 1 commit
-
-
Shohei Tanaka(@xcir) authored
This is a back port of the commits submitted in #2569 and merged in dc6c6520.
-
- 23 Feb, 2018 3 commits
-
-
Poul-Henning Kamp authored
Fixes: #2582
-
Federico G. Schwindt authored
MacOS will return EINVAL under e.g. setsockopt if the connection was reset.
-
Dridi Boukelmoune authored
Some user agents like Safari may "probe" specific resources like medias before getting the full resources usually asking for the first 2 or 11 bytes, probably to peek at magic numbers to figure early whether a potentially large resource may not be supported (read: video). If the user agent also advertises gzip support, and the transaction is known beforehand to not be cacheable, varnishd will forward the Range header to the backend: Accept-Encoding: gzip (when http_gzip_support is on) Range: bytes=0-1 If the response happens to be both encoded and partial, the gunzip test cannot be performed. Otherwise we systematically end up with a broken transaction closed prematuraly: FetchError b tGunzip failed Gzip b u F - 2 0 0 0 0 Refs #2530 Refs #2554
-
- 22 Feb, 2018 2 commits
-
-
Dag Haavi Finstad authored
The non-fatal ("non_fatal" in master) was omitted when this was first backported.
-
Dag Haavi Finstad authored
This is a backport of 5cc47eaa. With this commit hsh_rush has been split into hsh_rush and hsh_rush_clean. The former needs to be called while holding the OH lock, and the latter needs to be called without holding the lock. The reason for this added complexity is that we can't hold the lock while calling HSH_DerefObjHead. Fixes: #2495
-
- 21 Feb, 2018 1 commit
-
-
Pål Hermunn Johansen authored
This fixes the long-standing #1799 for "keep" objects, and this commit message suggests a way of working around #1799 in the remaining cases. The following is a (long) explanation on how grace and keep works at the moment, how this relates to #1799, and how this commit changes things. 1. How does it work now, before this commit? Objects in cache can outlive their TTL, and the typical reason for this is grace. Objects in cache can also linger because of obj.keep or in the (rare but observed) case where the expiry thread have not yet evicted an object. Grace and keep are here to minimize backend load, but #1799 shows that we are not successful in doing this in some important cases. Whenever sub vcl_recv has ended with return (lookup) (which is the default action), we arrive at HSH_Lookup, where varnish sometimes only finds an expired object (that match Vary logic, is not banned, etc). When this happens, we will initiate a background fetch (by adding a "busy object") if and only if there is no busy object on the oh already. Then the expired object is returned with HSH_EXP or HSH_EXPBUSY, depending on whether a busy object was inserted. 2. What makes us run into #1799? When we have gotten an expired object, we generally hope that it is in grace, and that sub vcl_hit will return(deliver). However, if grace has expired, then the default action (ie the action from builtin.vcl) is return (miss). It is also possible that the user vcl, for some reason, decides that the stale object should not be delivered, and does return (miss) explicitly. In these cases it is common that the current request is not the one to insert a busy object, and then we run into the issue with a message "vcl_hit{} returns miss without busy object. Doing pass.". Note that normally, if a resource is very popular and has a positive grace, it is unlikely that #1799 will happen. Then a new version will always be available before the grace has run out, and everybody get the latest fetched version with no #1799 problems. However, if a resource is very popular (like a manifest file in a live streaming setup) and has 0s grace, and the expiry thread lags a little bit behind, then vcl_hit can get an expired object even when obj.keep is zero. In these circumstances we can get a surge of requests to the backend, and this is especially bad on a very busy server. Another real world example is where grace is initially set high (48h or similar) and vcl_hit considers the health of the backend, and, if the backend is healthy, explicitly does a return(miss) ensure that the client gets a fresh object. This has been a recommended use of vcl_hit, but, because of #1799, can cause considerable load on the backend. Similarly, we can get #1799 if we use "keep" to facilitate IMS requests to the backend, and we have a stale object for which several requests arrive before the first completes. 3. How do we fix this? The main idea is to teach varnish to consider grace during lookup. To be specific, the following changes with this commit: If an expired object is found, the ttl+grace has expired and there already is an ongoing request for the object (ie. there exists a busy object), then the request is put on the waiting list instead of simply returning the object ("without a busy object") to vcl_hit. This choice is made because we anticipate that vcl_hit will do (the default) return (miss) and that it is better to wait for the ongoing request than to initiate a new one with "pass" behavior. The result is that when the ongoing request finishes, we will either be able to go to vcl_hit, start a new request (can happen if there was a Vary mismatch) by inserting a new "busy object", or we lose the race and have to go back to the waiting list (typically unlikely). When grace is in effect we go to vcl_hit even when we did not insert a busy object, anticipating that vcl_hit will return (deliver). This will will fix the cases where the user does not explicitly do a return(miss) in vcl_hit for object where ttl+grace has not expired. However, since this is not an uncommon practice, we also have to change our recommendation on how to use grace and keep. The new recommendation will be: * Set grace to the "normal value" for a working varnish+backend. * Set keep to a high value if the backend is not 100% reliable and you want to use stale objects as a fallback. * Do not explicitly return(miss) in sub vcl_hit{}. The exception is when this only can happen now and then and you are really sure that this is the right thing to do. * In vcl_hit, check if the backend is sick, and then explicitly return(deliver) when appropriate (ie you want an stale object delivered instead of an error message). A test case is included.
-
- 20 Feb, 2018 1 commit
-
-
Pål Hermunn Johansen authored
This is a back port of 33143e05 in master, and for this reason it is a little strange. The strangeness is due to the fact that obj.ttl is not available in vcl_deliver here, but it is in master. This commit could have been much simpler without ttl_now, but the function is taken to 4.1 regardles. The reason is that introducing obj.ttl in vcl_deliver is straightforward, and if someone is to do that in the future, the code in ttl_now(VRT_CTX) makes sure that obj.ttl will behave as in master, also in vcl_deliver. If obj.ttl is introduced in vcl_deliver, than also the two test cases s00008.vtc and s00009.vtc should be brought in, to make sure that obj.ttl works as expected. The following is the test from the commit in master: A new fucntion, ttl_now(VRT_CTX), defines what "now" is when ttl and age are calculated in various VCL subs. To sum up, * Before a backend fetch on the client side (vcl_recv, vcl_hit, vcl_miss) we use t_req from the request. This is the significance in this commit, and fixes the bug demonstrated by r02555.vtc. * On the backend side, most notably vcl_backend_responce, we keep the old "now" by simply using ctx->now. * In vcl_deliver we use ctx->now, as before. It was necessary to make all purges use t_req as their base time. Then, to not break c00041.vtc it was necessary to change from ">=" to ">" in HSH_Lookup. All VMODs that currently use HSH_purge must change to using VRT_purge. Conflicts: bin/varnishd/cache/cache_hash.c bin/varnishd/cache/cache_objhead.h bin/varnishd/cache/cache_req_fsm.c bin/varnishd/cache/cache_vrt.c bin/varnishd/cache/cache_vrt_var.c
-
- 14 Feb, 2018 1 commit
-
-
Martin Blix Grydeland authored
Since the last_lru tracks epoch time, it needs the double precision floating point type to accurately track the time. This is simply the test case from 2261dcfd.
-
- 18 Dec, 2017 2 commits
-
-
Pål Hermunn Johansen authored
-
Martin Blix Grydeland authored
Also add asserts for the references held in req->objcore and req->stale_oc. The test case for #1807 catches this bug after adding the asserts. Fixes: #2502
-
- 14 Dec, 2017 2 commits
-
-
Federico G. Schwindt authored
Addresses #2456 in a different way.
-
Martin Blix Grydeland authored
I'm guessing this is due to rounding. All test cases involving file stevedore has a minimum 10m file in the test, was silly to attempt a smaller one in this test. Fixes: #2496
-
- 30 Nov, 2017 1 commit
-
-
Pål Hermunn Johansen authored
The counter cache_hit_grace counts the number of grace hits. To be precise, it counts the number of times lookup returns an expired object, but vcl_hit is called and decides to return(deliver). Every time cache_hit_grace is incremented, cache_hit is also incremented (so this commit does not change the cache_hit counter). This is a back port of 1d62f5da.
-
- 27 Nov, 2017 5 commits
-
-
Dag Haavi Finstad authored
Fixes: #1772
-
Dag Haavi Finstad authored
With VBT_Close now being capable of dealing with STOLEN connections, we no longer need to VBT_Wait for them prior to close.
-
Dag Haavi Finstad authored
The change from shutdown(.., SHUT_WR) to shutdown(.., SHUT_RDWR) is required to make it trigger a waiter event.
-
Dag Haavi Finstad authored
This test case should not rely on first_byte_timeout/between_bytes_timeout.
-
Dag Haavi Finstad authored
The second time around, we force a fresh connection. The VCL user may choose to do 'return (retry);' in vcl_backend_error{} if further attempts are deemed warranted. Fixes: #2135
-