- 07 Feb, 2024 40 commits
-
-
Nils Goroll authored
to make clear that we understand exactly what is happening.
-
Nils Goroll authored
For streaming busy objects, we basically rely on the varnish-cache ObjExtend() / ObjWaitExtend() API to never read past the object: In fellow_stream_f(), we always wait for more data (or the end of the object) before returning, such that fellow_cache_obj_iter(), which iterates over segments, should never touch a segment past the final FCS_BUSY segment. Yet - it did, by means of the read-ahead and the peek-ahead to determine whether or not OBJ_ITER_END should be signaled. We fix this issue by reading/peeking ahead only for segments with a state beyond FCS_BUSY. There is now also extensive test infrastructure to specifically test concurrent access ti busy objects. To keep layers separate, fellow_cache_test uses a lightweight signal/wait implementation analogous to the ObjExtend() / ObjWaitExtend() Varnish-Cache interface. An earlier version of t_busyobj() had run on my dev laptop for 3.5 hours without crashing, while without the fixes it had run into assertion failures within seconds. Fixes #35 and #36 (I hope)
-
Nils Goroll authored
-
Nils Goroll authored
-
Nils Goroll authored
... to make it easier to follow the code in fellow_cache_test motivated by #35
-
Nils Goroll authored
-
Nils Goroll authored
... such that the total reserve is no less than 2MB. This is required for stable operation of LRU when the log is full. Ref #28
-
Nils Goroll authored
-
Nils Goroll authored
Should be irrelevant in practice, because we would not flush a single block during startup.
-
Nils Goroll authored
When some blocks were already allocated, we would fail to use all of the log region, that is, the newly added assertion if (n > 0) AZ(logreg->free_n); would fail This left some blocks of the logregion unused, but was insignificant otherwise.
-
Nils Goroll authored
Unfortunately, this was present even in the initial public release 58ec40f9 This issue should have had no production impact, but it made hunting down bugs unnecessary hard.
-
Nils Goroll authored
When we work on the last segment, the remaining length is zero, but we still have a current pointer and length. This was a particularly annoying glitch because I wrote almost the same code for varnish-cache with the equivalent assertion in the right place :( Sorry Ref https://github.com/varnishcache/varnish-cache/pull/4013/commits/8ec77190d91603c8f0dead0cee013e3c9ca8fa78#diff-f79cfeda8456789ae873270aefa58e8f1e94213ee16d32ea96b8db8a7013ebf8R790 Closes #34
-
Nils Goroll authored
it is planned to replace the "inuse" tri-state and might turn out helpful for debugging.
-
Nils Goroll authored
-
Nils Goroll authored
https://github.com/varnishcache/varnish-cache/pull/4013 fixes two issues in Varnish-Cache, which are relevant for SLASH/fellow and of which the first is the root cause of #33. This commit works around these issues until the fix gets merged: Because of the wrong use of the .objtrimstore API function by varnish-cache, we remove it from our obj_methods and exploit the fact that varnish-cache always sets the OA_LEN attribute when the object is complete: We move the trimstore function there, effectively calling it at the right time only. The inefficient memory allocation fixed in the second commit of VC#4013 is particularly relevant for fellow, because it causes the allocation code to assume that the object might grow up to the maximum possible size, which causes a substantial over-allocation. We work around this issue for the case that a 304 copy is made from fellow to fellow by using private thread-local storage to emulate basically the same function as the #4013 fix. Closes #33 Ref https://github.com/varnishcache/varnish-cache/pull/4013
-
Nils Goroll authored
Ref #33 Ref https://github.com/varnishcache/varnish-cache/pull/4013
-
Nils Goroll authored
-
Nils Goroll authored
-
Nils Goroll authored
-
Nils Goroll authored
These would have made analyzing #33 much easier. :|
-
Nils Goroll authored
motivated by #32
-
Nils Goroll authored
-
Nils Goroll authored
-
Nils Goroll authored
-
Nils Goroll authored
-
Nils Goroll authored
Spotted by Thomas Gleixner <tglx@linutronix.de>, THANK YOU forkrun() never properly handled the case that a child exited before the timeout expired, because we had failed to block the signal and hence never received a SIGCHLD. This was overlooked because this functionality was never relevant (it only delayed test execution) and because we did not explicitly test it. Related to #31
-
Nils Goroll authored
Should fix #32
-
Nils Goroll authored
See #31
-
Nils Goroll authored
It seems with the recent debian updates on my machine, some change of timing/scheduling has come which makes flock() fail when the lock holder is being killed by the timeout code in forkrun() For future reference: logs/20231026_apt_history.txt
-
Nils Goroll authored
-
Nils Goroll authored
-
Nils Goroll authored
-
Nils Goroll authored
-
Nils Goroll authored
Making a full copy of the logbuffer just to access four members was not justified. The original idea was to re-use logbuffer_fini, but, effectively, only buddy_return1_ptr_page() was called.
-
Nils Goroll authored
-
Nils Goroll authored
In particular with uint8_t, we risk writes to be non atomic and overwrite neighboring members
-
Nils Goroll authored
-
Nils Goroll authored
-
Nils Goroll authored
Ref #28
-
Nils Goroll authored
Ref #28
-