Commits · fe001da43ecf2a9b33aad4bd2a8854ee377aa9ce · uplex-varnish / slash

07 Feb, 2024 40 commits

prep dskreqs when adding blocks · fe001da4
Nils Goroll authored Oct 26, 2023

fe001da4
Get dskreqs based on logbuffer size · 6a7eb003
Nils Goroll authored Oct 26, 2023

6a7eb003
Refactor out logbuffer_prep_dskreqs · a98560b3
Nils Goroll authored Oct 26, 2023

a98560b3
When rewriting, flush the new log earlier · 787f2279
Nils Goroll authored Oct 26, 2023
```
... such that LRU, which is operating on the temporary log, can
make room.

Ref #28
```
787f2279
Start getting the disk log block reserve earlier · 274d76fe
Nils Goroll authored Oct 26, 2023
```
Ref #28
```
274d76fe
Raise dsk alloc priority of logbuffer flushes with frees · b9d5b64a
Nils Goroll authored Oct 25, 2023
```
Hopefully, this also contributes to a solution for #28
```
b9d5b64a
Return the log's reserved disk blocks also for recycle · 5ebad71a
Nils Goroll authored Oct 25, 2023
```
Otherwise it looks like a rewrite would leak log blocks.
```
5ebad71a
refactor out logbuffer_fini_dskreqs · accb3f03
Nils Goroll authored Oct 25, 2023

accb3f03
Sort prios table · fc0d299a
Nils Goroll authored Oct 25, 2023

fc0d299a
Rename allocation priorities, Raise logblk (dsk) priority by one · 4abc6ad5
Nils Goroll authored Oct 25, 2023
```
it is more important than objects

Should also contribute to a fix for #28
```
4abc6ad5

Allocate additional log blocks early · 2d497b1a

Nils Goroll authored Oct 24, 2023

This, hopefully, is part of a possible solution to the nasty issue #28:

When we do not have a sufficiently large pre-allocated log (log region)
as determined by objsize_hint in relation to the storage size, we need
to dynamically allocate disk blocks while we flush the log.

When the log flush includes object deletions (in particular when
triggered from the disk LRU), we run into a typical deadlock: To
complete the transaction to free space, we need the space...

This commit is part of an attempt to make this work by allocating
space early on: When we only have 20% of the log region left, we start
to reserve more blocks for the log.

The problem can, for example, be reproduced with an objsize_hint of 1MB
and an actual object size in the oder of 32KB.

Ref #28

2d497b1a

Fix wrong assertion hitting when all discard methods fail · 11d86e62

Nils Goroll authored Nov 09, 2023

Manually tested with this modification:

diff --git a/src/fellow_log.c b/src/fellow_log.c
index 6075d81..45da269 100644
--- a/src/fellow_log.c
+++ b/src/fellow_log.c
@@ -1696,6 +1696,9 @@ fellow_io_regions_discard(struct fellow_fd *ffd, void *ioctx,
                r = fallocate(ffd->fd,
                    FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE,
                    (off_t)todo->offset, (off_t)todo->len);
+               // XXX TEST
+               r = 1;
+               errno = EOPNOTSUPP;
                if (r == 0) {
                        if ((ffd->cap & FFD_CAN_FALLOCATE_PUNCH_URING) == 0) {
                                ffd->diag("fellow: fallocate punch"

Fixes #38

11d86e62

fellow_stream_f(): Improve comment and assertion · e1017adc
Nils Goroll authored Nov 07, 2023
```
to make clear that we understand exactly what is happening.
```
e1017adc

Fix races for streaming busy objects · aca69dac

Nils Goroll authored Nov 07, 2023

For streaming busy objects, we basically rely on the varnish-cache
ObjExtend() / ObjWaitExtend() API to never read past the object: In
fellow_stream_f(), we always wait for more data (or the end of the
object) before returning, such that fellow_cache_obj_iter(), which
iterates over segments, should never touch a segment past the final
FCS_BUSY segment.

Yet - it did, by means of the read-ahead and the peek-ahead to determine
whether or not OBJ_ITER_END should be signaled.

We fix this issue by reading/peeking ahead only for segments with a
state beyond FCS_BUSY.

There is now also extensive test infrastructure to specifically test
concurrent access ti busy objects. To keep layers separate,
fellow_cache_test uses a lightweight signal/wait implementation
analogous to the ObjExtend() / ObjWaitExtend() Varnish-Cache
interface.

An earlier version of t_busyobj() had run on my dev laptop for 3.5
hours without crashing, while without the fixes it had run into
assertion failures within seconds.

Fixes #35 and #36 (I hope)

aca69dac

Extend b62.vtc by cache reload · 83bc6afe
Nils Goroll authored Nov 06, 2023

83bc6afe
Mark a question to revisit later · ab644362
Nils Goroll authored Nov 06, 2023

ab644362

Add DBG() to fcsc_next() · 8fc211fe

Nils Goroll authored Nov 06, 2023

... to make it easier to follow the code in fellow_cache_test

motivated by #35

8fc211fe

Reorganize offsets in log info · 39700637
Nils Goroll authored Nov 03, 2023

39700637

Introduce a dynamic minimum to dsk_reserve_chunks ... · fefc08da

Nils Goroll authored Nov 03, 2023

... such that the total reserve is no less than 2MB.

This is required for stable operation of LRU when the log is full.

Ref #28

fefc08da

Add buddy_next_ptr_* · 6fc6bd78
Nils Goroll authored Oct 28, 2023

6fc6bd78
Fix single active logblock allocation for logregion-only case · 0b45d073
Nils Goroll authored Oct 26, 2023
```
Should be irrelevant in practice, because we would not flush
a single block during startup.
```
0b45d073

Fix nit in logblocks_alloc_from_logregion() with already allocated blocks · 78f2dcc4

Nils Goroll authored Oct 26, 2023

When some blocks were already allocated, we would fail to
use all of the log region, that is, the newly added assertion

	if (n > 0) AZ(logreg->free_n);

would fail

This left some blocks of the logregion unused, but was insignificant
otherwise.

78f2dcc4

Fix stupid glitch rendering logbuffer capabilities useless · 8b6e81f7

Nils Goroll authored Oct 26, 2023

Unfortunately, this was present even in the initial public
release 58ec40f9

This issue should have had no production impact, but it made hunting
down bugs unnecessary hard.

8b6e81f7

Move assertion to the right place · 108714e7

Nils Goroll authored Nov 03, 2023

When we work on the last segment, the remaining length is zero,
but we still have a current pointer and length.

This was a particularly annoying glitch because I wrote almost
the same code for varnish-cache with the equivalent assertion in
the right place :(

Sorry

Ref https://github.com/varnishcache/varnish-cache/pull/4013/commits/8ec77190d91603c8f0dead0cee013e3c9ca8fa78#diff-f79cfeda8456789ae873270aefa58e8f1e94213ee16d32ea96b8db8a7013ebf8R790
Closes #34

108714e7

Introduce a flush finish state · dac7e1da

Nils Goroll authored Nov 02, 2023

it is planned to replace the "inuse" tri-state and might turn
out helpful for debugging.

dac7e1da

Polish: use seq_inc() macro · 27841cb3
Nils Goroll authored Nov 01, 2023

27841cb3

Workaround for Varnish-Cache VC#4013: Wrong trim use, inefficient copy · 8409356f

Nils Goroll authored Oct 31, 2023

https://github.com/varnishcache/varnish-cache/pull/4013 fixes two
issues in Varnish-Cache, which are relevant for SLASH/fellow and of
which the first is the root cause of #33.

This commit works around these issues until the fix gets merged:

Because of the wrong use of the .objtrimstore API function by
varnish-cache, we remove it from our obj_methods and exploit the fact
that varnish-cache always sets the OA_LEN attribute when the object is
complete: We move the trimstore function there, effectively calling it
at the right time only.

The inefficient memory allocation fixed in the second commit of
VC#4013 is particularly relevant for fellow, because it causes the
allocation code to assume that the object might grow up to the maximum
possible size, which causes a substantial over-allocation. We work
around this issue for the case that a 304 copy is made from fellow to
fellow by using private thread-local storage to emulate basically the
same function as the #4013 fix.

Closes #33
Ref https://github.com/varnishcache/varnish-cache/pull/4013

8409356f

Assert no duplicate trimming · ce719295
Nils Goroll authored Oct 31, 2023
```
Ref #33
Ref https://github.com/varnishcache/varnish-cache/pull/4013
```
ce719295
Add PTOK() macro from varnish-cache · 5fc0b708
Nils Goroll authored Feb 07, 2024

5fc0b708
Modify b62.vtc to trigger #33 · 5b92665c
Nils Goroll authored Oct 31, 2023

5b92665c
Minor polish · 7dc2a23a
Nils Goroll authored Oct 31, 2023

7dc2a23a
Tigthen assertions in fellow_busy_body_seg_next · 9797906c
Nils Goroll authored Oct 31, 2023
```
These would have made analyzing #33 much easier. :|
```
9797906c
Add assertions · 3850cded
Nils Goroll authored Oct 30, 2023
```
motivated by #32
```
3850cded
Polish RST · 25662138
Nils Goroll authored Oct 30, 2023

25662138
Add b62.vtc · f84afcdd
Nils Goroll authored Oct 30, 2023

f84afcdd
Start a document about helpful debugging information · 2bcc41c3
Nils Goroll authored Oct 30, 2023

2bcc41c3
Add a variation of varnish-cache c62.vtc · 76180991
Nils Goroll authored Oct 30, 2023

76180991

In forkrun(), fix SIGCHLD waiting and test it · e1b3e40f

Nils Goroll authored Oct 29, 2023

Spotted by Thomas Gleixner <tglx@linutronix.de>, THANK YOU

forkrun() never properly handled the case that a child exited before
the timeout expired, because we had failed to block the signal and
hence never received a SIGCHLD. This was overlooked because this
functionality was never relevant (it only delayed test execution) and
because we did not explicitly test it.

Related to #31

e1b3e40f

Do not call the stream function again after it has failed · dc565b27
Nils Goroll authored Oct 27, 2023
```
Should fix #32
```
dc565b27
Get a weird problem out of the way for now · 2b9f729f
Nils Goroll authored Oct 26, 2023
```
See #31
```
2b9f729f