fellow_cache: wait for any I/O before delete and fix assertion

seen during internal testing:

 #13 0x00007efc94253e32 in __GI___assert_fail (assertion=0x7efc935ef349 "FCO_REFCNT(fco) <= 2",
    file=0x7efc935ee10f "fellow_cache.c", line=6106,
    function=0x7efc935ef2d4 "void fellow_cache_obj_delete(struct fellow_cache *, struct fellow_cache_obj *, const uint8_t *)") at ./assert/assert.c:101
 #14 0x00007efc935c7f66 in fellow_cache_obj_delete (fc=0x7efc93a41300, fco=fco@entry=0x7efc4778a000,
    hash=hash@entry=0x7efc2ea04270 "\016\351S~\a\346\353҄B\256x\346Mx\375P\211Hz\377U\337\030ol\207Y\276䯒")
    at fellow_cache.c:6106

reason: ongoing I/O on segments:

(gdb) p fco->fdo_fcs.refcnt
$19 = 3
(gdb) p fco->fcsl->lsegs
$20 = 3
(gdb) set $i = 0
(gdb) p fco->fcsl->segs[$i++]->state
$21 = FCS_INCORE
(gdb) p fco->fcsl->segs[$i++]->state
$22 = FCS_READING
(gdb) p fco->fcsl->segs[$i++]->state
$23 = FCS_READING

so:

- we can not make assumptions on the number of references
- we need to wait for any I/O, not just writing and seglist read
parent abe6d8ad
......@@ -6097,13 +6097,9 @@ fellow_cache_obj_delete(struct fellow_cache *fc,
assert(n <= FCO_MAX_REGIONS);
AZ(pthread_mutex_lock(&fco->mtx));
/* we must not free the object's disk space while it is still writing */
while (FCO_STATE(fco) == FCO_WRITING)
fellow_cache_seg_wait_locked(FCO_FCS(fco));
/* now the only other reference can be held by
* fellow_cache_seglists_load()
/* we must not free the object's disk space while it still
* has ongoing I/O
*/
assert(FCO_REFCNT(fco) <= 2);
while (FCO_REFCNT(fco) > 1)
AZ(pthread_cond_wait(&fco->cond, &fco->mtx));
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment