Release the fco mtx during "unbusy" log submission

the test case from the next commit exposed a deadlock because the
only object in the test would consume all memory und could not
get LRUd because fellow_cache_async_write_complete() would hold
the fco mtx during log submission.
parent 2c302579
......@@ -2812,12 +2812,27 @@ fellow_cache_async_write_complete(struct fellow_cache *fc,
switch (fco->logstate) {
case FCOL_WANTLOG:
/* XXX could probably this outside the lock
* by deleting the obj again if the logstate
* was changed to FCOL_TOOLATE
/* fellow_cache_obj_delete() can not race
* us because of wait for FCO_WRITING.
*
* unlock during busy_log_submit because
* of LRU and waiting allocs
*
* sfe_oc_event DOES race us and we may
* lose events between the time we write
* the log here and call stvfe_oc_log_submitted()
*
* the whole event thing is racy anyway, not sure
* how relevant...
*/
fellow_cache_lru_chgbatch_apply(lcb);
AZ(pthread_mutex_unlock(&fco->mtx));
fellow_busy_log_submit(fbo);
stvfe_oc_log_submitted(fco->oc);
AZ(pthread_mutex_lock(&fco->mtx));
assert(fco->logstate == FCOL_WANTLOG);
fco->logstate = FCOL_INLOG;
break;
case FCOL_NOLOG:
......@@ -4275,7 +4290,7 @@ fdr_compar(const void *aa, const void *bb)
}
#endif
/* under fco mtx */
/* NOT fco mtx */
static void
fellow_busy_log_submit(const struct fellow_busy *fbo)
{
......@@ -4823,9 +4838,13 @@ fellow_cache_obj_delete(struct fellow_cache *fc,
switch (fco->logstate) {
case FCOL_DUNNO:
case FCOL_WANTLOG:
fco->logstate = FCOL_TOOLATE;
break;
case FCOL_WANTLOG:
// SYNC WITH fellow_cache_async_write_complete()
// see comment there
WRONG("fellow_cache_obj_delete FCOL_WANTLOG - can't race");
break;
case FCOL_NOLOG:
break;
case FCOL_TOOLATE:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment