Do not short-cut the mutex protecting node->subreq.done

A follow-up issue has been reported in #13:

Assert error in Lck_Delete(), cache/cache_lck.c line 309:
  Condition((pthread_mutex_destroy(&ilck->mtx)) == 0) not true.

triggered from Lck_Delete(&bytes_tree->nodes_lock) at the bottom of
bytes_tree_fini().

Assuming everything else working correctly, the only scenario I
can see the moment is that we see the node->subreq.done == 1
earlier than Lck_Unlock() returned in vped_task(). In this
case, we could advance to destroying the lock while the other
thread still holds it.

The other use case of the shared lock is in fini_final(), where
we already go through an explicit lock/unlock.

Hopefully fixes #13 for real

(backport of 067c16e0 to 6.6)
parent 3b041dfe
......@@ -624,13 +624,12 @@ fini_subreq(const struct vdp_ctx *vdx, struct node *node)
assert(node->type == T_SUBREQ);
if (node->subreq.done == 0) {
CAST_OBJ_NOTNULL(subreq, node->subreq.req, REQ_MAGIC);
CAST_OBJ_NOTNULL(pesi, subreq->transport_priv, PESI_MAGIC);
CHECK_OBJ_NOTNULL(pesi->pesi_tree, PESI_TREE_MAGIC);
CHECK_OBJ_NOTNULL(pesi->pesi_tree->tree, BYTES_TREE_MAGIC);
subreq_wait_done(node, pesi->pesi_tree->tree);
}
CAST_OBJ_NOTNULL(subreq, node->subreq.req, REQ_MAGIC);
CAST_OBJ_NOTNULL(pesi, subreq->transport_priv, PESI_MAGIC);
CHECK_OBJ_NOTNULL(pesi->pesi_tree, PESI_TREE_MAGIC);
CHECK_OBJ_NOTNULL(pesi->pesi_tree->tree, BYTES_TREE_MAGIC);
subreq_wait_done(node, pesi->pesi_tree->tree);
AN(node->subreq.done);
if (node->subreq.req == NULL) {
......@@ -735,8 +734,7 @@ push_subreq(struct req *req, struct bytes_tree *tree,
assert(node->type == T_SUBREQ);
(void) next;
if (node->subreq.done == 0)
subreq_wait_done(node, tree);
subreq_wait_done(node, tree);
AN(node->subreq.done);
subreq = subreq_fixup(node, req);
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment