Commit 473897cd authored by Dridi Boukelmoune's avatar Dridi Boukelmoune

vre: Don't count on the capture of the 0th group

Using groups[0].e turns out to be unreliable to print the suffix of the
subject string for a regsub operation. On Debian buster, with the help
of ASAN we can observe uninitialized memory through the remains of ASAN's
0xbe pattern that leads later to a complaint about an invalid pointer:

    runtime error: pointer index expression with base 0x6310000a0816
    overflowed to 0xbebf21cebec8c6d4

With a simple subtraction we can confirm the offset added to the base
address:

    0xbebf21cebec8c6d4 - 0x6310000a0816 = 0xbebebebebebebebe

To work around the possibility of an uninitialized ovector depending on
the pcre2 version, we initialize all offsets to PCRE2_UNSET and when we
encounter that value we capture a safe empty token.

This means that at the end of VRE_sub() we can no longer count on the
capture of the 0th group and revert back to using the offset.
parent dab48c1b
......@@ -188,7 +188,7 @@ vre_capture(const vre_t *code, const char *subject, size_t length,
{
pcre2_match_data *data;
pcre2_code *re;
PCRE2_SIZE *ovector;
PCRE2_SIZE *ovector, b, e;
size_t nov, g;
int matches;
......@@ -202,6 +202,11 @@ vre_capture(const vre_t *code, const char *subject, size_t length,
AN(data);
}
ovector = pcre2_get_ovector_pointer(data);
nov = 2 * pcre2_get_ovector_count(data);
for (g = 0; g < nov; g++)
ovector[g] = PCRE2_UNSET;
matches = pcre2_match(re, (PCRE2_SPTR)subject, length, offset,
options, data, code->re_ctx);
......@@ -213,8 +218,14 @@ vre_capture(const vre_t *code, const char *subject, size_t length,
if (nov > *count)
nov = *count;
for (g = 0; g < nov; g++) {
groups->b = subject + ovector[2 * g];
groups->e = subject + ovector[2 * g + 1];
b = ovector[2 * g];
e = ovector[2 * g + 1];
if (b == PCRE2_UNSET) {
groups->b = groups->e = "";
} else {
groups->b = subject + b;
groups->e = subject + e;
}
groups++;
}
*count = nov;
......@@ -334,7 +345,7 @@ VRE_sub(const vre_t *code, const char *subject, const char *replacement,
}
/* Copy suffix to match */
VSB_cat(vsb, groups[0].e);
VSB_cat(vsb, subject + offset);
return (1);
}
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment