Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
L
libvmod-re2
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Nils Goroll
libvmod-re2
Commits
06610ae9
Commit
06610ae9
authored
Jun 04, 2017
by
Geoff Simmons
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Add the set.matched() and .nmatches methods, with a bit of refactoring.
parent
0324e8cf
Changes
7
Expand all
Show whitespace changes
Inline
Side-by-side
Showing
7 changed files
with
548 additions
and
136 deletions
+548
-136
README.rst
README.rst
+117
-9
configure.ac
configure.ac
+0
-27
set.vtc
src/tests/set.vtc
+208
-79
vmod_re2.c
src/vmod_re2.c
+103
-1
vmod_re2.vcc
src/vmod_re2.vcc
+103
-9
vre2set.cpp
src/vre2/vre2set.cpp
+14
-9
vre2set.h
src/vre2/vre2set.h
+3
-2
No files found.
README.rst
View file @
06610ae9
...
...
@@ -48,6 +48,8 @@ import re2 [from "path"] ;
VOID <obj>.add(STRING)
VOID <obj>.compile()
BOOL <obj>.match(STRING)
INT <obj>.nmatches()
BOOL <obj>.matched(INT)
DESCRIPTION
===========
...
...
@@ -116,14 +118,35 @@ example::
sub vcl_init {
new myset = re2.set();
myset.add("foo");
myset.add("bar");
myset.add("baz");
myset.add("foo");
# Pattern 1
myset.add("bar");
# Pattern 2
myset.add("baz");
# Pattern 3
myset.compile();
}
``myset.match(<string>)`` can now be used to match a string against the
pattern ``foo|bar|baz``.
``myset.match(<string>)`` can now be used to match a string against
the pattern ``foo|bar|baz``. When a match is successful, the matcher
has determined all of the patterns that matched. These can then be
retrieved with the method ``.nmatches()`` for the number of matched
patterns, and with ``.matched(n)``, which returns ``true`` if the
``nth`` pattern matched, where the patterns are numbered in the order
in which they were added::
if (myset.match("foobar")) {
std.log("Matched " + myset.nmatches() + " patterns");
if (myset.matched(1)) {
# Pattern /foo/ matched
call do_foo;
}
if (myset.matched(2)) {
# Pattern /bar/ matched
call do_bar;
}
if (myset.matched(3)) {
# Pattern /baz/ matched
call do_baz;
}
}
regex options
-------------
...
...
@@ -717,6 +740,9 @@ VCL load will fail with an error message.
In other words, add all patterns to the set in ``vcl_init``, and
finally call ``.compile()`` when you're done.
When the ``.matched(INT)`` method is called after a successful match,
the numbering corresponds to the order in which patterns were added.
Example::
sub vcl_init {
...
...
@@ -742,10 +768,10 @@ set.compile
Compile the compound pattern represented by the set -- an alternation
of all patterns added by ``.add()``.
``.compile()``
may fail if the ``max_mem`` setting is not large enough
f
or the composed pattern. In that case, the VCL load will fail with an
error message (then consider a larger value for ``max_mem`` in the set
constructor).
``.compile()``
fails if no patterns were added to the set. It may also
f
ail if the ``max_mem`` setting is not large enough for the composed
pattern. In that case, the VCL load will fail with an error message
(then consider a larger value for ``max_mem`` in the set
constructor).
``.compile()`` MUST be called in ``vcl_init``, and MAY NOT be called
more than once for a set object. If it is called in any other
...
...
@@ -768,6 +794,11 @@ Returns ``true`` if the given string matches the compound pattern
represented by the set, i.e. if it matches any of the patterns that
were added to the set.
The matcher identifies all of the patterns that were added to the set
and match the given string. These can be determined after a successful
match using the ``.matched(INT)`` and ``.nmatches()`` methods
described below.
``.match()`` MUST be called after ``.compile()``; otherwise the match
always fails.
...
...
@@ -777,6 +808,83 @@ Example::
call do_when_a_host_matched;
}
.. _func_set.matched:
set.matched
-----------
::
BOOL set.matched(INT)
Returns ``true`` after a successful match if the ``nth`` pattern that
was added to the set is among the patterns that matched, ``false``
otherwise. The numbering of the patterns corresponds to the order in
which patterns were added in ``vcl_init``, counting from 1.
The method refers back to the most recent invocation of ``.match()``
for the same object in the same client or backend context. It always
returns ``false``, for every value of the parameter, if it is called
after an unsuccessful match (``.match()`` returned ``false``).
``.matched()`` fails and returns ``false`` if:
* The ``.match()`` method was not called for this object in the same
client or backend scope.
* The integer parameter is out of range; that is, if it is less than 1
or greater than the number of patterns added to the set.
On failure, the method writes an error message to the log with the tag
``VCL_Error``; if it fails during ``vcl_init``, then the VCL load
fails with the error message. In any other VCL subroutine, the method
returns ``false`` on failure and processing continues; since ``false``
is a legitimate return value, you should consider monitoring the log
for the error messages.
Example::
if (hostmatcher.match(req.http.Host)) {
if (hostmatcher.matched(1)) {
call do_domain1;
}
if (hostmatcher.matched(2)) {
call do_domain2;
}
if (hostmatcher.matched(3)) {
call do_domain3;
}
}
.. _func_set.nmatches:
set.nmatches
------------
::
INT set.nmatches()
Returns the number of patterns that were matched by the most recent
invocation of ``.match()`` for the same object in the same client or
backend context. The method always returns 0 after an unsuccessful
match (``.match()`` returned ``false``).
If ``.match()`` was not called for this object in the same client or
backend scope, ``.nmatches()`` fails and returns 0, writing an error
message with ``VCL_Error`` to the log. If this happens in
``vcl_init``, the VCL load fails with the error message. As with
``.matched()``, ``.nmatches()`` returns a legitimate value and VCL
processing continues when it fails in any other subroutine, so you
should monitor the log for the error messages.
Example::
if (myset.match(req.url)) {
std.log("URL matched " + myset.nmatches()
+ " patterns from the set");
}
.. _func_version:
version
...
...
configure.ac
View file @
06610ae9
...
...
@@ -85,33 +85,6 @@ AC_PATH_PROG([VARNISHD], [varnishd], [],
PKG_CHECK_MODULES([RE2], [re2])
# RE2 versions up to 2016-03-01 require a pointer to vector<int> in
# Set::Match(), to identify the regex that was matched. Since commit
# df7a2dc in re2, the pointer may be NULL, if we just want to know
# whether there was a match. This check tests for that feature.
# Note: the test may cause a core dump if it fails.
AC_LANG_PUSH(C++)
SAVE_CXXFLAGS="$CXXFLAGS"
SAVE_LDFLAGS="$LDFLAGS"
CXXFLAGS+=" -std=c++11"
LDFLAGS+=" -lre2"
AC_RUN_IFELSE(
[AC_LANG_SOURCE([[
#include "re2/set.h"
main() {
re2::RE2::Set s(re2::RE2::DefaultOptions, re2::RE2::UNANCHORED);
s.Add("", NULL);
s.Compile();
s.Match("", NULL);
}
]])],
[AC_DEFINE([HAVE_SET_MATCH_NULL_VECTOR], [1],
[Define to 1 if RE2 Set::Match() permits a NULL vector])]
)
CXXFLAGS="$SAVE_CXXFLAGS"
LDFLAGS="$SAVE_LDFLAGS"
AC_LANG_POP()
# --enable-stack-protector
AC_ARG_ENABLE(stack-protector,
AS_HELP_STRING([--enable-stack-protector],[enable stack protector (default is YES)]),
...
...
src/tests/set.vtc
View file @
06610ae9
This diff is collapsed.
Click to expand it.
src/vmod_re2.c
View file @
06610ae9
...
...
@@ -68,6 +68,7 @@ struct vmod_re2_set {
vre2set
*
set
;
char
*
vcl_name
;
unsigned
compiled
;
int
npatterns
;
};
typedef
struct
task_match_t
{
...
...
@@ -79,6 +80,13 @@ typedef struct task_match_t {
unsigned
never_capture
;
}
task_match_t
;
struct
task_set_match
{
unsigned
magic
;
#define TASK_SET_MATCH_MAGIC 0x7a24a90b
int
*
matches
;
size_t
nmatches
;
};
static
char
c
;
static
const
void
*
match_failed
=
(
void
*
)
&
c
;
...
...
@@ -106,6 +114,7 @@ errmsg(VRT_CTX, const char *fmt, ...)
AN
(
ctx
->
msg
);
va_start
(
args
,
fmt
);
VSB_vprintf
(
ctx
->
msg
,
fmt
,
args
);
VSB_putc
(
ctx
->
msg
,
'\n'
);
va_end
(
args
);
VRT_handling
(
ctx
,
VCL_RET_FAIL
);
}
...
...
@@ -570,6 +579,7 @@ vmod_set__init(VRT_CTX, struct vmod_re2_set **setp, const char *vcl_name,
return
;
}
set
->
vcl_name
=
strdup
(
vcl_name
);
AZ
(
set
->
npatterns
);
}
VCL_VOID
...
...
@@ -615,6 +625,7 @@ vmod_set_add(VRT_CTX, struct vmod_re2_set *set, VCL_STRING pattern)
set
->
vcl_name
,
pattern
,
pattern
,
err
);
return
;
}
set
->
npatterns
++
;
}
#undef ERR_PREFIX
...
...
@@ -633,6 +644,10 @@ vmod_set_compile(VRT_CTX, struct vmod_re2_set *set)
"vcl_init"
,
set
->
vcl_name
);
return
;
}
if
(
set
->
npatterns
==
0
)
{
VERR
(
ctx
,
ERR_PREFIX
"no patterns were added"
,
set
->
vcl_name
);
return
;
}
if
(
set
->
compiled
)
{
VERR
(
ctx
,
ERR_PREFIX
"%s has already been compiled"
,
...
...
@@ -655,6 +670,10 @@ VCL_BOOL
vmod_set_match
(
VRT_CTX
,
struct
vmod_re2_set
*
set
,
VCL_STRING
subject
)
{
int
match
=
0
;
struct
vmod_priv
*
priv
;
struct
task_set_match
*
task
;
char
*
buf
;
size_t
buflen
;
const
char
*
err
;
CHECK_OBJ_NOTNULL
(
ctx
,
VRT_CTX_MAGIC
);
...
...
@@ -667,15 +686,98 @@ vmod_set_match(VRT_CTX, struct vmod_re2_set *set, VCL_STRING subject)
subject
,
set
->
vcl_name
);
return
0
;
}
if
((
err
=
vre2set_match
(
set
->
set
,
subject
,
&
match
))
!=
NULL
)
{
priv
=
VRT_priv_task
(
ctx
,
set
);
AN
(
priv
);
if
(
priv
->
priv
==
NULL
)
{
if
((
priv
->
priv
=
WS_Alloc
(
ctx
->
ws
,
sizeof
(
*
task
)))
==
NULL
)
{
VERRNOMEM
(
ctx
,
ERR_PREFIX
"allocating match data"
,
set
->
vcl_name
,
subject
);
return
0
;
}
priv
->
len
=
sizeof
(
*
task
);
priv
->
free
=
NULL
;
task
=
priv
->
priv
;
task
->
magic
=
TASK_SET_MATCH_MAGIC
;
}
else
{
WS_Contains
(
ctx
->
ws
,
priv
->
priv
,
sizeof
(
*
task
));
CAST_OBJ
(
task
,
priv
->
priv
,
TASK_SET_MATCH_MAGIC
);
}
buf
=
WS_Snapshot
(
ctx
->
ws
);
buflen
=
WS_Reserve
(
ctx
->
ws
,
0
);
if
((
err
=
vre2set_match
(
set
->
set
,
subject
,
&
match
,
buf
,
buflen
,
&
task
->
nmatches
))
!=
NULL
)
{
VERR
(
ctx
,
ERR_PREFIX
"%s"
,
set
->
vcl_name
,
subject
,
err
);
WS_Release
(
ctx
->
ws
,
0
);
return
0
;
}
if
(
match
)
{
task
->
matches
=
(
int
*
)
buf
;
WS_Release
(
ctx
->
ws
,
task
->
nmatches
*
sizeof
(
int
));
}
else
WS_Release
(
ctx
->
ws
,
0
);
return
match
;
}
#undef ERR_PREFIX
VCL_BOOL
vmod_set_matched
(
VRT_CTX
,
struct
vmod_re2_set
*
set
,
VCL_INT
n
)
{
struct
vmod_priv
*
priv
;
struct
task_set_match
*
task
;
CHECK_OBJ_NOTNULL
(
ctx
,
VRT_CTX_MAGIC
);
CHECK_OBJ_NOTNULL
(
set
,
VMOD_RE2_SET_MAGIC
);
if
(
n
<
1
||
n
>
set
->
npatterns
)
{
VERR
(
ctx
,
"n=%d out of range in %s.matched() (%d patterns)"
,
n
,
set
->
vcl_name
,
set
->
npatterns
);
return
0
;
}
priv
=
VRT_priv_task
(
ctx
,
set
);
AN
(
priv
);
if
(
priv
->
priv
==
NULL
)
{
VERR
(
ctx
,
"%s.matched(%d) called without prior match"
,
set
->
vcl_name
,
n
);
return
0
;
}
WS_Contains
(
ctx
->
ws
,
priv
->
priv
,
sizeof
(
*
task
));
CAST_OBJ
(
task
,
priv
->
priv
,
TASK_SET_MATCH_MAGIC
);
if
(
task
->
nmatches
==
0
)
return
0
;
WS_Contains
(
ctx
->
ws
,
task
->
matches
,
task
->
nmatches
*
sizeof
(
int
));
n
--
;
for
(
unsigned
i
=
0
;
i
<
task
->
nmatches
;
i
++
)
if
(
task
->
matches
[
i
]
==
n
)
return
1
;
return
0
;
}
VCL_INT
vmod_set_nmatches
(
VRT_CTX
,
struct
vmod_re2_set
*
set
)
{
struct
vmod_priv
*
priv
;
struct
task_set_match
*
task
;
CHECK_OBJ_NOTNULL
(
ctx
,
VRT_CTX_MAGIC
);
CHECK_OBJ_NOTNULL
(
set
,
VMOD_RE2_SET_MAGIC
);
priv
=
VRT_priv_task
(
ctx
,
set
);
AN
(
priv
);
if
(
priv
->
priv
==
NULL
)
{
VERR
(
ctx
,
"%s.nmatches() called without prior match"
,
set
->
vcl_name
);
return
0
;
}
WS_Contains
(
ctx
->
ws
,
priv
->
priv
,
sizeof
(
*
task
));
CAST_OBJ
(
task
,
priv
->
priv
,
TASK_SET_MATCH_MAGIC
);
return
task
->
nmatches
;
}
/* Regex function interface */
#define ERR_PREFIX "re2.match(pattern=\"%.40s\", text=\"%.40s\"): "
...
...
src/vmod_re2.vcc
View file @
06610ae9
...
...
@@ -33,6 +33,8 @@ $Module re2 3 Varnish Module for access to the Google RE2 regular expression eng
VOID <obj>.add(STRING)
VOID <obj>.compile()
BOOL <obj>.match(STRING)
INT <obj>.nmatches()
BOOL <obj>.matched(INT)
DESCRIPTION
===========
...
...
@@ -101,14 +103,35 @@ example::
sub vcl_init {
new myset = re2.set();
myset.add("foo");
myset.add("bar");
myset.add("baz");
myset.add("foo");
# Pattern 1
myset.add("bar");
# Pattern 2
myset.add("baz");
# Pattern 3
myset.compile();
}
``myset.match(<string>)`` can now be used to match a string against the
pattern ``foo|bar|baz``.
``myset.match(<string>)`` can now be used to match a string against
the pattern ``foo|bar|baz``. When a match is successful, the matcher
has determined all of the patterns that matched. These can then be
retrieved with the method ``.nmatches()`` for the number of matched
patterns, and with ``.matched(n)``, which returns ``true`` if the
``nth`` pattern matched, where the patterns are numbered in the order
in which they were added::
if (myset.match("foobar")) {
std.log("Matched " + myset.nmatches() + " patterns");
if (myset.matched(1)) {
# Pattern /foo/ matched
call do_foo;
}
if (myset.matched(2)) {
# Pattern /bar/ matched
call do_bar;
}
if (myset.matched(3)) {
# Pattern /baz/ matched
call do_baz;
}
}
regex options
-------------
...
...
@@ -620,6 +643,9 @@ VCL load will fail with an error message.
In other words, add all patterns to the set in ``vcl_init``, and
finally call ``.compile()`` when you're done.
When the ``.matched(INT)`` method is called after a successful match,
the numbering corresponds to the order in which patterns were added.
Example::
sub vcl_init {
...
...
@@ -638,10 +664,10 @@ $Method VOID .compile()
Compile the compound pattern represented by the set -- an alternation
of all patterns added by ``.add()``.
``.compile()``
may fail if the ``max_mem`` setting is not large enough
f
or the composed pattern. In that case, the VCL load will fail with an
error message (then consider a larger value for ``max_mem`` in the set
constructor).
``.compile()``
fails if no patterns were added to the set. It may also
f
ail if the ``max_mem`` setting is not large enough for the composed
pattern. In that case, the VCL load will fail with an error message
(then consider a larger value for ``max_mem`` in the set
constructor).
``.compile()`` MUST be called in ``vcl_init``, and MAY NOT be called
more than once for a set object. If it is called in any other
...
...
@@ -657,6 +683,11 @@ Returns ``true`` if the given string matches the compound pattern
represented by the set, i.e. if it matches any of the patterns that
were added to the set.
The matcher identifies all of the patterns that were added to the set
and match the given string. These can be determined after a successful
match using the ``.matched(INT)`` and ``.nmatches()`` methods
described below.
``.match()`` MUST be called after ``.compile()``; otherwise the match
always fails.
...
...
@@ -666,6 +697,69 @@ Example::
call do_when_a_host_matched;
}
$Method BOOL .matched(INT)
Returns ``true`` after a successful match if the ``nth`` pattern that
was added to the set is among the patterns that matched, ``false``
otherwise. The numbering of the patterns corresponds to the order in
which patterns were added in ``vcl_init``, counting from 1.
The method refers back to the most recent invocation of ``.match()``
for the same object in the same client or backend context. It always
returns ``false``, for every value of the parameter, if it is called
after an unsuccessful match (``.match()`` returned ``false``).
``.matched()`` fails and returns ``false`` if:
* The ``.match()`` method was not called for this object in the same
client or backend scope.
* The integer parameter is out of range; that is, if it is less than 1
or greater than the number of patterns added to the set.
On failure, the method writes an error message to the log with the tag
``VCL_Error``; if it fails during ``vcl_init``, then the VCL load
fails with the error message. In any other VCL subroutine, the method
returns ``false`` on failure and processing continues; since ``false``
is a legitimate return value, you should consider monitoring the log
for the error messages.
Example::
if (hostmatcher.match(req.http.Host)) {
if (hostmatcher.matched(1)) {
call do_domain1;
}
if (hostmatcher.matched(2)) {
call do_domain2;
}
if (hostmatcher.matched(3)) {
call do_domain3;
}
}
$Method INT .nmatches()
Returns the number of patterns that were matched by the most recent
invocation of ``.match()`` for the same object in the same client or
backend context. The method always returns 0 after an unsuccessful
match (``.match()`` returned ``false``).
If ``.match()`` was not called for this object in the same client or
backend scope, ``.nmatches()`` fails and returns 0, writing an error
message with ``VCL_Error`` to the log. If this happens in
``vcl_init``, the VCL load fails with the error message. As with
``.matched()``, ``.nmatches()`` returns a legitimate value and VCL
processing continues when it fails in any other subroutine, so you
should monitor the log for the error messages.
Example::
if (myset.match(req.url)) {
std.log("URL matched " + myset.nmatches()
+ " patterns from the set");
}
$Function STRING version()
Return the version string for this VMOD.
...
...
src/vre2/vre2set.cpp
View file @
06610ae9
...
...
@@ -69,14 +69,9 @@ vre2set::compile() const
}
inline
bool
vre2set
::
match
(
const
char
*
subject
)
const
vre2set
::
match
(
const
char
*
subject
,
vector
<
int
>*
m
)
const
{
#ifdef HAVE_SET_MATCH_NULL_VECTOR
return
set_
->
Match
(
subject
,
NULL
);
#else
vector
<
int
>
v
;
return
set_
->
Match
(
subject
,
&
v
);
#endif
return
set_
->
Match
(
subject
,
m
);
}
const
char
*
...
...
@@ -151,10 +146,20 @@ vre2set_compile(vre2set *set)
}
const
char
*
vre2set_match
(
vre2set
*
set
,
const
char
*
const
subject
,
int
*
const
match
)
vre2set_match
(
vre2set
*
set
,
const
char
*
const
subject
,
int
*
const
match
,
void
*
buf
,
const
size_t
buflen
,
size_t
*
const
nmatches
)
{
try
{
*
match
=
set
->
match
(
subject
);
vector
<
int
>
m
;
*
nmatches
=
0
;
*
match
=
set
->
match
(
subject
,
&
m
);
if
(
*
match
)
{
if
(
m
.
size
()
*
sizeof
(
int
)
>
buflen
)
return
"insufficient space to copy match data"
;
*
nmatches
=
m
.
size
();
memcpy
(
buf
,
m
.
data
(),
*
nmatches
*
sizeof
(
int
));
}
return
NULL
;
}
CATCHALL
...
...
src/vre2/vre2set.h
View file @
06610ae9
...
...
@@ -46,7 +46,7 @@ public:
virtual
~
vre2set
();
int
add
(
const
char
*
pattern
,
string
*
error
)
const
;
bool
compile
()
const
;
bool
match
(
const
char
*
subject
)
const
;
bool
match
(
const
char
*
subject
,
std
::
vector
<
int
>*
m
)
const
;
};
#else
typedef
struct
vre2set
vre2set
;
...
...
@@ -72,7 +72,8 @@ extern "C" {
const
char
*
vre2set_add
(
vre2set
*
set
,
const
char
*
pattern
);
const
char
*
vre2set_compile
(
vre2set
*
set
);
const
char
*
vre2set_match
(
vre2set
*
set
,
const
char
*
subject
,
int
*
const
match
);
int
*
const
match
,
void
*
buf
,
const
size_t
buflen
,
size_t
*
const
nmatches
);
#ifdef __cplusplus
}
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment