Hi,
Someone pointed out to me recently that since the delimiters are not a
requirement of PCRE, and thus should not be considered a part of the
regular expression, there really is no need to escape them inside of the
regular expression such as that provided to preg_match functions and
similar.
I propose removing this requirement by seeking out the delimiter from the
end of the regex string (ignoring modifiers) and working backwards to find
the closing delimiter. Currently the implementation seeks out the closing
modifiers by iterating through the entire regex string from start to end
(thus requiring us to escape the delimiter inside the expression).
The change introduces no BC breakage as far as I can tell and offers a
number of advantages.
-
Regular expressions supplied to preg_match functions become portable
across PCRE implementations since escaping the delimiter inside of the
regular expression really isn't required. (i.e. it may work from PHP to
Perl, but may not work from Perl to PHP). This makes things a lot easier to
port between PHP and other PCRE implementations. -
The regular expression is easier to read by a human if we remove the
requirement to escape delimiters. -
It becomes clearer that the delimiters are not a part of the regular
expression both in the implementation and from userspace code.
It's not a big change so I propose merging into 5.5 or 5.4 up into 5.5.
Thoughts, objections, etc?