I have moved the POSIX regex dependant functions to ext/ereg/ extension.
Now only places using the POSIX regex functions (ext/ereg/ excluded) are
ext/standard/browscap.c and ext/pgsql/pgsql.c.
So what to do with these 2 places using the POSIX stuff? Convert them to
use PCRE functions or enable PCRE to be build with the POSIX compat
functions?
ext/ereg/ is going to go to PECL anyway..
--Jani
Hi,
On Mon, 16 Jul 2007 15:47:36 +0300, in php.internals
jani.taskinen@sci.fi (Jani Taskinen) wrote:
Now only places using the POSIX regex functions (ext/ereg/ excluded) are
ext/standard/browscap.c and ext/pgsql/pgsql.c.So what to do with these 2 places using the POSIX stuff? Convert them to
use PCRE functions or enable PCRE to be build with the POSIX compat
functions?
I would suggest just rewriting them to PCRE functions.
According to 5.2.3 pgsql.c has nine explicit expressions. browscap.c
builds regexes and has just one regex call.
browscap.c does nothing special with regard to regular expressions
(just converting glob-like tokes such as ? * . to . .* . ). The
expressions in pgsql.c seem pretty straight forward to me as well.
It might be enough just to add delimiters to the expressions (and
properly escape if the delimiter is present in the expression). I
don't think it would take a long time to do.
--
- Peter Brodersen
I have moved the POSIX regex dependant functions to ext/ereg/
extension.Now only places using the POSIX regex functions (ext/ereg/ excluded)
are
ext/standard/browscap.c and ext/pgsql/pgsql.c.
I took a brief look at the pgsql.c stuff, and it looks like it's all
fairly straight-forward to alter to PCRE instead of POSIX, and it's
all localized to this function:
http://lxr.php.net/ident?i=php_pgsql_convert_match
Am I under-estimating the problem?
Or is it actually possible that I could fix this and contribute
something useful for once?
Is anybody else already "on it"?
Cuz I'm gonna go download CVS and see if I can't submit a patch...
[be afraid, be very afraid]
--
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?
Richard Lynch kirjoitti:
I took a brief look at the pgsql.c stuff, and it looks like it's all
fairly straight-forward to alter to PCRE instead of POSIX, and it's
all localized to this function:
http://lxr.php.net/ident?i=php_pgsql_convert_matchAm I under-estimating the problem?
Propably not.
Is anybody else already "on it"?
AFAIK, no. Feel free. I hope you have pgsql in use and you can test it too. :)
--Jani
Richard Lynch kirjoitti:
I took a brief look at the pgsql.c stuff, and it looks like it's all
fairly straight-forward to alter to PCRE instead of POSIX, and it's
all localized to this function:
http://lxr.php.net/ident?i=php_pgsql_convert_matchAm I under-estimating the problem?
Propably not.
Is anybody else already "on it"?
AFAIK, no. Feel free. I hope you have pgsql in use and you can test it
too. :)
Yes and yes.
Errr, I guess I'd better make that "Yes and I'll try" :-)
I use PostgreSQL a lot, actually.
--
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?
Now only places using the POSIX regex functions (ext/ereg/ excluded)
are
ext/standard/browscap.c and ext/pgsql/pgsql.c.
As you may know, I'm working on converting ext/pgsql/pgsql.c to use
PCRE instead of POSIX regex.
It's actually going fairly well, believe it or not, though I have a
ton of debug printf's in my C code at the moment, to be sure I'm
hitting all the lines I want to hit to test everything.
[Yeah, I know, there are way more fancy tools available this decade,
but I'm old.]
At any rate, I've hit a bit of a snag, and either I'm being stupid, or
there's been a bad regex pattern in there for awhile now...
Does not this line:
http://lxr.php.net/source/php-src/ext/pgsql/pgsql.c#5021
have an "extra" closing paren at the end of the pattern?
[I am making my patch against PHP 5; it just hasn't changed since 4]
My PCRE patch is telling me it's broken there:
Warning: pg_convert(): Compilation failed: unmatched parentheses at
offset 47 in /home/lynch/pg_pcre.php on line 28
[Seems weird how I get a PHP error out of my C code, but there it is...]
My eyes and counting up/down on my fingers like I learned in Lisp
class in college does.
The Regex Coach says it does.
Does POSIX regex not generate some kind of error on an extra paren at
the end?
Or am I missing something particularly arcane or abstract here?
If it's actually broken:
My PCRE patch will just not have that extra closing paren, so don't
rush a patch through just for this, unless you really want to.
I expect to wrap up in a couple days, unless something totally
unexpected crops up.
--
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?
Now only places using the POSIX regex functions (ext/ereg/ excluded)
are
ext/standard/browscap.c and ext/pgsql/pgsql.c.
For your review, my first patch (!) along with a php test case, of
course, in a URL/directory structure that should be familiar:
http://l-i-e.com/php5/ext/pgsql/
:-)
The commit comment should probably have something not unlike this:
Use PCRE instead of POSIX regex
Remove stray closing parenthesis in PG_TIME pattern
Real Hackers can snag the patch and play with it and hit 'delete' now.
Regarding the test case...
The existing pg_convert test case only tested 3 conversions and there
are/were 9 PCRE/POSIX-regex non-trivial conversions.
I didn't really want to mess with adding a bunch of columns to the
existing test table, potentially messing up a bunch of other test
scripts, so just created/dropped my own table to hit all 9 PCREs I
hacked.
There are many other conversions, actually, but they mostly consist of
no-op or typecasting an int to a string with no actual change, or
adding apostrophes around a value to make it DB-ready, and I didn't
touch those anyway, so they should be no less broken than they were
before.
I am, of course, 100% open to critiques, comments, or derogatory
remarks. :-)
PS
The function was and probably should remain experimental in the docs,
I guess...
Though I am pretty sure I did excise one bug with that stray paren. :-)
--
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?
Now only places using the POSIX regex functions (ext/ereg/ excluded)
are
ext/standard/browscap.c and ext/pgsql/pgsql.c.For your review, my first patch (!) along with a php test case, of
course, in a URL/directory structure that should be familiar:http://l-i-e.com/php5/ext/pgsql/
:-)
The commit comment should probably have something not unlike this:
Use PCRE instead of POSIX regex
Remove stray closing parenthesis in PG_TIME pattern
It's been a week and nobody has commented on this.
Should somebody commit it now?...
Or grant me commit karma to ext/pgsql
CVS username is 'lynch'
And, just to be sure, since it only changes internal workings and not
documented features, it should go into 5.x, right?...
Or is requiring PCRE instead of POSIX considered not BC for 5.x series?
I'll check PHP 6 pgsql and see if it's been Unicode-ified beyond
recognition for this patch, or if it applies cleanly there as well.
PS
I'll change the test case to do the insert with the converted data as
a further check that it worked, instead of a rather bogus test insert
of hand-coded data that it does now.
--
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some indie artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?