Hi, Internals
I wrote an RFC that drop support mbregex.
https://wiki.php.net/rfc/eol-oniguruma
I wrote this as one idea.
What do you think?
Regards
Yuya
--
Yuya Hamada (tekimen)
2025年8月22日(金) 9:55 youkidearitai youkidearitai@gmail.com:
Hi, Internals
I wrote an RFC that drop support mbregex.
https://wiki.php.net/rfc/eol-onigurumaI wrote this as one idea.
What do you think?Regards
Yuya--
Yuya Hamada (tekimen)
Hello, internals
I improvement this RFC.
https://wiki.php.net/rfc/eol-oniguruma
Added more information about maintenance versions.
What do you think about Oniguruma maintenance ended.
Please watch and feel free to comment.
Regards
Yuya
--
Yuya Hamada (tekimen)
I improvement this RFC.
https://wiki.php.net/rfc/eol-onigurumaAdded more information about maintenance versions.
What do you think about Oniguruma maintenance ended.
Please watch and feel free to comment.
First, thank you for caring about this! I agree that we need a long
term solution for this issue. As I understand it, Oniguruma's greatest
advantage over PCRE2 is that it supports other character encodings than
Unicode and ANSI, so deprecating mbregex might be a problem for some users.
Still, the alternative would likely be to bundle liboniguruma, and I
don't think that would be a good idea. So deprecating mbregex as of PHP
8.6.0 seems prudent; if there would be lots of objections, we could
still reconsider.
Now I wonder how much trouble it would be to separate mbregex from
ext-mbstring. If that can be done with a reasonable amount of work,
that would likely be the best course of action (in addition to
deprecating mbregex). We could than move the extension to PECL/PIE, and
let users deal with it (I'm not happy what happened to ext-imap, but
it's still better than relying on an unmaintained library from a bundled
extension).
Christoph
I improvement this RFC. https://wiki.php.net/rfc/eol-oniguruma
Added more information about maintenance versions. What do you think
about Oniguruma maintenance ended. Please watch and feel free to
comment.First, thank you for caring about this! I agree that we need a long
term solution for this issue. As I understand it, Oniguruma's
greatest advantage over PCRE2 is that it supports other character
encodings than Unicode and ANSI, so deprecating mbregex might be a
problem for some users.
Yes, but I think Yuya mentioned somewhere else (I can't find it now) in
an earlier discussion, that many of these users now also moved to UTF-8.
It would also be possible to rewrite these uses from using mbregex to
UConverter::convert+pcre.
Incidently, icu also has a regular expression engine, but of course
that'll operate on UTF-16, and we'd have to create a full new
implementation for that:
Still, the alternative would likely be to bundle liboniguruma, and I
don't think that would be a good idea. So deprecating mbregex as of
PHP 8.6.0 seems prudent; if there would be lots of objections, we
could still reconsider.
I agree with that.
Now I wonder how much trouble it would be to separate mbregex from
ext-mbstring. If that can be done with a reasonable amount of work,
that would likely be the best course of action (in addition to
deprecating mbregex). We could than move the extension to PECL/PIE,
and let users deal with it (I'm not happy what happened to ext-imap,
but it's still better than relying on an unmaintained library from a
bundled extension).
Seeing code like in mbstring.c
#ifdef HAVE_MBREGEX
PHP_MINIT(mb_regex) (INIT_FUNC_ARGS_PASSTHRU);
#endif
And:
php_mbregex.h:PHP_MINIT_FUNCTION(mb_regex);
php_mbregex.h:PHP_MSHUTDOWN_FUNCTION(mb_regex);
php_mbregex.h:PHP_RINIT_FUNCTION(mb_regex);
php_mbregex.h:PHP_RSHUTDOWN_FUNCTION(mb_regex);
php_mbregex.h:PHP_MINFO_FUNCTION(mb_regex);
makes it feel that it already sort-of operates as a sub-extension, and
it wouldn't be too much work. But it will still be work. Is it worth
it?
cheers,
Derick
https://derickrethans.nl | https://xdebug.org | https://dram.io
Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support
mastodon: @derickr@phpc.social @xdebug@phpc.social