Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:128555 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id C3FC21A00BC for ; Tue, 26 Aug 2025 10:15:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1756203224; bh=pxjx11ebgiyZHS3ghPX/0O/q/5X+bnbZEOP4tSaxhAI=; h=Date:From:To:cc:Subject:In-Reply-To:References:From; b=FWpxNynu706IaRNo4yZLql/A1MzV9rm1/E+P0/5AuRmf2BS+0xBSkYoLj/vnsQg68 v9EoiWbakSkB78mNUnoYowlEjG8g2zelwHWT9637siVeyk/cKl4ngEFnu+j097o1vV kuqoqKtfzXZAdRX08C++VkIKEMt3Eh/IwqDtT3V/eABWRKJ1CgnpLnJAfc5qgNmxO3 7xU70NQoMdplU6+X8LLHsBEBBPmOMoSF5d1hFnl40stEPKhkKTmiilOybD/S8yy2LB RM6ko6T5YjKEVstB1ojs+Q8zwOqq58wLWKnwyMC6/Sy4PCpOm24BDfupOrVlyKPPmC NiOLhsbTzIA0Q== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id EB26E180059 for ; Tue, 26 Aug 2025 10:13:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: ** X-Spam-Status: No, score=2.8 required=5.0 tests=BAYES_40,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,SPF_HELO_PASS, SPF_SOFTFAIL autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from xdebug.org (xdebug.org [82.113.146.227]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 26 Aug 2025 10:13:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1756203314; bh=pxjx11ebgiyZHS3ghPX/0O/q/5X+bnbZEOP4tSaxhAI=; h=Date:From:To:cc:Subject:In-Reply-To:References:From; b=S2WVojN47EAhUnJFVUtlu8LYC7MeaSllyP4cQwiSJ3PfqYLzFkYcXzBxzBwcg+3Tg I6njnKAbjdloKHflXVWRRIKTRtab3ljsCAC5g26UFRYlVqgZq6x2cQURAEQTQaTxRg esIMs76uK0qj0eEzPHFXa3zKN3aM1pwoiYHb7UfwOnT2S0jxfvAttGeY095qH9fhup i2EdPFpPwqWgwOXxsslmz3GsX7PilvKxThyXSW0RbT5XQ93Nszonubjo57e3pejQBw alm05VoPvM+V5aK+iCwPhgzN2T90udml6WtaU5XuSA95jlqxx2t5ZT1q3aSLwMSiTJ M4zDXm1Q2TPVg== Received: from localhost (localhost [IPv6:::1]) by xdebug.org (Postfix) with ESMTPS id 7C2A110C052; Tue, 26 Aug 2025 11:15:14 +0100 (BST) Date: Tue, 26 Aug 2025 11:15:14 +0100 (BST) To: "Christoph M. Becker" cc: youkidearitai , php internals Subject: Re: [PHP-DEV][DISCUSSION] Deprecate mbregex in PHP 8.6 and maintenance version In-Reply-To: <8d1d90e2-b2db-421a-babe-f915dc06b76d@gmx.de> Message-ID: References: <8d1d90e2-b2db-421a-babe-f915dc06b76d@gmx.de> Precedence: list list-help: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII From: derick@php.net (Derick Rethans) On Mon, 25 Aug 2025, Christoph M. Becker wrote: > On 25.08.2025 at 09:26, youkidearitai wrote: > > > I improvement this RFC. https://wiki.php.net/rfc/eol-oniguruma > > > > Added more information about maintenance versions. What do you think > > about Oniguruma maintenance ended. Please watch and feel free to > > comment. > > First, thank you for caring about this! I agree that we need a long > term solution for this issue. As I understand it, Oniguruma's > greatest advantage over PCRE2 is that it supports other character > encodings than Unicode and ANSI, so deprecating mbregex might be a > problem for some users. Yes, but I think Yuya mentioned somewhere else (I can't find it now) in an earlier discussion, that many of these users now also moved to UTF-8. It would also be possible to rewrite these uses from using mbregex to UConverter::convert+pcre. Incidently, icu also has a regular expression engine, but of course that'll operate on UTF-16, and we'd have to create a full new implementation for that: - https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/uregex_8h.html > Still, the alternative would likely be to bundle liboniguruma, and I > don't think that would be a good idea. So deprecating mbregex as of > PHP 8.6.0 seems prudent; if there would be lots of objections, we > could still reconsider. I agree with that. > Now I wonder how much trouble it would be to separate mbregex from > ext-mbstring. If that can be done with a reasonable amount of work, > that would likely be the best course of action (in addition to > deprecating mbregex). We could than move the extension to PECL/PIE, > and let users deal with it (I'm not happy what happened to ext-imap, > but it's still better than relying on an unmaintained library from a > bundled extension). Seeing code like in mbstring.c #ifdef HAVE_MBREGEX PHP_MINIT(mb_regex) (INIT_FUNC_ARGS_PASSTHRU); #endif And: php_mbregex.h:PHP_MINIT_FUNCTION(mb_regex); php_mbregex.h:PHP_MSHUTDOWN_FUNCTION(mb_regex); php_mbregex.h:PHP_RINIT_FUNCTION(mb_regex); php_mbregex.h:PHP_RSHUTDOWN_FUNCTION(mb_regex); php_mbregex.h:PHP_MINFO_FUNCTION(mb_regex); makes it feel that it already sort-of operates as a sub-extension, and it wouldn't be *too* much work. But it will still be work. Is it worth it? cheers, Derick -- https://derickrethans.nl | https://xdebug.org | https://dram.io Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support mastodon: @derickr@phpc.social @xdebug@phpc.social