Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:130429 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id 19A881A00BC for ; Tue, 24 Mar 2026 11:46:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1774352815; bh=gJ3kKgEYg/i7ZgYLjUc3i9HzEL+81Bd63FFfXVBuKRo=; h=Subject:To:References:From:Date:In-Reply-To:From; b=OSSuPxKRBA7GXEzWe6QsTFk4y+/H+hOUPUAI3dQiBdfCpnnEoPcHWhYx9X4Lh0Cnn 8zMKPdnrHH2J/CYSufEIBrC+SrseM/OwnQbqgmS/xyu61DKlePtPBynf2FeZaHoEtg GDMc9C6PXCzE+igOqQjWiqlLXEpctPt5J7LtPOXcD43kV8ILItb5Czn7imfoRC3tdJ bYrGJJQd7GygJcYDfyt1YRO2niKOMnvwB1m7MDqgqX1tTRRdx7n8Dt/3v8cBX/NSxE hrzPgeMQthtB5b5Qn+uchkuGuZ1CV3QdiGZOjdn/t77kibNa2omrbrB6mHB+Jlxuue t+WgupxmNQm6g== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 64D12180393 for ; Tue, 24 Mar 2026 11:46:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: *** X-Spam-Status: No, score=3.3 required=5.0 tests=ARC_SIGNED,ARC_VALID,BAYES_50, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING, HTML_MESSAGE,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_SOFTFAIL autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from purple.birch.relay.mailchannels.net (purple.birch.relay.mailchannels.net [23.83.209.150]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 24 Mar 2026 11:46:53 +0000 (UTC) X-Sender-Id: a2hosting|x-authuser|juliette@adviesenzo.nl Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id D74F06C4203 for ; Tue, 24 Mar 2026 11:46:47 +0000 (UTC) Received: from nl1-ss105.a2hosting.com (100-117-85-61.trex-nlb.outbound.svc.cluster.local [100.117.85.61]) (Authenticated sender: a2hosting) by relay.mailchannels.net (Postfix) with ESMTPA id EEABF6C3662 for ; Tue, 24 Mar 2026 11:46:45 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; d=mailchannels.net; s=arc-2022; cv=none; t=1774352806; b=rv2Rf1DaICo8jRCc0CKxu8WvG2d719F0Tg3IPAcFJRkUgqhvipeJPiDO38oebFbuC89dBG F35yNouNvIQ3z7INUqQe9/8YFCcM+garOYY+yizJBZDATkqN0Pk48G/FYZ0eREe5oLhHIa ORvTeb83o7PNxUQ2z7zG6mg/liHXtpDj+Anm199tF2wu4DsQSqUdvdLMJ6EHZYnlosv9DK RpIpK1eK9Qhf2XnnJ7kPky9dXNwSqBI2ZkuDH9QylSEXGzyThO3xg5o7uuoFWTbRn/b5zl CIR9HVy0hoDpHIkc7KCyWg7SgvbzwMnYNMerv+EPIiEcZkonwbWCk3G+YjxzmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1774352806; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QaB3sb27P9eCT5SZs2vsfZjmd6oKCxRRj8iNVxPmi0w=; b=0WgN4JJxdmXjbq4vP8z29keTeRSCQnrvI70j/x4Lf+XDWEk18qMr/VKgXLZQ+vbxIxuKHu i8HaxZr7H0Fu4va109y/uzOM7T6bFXvym+xar8/ZuUjkk5nzeZkYSebkf/jvkBDmjjEa5o pbkqcIlQOb6JjQ5DRY7De8BmemHGWpYs8ksQp8J0RNX+7jSDrKK+ZqkhkvIYkXSZ5iklIF fIMBxIqlCYkBfpX7wLAHq372fkA/Q02FXuJKidthO6DVN1NlHGDYKhnFglqYxFAyCiIuy+ Bgv1Fg6MJnQ4OSn00znzgjwC5MB+B+smmSxufujEUutnBLZLW2GMXrYu0pld5w== ARC-Authentication-Results: i=1; rspamd-6d4cb6745-brq2n; auth=pass smtp.auth=a2hosting smtp.mailfrom=php-internals_nospam@adviesenzo.nl X-Sender-Id: a2hosting|x-authuser|juliette@adviesenzo.nl X-MC-Relay: Neutral X-MailChannels-SenderId: a2hosting|x-authuser|juliette@adviesenzo.nl X-MailChannels-Auth-Id: a2hosting X-Illustrious-Minister: 7fa524b17c222d48_1774352807148_506973562 X-MC-Loop-Signature: 1774352807147:3545468316 X-MC-Ingress-Time: 1774352807147 Received: from nl1-ss105.a2hosting.com (nl1-ss105.a2hosting.com [85.187.142.69]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.117.85.61 (trex/7.1.5); Tue, 24 Mar 2026 11:46:47 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=adviesenzo.nl; s=default; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:From:References:To:Subject:Sender:Reply-To:Cc: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=QaB3sb27P9eCT5SZs2vsfZjmd6oKCxRRj8iNVxPmi0w=; b=R2WlkFtFW/DR1xxdkd4DQK+ayA /3/cKRy7I9XX3xZq1ug7RPHomlaBxflw3zIwHYKlbe1JEc+blLRvBGAYrWM8i2g0oiE2pijQVu3pt rhclZ3ptj3WGx8gb0nahXGFqnLuoQtp1eMVrJnm0bFipjgKG1Gngpyf0K+iwFXESvDqw=; Received: from mailnull by nl1-ss105.a2hosting.com with spam-scanner (Exim 4.99.1) (envelope-from ) id 1w50Dc-0000000EuWO-08uu for internals@lists.php.net; Tue, 24 Mar 2026 12:46:44 +0100 X-ImunifyEmail-Filter-Action: no action X-ImunifyEmail-Filter-Info: VE9fRE5fQUxMIEZST01fRVFfRU5WRlJPTSBUT19NQVRDSF9F TlZSQ1B UX1NPTUUgUkNWRF9UTFNfQUxMIEFSQ19OQSBJRV9WTF9QQkxfQUNDT1 VOVF8wNSBJRV9WTF9QQkxfRU1BSUxfMDUgSUVfVkxfUEJMX0RPTUFJT l8wNSBNSU1FX1RSQUNFIFZFUklMT0NLX0NCIElFX1ZMX1BCTF9FTUFJ TF8wMSBJRV9WTF9QQkxfRE9NQUlOXzAxIE1JRF9SSFNfTUFUQ0hfRlJ PTSBNSU1FX1VOS05PV04gQkFZRVNfSEFNIFJDUFRfQ09VTlRfVFdPIE FTTiBSQ1ZEX1ZJQV9TTVRQX0FVVEggSUVfVkxfUEJMX0FDQ09VTlRfM jAgSUVfVkxfUEJMX0FDQ09VTlRfMDEgUkNWRF9DT1VOVF9PTkUgRlJP TV9IQVNfRE4= X-ImunifyEmail-Filter-Score: 1.88 X-ImunifyEmail-Filter-Version: 3.8.21/202603171242 Received: from [143.178.147.121] (port=57321 helo=[192.168.1.16]) by nl1-ss105.a2hosting.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.99.1) (envelope-from ) id 1w50DZ-0000000EuRe-3kM4; Tue, 24 Mar 2026 12:46:43 +0100 Subject: Re: [PHP-DEV][RFC][UNDER DISCUSSION] Oniguruma maintenance end and future of mbregex(End of mbregex) To: youkidearitai , php internals References: Message-ID: <69C279A1.5040405@adviesenzo.nl> Date: Tue, 24 Mar 2026 12:46:41 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0 Precedence: list list-help: list-unsubscribe: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/alternative; boundary="------------050706080204000506010308" X-AuthUser: juliette@adviesenzo.nl From: php-internals_nospam@adviesenzo.nl (Juliette Reinders Folmer) This is a multi-part message in MIME format. --------------050706080204000506010308 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit On 23-3-2026 1:34, youkidearitai wrote: > Hi, Internals > > I decide deprecate mbregex in 8.6 and drop in 9.0. > So I would like to go to Under Discussion phase. > https://wiki.php.net/rfc/eol-oniguruma > https://github.com/php/php-src/pull/21490 Thank you for writing this RFC. I don't have a strong opinion either way. I fully understand that maintaining the Oniguruma library, while it was abandoned by the original project is a huge and unenviable task. Having said that, I am very curious what Ruby will be using going forward and if PHP could adopt a similar solution. I also wonder if there are no other "blessed" forks of the Oniguruma library to which PHP could switch. I believe this should be investigated and the results of this investigation should be added to the RFC to (potentially) strengthen the case for the current proposal, or, depending on the findings, it could be that the current proposal could be adjusted based on what this investigation throws up. Secondly, I believe the RFC would benefit from a more detailed section about what PHP devs can do to mitigate the deprecation. For example, if the only expected text encoding is UTF-8, people can use `preg_*()` functions with the `u` modifier instead of the `mb_ereg*()` functions. I also think it is important to mention that the Symfony Mbstring[1] polyfill package does **NOT** polyfill the MB regex functionality, so cannot be used as a replacement/alternative. With this in mind, I also believe the impact analysis in the RFC should be expanded as the MbString extension is widely used. To support this, I've created a branch in the PHPCompatibility package [2] specifically for this deprecation and I have run the relevant checks over the Packagist Top 4000 (as of yesterday). I've posted the used ruleset and the full results as a gist. https://gist.github.com/jrfnl/bd0f66f1c185930427db4f093babf214 Summary of findings: PHP CODE SNIFFER VIOLATION SOURCE SUMMARY ------------------------------------------------------------------------------------------- SOURCE COUNT ------------------------------------------------------------------------------------------- PHPCompatibility.FunctionUse.RemovedFunctions.mb_splitDeprecated 30 PHPCompatibility.FunctionUse.RemovedFunctions.mb_regex_encodingDeprecated 25 PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregi_replaceDeprecated 20 PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replaceDeprecated 18 PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_matchDeprecated 13 PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_initDeprecated 10 PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_regsDeprecated 9 PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replace_callbackDeprecated 6 PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_getregsDeprecated 5 PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregDeprecated 4 PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_setposDeprecated 4 PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregiDeprecated 2 PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_posDeprecated 1 ------------------------------------------------------------------------------------------- A TOTAL OF 147 SNIFF VIOLATIONS WERE FOUND IN 13 SOURCES ------------------------------------------------------------------------------------------- So, 147 occurances in the Packagist top 4000 in total. While this is lower than I would have expected, it should be remembered that most distributed packages will default to/require UTF-8 encoding and that code handling non-UTF8 encodings - and therefore needing the Mb regex functionality - is mostly found in proprietary packages. The PIE extension would help those packages. Another potential alternative for those packages would be to convert all their data and code to a UTF-8 base, which will be a humongous project for most (and that deserves a mention in the RFC). Hope this helps. Smile, Juliette 1: https://symfony.com/packages/polyfill-mbstring 2: https://github.com/PHPCompatibility/PHPCompatibility/commit/47ba8b691f82d13dcfe496549c1110d250e18a8c 3: https://gist.github.com/jrfnl/bd0f66f1c185930427db4f093babf214 --------------050706080204000506010308 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit
On 23-3-2026 1:34, youkidearitai wrote:
Hi, Internals

I decide deprecate mbregex in 8.6 and drop in 9.0.
So I would like to go to Under Discussion phase.
https://wiki.php.net/rfc/eol-oniguruma
https://github.com/php/php-src/pull/21490

Thank you for writing this RFC. I don't have a strong opinion either way. I fully understand that maintaining the Oniguruma library, while it was abandoned by the original project is a huge and unenviable task.

Having said that, I am very curious what Ruby will be using going forward and if PHP could adopt a similar solution.
I also wonder if there are no other "blessed" forks of the Oniguruma library to which PHP could switch.
I believe this should be investigated and the results of this investigation should be added to the RFC to (potentially) strengthen the case for the current proposal, or, depending on the findings, it could be that the current proposal could be adjusted based on what this investigation throws up.

Secondly, I believe the RFC would benefit from a more detailed section about what PHP devs can do to mitigate the deprecation.
For example, if the only expected text encoding is UTF-8, people can use `preg_*()` functions with the `u` modifier instead of the `mb_ereg*()` functions.

I also think it is important to mention that the Symfony Mbstring[1] polyfill package does **NOT** polyfill the MB regex functionality, so cannot be used as a replacement/alternative.

With this in mind, I also believe the impact analysis in the RFC should be expanded as the MbString extension is widely used.

To support this, I've created a branch in the PHPCompatibility package [2] specifically for this deprecation and I have run the relevant checks over the Packagist Top 4000 (as of yesterday).

I've posted the used ruleset and the full results as a gist.
https://gist.github.com/jrfnl/bd0f66f1c185930427db4f093babf214

Summary of findings:

PHP CODE SNIFFER VIOLATION SOURCE SUMMARY
-------------------------------------------------------------------------------------------
SOURCE                                                                                COUNT
-------------------------------------------------------------------------------------------
PHPCompatibility.FunctionUse.RemovedFunctions.mb_splitDeprecated                      30
PHPCompatibility.FunctionUse.RemovedFunctions.mb_regex_encodingDeprecated             25
PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregi_replaceDeprecated              20
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replaceDeprecated               18
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_matchDeprecated                 13
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_initDeprecated           10
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_regsDeprecated           9
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replace_callbackDeprecated      6
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_getregsDeprecated        5
PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregDeprecated                       4
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_setposDeprecated         4
PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregiDeprecated                      2
PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_posDeprecated            1
-------------------------------------------------------------------------------------------
A TOTAL OF 147 SNIFF VIOLATIONS WERE FOUND IN 13 SOURCES
-------------------------------------------------------------------------------------------

So, 147 occurances in the Packagist top 4000 in total.

While this is lower than I would have expected, it should be remembered that most distributed packages will default to/require UTF-8 encoding and that code handling non-UTF8 encodings - and therefore needing the Mb regex functionality - is mostly found in proprietary packages.

The PIE extension would help those packages.

Another potential alternative for those packages would be to convert all their data and code to a UTF-8 base, which will be a humongous project for most (and that deserves a mention in the RFC).

Hope this helps.

Smile,
Juliette


1: https://symfony.com/packages/polyfill-mbstring
2: https://github.com/PHPCompatibility/PHPCompatibility/commit/47ba8b691f82d13dcfe496549c1110d250e18a8c
3: https://gist.github.com/jrfnl/bd0f66f1c185930427db4f093babf214
--------------050706080204000506010308--