Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:130454 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id 67E691A00BC for ; Wed, 25 Mar 2026 16:03:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1774454631; bh=EvxRlFOQpETVUvaplGrTHpqasuqOjCWu7OQEPuLk9HQ=; h=References:In-Reply-To:From:Date:Subject:To:From; b=MHN6Fp0U6acGk3pSfyBTcbnCPp8zy6k9MXqCMnHOdoSMWI2qQKcQ8+RN6ZDhU6Ghm i89eEMxI62Ego6NflkWEhtkQAH+hU6W+AJ8rK4R8Fgt+JsTG02yB4szD40Qlyji2Te nEiOIrrYF+zSGO0F6CNLmvjnyhgX2hgjjwID+sTUktffgp+ceLwFFTnk+KCnX155g4 ybFvv/NbRBZJ+f8sz6H08LVIpDSKqdWBQN7AS16EqRjWalwge4TCV1yFLgd48cVXFM bH7LFYBJFS4YgSqtfqMxdUOaVQMvw4Yr9S1OJi66rM0j+ZTpmW8cGQoPXxMPa+tabz FXWxkcUmZxftA== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id F016B18005D for ; Wed, 25 Mar 2026 16:03:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=ARC_SIGNED,ARC_VALID,BAYES_50, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS, FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from mail-wm1-f46.google.com (mail-wm1-f46.google.com [209.85.128.46]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 25 Mar 2026 16:03:49 +0000 (UTC) Received: by mail-wm1-f46.google.com with SMTP id 5b1f17b1804b1-487035181a7so98985e9.2 for ; Wed, 25 Mar 2026 09:03:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1774454623; cv=none; d=google.com; s=arc-20240605; b=etXWW752VAq4b36gC6piwKiGw/G5C9LOwo0qIoz51puw74ZsLYudocOgLlEYLRddrx Bd1Ouud2wcJWRLY9hSvYLXU8TGgectTJmfuGkd8TvzWoVMKFXJIsfUcsb7gfs/1weA1I Fn9xGKaEj32JTTNAqf+bfn22CR9X11VNlfqw94JXD6Ul3MpXTzY2qAxZqKtY2CsKg8kr HQUM5DuckKBOHPJIz32SQXvSyy46SUsM3IzkI8jN+W7ODK+4O1Et2UO/lvhqo0c0TDdj VjJ790TOz8MPz0I1pmi492NaCClFKV9NLCLtnZOkUNNzWNhIibsxwPKJLJ6CLBlRnNbf SZGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=70mOv2DSEkdtLEmO4KCSZ33pnkMbQyjrdg0o0OP7sq8=; fh=RnZ+4KjfdZdxwVfzmjFoBvUAaJ227RJecqE9MM9tvfQ=; b=N+hQqihuyYBSeQ/l2JaEJY70EcLOvY9yuR39zXhVOVGAV+0Lts5lwGnSGm6vCiw6Uj YRCf3uZTzuMDCdpjElDYteH4vAsNizq2r2NdF89zOHULehaYLv6rGSdMCdr9sY7V2Cmr iChzzelyDbaRV1ntXyOblKeD+cuJHT7NBibzPiW55rrW7BX9g71vpBGGAu3bHtKpUmvn 1vy0P0euzHKzZChdZOEvENLuzXGhSgf1W0nAO7UNfwmaRlt26maqFfXgfOQnClIPV2SQ 94ieVvBN0+OSKfL8xqIb99BKsAR+6LO2vNg+2AqtKsHHfvPhi3cgPHhWsi5QtfNM3QPg Uffw==; darn=lists.php.net ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774454623; x=1775059423; darn=lists.php.net; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=70mOv2DSEkdtLEmO4KCSZ33pnkMbQyjrdg0o0OP7sq8=; b=fqflcxmc79DKE7yX4O5TDer4GhIX5d0cADPZstnqcQg5OX1j5n4Q6/Z/TbkSbTzx3e KR2rjc5TWT7jXMpMud0zxc2ls5H5rGemzfpMuRtNMb9gF5UlS69mviWigNF8IOB8QWaA VJFHuGTdIdzDP6RqN2QKFYiwEmBjB8Rs0mocVUoIbAHZail8WLrSVXvhm4Ou/S8FI9/H Q1QQQLcbArljlrRONfFRxFxT1MYOqGwqUfaONFPvZR2Ck/Msqm7/Hpxf+j67J4BSlFl+ nTdHP0Y9VkFLRD+TzKablGx2hBJHggtoA5JUUmrscg/lL1/gAenivrW9YODXTpQc5c8s 2pbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774454623; x=1775059423; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=70mOv2DSEkdtLEmO4KCSZ33pnkMbQyjrdg0o0OP7sq8=; b=MKo1WsDDWU8w4kQBehtebNYZ+2nTKse6SO30yyoVtcZ5MuOtApTo/C5SrSs643sCa7 WQld9eMLkdGSLhb8NhwNdwhWlMujJ+Q3sEgBCiqjT6lGLVX7ovqCR81DpQSGDMoY5i6Q FrwKsoc7CaiOUVN1QgQx81yxa7c21k+ER5tWyNKtqWqA/WS/u1mAtijIfzTj+qmq83Rv LATGBDHJMC0LO1xIVA/QCbjpMHiGYlID7+B1vtWyTChn1TpEelAp4qJ3ZGYyNfsPNk3e ZB+mNnGIO3vsTpcebP4FUnAEOHWUuPBr/Jw2+tIFhygzUs+bLKnA53gqNxMhMQtsAYlH +L8A== X-Gm-Message-State: AOJu0Ywal2oMF7cZd2pqZZDyFtYo7oJ/zTFNPTXoKhdz6ui8CqhMAa3Z HUuO4qAbt9nwH2apn5pMTY+4EQ2AyDmNQeGtGgOhDDuj7V88qXGP1iBTXWwdnqCOxirbhFrQMtA 5FXXjhNwdBeRxdKE9WwRGVGJ+AH1fIgqykQsxXA== X-Gm-Gg: ATEYQzzfmvMDueM+3TDWFsjpb4Gl+EDV3WZbhzlxBA5kacU8AHL8+N5momL4oGCCYN7 z9+YmGib2BJRbXKVmVWTG9ybaYz8O2+U21dnACVGaLFCoU+CO1v/cQ59Q4DlhqqaaXBbWdumrJn 9aLcUYdgrx8f0bne2oVDOmPldsvjOsKxWWfbsttX/dI816F/ys7okKG042g+29ofyyPCK0YJk6D 4QAEk98I6Km1FlVwiE8HESV0h11GxucaQj71XXd2q5fNq+xiz8+wS3szIyUcbrBWuqFmMBLaBcJ yXtCBotFbGTMpNmwml/AuymtqHEGG/xJZw== X-Received: by 2002:a05:600c:c089:b0:485:35ee:f836 with SMTP id 5b1f17b1804b1-48715fba9d6mr50066915e9.2.1774454623221; Wed, 25 Mar 2026 09:03:43 -0700 (PDT) Precedence: list list-help: list-unsubscribe: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 References: <69C279A1.5040405@adviesenzo.nl> In-Reply-To: <69C279A1.5040405@adviesenzo.nl> Date: Thu, 26 Mar 2026 01:03:32 +0900 X-Gm-Features: AQROBzAqSsYfdCzk-XknQVqCtxMJooW2NGD7f2ewhAQq4vtMWMQ8JntkDvgRYyM Message-ID: Subject: Re: [PHP-DEV][RFC][UNDER DISCUSSION] Oniguruma maintenance end and future of mbregex(End of mbregex) To: php internals Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable From: youkidearitai@gmail.com (youkidearitai) 2026=E5=B9=B43=E6=9C=8824=E6=97=A5(=E7=81=AB) 20:46 Juliette Reinders Folme= r : > > On 23-3-2026 1:34, youkidearitai wrote: > > Hi, Internals > > I decide deprecate mbregex in 8.6 and drop in 9.0. > So I would like to go to Under Discussion phase. > https://wiki.php.net/rfc/eol-oniguruma > https://github.com/php/php-src/pull/21490 > > > Thank you for writing this RFC. I don't have a strong opinion either way.= I fully understand that maintaining the Oniguruma library, while it was ab= andoned by the original project is a huge and unenviable task. > > Having said that, I am very curious what Ruby will be using going forward= and if PHP could adopt a similar solution. > I also wonder if there are no other "blessed" forks of the Oniguruma libr= ary to which PHP could switch. > I believe this should be investigated and the results of this investigati= on should be added to the RFC to (potentially) strengthen the case for the = current proposal, or, depending on the findings, it could be that the curre= nt proposal could be adjusted based on what this investigation throws up. > > Secondly, I believe the RFC would benefit from a more detailed section ab= out what PHP devs can do to mitigate the deprecation. > For example, if the only expected text encoding is UTF-8, people can use = `preg_*()` functions with the `u` modifier instead of the `mb_ereg*()` func= tions. > > I also think it is important to mention that the Symfony Mbstring[1] poly= fill package does **NOT** polyfill the MB regex functionality, so cannot be= used as a replacement/alternative. > > With this in mind, I also believe the impact analysis in the RFC should b= e expanded as the MbString extension is widely used. > > To support this, I've created a branch in the PHPCompatibility package [2= ] specifically for this deprecation and I have run the relevant checks over= the Packagist Top 4000 (as of yesterday). > > I've posted the used ruleset and the full results as a gist. > https://gist.github.com/jrfnl/bd0f66f1c185930427db4f093babf214 > > Summary of findings: > > PHP CODE SNIFFER VIOLATION SOURCE SUMMARY > -------------------------------------------------------------------------= ------------------ > SOURCE = COUNT > -------------------------------------------------------------------------= ------------------ > PHPCompatibility.FunctionUse.RemovedFunctions.mb_splitDeprecated = 30 > PHPCompatibility.FunctionUse.RemovedFunctions.mb_regex_encodingDeprecated= 25 > PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregi_replaceDeprecated = 20 > PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replaceDeprecated = 18 > PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_matchDeprecated = 13 > PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_initDeprecat= ed 10 > PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_regsDeprecat= ed 9 > PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_replace_callbackDep= recated 6 > PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_getregsDepre= cated 5 > PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregDeprecated = 4 > PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_setposDeprec= ated 4 > PHPCompatibility.FunctionUse.RemovedFunctions.mb_eregiDeprecated = 2 > PHPCompatibility.FunctionUse.RemovedFunctions.mb_ereg_search_posDeprecate= d 1 > -------------------------------------------------------------------------= ------------------ > A TOTAL OF 147 SNIFF VIOLATIONS WERE FOUND IN 13 SOURCES > -------------------------------------------------------------------------= ------------------ > > So, 147 occurances in the Packagist top 4000 in total. > > While this is lower than I would have expected, it should be remembered t= hat most distributed packages will default to/require UTF-8 encoding and th= at code handling non-UTF8 encodings - and therefore needing the Mb regex fu= nctionality - is mostly found in proprietary packages. > > The PIE extension would help those packages. > > Another potential alternative for those packages would be to convert all = their data and code to a UTF-8 base, which will be a humongous project for = most (and that deserves a mention in the RFC). > > Hope this helps. > > Smile, > Juliette > > > 1: https://symfony.com/packages/polyfill-mbstring > 2: https://github.com/PHPCompatibility/PHPCompatibility/commit/47ba8b691f= 82d13dcfe496549c1110d250e18a8c > 3: https://gist.github.com/jrfnl/bd0f66f1c185930427db4f093babf214 Hi, Juliette Thank you very much for your gist. I saw your gist, seems like depends mbregex(Oniguruma). > Having said that, I am very curious what Ruby will be using going forward= and if PHP could adopt a similar solution. > I also wonder if there are no other "blessed" forks of the Oniguruma libr= ary to which PHP could switch. > I believe this should be investigated and the results of this investigati= on should be added to the RFC to (potentially) strengthen the case for the = current proposal, or, depending on the findings, it could be that the curre= nt proposal could be adjusted based on what this investigation throws up. Indeed, There is a Onigmo in Ruby(https://github.com/ruby/ruby/blob/master/regexec.c) that fork from Oniguruma. There are Onigmo and Oniguruma differences. I wrote your feedback to RFC. And I quoted your gist result. Please let me know if there is any problem. Thank you again. Regards Yuya --=20 --------------------------- Yuya Hamada (tekimen) - https://tekitoh-memdhoi.info - https://github.com/youkidearitai -----------------------------