Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:118234 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 59179 invoked from network); 9 Jul 2022 09:21:26 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 9 Jul 2022 09:21:26 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 8E7721804B1 for ; Sat, 9 Jul 2022 04:15:08 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS8560 212.227.0.0/16 X-Spam-Virus: No X-Envelope-From: Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 9 Jul 2022 04:15:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1657365305; bh=WK+zYwfQWYUIC2LJX4b/u3OU73gAadSowRDMIucfJMY=; h=X-UI-Sender-Class:Date:Subject:To:References:From:In-Reply-To; b=F5Y8Wf5GbZoy55GEnijuyt8rEi1NjS3VbCGSY6JU9nJ2eFv65Picl1kiCAhiqWkY9 dQYhbnEL2fxD3XZ55agf/OXndU/d5I2nNw3YUis/Dhqp5HBRq4pxc+wsuVTsLT5Grd OOjtkppyVTitMOM8dcFwItnZuANAqLv3ETfi2B/M= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Received: from [192.168.2.130] ([79.251.193.24]) by mail.gmx.net (mrgmx104 [212.227.17.168]) with ESMTPSA (Nemesis) id 1Mbir8-1naXI71Jss-00dBqY; Sat, 09 Jul 2022 13:15:05 +0200 Message-ID: <4a473f00-e68f-54de-fc70-f6f94885464a@gmx.de> Date: Sat, 9 Jul 2022 13:15:05 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Content-Language: de-DE To: mickmackusa , internals@lists.php.net References: In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Provags-ID: V03:K1:6xtxsUdXy3FuabABIUvxWUdshTJNkOw3zfQlqhtsDiVp1tS6BIR 10n4RS6Kbs88upKFv0Omhc/2vUTUVpA0YSjTjRMPMBlA1NkrhQhXu8rsYoODNoeUzF/XiXU kEWOLbNAFKL+KBsRZ3c5dQy6p5R7AMwBU/bLA1c93gezWJ1OOzj2HOPr/7a6S3VsX0s9AZR k14dmKhZywIcyC81OulEA== X-UI-Out-Filterresults: notjunk:1;V03:K0:iC9MWheetJ0=:cAVFI/34XZ842AeLe0KHVP /3KzAHIqTAiz5xLATa70i0LeCaS7KcPdnYL/L4dPzTvWw6gjTO015hdw+3d9Xdkdz6uYdkNLa ubMLqw/LHnn4EINR8hZOlH4MJ1/Gsm6UWKS6d84t76FYinoG4m9Yi7e3ZQ9mGfd81HfukKKLY V89Zu91rP03WT9jaXRrI5ciMbT5reLKEjiYD/5Dw41sghVgXJ43oIBD1Sd9vJfk5mQ7seZlH4 sDyftLVoeQqAHmeGZ8ig+jWU3y0JL8xpiJjvMPiCesFZJyUahN04CK/xCV7N7pmvWfKGMTII6 Nx7QNHeKsHCSp/lPcbN1mMrOfBf8QTSOK2K66VOQHhbOAopXa4hBW6MqBV/OokFjO15hYSd2q etxlFH2R40rcr4z6ZxgBSB2SWHqG3xBYBPgPFDW7QqLs4Kwx5c9Av1cYmJUdbMo7dcr3LMsgu XEyWrcXv8Yk1PWlhlbCLzBFuaec1AQOQpJtDrIdEGXVrkFxNjWSacs2n5xWrwzvbI5aKlxmQr ZsQ6//uGfvRlsZzSdxRuwZSMc5VnFOXaQ5+/QL9f8kj1oZGzqg1M7wI3bCAJbtyRTF08MVrKo tCQq96JVD8lmcdeYmWbLXZ5bVzkA1PV1QldkXOM71tqbUtZlwL9yMT07efPadmdF7yePQfenh D8IsDahc06nFST9Kfxx3NaEsAVve3pcrdvLnAL92V3k26i/DXli0XzFpobOc/FWmCZW00flui 7J9hfjKGCFlCOTZmyMGhvh3G5CnpnjPGElFUMj2HuaHFKyv8QhiJ8icheHhdQelUYUR8/UKTg z9RugnzsoZSj6qtxISWe9SPcMdkavNSpX+wyTDpezrsUWbVdj+XtoJso6NhMhNKwOELrKwN5b MhHTBLJaCxqCz86AtJyTIBTV1m7yquRmLfmt2eliamEFMG26PT1TgcNDZz/5REBxsqDuLJ857 KRFhZF3w4pcW1G+9Vzmkg9wgcD2Ym04P59D31RgFHck9PKZwk4ypWu0P9IFb9napbsqlHrG4g IiRE6pQ1Qq9y5z3nxH2MUOrsluIzZnQcxYx8iUacLM0MssCZVLevC7TQt8a4lXsJvu/8reEt8 /t1U/aCz5zszZT0GgjBkfPdosU3p7GqBleoedpu+FOQj7wV5IY4XXirFQ== Subject: Re: Character range syntax ".." for character masks From: cmbecker69@gmx.de ("Christoph M. Becker") On 09.07.2022 at 01:55, mickmackusa wrote: > I've discovered that several native string functions offer a character m= ask > as a parameter. > > I've laid out my observations at > https://stackoverflow.com/q/72865138/2943403 > > In a nutshell, not all character masks offer ranges via "double dot" > syntax. Or should I refer to ".." as the "string spread operator" to avo= id > naming conflict with "..." -- the better known "spread operator" (array > spread operator)? > > Rowan/@IMSoP informed me that the current division between the haves and > the have-nots appears to be based on the source language from which PHP > pulled. Essentially, if from C, the double dot does not represent a rang= e. > https://chat.stackoverflow.com/transcript/11?m=3D54864842#54864842 > > Character ranges are not yet supported for: > - strcspn() > - strpbrk() > - strspn() > > Before I fire off an RFC, I would like to know: > > 1. Are there any reasonable objections to consistently implementing > character range expressions for all character masks? In my opinion, this notation is somewhat confusing; trim($str, "a..z") and trim($str, "a.z") look pretty similar, but have completely different meaning. I'd rather have some general way to construct such ranges; the slightly contrived implode(range()) is already available, though. Besides, adding support for such character ranges to other functions now, constitutes a (probably minor) BC break. > 2. Are there any native functions that I did not mention my Stack Overfl= ow > answer? It is impossible to list all "native" functions, at least if you mean internal functions, because these may be defined by extensions. And these extensions would need to explicitly implement support for such character ranges. > 3. Is it true that only single-byte characters can be used in all > scenarios? If so, must it remain that way? I think it needs to remain that way, since the functions already accepting character ranges actually work on byte strings. > 4. Is there already an official or widely-used term that I should be usi= ng > for the two-dot operator? I'd call them character ranges; the implementation is called php_charmask() (). > I should also mention that I initially considered requesting that all > character mask parameters be named $mask (instead of $separators, $token= , > or $characters), but I later resigned to the fact that changing to a nam= e > that describes the texture of the string would remove the more > vital/intuitive purpose of the string. I suppose the best that can be d= one > to inform developers is to explicitly mention in the documentation when > character range expressions are implemented and demonstrate their usage = in > an example (not just as a user comment at the bottom; this isn't In-N-Ou= t > Burger -- put your offerings on the frickin' menu!). I agree that the documentation needs to be improved. While trim() mentions the character range support in one sentence, addcslashes() dedicates several paragraphs of detailed explanation. =2D- Christoph M. Becker