Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:118231 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 31870 invoked from network); 9 Jul 2022 02:08:47 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 9 Jul 2022 02:08:47 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id DCB38180212 for ; Fri, 8 Jul 2022 21:02:24 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.6 required=5.0 tests=BAYES_00,BODY_8BITS, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-wr1-f43.google.com (mail-wr1-f43.google.com [209.85.221.43]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 8 Jul 2022 21:02:24 -0700 (PDT) Received: by mail-wr1-f43.google.com with SMTP id v14so660893wra.5 for ; Fri, 08 Jul 2022 21:02:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=OB8iwbig24kuY7aLiT8EI/Buo+8mGC8djICvEUvyPEE=; b=Sbe4cmHMGJV+ZYAxO130YYX8DpYBskmhqQPF/Auxgc6qvAJoOTJRDxkZuyB7+4EcSu 3OkaQGGOlhXiQRLqXDyW0RB1h/BdKm9CuEQSV+C5GNRFA76H3vvoXnBPViqjEaj8/Kte L8qYve3QQCPTvy6V/lI74i54eFp8ROz5x9hu1h8ppND0vgs3/gsqNOQOiO9oKkvep4NV 3pgOv3023JzDxwzyZfksJD8O5doHyRrTKddUq9sSctiyueJHpHTG6oje7Zfa+l6bbDdN uGz6Jhct9WiW10ucXQUm+JBYL2/939K3M1/zGU0KwP0HF+E4VvtHQYb7PQvPvbzEnMcE ORHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=OB8iwbig24kuY7aLiT8EI/Buo+8mGC8djICvEUvyPEE=; b=Hk52fpaxr9hL6DtbfvXg87ALh82aiRy/M/RGqWlTTK/4IiwTl+AFte8+Lz8NdpVP+m 6BclnbTAjwUW7NfM/KbwdMdfdQhbOYODPLH6EoT08CAgujksnJC9Rsv5aNY//tz9+Wcf 4LWddYsKqZS4VcF8p/IlMO5pWjivKg5PgdjPR7GwGnYuUobBs2SkQZpBK4WRM/U8cuEO Y7mbBCCjenpBbfTF3aXt/OnHv6LkXmsk/x+3jTW5pByQnhFuhbrJUxlxIQw6BE5S9vfZ SBC6+w/TNuK0nW4OAiaMXDNkh7wLKH4HJaMOQNkq8FByHSQMcal0026DmE+3MvMWtLfo /SMA== X-Gm-Message-State: AJIora+9uLWVagDnvgZCW2QDmOD9aNJ9OLB57D/D6766dZHW+UFFL0A4 z4mRHG3b8GjJ0ULzG6pwgNbrWbocdk+nSSB4lbw9Qcbc X-Google-Smtp-Source: AGRyM1tMA5zN44xSkbPmcFEXDSf4tInG6k6VRczIfSgHEzw7j3w99FS3Gn/b3n3ekL8ukp31cIzykjAIokD1DZilZf4= X-Received: by 2002:a5d:5c11:0:b0:21b:a9a2:7eec with SMTP id cc17-20020a5d5c11000000b0021ba9a27eecmr6275936wrb.579.1657339342563; Fri, 08 Jul 2022 21:02:22 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a05:6000:1863:0:0:0:0 with HTTP; Fri, 8 Jul 2022 21:02:21 -0700 (PDT) In-Reply-To: <1657331615.886786918@f311.i.mail.ru> References: <1657331615.886786918@f311.i.mail.ru> Date: Sat, 9 Jul 2022 14:02:21 +1000 Message-ID: To: Kirill Nesmeyanov Cc: internals Content-Type: multipart/alternative; boundary="000000000000c92e2105e35762d4" Subject: Re: Character range syntax ".." for character masks From: mickmackusa@gmail.com (mickmackusa) --000000000000c92e2105e35762d4 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Saturday, July 9, 2022, Kirill Nesmeyanov wrote: > > Note that the "..." operator is unary, so there is no syntax conflict whe= n > using two floats: > ``` > echo 0...1; // 00.1 > ``` > > However, in the case of the ".." operator, it is assumed to be a binary > operator, so problems with grammar ambiguity may arise: > ``` > echo 0 ..1; // 00.1 > echo 0.. 1; // 01 > ``` > > * Note: The syntax you suggest is widely used in at least Ruby ( > https://ruby-doc.org/core-2.5.1/Range.html ) and CoffeeScript. > * Note: There is also a `trim`, `ltrim` and `rtrim` functions > > >=D0=A1=D1=83=D0=B1=D0=B1=D0=BE=D1=82=D0=B0, 9 =D0=B8=D1=8E=D0=BB=D1=8F 2= 022, 2:56 +03:00 =D0=BE=D1=82 mickmackusa : > > > >I've discovered that several native string functions offer a character > mask > >as a parameter. > > > >I've laid out my observations at > >https://stackoverflow.com/q/72865138/2943403 > > > >In a nutshell, not all character masks offer ranges via "double dot" > >syntax. Or should I refer to ".." as the "string spread operator" to avo= id > >naming conflict with "..." -- the better known "spread operator" (array > >spread operator)? > > > >Rowan/@IMSoP informed me that the current division between the haves and > >the have-nots appears to be based on the source language from which PHP > >pulled. Essentially, if from C, the double dot does not represent a rang= e. > >https://chat.stackoverflow.com/transcript/11?m=3D54864842#54864842 > > > >Character ranges are not yet supported for: > >- strcspn() > >- strpbrk() > >- strspn() > > > >Before I fire off an RFC, I would like to know: > > > >1. Are there any reasonable objections to consistently implementing > >character range expressions for all character masks? > >2. Are there any native functions that I did not mention my Stack Overfl= ow > >answer? > >3. Is it true that only single-byte characters can be used in all > >scenarios? If so, must it remain that way? > >4. Is there already an official or widely-used term that I should be usi= ng > >for the two-dot operator? > > > >I should also mention that I initially considered requesting that all > >character mask parameters be named $mask (instead of $separators, $token= , > >or $characters), but I later resigned to the fact that changing to a nam= e > >that describes the texture of the string would remove the more > >vital/intuitive purpose of the string. I suppose the best that can be do= ne > >to inform developers is to explicitly mention in the documentation when > >character range expressions are implemented and demonstrate their usage = in > >an example (not just as a user comment at the bottom; this isn't In-N-Ou= t > >Burger -- put your offerings on the frickin' menu!). > > > >mickmackusa > > > -- > Kirill Nesmeyanov > Thanks for your reply, Kirill, but I am no way trying to introduce a new, general use operator for all encountered strings. I am purely focused on having the operator consistently implemented for all character masks. The language construct `echo` does not have a specified character mask parameter. mickmackusa --000000000000c92e2105e35762d4--