Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:118229 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 17944 invoked from network); 8 Jul 2022 22:02:17 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 8 Jul 2022 22:02:17 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 60F921804D0 for ; Fri, 8 Jul 2022 16:55:54 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 8 Jul 2022 16:55:50 -0700 (PDT) Received: by mail-wm1-f50.google.com with SMTP id t17-20020a1c7711000000b003a0434b0af7so80758wmi.0 for ; Fri, 08 Jul 2022 16:55:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:from:date:message-id:subject:to; bh=TjBKV0x8q0NO3qMA+Ki7F7IhJohy+o50EJy7wcwdDO8=; b=ZYlldUjduHDOe8ZQR6oygj+5yznOiPYt1oMlkwKK6lWS0eBdQT/JiAGQQ26osTq5gD gh3luuWhq7SjsYvcjlS24qF3hJ14RovPYpMHsjk0FT4oN06IpfzAdnmaKaEtdzWbNxOJ +QCbqBl7d495XbZv4nY49kG5aVZX3vbFtjb/O/tV1hhkaKlFViWhcegahaJRfylNmPO6 VMMPm+abd5l+We7SdiaoDQhqMcr5awyU4iX+U4zWNN3QQ2305azj73VEugMxz3+ci9jK Gzvh/rSCw0NA0dR9qj2OGVhf/F/RaGR6v9xKsSr+sLt9fBjEn4bFkOUydXYezyU1bRMI C3+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=TjBKV0x8q0NO3qMA+Ki7F7IhJohy+o50EJy7wcwdDO8=; b=qnSJbEoWTyOYJk6injrnxUKCRTUKoscH8sDjV8QCTYq6PDuHRgw1nXsNuTMY4u6AT+ YQUJCgmGqlIwxUFDWQorux170bJ8ArADrGjxsUHoteKHRo9aGhB0XHjlUJRIiVHKaRqP DIwwmySo62NCX4lT7TQDsNU2X4X5PjFKRs2uD7Kyv2vXEeo+sfqtBwqFqrz+4z18j369 BYVWTu8Pc8pgrEGQnS3hbl/ITVxl+1jVB752pxGf2TRu9VQYhti0zeBns7kasXdSp8qt MQczOnR7GAz/RmVpjkceij8jlpVa0n7JYw+d/lOjPSrofSvsGZpHfQQi2R+X0RPh3EAI OCgA== X-Gm-Message-State: AJIora+ujW3ncJJknAw85qEhbnh+DZAfSafAn6nNrwL4z3SrMUFEWFzd bRC/oAXzYG8e1e8zz4nmEsnOYltQuCHMrOy4RDsgG2r/ X-Google-Smtp-Source: AGRyM1sYYPSHcmawSoOyIV5PcX6reIW3Z50gagERS2G7shDEBXnf7fYyNocIBAh1/2BY8hEij2vz+E8/7XYU5mhW3Go= X-Received: by 2002:a05:600c:2159:b0:3a2:d776:4972 with SMTP id v25-20020a05600c215900b003a2d7764972mr2375000wml.167.1657324549181; Fri, 08 Jul 2022 16:55:49 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a05:6000:1863:0:0:0:0 with HTTP; Fri, 8 Jul 2022 16:55:48 -0700 (PDT) Date: Sat, 9 Jul 2022 09:55:48 +1000 Message-ID: To: internals@lists.php.net Content-Type: multipart/alternative; boundary="0000000000000814aa05e353f1fa" Subject: Character range syntax ".." for character masks From: mickmackusa@gmail.com (mickmackusa) --0000000000000814aa05e353f1fa Content-Type: text/plain; charset="UTF-8" I've discovered that several native string functions offer a character mask as a parameter. I've laid out my observations at https://stackoverflow.com/q/72865138/2943403 In a nutshell, not all character masks offer ranges via "double dot" syntax. Or should I refer to ".." as the "string spread operator" to avoid naming conflict with "..." -- the better known "spread operator" (array spread operator)? Rowan/@IMSoP informed me that the current division between the haves and the have-nots appears to be based on the source language from which PHP pulled. Essentially, if from C, the double dot does not represent a range. https://chat.stackoverflow.com/transcript/11?m=54864842#54864842 Character ranges are not yet supported for: - strcspn() - strpbrk() - strspn() Before I fire off an RFC, I would like to know: 1. Are there any reasonable objections to consistently implementing character range expressions for all character masks? 2. Are there any native functions that I did not mention my Stack Overflow answer? 3. Is it true that only single-byte characters can be used in all scenarios? If so, must it remain that way? 4. Is there already an official or widely-used term that I should be using for the two-dot operator? I should also mention that I initially considered requesting that all character mask parameters be named $mask (instead of $separators, $token, or $characters), but I later resigned to the fact that changing to a name that describes the texture of the string would remove the more vital/intuitive purpose of the string. I suppose the best that can be done to inform developers is to explicitly mention in the documentation when character range expressions are implemented and demonstrate their usage in an example (not just as a user comment at the bottom; this isn't In-N-Out Burger -- put your offerings on the frickin' menu!). mickmackusa --0000000000000814aa05e353f1fa--