Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:118308 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 6479 invoked from network); 29 Jul 2022 07:00:02 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 29 Jul 2022 07:00:02 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 8FAD7180507 for ; Fri, 29 Jul 2022 01:58:46 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-yw1-f178.google.com (mail-yw1-f178.google.com [209.85.128.178]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 29 Jul 2022 01:58:46 -0700 (PDT) Received: by mail-yw1-f178.google.com with SMTP id 00721157ae682-31d85f82f0bso44429817b3.7 for ; Fri, 29 Jul 2022 01:58:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=UDpREnUwwGg8rrkXIBM6jSCt+7k5YzvF4NEvv2IAtUI=; b=JGhmT2F+uHVos8AzgB6OGErAXOX8+iSoiNp84tp0PuLlini8xLOrS+bH7k1NxWG0JE o8kDUnDoJKTFLOQ8G539qo55AvEYfVnRYVwEUKnq/dNzxgRE1MLRi5aK0B9SxI0QdBut DvIWkLfQ3xNh4Ogm9jcPlniCR0jtvGStE/rJjMwjxhlxGs4fTERrDGEqDTufbj4TEr6N KPpVeCnJ+OgGPdY6IgHqUI3mo/gGAaRfDeG4a3Bm42tGlQTHYg6O1c+Q1GEOr3/P63XV 8ZEBOgp290YdwDS+j6MX0ZgFJUGTEe3Qs2BbIxnrKLqap6R8C4z4W1H4uZ3v8SYnTTMu mzUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=UDpREnUwwGg8rrkXIBM6jSCt+7k5YzvF4NEvv2IAtUI=; b=Rl4adh8PE7z3MU7DLzJ1+tO/UiSNnqxTJI2D/Ri6Sv2mW6IF+nRfAIbF9rghQzWKgW XuCc7pBZboVrNj9MqoMfJ18Bgose4Y6CfS+epU/ytx6Qmd6zW3U364UHoFnMa06XNVeB JSdZVLKF8797DwfqZmOvxm9pOLpOMsZ4Z70XnbwcYhZnmNTmu4n14+BppWKT9edAd94N x2xF2Omz2IPuCb2jL5ERtA1jzYfbNrOeDJM/zLIzzF5AL5PnC96QNzHZFUq71WUNG4Ff rXisIyJ61tz/UklB4MdjEjrgaxbU8qBc2PQfQ5mjbiw18ZIumsJc6uVHg15uyN+rGdLb quPw== X-Gm-Message-State: ACgBeo2N7+boJtxKPyzZnaCt/lIrCLghBwN+3clG7HzrqQH+KWokDpQt kEQLGlBJwf6TN6j6NLDysmpmYQPmlEsgH0WN0w== X-Google-Smtp-Source: AA6agR6y205zeb1MDq4bRwJeQKR9IeCt6WQ3Krmwr0zsyuhTUE3hfzG73ubkUOGY0Wymoc6iEbR5DoHWVmjsx4I5960= X-Received: by 2002:a81:1309:0:b0:31f:5d35:ed1e with SMTP id 9-20020a811309000000b0031f5d35ed1emr2236448ywt.424.1659085125697; Fri, 29 Jul 2022 01:58:45 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: Date: Fri, 29 Jul 2022 10:58:35 +0200 Message-ID: To: mickmackusa Cc: "internals@lists.php.net" Content-Type: multipart/alternative; boundary="00000000000091c4ad05e4eddb48" Subject: Re: Character range syntax ".." for character masks From: guilliam.xavier@gmail.com (Guilliam Xavier) --00000000000091c4ad05e4eddb48 Content-Type: text/plain; charset="UTF-8" On Fri, Jul 29, 2022 at 7:15 AM mickmackusa wrote: > > > On Monday, July 25, 2022, Guilliam Xavier > wrote: > >> On Sat, Jul 9, 2022 at 1:56 AM mickmackusa wrote: >> >>> I've discovered that several native string functions offer a character >>> mask >>> as a parameter. >>> >>> I've laid out my observations at >>> https://stackoverflow.com/q/72865138/2943403 >>> >> >> Out of curiosity, why do you say that strtr() is "not a good candidate >> because character order matters" (although you give a reasonable example)? >> Maybe you have some counter-example? >> >> Regards, >> >> -- >> Guilliam Xavier >> > > I prefer to keep my scope very tight when posting on Stack Overflow. > > My focus was purely on enabling character range syntax for native > functions with character mask parameters. My understanding of character > masks in PHP requires single-byte characters and no meaning to character > order. > > When strtr() is fed two strings, they cannot be considered "character > masks" because the character orders matter. > > If extending character range syntax to parameters which are not character > masks, I might support the feature for strtr(), but ensuring that the two > strings are balanced will be made more difficult with ranged syntax. > strtr() will silently condone imbalanced strings. https://3v4l.org/PY15F > Thanks for the clarifications. You're right that the internal `php_charmask` converts a character list (possibly containing one or more ranges) into a 256-char *mask*, thus "losing" any original order; so strtr() actually couldn't use the same implementation (even without ranges), and a counter-example is `strtr('adobe', 'abcde', 'ebcda')` (`strtr('adobe', 'a..e', 'e..a')` would trigger a Warning "Invalid '..'-range, '..'-range needs to be incrementing"). I had seen a parallel with the Unix `tr` command, which *does* support [incrementing] ranges (e.g. both `echo adobe | tr abcde ABCDE` and `echo adobe | tr a-e A-E` give "ADoBE", while `echo adobe | tr abcde edcba` gives "eboda" but `echo adobe | tr a-e e-a` errors "range-endpoints of 'e-a' are in reverse collating sequence order"), but its implementation doesn't use character masks indeed ( https://github.com/coreutils/coreutils/blob/master/src/tr.c), and `echo abracadabra | tr a-f x` gives "xxrxxxxxxrx" not "xbrxcxdxbrx"; and it also supports more things like POSIX character classes... PS: I find the `strtr(string $string, array $replace_pairs)` form generally superior to the `strtr(string $string, string $from, string $to)` one anyway ;) Regards, -- Guilliam Xavier --00000000000091c4ad05e4eddb48--