Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:116211 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 2721 invoked from network); 4 Oct 2021 09:24:20 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 4 Oct 2021 09:24:20 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 7A1FA1804DB for ; Mon, 4 Oct 2021 03:08:28 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_20,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 4 Oct 2021 03:08:28 -0700 (PDT) Received: by mail-ed1-f49.google.com with SMTP id g8so62851516edt.7 for ; Mon, 04 Oct 2021 03:08:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=JWNNJFDlt6slLY6U8OiJ99B4DscITWcnsJEj+9JpQDY=; b=bq/9nf4eh1uQCqNow9P1oPls8LTgsNCpDLmqVB3PUfZd1wR41qiB2ahboGrvX2UA6s 2FRca8dpttw8BvBarUjl+8Sr8AvYLwjLmnNabqudRGzkDq0NDCq7UHogzklFR7CsxzYK 91GGBKA7KPb34xNpl3vh3aTWI4QT8k++YdlBoAj/JDph8sNfySm1NgSClPQ39RFLVgeD QQln1OLqX8la/gyTQPsyaqWm9RMt5CWZVwFbd2ZPuKjIVY6/Bq+BrvqFVNXNjRblBK9D u/8fyq5UVHIaboJJLr328UYnGkY1VL5yZ7cG+TaWotkGS/vKoim1LNcXsRX1vbm90t2v pjcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=JWNNJFDlt6slLY6U8OiJ99B4DscITWcnsJEj+9JpQDY=; b=zBBTzW5OzUPs7ldMFrCWWXUc9VojucpMi+kU5MCz+E7MRyFHz9/fzatgwf/BfNu7jA aPTYaOOmk2eCST4tHCWaz0VGGtYCUpJWyWrci+xy5eBIC+yhpAGXLWn3DZCz5z87FMo1 6GmFgHVEZtfXEmSnb4lBldE/eBfELuBcoCdnjvfWsOAPVRslc2s+TbihjBcah8QLrM5B cWarhdUI4rBldcU024qN/kftVz53IZRYjBfxb6ZoEL/XQwa4GKq767mWH362Q63IJlyh VEtaQcx/F4Xpw//9FZmmOPd1N0yEifg2w1zmyHt3ob5QcEpuijoy9XmA+8y/QWruUVLw lqCg== X-Gm-Message-State: AOAM533Kzfy+Fmffp2azTV7mJv853QccieU7Ha63zDCKxppVUO/x7F3F GmBVCffg0PpvQFU1rn5tsdoViPrTrzdxigJqzS0= X-Google-Smtp-Source: ABdhPJzkmy+U54fhuGGN3HqSQoMjYre1TZlGDSHzQYj8Ko5keZsyCr7fijsQImSbTRrswa44QLaJMh/K61vV6Gn/mSw= X-Received: by 2002:a17:906:1496:: with SMTP id x22mr3390923ejc.331.1633342104447; Mon, 04 Oct 2021 03:08:24 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: Date: Mon, 4 Oct 2021 12:08:08 +0200 Message-ID: To: Tim Starling Cc: PHP internals Content-Type: multipart/alternative; boundary="000000000000eec57c05cd84173e" Subject: Re: [PHP-DEV] [RFC] Locale-independent case conversion From: nikita.ppv@gmail.com (Nikita Popov) --000000000000eec57c05cd84173e Content-Type: text/plain; charset="UTF-8" On Thu, Sep 23, 2021 at 8:32 AM Tim Starling wrote: > Please consider my RFC for locale-independent case conversion. > > https://wiki.php.net/rfc/strtolower-ascii > https://github.com/php/php-src/pull/7506 > > The RFC and associated PR ended up going some way beyond the original > scope, because for consistency, it's best if everything has the same > concept of case folding. I saw this as an opportunity to clean up a > common kind of locale-dependence in PHP which was previously inconsistent. > > So not only will strtolower() and strtoupper() become > locale-independent, converting only ASCII, but also stristr, stripos, > strripos, lcfirst, ucfirst, ucwords, str_ireplace, the array sorting > functions with SORT_FLAG_CASE, and array_change_key_case. > > Also, I changed a number of internal functions to use ASCII case > folding, giving rise to a range of effects in callers throughout the > core tree. The effects are all documented in the RFC. > > I am proposing that locale-sensitive case conversion be provided with > the new names ctype_tolower() and ctype_toupper(). Those names might > seem odd at first glance, but they are wrappers for functions in > ctype.h and work in a very similar way to the rest of the ctype extension. > Hi Tim, Thanks for creating this proposal, it looks great! I think this is a very beneficial change, and the amount of incorrect locale-dependent calls we had just in php-src further convinced me of this: We're generally aware of the problem, and we still made this mistake. Many times. The only open question I have is regarding the ctype_* functions. One might argue that these functions should be locale-independent as well. Certainly, whenever I have used ctype_digit() I only intended it to match [0-9]. It seems like some people try to use ctype_alpha() in a locale-sensitive way ( https://stackoverflow.com/questions/19929965/php-setlocale-not-working-for-ctype-alpha-check) and then fail because it doesn't support UTF-8. Regards, Nikita PS: Regarding escapeshellarg(), are you aware of the array command support for proc_open() that was added in PHP 7.4? That does away the need to escape arguments. --000000000000eec57c05cd84173e--