Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:116141 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 20063 invoked from network); 23 Sep 2021 05:50:31 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 23 Sep 2021 05:50:31 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 86F561804A9 for ; Wed, 22 Sep 2021 23:31:55 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 22 Sep 2021 23:31:55 -0700 (PDT) Received: by mail-pj1-f44.google.com with SMTP id k23so3786287pji.0 for ; Wed, 22 Sep 2021 23:31:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wikimedia.org; s=google; h=to:from:subject:message-id:date:user-agent:mime-version :content-language:content-transfer-encoding; bh=5YW46AdQgYyjvkj0MdPmNop0vcNvTtSLFLs995OuB/c=; b=jdWz7LteoAYWvmQXaxM4n7ecuUMMM4XTEgCHi9LHmTf9h0kiL0C+YK8G2/HvatrGus R2C6rGQT0ZDdvtVntBqiEWBWCAX2e8mNzxIWorEoiTC9fhf4G+vTcRwT7FeNSEKmZMwG EgrbiBZ0kEoMJ07kPJwpyRcevuKvoNye9areg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:to:from:subject:message-id:date:user-agent :mime-version:content-language:content-transfer-encoding; bh=5YW46AdQgYyjvkj0MdPmNop0vcNvTtSLFLs995OuB/c=; b=Xa84SCfAh+nm9lJQfq7gKNQPxZuCBbo5YNZ1mSQnYkpfC8eMLy0/k2SY4M3AfBGoPx 2qhPBmNrGYp1y7ddoDD8yHMxRojZXBpEnP0+Kn9lnoXgpmu+8RPuLvEaxCGalU4Mhqc0 gI9rs8G4vzq8jDWbQYfxO3MOdS4P1CtjQN2upA5ok3eZK8tSwwVfq3Hxek7e7VAye8ms E07Tt5hebR9UjK8JHWD2D9l5pIVfZHeybrYuN8l4kWEn4tX4wCl1Q+xxwR6tid0nSGei D4XWg5/s+jbTt4XgJRaSdONgcGqRV30BsDLVc058cuRifkoMmmVExbL0ipk7RLw/pj0O 9pHw== X-Gm-Message-State: AOAM531vUNMtYuwKZuqrX0bVSAF5RifnNX/zB9LEASty8Lv99vqPO2ZT 5pQgPOWq6HbmoKFcGWCE+LM8htpJkcXkcw== X-Google-Smtp-Source: ABdhPJyUaNxaQlhGvnfmNUgeX8WzrHJu34ZNHRGBu5FwV5jdMLsZrL2vCjGc9f3xcsjiGrLzDAjiWw== X-Received: by 2002:a17:902:6106:b0:13d:9572:86bb with SMTP id t6-20020a170902610600b0013d957286bbmr2411516plj.26.1632378711705; Wed, 22 Sep 2021 23:31:51 -0700 (PDT) Received: from [10.1.1.45] (124-168-141-168.dyn.iinet.net.au. [124.168.141.168]) by smtp.gmail.com with ESMTPSA id l10sm4884326pgn.22.2021.09.22.23.31.50 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 Sep 2021 23:31:51 -0700 (PDT) To: PHP internals Message-ID: Date: Thu, 23 Sep 2021 16:31:48 +1000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: [RFC] Locale-independent case conversion From: tstarling@wikimedia.org (Tim Starling) Please consider my RFC for locale-independent case conversion. https://wiki.php.net/rfc/strtolower-ascii https://github.com/php/php-src/pull/7506 The RFC and associated PR ended up going some way beyond the original scope, because for consistency, it's best if everything has the same concept of case folding. I saw this as an opportunity to clean up a common kind of locale-dependence in PHP which was previously inconsistent. So not only will strtolower() and strtoupper() become locale-independent, converting only ASCII, but also stristr, stripos, strripos, lcfirst, ucfirst, ucwords, str_ireplace, the array sorting functions with SORT_FLAG_CASE, and array_change_key_case. Also, I changed a number of internal functions to use ASCII case folding, giving rise to a range of effects in callers throughout the core tree. The effects are all documented in the RFC. I am proposing that locale-sensitive case conversion be provided with the new names ctype_tolower() and ctype_toupper(). Those names might seem odd at first glance, but they are wrappers for functions in ctype.h and work in a very similar way to the rest of the ctype extension. -- Tim Starling