Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:116072 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 1540 invoked from network); 17 Sep 2021 12:44:15 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 17 Sep 2021 12:44:15 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 23C96180505 for ; Fri, 17 Sep 2021 06:24:14 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-ed1-f47.google.com (mail-ed1-f47.google.com [209.85.208.47]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 17 Sep 2021 06:24:13 -0700 (PDT) Received: by mail-ed1-f47.google.com with SMTP id q3so30055588edt.5 for ; Fri, 17 Sep 2021 06:24:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Qsj6PoKuxM6ljbrzKbDe3BzaBitPT5N1ImlfJumjLSI=; b=iMOvS/eJGgQmlRQsqNs7bpXG53iSx5PRMA/VL04KCTwM8wZLxAtaNPuhBMB0RMNiLE XnI7t5o3qDweubAkg6AfmB4cPgyK+RJd4LvOqUbJzo47jWKhZWRR8+a9CdziDzF0eesY dKEn3u1IhgTvn7h0J58fRXSoR+3WCbf9kzuD9Uui8utrIzEO5hEj2lDMQssJk7y/jRnp UIlZJZte3iVU6qNu47YKS/cTyyn4AVxRGIYJvUbv2ru+FH5vfzsDmsHr+D+y8+Q/PJkQ cU5LfSIZ9MYoFWdDd6bpsDVRFaVyA4sQVeMAvpemoJDONtmsD4kkYPk7YlPnTkBMZv62 ssmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Qsj6PoKuxM6ljbrzKbDe3BzaBitPT5N1ImlfJumjLSI=; b=xA43Iw8ig9XPpeK7XpWk1mekrCFq4Yw3KqsgfOthrVlhzDrCu+UTtV6h5wjEO5wpeQ x7EqLxX49nFwvycBisKaSP3UhzOntoOP/9x2ResXScj5/Vfu5IAsLwmgJaPD6jWMqNlC XRC88D/dCT8sxLrr0tdZg5yy8WEj2KBi7TT01SCf1BDYO2oBwVjJBdxGUF0QvNDGdf9V 71+k6Iq+DNwVwZmKk6nej7tEN1dW3g6HhH9OhRG5v/SZhv0uAUkoIJqYEeclZX7gvKyL OUa9S+1ARNhIN4GXVsw/r1K7x+618epFqAyPzGNz2BDXBjah7PdOeYXU8Q4xqaXUTErL xlzQ== X-Gm-Message-State: AOAM533MVtPkFBNTnM9l74835cVgf+8vENGgM2z/FHaFB7kiSQa6XWgR uTd45Y/yNioGab/mXwTShP31HT6xomPC3wxOgQ0= X-Google-Smtp-Source: ABdhPJxZgBp1WNN32QBv2SkY9DALk4gVnoQ90Ymf9RUJ5RDuodPlOTEng1/7HqMb/DkxvekiUI5bgASak86RSdbmUTs= X-Received: by 2002:a17:906:a012:: with SMTP id p18mr12400617ejy.331.1631885052413; Fri, 17 Sep 2021 06:24:12 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: Date: Fri, 17 Sep 2021 15:23:56 +0200 Message-ID: To: Tim Starling Cc: Kamil Tekiela , "internals@lists.php.net" Content-Type: multipart/alternative; boundary="000000000000dd09a205cc30d80d" Subject: Re: [PHP-DEV] Make strtolower/strtoupper just do ASCII From: nikita.ppv@gmail.com (Nikita Popov) --000000000000dd09a205cc30d80d Content-Type: text/plain; charset="UTF-8" On Fri, Sep 17, 2021 at 12:07 PM Tim Starling wrote: > On 17/9/21 7:15 pm, Kamil Tekiela wrote: > > +1 from me. I wasn't even aware that these functions are > > locale-dependent until recently. I see an added benefit that we could > > add them to the optimizer once they are no longer locale-dependent. > > What would happen to users who really need the locale-dependent > > functions? Do we offer some workarounds? > > We could add a global mode, although that would prevent constant > propagation, if that's what you mean by adding them to the optimizer. > Or we could add variant functions like locale_strtolower() and > locale_strtoupper(). But I think I would want to hear from someone who > uses locale-dependence so I can understand what their needs are. I > guess the RFC will sort that out. > I would expect that in nearly all cases the replacement would be one of these: 1. You were using an UTF-8 locale (which you likely are), then just keep using strtolower(). Without having checked all the details here, I think strtolower() under UTF-8 locales already effectively behaves like ASCII lowercase, because it skips continuation bytes. 2. If you were using some other charset, then using mb_strtolower() with that charset should work. So if you were using de_DE.ISO8859-1, then using mb_strtolower() with "ISO8859-1" encoding would be the replacement. As a matter of general policy, it is unlikely that we will accept an option (whether that be an ini option or something else) to control this behavior. We can make the change or not make it, but not both ;) Regards, Nikita --000000000000dd09a205cc30d80d--