Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:116518 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 22461 invoked from network); 25 Nov 2021 21:28:56 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 25 Nov 2021 21:28:56 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 704081804E3 for ; Thu, 25 Nov 2021 14:26:13 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com [209.85.210.178]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 25 Nov 2021 14:26:13 -0800 (PST) Received: by mail-pf1-f178.google.com with SMTP id x5so7124442pfr.0 for ; Thu, 25 Nov 2021 14:26:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wikimedia.org; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=4C+fjeys0QJ9MIjuz+uP+J6w9iIjT08tXm9M1uSut0Y=; b=gPvPYrTu1VFGcT2xADjIlDXnCphNNA9pE261ARhD/iPwCzG6dMqZfRiHEnqcpTGDVb ZwD37X0kSu5yNt085fnVZ/n7ZfaeHUIDj0bMG5VaUVJHDWzzHbqwCyF9SdJC3VV+LUky ACpbheNyvfHlU082a/RUbU+RiuPEsSdpc2w+g= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=4C+fjeys0QJ9MIjuz+uP+J6w9iIjT08tXm9M1uSut0Y=; b=qRvKVPQ69hwrljAg4oi1XbpcUt/ySSi7nB0TMLQdguC26upvHnJkToc6GfBw/BghyC ES2PMOk5BQ4wTnxEoGnP8SvFuFBKY/MISDU5aMzM90EDUtpVKxPEAmzvG074brLQT5OI f9NxRL5yoQXf4yEx3faS1I/d8kSmUq+ylRo64egrxTeeGD2xx46MQ9ksDjOwqOwygb+d U849h0IebXpuplw+HwQEEi5dHOzQ3/IDFiS8eTlQ56Kndt/MRDgDoSVcons98ZSxsNUt UashFo7rVJVp0yyl5PTAbRoJ/ANEF7mSaGydGF4SY2Y/q5HKlexWTE3uVN6lR3JtHOuD opzQ== X-Gm-Message-State: AOAM533lAZynsvAsmebMkkTWIHA+sUBCo4Ok4E824Uv7z5LhqpCmlgc+ RLsopAwBV5A5U2Lpf8GBP0EVBq//T/LQ8A== X-Google-Smtp-Source: ABdhPJwWlN6UuCwjbCg7Qt1fMEGuLL7cUWCo6X9mt6jQVgoniqlTe9nnuh7/cUhHDz/N43xjwGcnLw== X-Received: by 2002:a63:778c:: with SMTP id s134mr18537554pgc.289.1637879171479; Thu, 25 Nov 2021 14:26:11 -0800 (PST) Received: from [10.1.1.45] (124-168-134-56.dyn.iinet.net.au. [124.168.134.56]) by smtp.gmail.com with ESMTPSA id m18sm4732720pfk.68.2021.11.25.14.26.09 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 25 Nov 2021 14:26:11 -0800 (PST) To: Paul Crovella Cc: =?UTF-8?Q?C=c3=b4me_Chilliet?= , internals@lists.php.net References: <757fcf17-4d8b-0eee-8226-e88705d92795@wikimedia.org> <5769524.lOV4Wx5bFT@come-prox15amd> Message-ID: <3aff8da1-5180-face-0d85-48a396488163@wikimedia.org> Date: Fri, 26 Nov 2021 09:26:07 +1100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.13.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-Language: en-US Subject: Re: [PHP-DEV] [VOTE] Locale-independent case conversion From: tstarling@wikimedia.org (Tim Starling) On 25/11/21 11:34 pm, Paul Crovella wrote: > On Thu, Nov 25, 2021 at 3:14 AM Tim Starling wrote: >> On 25/11/21 7:57 pm, Côme Chilliet wrote: >> >>> To reuse the example from the RFC, if I want to convert a UTF string to uppercase using Turkish rules and get dotted capital I, what should I use? >> For case-insensitive comparison you can use Collator. But for display >> you just have to do it yourself. For the Turkish Wikipedia and other >> Turkic language websites we are currently using str_replace(). > Any particular reason not to use transliterators? https://3v4l.org/I038T Thanks, I missed that. You would need to do your own mapping from language code to transliterator name, since it only has converters for az/tr, el, lt and "Any", with no fallbacks. For example if you did Transliterator::create("en-Upper")->transliterate('a') you would get a fatal error. Presumably if I submitted a PR adding wrappers for u_strToUpper() etc., it would not be rejected on the basis that we already have transliterators. -- Tim Starling