Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:127519 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id 9FF121A00BC for ; Sun, 1 Jun 2025 02:55:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1748746407; bh=eZgWQjG6Z3S/ZSgtUu/5CpOgGey0mwnnb2GPxoBIS00=; h=References:In-Reply-To:From:Date:Subject:To:From; b=EroAhM5HVNgHI29nvC/PmwjGsdyX4zOaoMsMN6N8h0OSBpPINt9s+L6/IWRV4X5Z8 ymmiy7jk35+I5q4T9alYU/HFtSFWsBgatHjXAHcnfo/ExuuiEOXfyRow0UJKI5QImn V/OcM/OPQiqTA7aYwTIBp2/mMiMjUl+xZp155V+9KXlleSUoyh7roNsMhrKL7jiKSq sMDm84F9eEprPbMCf/t9Do+4CGQ2dllJPEK6bJ8dQ7bDW2vUgxxVqg4lylqsS9ZWCN EDiuOunXgUFKDWeLXfdRWdcX4wpY1hVUEJ/sxBjmZB9gf4kPHWdRZ7T0MmxAJzWqGa lW71QylfqhgnQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 7194C180068 for ; Sun, 1 Jun 2025 02:53:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.4 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: Error (Cannot connect to unix socket '/var/run/clamav/clamd.ctl': connect: Connection refused) X-Envelope-From: Received: from mail-wr1-f45.google.com (mail-wr1-f45.google.com [209.85.221.45]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sun, 1 Jun 2025 02:53:26 +0000 (UTC) Received: by mail-wr1-f45.google.com with SMTP id ffacd0b85a97d-3a37ed01aa0so2699603f8f.2 for ; Sat, 31 May 2025 19:55:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1748746530; x=1749351330; darn=lists.php.net; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=eZgWQjG6Z3S/ZSgtUu/5CpOgGey0mwnnb2GPxoBIS00=; b=fBdwnIFeCHbZ2T4izuRrCSxdaJKcSXVoVFYfMYoVIH5EQjNdHmHL1mBrlqwraE6o1V HC/GyvTtE5QZufkZG5qMaTiBAl5Yao2s0oQ9lmzLJ22Fv6rgFURind3fTz96zuhXoKNI szGidlKkRSlgHh66FyiTTqNkyr1lFBJe8leYFN9iCCtu9+rbT5BjcYH4F+k9/gvYKJay 0hu23y6amT1aaWdFQjJltRuSwSiwNr9NGa5NwSiT1Q56UETmpvXFV6Lm3PmpcBuU59HP yHXkgturMtnFC7iPHCJROxoXkgG867O02rrDiFVL4KXgSrHYiNH1gsj7t4JOu/Uesycg fovQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748746530; x=1749351330; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eZgWQjG6Z3S/ZSgtUu/5CpOgGey0mwnnb2GPxoBIS00=; b=wb8FueTYfsUrjf1IuDEjTuU/4bMKobl5ptrWDUoxlzJvyeol/2fGoOU5K4JuqMwyIa 9mK/MGt9CI4oehRGvGABqwLUXDf9LqHZ/sYU+CF9IyqROVdmUUBgpBI2HDJQFV3GwNdW 3m+qnFg7HbDcq7xxj2yRcFxjhEViAtWCkZdYzhZ5uVFY4G/xnn5uZZJC7RTrXT/9lpua kM3QlHREw3nKSEssWHAEtvTh3RyJyKFmsVQcMfajYMBvHQZblVVFQ7Nj4h34EybKWzmw J+Uw7VC4zgg1Fttc0FlkwRyU01zEd0E9iG8/z6XeOjaBoH9ancBvXKOw9y+78VJ+eKLy 4zXg== X-Gm-Message-State: AOJu0Yztbem8DQ5xlmETRETocDYGs6LXDVZY8hHqx5oVrlSkSDKu2q0G Ez91UuGpDB6TBnucQJzBlyVRshusAtlg+efQNCOQzhCIHGrcxM9lwA6ws1g24GzCLmfCyXgjgfk dW/hd9nKKPfhtYxobjzKSzs9q5WDbfEPFU+9/j27U X-Gm-Gg: ASbGncsil6AFZCsjMucgbgK442qxy7ElsBH/NFUom+1hQMOZfnLh1sQ/pQEK1oHm1Cl d1ZWHi3Kd4yBBQL1QWa3/TEPLR3YWy3OQqJYI7SEU0l0Hze4valN57rWb8tunIU8S9WBDa5drEZ psifKM5YZJJ+MRD56vSS2+YgXO0sQ1gddJcA== X-Google-Smtp-Source: AGHT+IEXwrTvYXmp1X7wo5Orf0rwzt3dfaYwCXPwMU7Y+IZGWb4+tDO10bh/RBehars0dVMcyQi2EolpQf3tuc0vpJo= X-Received: by 2002:a05:6000:4023:b0:3a4:edce:b2a6 with SMTP id ffacd0b85a97d-3a4f7aafdd0mr7123612f8f.53.1748746529769; Sat, 31 May 2025 19:55:29 -0700 (PDT) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 References: <2a626b9f-292b-4fc2-a023-0b0db9a64ede@app.fastmail.com> <20AFBEA9-C3DA-4F6A-BD0A-112F86064316@php.net> In-Reply-To: <20AFBEA9-C3DA-4F6A-BD0A-112F86064316@php.net> Date: Sun, 1 Jun 2025 11:55:18 +0900 X-Gm-Features: AX0GCFuxXlUqeaKy-SbshxzztBtcLLy5eQcVTizVdM7KB7UjIGfVWWY54wnvlCI Message-ID: Subject: Re: [PHP-DEV] Adding in a case-insensitive version of str_contains To: internals@lists.php.net, Derick Rethans Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable From: youkidearitai@gmail.com (youkidearitai) 2025=E5=B9=B46=E6=9C=881=E6=97=A5(=E6=97=A5) 1:07 Derick Rethans : > > On 31 May 2025 12:36:52 BST, youkidearitai wrot= e: > >2025=E5=B9=B45=E6=9C=8831=E6=97=A5(=E5=9C=9F) 19:41 Nikita Popov : > >> > >> On Thu, May 29, 2025, at 23:00, Kamil Tekiela wrote: > >> > >> As I understand, it was a conscious decision not to add this function > >> when str_contains was created. The reason is that case sensitivity is > >> locale-dependent, and for such use cases, mbstring extension is better > >> [1] & [2]. Do you think that locale is a concern here, and if not, > >> why? Would it be a good idea to add mb_str_icontains instead? > >> > >> If you're going to propose an RFC for this, it would be a good idea to > >> explain what the real life use case for it is. While str_contains is > >> very useful for checking the existence of a byte-string within another > >> byte-string, a case-sensitive check doesn't seem to have much use. > >> > >> [1]: https://stackoverflow.com/a/63121809/1839439 > >> [2]: https://wiki.php.net/rfc/str_contains#case-insensitivity_and_mult= ibyte_strings > >> > >> > >> To make it a bit more explicit: The proposed str_icontains function do= es not support UTF-8. It would only be case-insensitive on ASCII characters= . Do we really want to add new functions that do not properly handle UTF-8? > >> > >> I think that thanks to https://wiki.php.net/rfc/strtolower-ascii (whic= h removed C locale support from this family of functions), there actually i= s a pretty viable way forward to make the non-mbstring case-insensitive str= ing functions useful again: Make them work on UTF-8. (In the sense of using= Unicode case folding and case mapping on UTF-8, while still returning code= unit offsets. This would make them superior to both the current stri* func= tions, and the mb_stri* functions.) > >> > > > >I agree that it's important to think about it in UTF-8. > > > >I think about UTF-8 support case folding function in past few days. > >Maybe... It is like below? > > > >``` > >grapheme_setlocale($locale); > >grapheme_icontains($haystack, $needle); > >``` > > > >First, grapheme_* function supports locale. > >Second, add grapheme_icontains function for case insensitive version > >for str_contains. . > > I don't think it's a good idea to rely on a global state containing the l= ocale. > > cheers > Derick Hi, Derick (and Internals) Thank you your feedback. Well, Then I could find two ways. First, grapheme_* functions add $locale parameter. For example in grapheme_strpos. ``` grapheme_strpos(string $haystack, string $needle, int $offset =3D 0, string $locale): int|false ``` Second, Contain a locale in object instance (But I can't find just object). By the way, intl is already exists Locale class. https://www.php.net/manual/en/class.locale.php But, it is not seems use anymore. Regards Yuya --=20 --------------------------- Yuya Hamada (tekimen) - https://tekitoh-memdhoi.info - https://github.com/youkidearitai -----------------------------