Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:125007 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id CE3331A00BD for ; Fri, 16 Aug 2024 23:25:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1723850825; bh=YeCW8Tb19LUG7MQ4VXrrcJ6hKQW3heAo0535Al6yfME=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=hSWUePFQolJ2hbH4aQ73+DHPD9Ofir1CuBmeA3XHqJB0540V1CAudrZtn7R+FQY9B OUIWJg+rh4yTr6+3OY+nkxeUvm1YVv9CC1ZavR4DzbYpBoLJ/KhRVg8axjMTOmzeQK 8FoUedKd4y9mZrOZ+bk0iD+/5yzObH+iBeTlN+klBB3l/JjKRvC5hyR0EiC5f6zVHS 6S2vmaYBWSD4rPfiDcMGX+ukixRPOMjWazD54lzpsdqiMgM6DMDARvZMFlHo662Q3w V6fwc57MCJkmhAcKx4J2Jmm0/zhqnMuvGaCipNoBsukvXnC5LVN+suNOhSqvEv9lU/ ALZHh1gftpzgA== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id DD683180087 for ; Fri, 16 Aug 2024 23:27:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DMARC_MISSING,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_NONE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-yw1-f177.google.com (mail-yw1-f177.google.com [209.85.128.177]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 16 Aug 2024 23:27:04 +0000 (UTC) Received: by mail-yw1-f177.google.com with SMTP id 00721157ae682-6b4412fac76so4590587b3.1 for ; Fri, 16 Aug 2024 16:25:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=newclarity-net.20230601.gappssmtp.com; s=20230601; t=1723850716; x=1724455516; darn=lists.php.net; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=IpqtI1IfRhRMywq+BWeT7I4ObMQE4dgNR3uWBWx9NXM=; b=O2QBOHsupabttqfdt4y8wAQGCyd8RlNdPLY3Zv+TH4hal9cs9Ce36jqEHCll2mR2sg +vm/s7C60cwhajw1+76Bon8SqN3AMk9KJzPh4LGjtVDBynJO+b0lCFV0o1V1rrkhqNJf YdhFMtUxvFHvCdwIXYbWHv13iSb683o3HDd/hm8c/OYv4HmVZP6Xe7RSMjIglMiDVxEe xsdWqWeL4CLZxxwVOYMW7pp4SKJ+RlyW5ZBerlWl36l4oOswW6AkDPbKdHZ4iNSLvFOg B6Jf8z8Iwt/0vcif8bb+UIX4TpKim9yWaxXbe70C5BgJxwtsj16U8CliGl0RvlggHjCG ScQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723850716; x=1724455516; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IpqtI1IfRhRMywq+BWeT7I4ObMQE4dgNR3uWBWx9NXM=; b=RG5vL7C5b5fbPlm0LY4Bkanoil3hCNGo57yByaIMDgje+WgU1V9xoROBopJPh+RjAc dsOAhDPlHDqCWrESEjZeHFC3KeXnfKxXkyVfAsIVA6K1VclcltB+5BOoIibbwYFPeVBp 90bEJIGpM7nxFWIPXZJzubPyo8E+cvjLbCbzsVGWc01ZaEzk95GAS77HrDA/dq9LlJEo g8PY99jHoFZLu85GKWLH649f+Oromjy9FkQjzVPUBBWNVvJ1ZqxdW8x33SfYW9Psu8WA 3i8RGCBxqhLYqHREKJGDfPcAYcrI/fduHqKRmLM7LahRxqmOWM9ns7ibkxC0567CqZZW Eb8Q== X-Gm-Message-State: AOJu0YwkOIODjhlJ8v4dc/YU9SggVt9H5gTO8kNuX9PfKC98PHeOoMlw PaNso6ubVNQLGQpz8XJk5lF/mUmLyp6RLljFnoadFEEIEElmPzQuSONAU2Gv0CTSWY1DXDYZmfB 0Qck= X-Google-Smtp-Source: AGHT+IHf516EenD3GMrUz/OFPV1JVVu8W/dHkuJBnDVH1jgMiw/oGEwfI/WgG+75n6NWP5ZCzYRwbA== X-Received: by 2002:a05:690c:7010:b0:6ab:e840:7f4d with SMTP id 00721157ae682-6b1bdb1ed23mr50306787b3.46.1723850716240; Fri, 16 Aug 2024 16:25:16 -0700 (PDT) Received: from smtpclient.apple (c-98-252-216-111.hsd1.ga.comcast.net. [98.252.216.111]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6af9cd7aaf6sm7993127b3.79.2024.08.16.16.25.15 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 16 Aug 2024 16:25:15 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.8\)) Subject: Re: [PHP-DEV][Discussion] Should All String Functions Become Multi-Byte Safe? In-Reply-To: Date: Fri, 16 Aug 2024 19:25:13 -0400 Cc: internals@lists.php.net Content-Transfer-Encoding: quoted-printable Message-ID: <37013714-371C-4871-9E15-FE91657A662B@newclarity.net> References: <1AFE8300-D363-43D8-A989-15D001B9879C@newclarity.net> <270D6057-626D-4720-B44A-3CB7A7B9320B@newclarity.net> To: "Rowan Tommins [IMSoP]" X-Mailer: Apple Mail (2.3696.120.41.1.8) From: mike@newclarity.net (Mike Schinkel) > On Aug 16, 2024, at 3:36 PM, Rowan Tommins [IMSoP] = wrote: > On 16 August 2024 19:44:22 BST, Mike Schinkel = wrote: >> Let me see if I understand your argument correctly? You are = asserting that Unicode is "too complex" to be handled in the standard = library so that complexity should instead be shouldered individually by = each and every PHP developer who needs to work with Unicode text in PHP, = which is many PHP developers if not eventually most. Is that your = argument? >=20 >=20 > Not really, no. I'm definitely in favour of including more = Unicode-based string handling functionality, by improving and extending = ext/intl, or coming up with new convenience wrappers for common tasks.=20= Your prior reply came across to me as a closed-end slamming-the-door on = the topic because of "what we can't do," which is why I commented on. This most recent reply OTOH was an example of exploring "what we can do" = to improve PHP. So kudos for this reply; big plus. > What I'm always sceptical of is the idea that you could ever consider = such functionality "complete", or that "Unicode support" can ever be a = single deliverable, rather than an ongoing aspiration. (And = consequently, I'm sceptical of any language which says it has achieved = that.) To be clear I don't think anything can ever be considered "complete" = unless you are talking something as limited in scope as numeric = addition. And even then, supporting imaginary numbers might arise. =20 So while it is not a problem to be explicit about it even if redundant, = having it be an underlying reason to argue against useful functionality = =E2=80=94 if it is ever used in that way =E2=80=94 seems = counter-productive. Nothing precludes a follow-up RFC after an initially = successful RFC is implemented and shipped. > I also think "Unicode support" is probably the wrong angle to approach = from; it leads to features like IntlChar, which technically provides = access to tons of data from the Unicode standard, but practically has no = use for 99% of PHP developers. Instead we should be talking about = "internationalisation support", of which handling different writing = systems is one (fairly big) part. I am not sure I agree with you that adding Unicode support is the wrong = angle, per se.=20 A strong argument could be made that Unicode support is a necessary but = not sufficient building block for "internationalization support." IOW, = if you want to get to the latter it is probably a lot easier to start = with the former as the scope of the latter is by-nature larger. After = all, perfect is the enemy of the good and waiting for a full-press = effort for internationalization support could well push off Unicode = support long down the road. Still, *if* "full" internationalization support can be achieved in the = shorter term it would be bikeshedding for me to argue against it. -Mike