Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:124855 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 4673B1A00B7 for ; Sun, 11 Aug 2024 16:18:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1723393204; bh=qaiykD4JmwlzdxswwKT5x0AcVlduj1Vt+RHfYkHmsZ0=; h=Date:Subject:To:References:From:In-Reply-To:From; b=BlaOe8iSJR8k1WtyF/IVYp/gQhkF32oJJqhn3Q4XfQpkXKP3kWi4oCGkwYe94RmM/ zkr2cScnWNhyYd6W2gKSbEspWEthYAmE4FvYTlZdt74suiFldctNcDzlkYFkn303qD 7zdZai6T6GyXqjgbk/vyJ5xFDKFFsw/UHhmhq5mtZC0pX3jl5ukoVPYvV7cdZxLhy2 8F+GdaW9ytmZeHceadnvk5gZGeRtOUX09M394yTQBw61JoxKeehwOPk/zUcPpaD3ls SLM6YaGdiVfePxgD9g2zIAyPyU8rXmyrd423mhvJhIF2/tGLmPDvcAT/EAh2gO3pr+ u84X/U5wLoX2Q== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id C5C05180079 for ; Sun, 11 Aug 2024 16:20:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from chrono.xqk7.com (chrono.xqk7.com [176.9.45.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sun, 11 Aug 2024 16:20:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bastelstu.be; s=mail20171119; t=1723393097; bh=8STNxJlhIwz6FdB8Sk/6p0sITXOUG8T485XQtmOdYC8=; h=Message-ID:Date:MIME-Version:Subject:To:References:From: In-Reply-To:Content-Type:from:to:cc:subject:message-id; b=UzkO8USqoAXLDr9IYGiFhtaDdagEeEBAq90YZR86z9W0r0NPyOhwcT6o5rkcKQokE LzvKYvZ85xrMUmXQmb7AF+w3mHGF9FAzJlKgAbWJa6UQv394XZItqp95RaSnla3D6B Xhnci5oWYLn48CuSV/f/u3jP28/TW5xHk/4F7nGtOzadALIbLqS2AyWjSQydsoqukL z8TVN6woF0TB6LX1srm7+0SzYmfbt/XZUnAaY6xzheIOh9hgON+LeWaFiGjO4gwK3s dWstONUqP0+9cJHXhdUGUTcKg80fKM2V5Yaj9AdeFVdf0NzBP/vxfuUkrOqKBqkMHf M1k7+5PtIsRGw== Message-ID: Date: Sun, 11 Aug 2024 18:18:15 +0200 Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 Subject: Re: [PHP-DEV][Discussion] Should All String Functions Become Multi-Byte Safe? To: Nick Lockheart , internals@lists.php.net References: <8a60a5d76bf3bbdda821160c6141b45914a33b98.camel@ageofdream.com> Content-Language: en-US In-Reply-To: <8a60a5d76bf3bbdda821160c6141b45914a33b98.camel@ageofdream.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit From: tim@bastelstu.be (=?UTF-8?Q?Tim_D=C3=BCsterhus?=) Hi On 8/11/24 17:50, Nick Lockheart wrote: > It seems like there's still a lot of string functions that assume that > a character is a single byte, and these may actually work as expected > when dealing with Latin characters, but may fail unexpectedly if a > sequence is more than one byte. PHP's strings are byte-strings containing arbitrary sequences of bytes. Unless you specifically select functions that interpret the byte-strings as something else, you get a byte-string interpretation. There is nothing unexpected about that. > Are there any use cases for PHP where **single-byte** characters are > the norm? Dealing with binary formats. > It seems that if everything on the Internet is multi-byte encoded now, > then all of the PHP string functions should be multi-byte safe. The premise is false. Everything on the Internet is byte-strings (also called "octet-string"). -------- You might be interested in https://externals.io/message/119149#119149. Best regards Tim Düsterhus