Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:113673 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 67155 invoked from network); 22 Mar 2021 15:09:27 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 22 Mar 2021 15:09:27 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 18F871804C0 for ; Mon, 22 Mar 2021 08:04:43 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: * X-Spam-Status: No, score=1.5 required=5.0 tests=BAYES_40,BODY_8BITS, DKIM_SIGNED,DKIM_VALID,SPF_HELO_NONE,SPF_NONE autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mx.kolabnow.com (mx.kolabnow.com [95.128.36.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 22 Mar 2021 08:04:41 -0700 (PDT) Received: from localhost (unknown [127.0.0.1]) by ext-mx-out003.mykolab.com (Postfix) with ESMTP id 3351C410F6 for ; Mon, 22 Mar 2021 16:04:40 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kolabnow.com; h= content-transfer-encoding:content-language:content-type :content-type:in-reply-to:mime-version:date:date:message-id:from :from:references:subject:subject:received:received:received; s= dkim20160331; t=1616425479; x=1618239880; bh=xeugAKTmFofXf6uL6AJ moYkyCAzfrdcyFMkkIebGSoc=; b=or/sF3mXtRQhmTB/ua32O+QN0H6RrMqsv0T g1U5ify0oQ5ngwMO5sEhMMz6pC71T4IOSDoFrHOR3tyvFmdBMIxXLxCXYOLkYxSV okB6bCcE9IQFL8k8ypOnuSBtG9+q96a26784LRUiW0uAx7cTzgn7atyEbUy+xRvJ 4i5/RMjr8rgvhAAWtG6y6ZMubb4t/3wTxNO+/NyxbQtVYENu2EUHQs27bD9Fb9fw lkfV1nClPpJSUwfWwaFV4dc2Y8LrfzvPVR/BQ8sYMdY2c7EWLYuO5C4b1o9BPMag AZrlBhQyT4v3e/pMbSTgwvIi3Vg08ddUeRiiR/Q7dfExJxS/vdVQ4MAqN8isbD3y qDoDOFzmw5VmLIsksuQ27VIod13yjh5lQJ1NIX5C5et41KqwNc0Ak24XDzVt9DTd h73WBWevBd4V2GTafTCLtIVxMU8B+gtYIq5G8u3GjggTywpNzVqvhS83xt4xkjMw 3IuYfNrmBETAtpGQCKavZdgFJNSlg4qjmxpaMpzgZvS5j5FIGQqSfGqzlTRsJ/bU g1kvp5ePWFrzb0IqlukPlw+jSCqz560jAziz9Tl+ZVHpOyNPQpwVBGR0jwgLQKto GZR2TpIDV1wFbfl0iY958lT1H+ojcjvHCwyQASeXEuO/i+ZnKp8tYqrebIt2Omvz pb0TuCfA= X-Virus-Scanned: amavisd-new at mykolab.com Received: from mx.kolabnow.com ([127.0.0.1]) by localhost (ext-mx-out003.mykolab.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yH4SuQk4NU8n for ; Mon, 22 Mar 2021 16:04:39 +0100 (CET) Received: from int-mx002.mykolab.com (unknown [10.9.13.2]) by ext-mx-out003.mykolab.com (Postfix) with ESMTPS id 6BCA2404C5 for ; Mon, 22 Mar 2021 16:04:39 +0100 (CET) Received: from ext-subm001.mykolab.com (unknown [10.9.6.1]) by int-mx002.mykolab.com (Postfix) with ESMTPS id 809DAF0A for ; Mon, 22 Mar 2021 16:04:38 +0100 (CET) To: internals@lists.php.net References: <693767b5-a25b-b4d9-f535-6b985bf26d67@gmail.com> Message-ID: <29d5329c-bea2-7944-4820-515d4a10ae86@alec.pl> Date: Mon, 22 Mar 2021 16:04:35 +0100 MIME-Version: 1.0 In-Reply-To: <693767b5-a25b-b4d9-f535-6b985bf26d67@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] What should we do with utf8_encode and utf8_decode? From: alec@alec.pl (Aleksander Machniak) On 22.03.2021 15:30, Rowan Tommins wrote: > - Make utf8_decode() throw errors for unrepresentable characters. I'm not sure I understand this, but it sounds like it would be a BC break for my case. I'm using utf8_encode()/utf8_decode() to make input string safe to be stored in DB, and back. In most cases the input is utf-8, but it occasionally may contain "broken characters". $str = ''; for ($x=0; $x<256; $x++) { $str .= chr($x); } $this->assertSame($str, utf8_decode(utf8_encode($str))); $str = "グーグル谷歌中信фδοκιμήóźdźрöß😁😃"; $this->assertSame($str, utf8_decode(utf8_encode($str))); Could anyone point to a sample input that will not work with my use-case? -- Aleksander Machniak Kolab Groupware Developer [https://kolab.org] Roundcube Webmail Developer [https://roundcube.net] ---------------------------------------------------- PGP: 19359DC1 # Blog: https://kolabian.wordpress.com