Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:113670 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 61787 invoked from network); 22 Mar 2021 14:35:19 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 22 Mar 2021 14:35:19 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 766681804DC for ; Mon, 22 Mar 2021 07:30:36 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com [209.85.208.43]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 22 Mar 2021 07:30:35 -0700 (PDT) Received: by mail-ed1-f43.google.com with SMTP id o19so19601666edc.3 for ; Mon, 22 Mar 2021 07:30:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding:content-language; bh=phs3XR+1Rik/u9TY4eYryNBQ/Y2ugslqm3bzLciPmT0=; b=MFFUx8IxHA8ysuO2s6WpR+8rBEfEaRBMXYjsae7sPCrr6Ck+8936fS6P8V0EO6w/JY KaykawJmZzue0I9+gSmKYU0sPOE6QzQei706kAuiP5aOsHKiaB+QvRfw9sTriIMbbrty 9akgtRaY5U3+5J48vWP9Mq/cdrN+ly9mBbwbh9v3tUOBTn6NrmdoYJ+gpem7q1ZRxB8l vqNpY8pKuVNAj4lETlt26hoMhEZoROqlMxoPTD2QXcENMYvcTnCDMB3RMo6cbLeZRrT9 5XoiAWqxeMLd3ABRxPtYiFVorDav6b4y4+8LJJqsCwZomPnJUrttMV3TyWIYJhonLHLK RLuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=phs3XR+1Rik/u9TY4eYryNBQ/Y2ugslqm3bzLciPmT0=; b=PoQtQjYSe1uYARbj0wxRAgTkQ6cvhN0qsJS3V8wYe75wCBVYbw8ktYo1m3sk2olOgX k9M6HO/h+woJezS2ImrY6NG1vSYgDS4D4NlZwc8MI2WUNalbm7giLcEioG/xgAZXhXu9 lsjRn2YVdT9DguJyFD2RWwsS9ZVm5egQeNvbKglk4IRwgrf17aS7SIK3RE7jJGnN25XT HliE3EMPP1F/dbY+gCGnDPGIOOlQPVA52a97BDfbgs2t46sn/egi48ToXBWp3Lx8Jwyk ra2b2a1hkVmloCVv7pc4Gfd9bpHJRlsUYMcJTQzj/lDRs4EvUN+K4oQXlmfz/5BlbGs2 KDmQ== X-Gm-Message-State: AOAM5317qkXVNj5YK6R+Afcj4WFAa6umaiVK2QCZcxEKpf1zErH9D2DQ m+tdQd1Z0jv74Br/wTmWuSH3gIxpYk4= X-Google-Smtp-Source: ABdhPJwkfI8dN1Y9o5bsROdSoR1MgwlKe0W7O19UwtXbAAQECSMs6Thw/v2c3s0Vy3zX8a8zary6xw== X-Received: by 2002:aa7:d1cd:: with SMTP id g13mr26025354edp.369.1616423434890; Mon, 22 Mar 2021 07:30:34 -0700 (PDT) Received: from [192.168.0.22] (cpc104104-brig22-2-0-cust548.3-3.cable.virginm.net. [82.10.58.37]) by smtp.googlemail.com with ESMTPSA id a3sm9927589ejv.40.2021.03.22.07.30.33 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 22 Mar 2021 07:30:34 -0700 (PDT) To: PHP Internals References: Message-ID: <693767b5-a25b-b4d9-f535-6b985bf26d67@gmail.com> Date: Mon, 22 Mar 2021 14:30:33 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-GB Subject: Re: [PHP-DEV] What should we do with utf8_encode and utf8_decode? From: rowan.collins@gmail.com (Rowan Tommins) On 22/03/2021 13:10, Sara Golemon wrote: > > > * People who just want to replace calls to utf8_decode won't want to go > > through every call and make it exception safe. > > > > Then they shouldn't use these replacements, it's not for them. It's > for people using iso-8859-1. This is a non-sequitur. Someone using the function correctly to convert to ISO 8859-1 may also be relying on the documented and consistent error-handling behaviour. Substituting the character may not always be the best approach, but in some cases it's more useful than discarding the entire string, let alone aborting the entire process with an unhandled Throwable. > The goal is only to not punish users by taking away a valid API that > they were using correctly (for those users who were using it correctly). I'm sympathetic to that aim, but if the new function is not the same, you *are* taking away the existing API, and introducing a new one. Neither of the following seems like it would be accepted: - Make utf8_decode() throw errors for unrepresentable characters. - Introduce a function specifically for converting from UTF-8 to Latin-1, if we didn't already have one. So it feels questionable to me to design a new function, which is neither compatible with what we have, nor a reasonable addition on its own merits. Regards, -- Rowan Tommins [IMSoP]