Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:113701 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 18491 invoked from network); 22 Mar 2021 19:47:56 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 22 Mar 2021 19:47:56 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 8C3211804D8 for ; Mon, 22 Mar 2021 12:43:15 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-ej1-f48.google.com (mail-ej1-f48.google.com [209.85.218.48]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 22 Mar 2021 12:43:14 -0700 (PDT) Received: by mail-ej1-f48.google.com with SMTP id a7so23242984ejs.3 for ; Mon, 22 Mar 2021 12:43:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding:content-language; bh=ZGf5sEp/f2T1dnAG3HGXjsJy+GM/CGsG2uDnNW79xEs=; b=k3KLAYuvr3IrN2vS+Zb29NaxpE6YK33dQ24ZoF3aDWxlumLDM9QVA0QuZEUMxAtPfS FxZjowVWv1VfhKZRaCVLVfnQuRUIQgbsaFKZERePpgeXwc73FNgFWWS7BXaKvKc9A97Y 67VHUxNUpf5MK3GDy7s9hGwXgIHjS3333sh4LYXPM6VS4dZIv5u4Whd26/bL/jIY5O3j n9zdAWHAxde9fh5Q4sOcNrcfiNRe27vCzIormMXWSewyz46oT7F+ZcO8TTGYv199tD0v ZGh+GVsnLcgRNAtx3kR8chBXnkwnEh7tNJiSrDWjDLcb7aJgO/z07VoBmcwcKzi3V7xv 5aaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=ZGf5sEp/f2T1dnAG3HGXjsJy+GM/CGsG2uDnNW79xEs=; b=I/9+lsMOs0dYlpY7BYlYTMxewhLlvpE7UmvaPdeeHpGgg//RTy+x+WKya18nIFgHvm +j5OA27FDPCCwQKFylIxmZC4YBp8CnWkQ1eeXmJsNoNF5e+UmnAITmMSzG1JNAWPdWqq qWZYGMrplUSRRvcpi+mvM/C6FzKNZ+Wjay0P3HoOdCimC1F8oKjjsSKz/bt/YQWJkOzs JqiOF4iQo+xwg3E+lwENC8zCrAV6jvRi3s7NpzcyKTuyGEjVydyB0foLy+jUWXCIr7IV DPtHDwirwhoDusxVn9jzvFWqdMia1OY9EPNalWW2bSR1OnSH9vOZz5EQE6ryVZ/sPBzJ HJ3g== X-Gm-Message-State: AOAM533RQZ9WVxRcMXqF/L5aqY0tN15zX1XYI9XB1RO0ktZuZNSTRTBb heJIjRtiqAdjHHIGZVMIX/zQcpb5K/0= X-Google-Smtp-Source: ABdhPJxx9UX0NuZf7UMveuTjsfnJApfWxHkqWQ4IDjLNnrGPJ0JVbXmwtPpGsLt6BQcUiVe+05vdYw== X-Received: by 2002:a17:906:d114:: with SMTP id b20mr1342669ejz.449.1616442193538; Mon, 22 Mar 2021 12:43:13 -0700 (PDT) Received: from [192.168.0.22] (cpc104104-brig22-2-0-cust548.3-3.cable.virginm.net. [82.10.58.37]) by smtp.googlemail.com with ESMTPSA id r5sm12052470eds.49.2021.03.22.12.43.12 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 22 Mar 2021 12:43:12 -0700 (PDT) To: PHP internals References: <693767b5-a25b-b4d9-f535-6b985bf26d67@gmail.com> <29d5329c-bea2-7944-4820-515d4a10ae86@alec.pl> <16ecfc31-33aa-4223-fb67-b5a4b5895f05@gmail.com> <11e9a312-ed10-412e-506d-ccf9f24457f8@alec.pl> Message-ID: <7a6196e9-5b9f-5a82-d14b-4a6f933243e4@gmail.com> Date: Mon, 22 Mar 2021 19:43:12 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-GB Subject: Re: [PHP-DEV] What should we do with utf8_encode and utf8_decode? From: rowan.collins@gmail.com (Rowan Tommins) On 22/03/2021 18:18, Chase Peeler wrote: > > Even if it is by accident, removing or changing the behavior of the > function is guaranteed to make something that currently works (by > skill or by luck) and risk it no longer working. This is absolutely true. However, at some point you have to draw the line between supported use cases, and requests to re-enable spacebar heating: https://xkcd.com/1172/ I think using utf8_encode to store binary data in a text column crosses that line: the code was added because of a misunderstanding of the function, it works by accident, and there are plenty of better ways to solve the actual problem. Just to be clear, the trick Aleksander and Alexandru stumbled on doesn't just work for "corrupted UTF-8"; you could store a JPEG in a text column by using utf8_encode(file_get_contents($image_file)). It's probably best not to, though. I *also* agree that users should have a clear guide to how to replace their current usages. Fortunately, there are at least 4 other ways of writing this functionality in PHP (iconv, mbstring, intl, and the Symfony polyfill). Regards, -- Rowan Tommins [IMSoP]