Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:117087 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 70152 invoked from network); 21 Feb 2022 07:50:06 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 21 Feb 2022 07:50:06 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 9FB1F180088 for ; Mon, 21 Feb 2022 01:09:16 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=BAYES_05,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-wr1-f43.google.com (mail-wr1-f43.google.com [209.85.221.43]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 21 Feb 2022 01:09:15 -0800 (PST) Received: by mail-wr1-f43.google.com with SMTP id x5so20943996wrg.13 for ; Mon, 21 Feb 2022 01:09:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :references:from:in-reply-to:content-transfer-encoding; bh=CZIZlZGdcWi/E/214jylJcwWKMcr2Q4Yaxpfceq3YcI=; b=qvYMP3HMs1ceEciTtyn/NWATO5LvBEVL2sdWVe1IlZ9Uhvpmnx3SItx7iNlB19gdQK hVs1Kb9vtYUr6fv2fHLKxev02z8whMkASEV9btbzleC5WiIYDtxfs9JBZcp7PVFkaz+F 8E5v9Ld8wKTFUHBPPC3JMu/IfPk0k2IlZ4Otu+AjFEWl9SlPqeKwU3cPVV9NXbfnwjUt RXyR6cSgRNZjOMEtCzLTNxW41o+52d2M17+HzR7USqbYc1SUxEHhr4mz6+O7YY39hwpD FSORldLCSemZPizn4Ug9UgtmKec7HO0qYpOCOsinnvz9nXcNR2ouAIy67jCe177/Y5Ur GG2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:references:from:in-reply-to :content-transfer-encoding; bh=CZIZlZGdcWi/E/214jylJcwWKMcr2Q4Yaxpfceq3YcI=; b=OUjVO+LaNdFIKje/stVOFqp8R1KrUwjiJ6qzQYRyCKS3uHJWhJvtEYVghDMYPHwlRJ SAkmA4SPWpdE8hZl7JNxBwixFnUTefK7FadZ6zB69YeYd73jl5XyIwAIIhm4bMn0Omlv GNPxZDLFRR8eUUuYC3UzbI0nRRt5YZ5tSG/gI3MsHFf910yvIzYAT3ssJsOW/mBMzW0t KhwTPXWGDRxmAFMh6KxzjXB8jbfOrU7DIh6Rsv72JxQZU0oAMI9SpWtuCfWp2pPINLy+ mQjbIBfpBcJkbsaF/aL9mtFAASarm69PiL8aGxWQtfGp9mdapZa/PQ3jl6+K9p2hyj90 /Bpg== X-Gm-Message-State: AOAM531nsi2TooEOboxNFaCBWXtZQtrd/cU94Yt6Nxe7dDcibLOlS70y TZbR6T1nA1hmXAUi9zq5EYrxo9H9wIPjhw== X-Google-Smtp-Source: ABdhPJy/nUJjSLx3hXhcbm+ec+aOGwyQ07nb2lSzJpUOokaVZuW8LvatHMrqKfdBoZ/JTQbV7tE1aw== X-Received: by 2002:a05:6000:144e:b0:1e8:a4f5:5505 with SMTP id v14-20020a056000144e00b001e8a4f55505mr14665774wrx.674.1645434554645; Mon, 21 Feb 2022 01:09:14 -0800 (PST) Received: from [192.168.0.22] (cpc104104-brig22-2-0-cust548.3-3.cable.virginm.net. [82.10.58.37]) by smtp.googlemail.com with ESMTPSA id o4sm30997717wrc.52.2022.02.21.01.09.14 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 21 Feb 2022 01:09:14 -0800 (PST) Message-ID: <64095373-f73b-0231-dbd2-3b3271ab0e96@gmail.com> Date: Mon, 21 Feb 2022 09:09:12 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Content-Language: en-GB To: PHP Internals References: <22242169-a16d-5261-696c-3cf00b00336a@gmail.com> <93e83a99-8f03-b823-1b4b-a10519d41dd7@gmail.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] [RFC] Deprecate and Remove utf8_encode and utf8_decode From: rowan.collins@gmail.com (Rowan Tommins) On 20/02/2022 23:54, Craig Francis wrote: > I'm just wondering, and this would not be necessary... considering how > most systems need to deal with UTF-8 data today, could an argument be > made for enabling etc/mbstring by default? > > I'm fairly sure Ubuntu and CentOS need to install the package > `php-mbstring` separately; whereas my limited experience with > cheep/shared hosting, they tend to have it enabled. Unfortunately, enabling by default in the distributed source files won't make any difference to that situation, as anything that can be built as a separate library file can (and seemingly will) be split into a separate package in a binary distribution. Making the extension always available (impossible to compile without it) is a potential option, and I think has been suggested before; I'm not sure of the exact pros and cons. > everyone could trust functions like `mb_strlen()` are available as well. I would personally encourage everyone to have ext/intl installed and use grapheme_strlen() instead of mb_strlen(), because knowing whether a particular instance of the string "Nguyễn" is written with 6, 7, or 8 code points is not nearly as useful as knowing that it looks like 6 "characters" to a user either way. Regards, -- Rowan Tommins [IMSoP]