Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:128639 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id 393881A00BC for ; Thu, 4 Sep 2025 19:50:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1757015362; bh=6+5b59fZ08gyNgh+DUsOFh4gb341f7c+U2am5b3LEEo=; h=Date:Subject:To:References:From:In-Reply-To:From; b=gFY9s2mZaG5rSnKm4OEDW+f/9T0+idr1x6En5rscL0hwD9cSgjkprw2dtZ6MLLh0N RCvkLwul4DaXeERLI5qfnOVAl8EmuNrmKCGohVW9h+WjVzSHCDFITDsAfprEjP64fA T7fueEAXaa57L8aqyom7NqVP6WzMGyEpkbmvegaLUm3qUytdfjzAFa1TQkhpj3u63k s9aRUETLXp/V5NbjPmfo8OMhfIYP6GCaWG9++1N7Xm6GbD78LQMrw77aKZ+Oua31f1 aaOyHCoiyP7reTvaEla3hu23zJlRRwJkibP/AwMOaJRaeIEKWSZwERw+euAaBQ9EGq DmpKsmiyGC0TQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id B7040180081 for ; Thu, 4 Sep 2025 19:49:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,RCVD_IN_DNSWL_LOW, SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from fout-a3-smtp.messagingengine.com (fout-a3-smtp.messagingengine.com [103.168.172.146]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 4 Sep 2025 19:49:21 +0000 (UTC) Received: from phl-compute-06.internal (phl-compute-06.internal [10.202.2.46]) by mailfout.phl.internal (Postfix) with ESMTP id B6D77EC0331 for ; Thu, 4 Sep 2025 15:50:49 -0400 (EDT) Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-06.internal (MEProxy); Thu, 04 Sep 2025 15:50:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rwec.co.uk; h=cc :content-transfer-encoding:content-type:content-type:date:date :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm1; t=1757015449; x=1757101849; bh=ejgXm1ArYKtSiuxh1LN8C6C60/NmWE/yhB69D0T7/30=; b= K6ILGfEB9YjdljdgLi62ms9sUseUZXYLeA2EWIlZi2Oeea7r8gKYMFjVjMWlRcm5 BpTcidGniZpHiQRvqQ3ziatmI0v6WRpRmvTpkBXWDywTffKBap2p5UUK/LUrvCdU bE2jJof090SPj+baErhNLw7f+xWgRJzEQsS/XFCs/zytAtcYyWwJM5VeaFQCCgrx vEjJCjmy06Wp0r4BmrLp4h1TEKlBtMMj/cm6XpUGOTL+PJDNNX/5/EG31SG4Rw7y 9W2ZMMdxZh4zh7H77ZlZIervFhV8zTvcptdi13K+HBzWxFMdqhp0yaSJcb7/mPLr mz0WUgVk4Vt+GyjcvpOjwA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm1; t=1757015449; x=1757101849; bh=e jgXm1ArYKtSiuxh1LN8C6C60/NmWE/yhB69D0T7/30=; b=On3c2GMwb6rf40AbR DjkpMr1T24Daivykxmw7sAsI+8HoCE1uk8bHc/YcGWRu8dEPtx+wDUtZAZL1N4X5 EkOY0a08vzEGmaBh39S0YmaSeJaA24XejouLBF82xuCmiB7nN9JCSNJaGTIdNHll TZ/Z7Hn+DKW/HfYn3EJdVJH6QqCHEUniyu60BJZUXkB5AVgjPfcV13W97I1r85mP yovO4qizEna5dpMUI09627YxNNVGg5GWyN/sBSLYAKEugdX4L9nyWoOuqz9d86Cg 320r7Pr9s1WzaYhYkvepcl8OojvU+UQWz6cQCylPUBPC373+xiME32goGnuQjfbg TWIww== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdeggdeiledtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceurghi lhhouhhtmecufedttdenucgoufhushhpvggtthffohhmrghinhculdegledmnecujfgurh epkfffgggfuffvfhfhjggtgfesthekredttddvjeenucfhrhhomhepfdftohifrghnucfv ohhmmhhinhhsucglkffoufhorfgnfdcuoehimhhsohhprdhphhhpsehrfigvtgdrtghord hukheqnecuggftrfgrthhtvghrnhepledtteejtedvheetffeggfelveeigfevheejleel heeggeetkeehjeetieehuefhnecuffhomhgrihhnpehphhhprdhnvghtnecuvehluhhsth gvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepihhmshhophdrphhhphes rhifvggtrdgtohdruhhkpdhnsggprhgtphhtthhopedupdhmohguvgepshhmthhpohhuth dprhgtphhtthhopehinhhtvghrnhgrlhhssehlihhsthhsrdhphhhprdhnvght X-ME-Proxy: Feedback-ID: id5114917:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA for ; Thu, 4 Sep 2025 15:50:49 -0400 (EDT) Message-ID: <5c882e40-fa5e-4983-b35e-df1e61c4a2b9@rwec.co.uk> Date: Thu, 4 Sep 2025 20:50:47 +0100 Precedence: list list-help: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PHP-DEV] mbstring/iconv encoding deprecation Content-Language: en-GB To: internals@lists.php.net References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit From: imsop.php@rwec.co.uk ("Rowan Tommins [IMSoP]") On 04/09/2025 07:44, Jorg Sowa wrote: > + enigmatic "all functions that take encoding option use > php.internal_encoding as default (e.g. > htmlentities/mb_strlen/mb_regex/etc)" This is not very well worded, but I believe what it's saying is that a number of functions have an "encoding" parameter, which defaults to some INI setting. Since some of those settings were proposed to be removed, and others added, they needed a new default. However, further up it seems to propose a different default: > Use default_charset as default for encoding related php.ini settings > and module/functions. What's more, there isn't actually a setting called "php.internal_encoding", it's just called "internal_encoding". Regardless of what was intended 11 years ago, we need to decide what to do now. htmlentities is currently documented like this [https://www.php.net/htmlentities]: > An optional argument defining the encoding used when converting > characters. > > If omitted, encoding defaults to the value of the default_charset > configuration option. So, I guess that doesn't need to change, because that setting isn't deprecated? The standard wording for all the ext/mbstring functions is currently this [e.g. https://www.php.net/manual/en/function.mb-convert-case.php]: > The encoding parameter is the character encoding. If it is omitted or > null, the internal character encoding value will be used. This is rather vague. In practice, it takes the value of "mbstring.internal_encoding"; if not set, "internal_encoding"; if not set, "default_charset" - but I can't find anywhere in the manual stating this directly. > Without INI values following functions are becoming pointless: > - iconv_set_encoding > - iconv_get_encoding > - mb_internal_encoding > - mb_http_output Neither of the RFCs explicitly mention any of these functions, and none has any deprecation note in the manual. iconv_set_encoding triggers a deprecation notice at run-time, but iconv_get_encoding does not. mb_internal_encoding does not trigger any deprecation notice even when it's used to set a new value. Until 5 years ago, it was also heavily implied to be the correct function to use - until they were rebuilt from stub files, the synopses for mbstring functions in the manual looked like this: >  mb_convert_case ( string $str , int $mode [, string $encoding = > mb_internal_encoding() ] ) : string I suspect mb_internal_encoding is rather widely used to set a *run-time* default, unrelated to the INI settings. If we want to remove it, let's go all the way and make the encoding parameter mandatory in some future version, rather than defaulting to a different global value. -- Rowan Tommins [IMSoP]