Hello everyone,
I'm writing regarding the deprecation of mbstring/iconv encoding that was
voted on more than 10 years ago.
RFC: https://wiki.php.net/rfc/default_encoding
It was accepted that following INI settings were deprecated:
"iconv.input_encoding (Default: php.input_encoding)
iconv.internal_encoding (Default: php.internal_encoding)
iconv.output_encoding (Default: php.output_encoding)
mbstring.http_input (Default: php.input_encoding)
mbstring.internal_encoding (Default: php.internal_encoding)
mbstring.http_output"
- enigmatic "all functions that take encoding option use
php.internal_encoding as default (e.g. htmlentities/mb_strlen/mb_regex/etc)"
Without INI values following functions are becoming pointless:
- iconv_set_encoding
- iconv_set_encoding
- mb_internal_encoding
- mb_http_output
Function mb_http_input
would likely require separate RFC.
PRs[1] deprecating INI settings are already waiting for next major PHP
update, but the problem is with proper deprecation phase of those values.
There wasn't any since 5.6. So the decision of removal of those INI may be
controversial for now.
My suggestion is to properly deprecate all the INI settings + functions
using them before 9.0. The question is, does it require additional RFC, or
this may already have consensus, given the formal deprecation was agreed
upon 10 years ago, however without specified rules how it will be achieved?
Kind regards,
Jorg
Hi!
All of these were vote-accepted for removal in PHP 7:
https://wiki.php.net/rfc/remove_deprecated_functionality_in_php7#iconv_and_mbstring_encoding_ini_directives
But there was an unforseen issue that needed to be resolved before that can
happen, and it apparently never got fixed. See this thread:
https://externals.io/message/80100#80718
Cheers,
Andrey.
Hi!
All of these were vote-accepted for removal in PHP 7: https://wiki.php.net/rfc/remove_deprecated_functionality_in_php7#iconv_and_mbstring_encoding_ini_directives
But there was an unforseen issue that needed to be resolved before that can happen, and it apparently never got fixed. See this thread: https://externals.io/message/80100#80718Cheers,
Andrey.
This was resolved in PHP 7.4.
So yes it did get fixed, I remember the discussion I had with Nikita when submitting PRs to remove them in 8.0 that the one minor version might not have been long enough due to the confusion around those INI settings.
As it has now been over 5 years it is, IMHO, safe to remove them for the next version.
Best regards,
Gina P. Banyard
- enigmatic "all functions that take encoding option use
php.internal_encoding as default (e.g.
htmlentities/mb_strlen/mb_regex/etc)"
This is not very well worded, but I believe what it's saying is that a
number of functions have an "encoding" parameter, which defaults to some
INI setting. Since some of those settings were proposed to be removed,
and others added, they needed a new default.
However, further up it seems to propose a different default:
Use default_charset as default for encoding related php.ini settings
and module/functions.
What's more, there isn't actually a setting called
"php.internal_encoding", it's just called "internal_encoding".
Regardless of what was intended 11 years ago, we need to decide what to
do now.
htmlentities is currently documented like this
[https://www.php.net/htmlentities]:
An optional argument defining the encoding used when converting
characters.If omitted, encoding defaults to the value of the default_charset
configuration option.
So, I guess that doesn't need to change, because that setting isn't
deprecated?
The standard wording for all the ext/mbstring functions is currently
this [e.g. https://www.php.net/manual/en/function.mb-convert-case.php]:
The encoding parameter is the character encoding. If it is omitted or
null, the internal character encoding value will be used.
This is rather vague. In practice, it takes the value of
"mbstring.internal_encoding"; if not set, "internal_encoding"; if not
set, "default_charset" - but I can't find anywhere in the manual stating
this directly.
Without INI values following functions are becoming pointless:
- iconv_set_encoding
- iconv_get_encoding
- mb_internal_encoding
- mb_http_output
Neither of the RFCs explicitly mention any of these functions, and none
has any deprecation note in the manual.
iconv_set_encoding triggers a deprecation notice at run-time, but
iconv_get_encoding does not.
mb_internal_encoding does not trigger any deprecation notice even when
it's used to set a new value. Until 5 years ago, it was also heavily
implied to be the correct function to use - until they were rebuilt from
stub files, the synopses for mbstring functions in the manual looked
like this:
mb_convert_case ( string $str , int $mode [, string $encoding =
mb_internal_encoding()
] ) : string
I suspect mb_internal_encoding is rather widely used to set a run-time
default, unrelated to the INI settings. If we want to remove it, let's
go all the way and make the encoding parameter mandatory in some future
version, rather than defaulting to a different global value.
--
Rowan Tommins
[IMSoP]
Hello internals,
After discussions with RMs the agreement is that the PR [1] can be merged if there is no objection within 1 week.
I'll recapitulate some of the history behind that RFC and delay in implementation.
The original RFC [2] was approved for PHP 5.6 and slated for removal in PHP 7.
(The original text says PHP 6, but this is because the choice to skip version 6 wasn't made yet.)
However, it didn't get removed in PHP 7 due to a technical TODO which wasn't done in time.
See following discussion: https://externals.io/message/103087.
The various technical issues were fixed in PHP 7.4.
However, when I submitted a PR to remove those INI settings for PHP 8.0 the short time span, confusion around those INI settings and behaviour and time constraints meant that it got punted to PHP 9.
(There are a few Room 11 discussion from the time that can be found. [4][5])
Those INI settings have never been "undeprecated", and them being slated for deprecation is confirmed by the documentation.
Specifically, the manual pages for the iconv [6] and mbstring [7] INI settings state the following:
This feature has been DEPRECATED as of PHP 5.6.0. Relying on this feature is highly discouraged.
and
This deprecated feature will certainly be removed in the future.
respectively.
Moreover, multiple discussion clearly talk about the nonsensical state of keeping the functions while removing the INI settings, because what would be the behaviour of the functions?
Doing nothing? Affect the global INI settings?
As such, those functions should be deprecated as it can be seen as a natural follow-up, and something that was just forgotten to be explicitly stated in the RFC body.
Which seems to be confirmed by an email by Yasuo (original author of the RFC) sent to the list in January 2015. [8]
Therefore, I see no reason to delay a deprecation of those functions any further, especially considering that the existence of a version 8.6 has not yet been decided.
Best regards,
Gina P. Banyard
[1] https://github.com/php/php-src/pull/19664
[2] https://wiki.php.net/rfc/default_encoding
[3] https://github.com/php/php-src/pull/533
[4] https://chat.stackoverflow.com/transcript/11?m=49250309#49250309
[5] https://chat.stackoverflow.com/transcript/11?m=50205227#50205227
[6] https://www.php.net/manual/en/iconv.configuration.php
[7] https://www.php.net/manual/en/mbstring.configuration.php
[8] https://externals.io/message/81099#81229
Hello everyone,
I'm writing regarding the deprecation of mbstring/iconv encoding that was voted on more than 10 years ago.
RFC: https://wiki.php.net/rfc/default_encodingIt was accepted that following INI settings were deprecated:
"iconv.input_encoding (Default: php.input_encoding)
iconv.internal_encoding (Default: php.internal_encoding)
iconv.output_encoding (Default: php.output_encoding)
mbstring.http_input (Default: php.input_encoding)
mbstring.internal_encoding (Default: php.internal_encoding)
mbstring.http_output"
- enigmatic "all functions that take encoding option use php.internal_encoding as default (e.g. htmlentities/mb_strlen/mb_regex/etc)"
Without INI values following functions are becoming pointless:
- iconv_set_encoding
- iconv_set_encoding
- mb_internal_encoding
- mb_http_output
Function
mb_http_input
would likely require separate RFC.PRs[1] deprecating INI settings are already waiting for next major PHP update, but the problem is with proper deprecation phase of those values. There wasn't any since 5.6. So the decision of removal of those INI may be controversial for now.
My suggestion is to properly deprecate all the INI settings + functions using them before 9.0. The question is, does it require additional RFC, or this may already have consensus, given the formal deprecation was agreed upon 10 years ago, however without specified rules how it will be achieved?
Kind regards,
Jorg