Hi Internals,
I have opened a Pull Request to add \f (form feed) to the list of characters stripped by default in trim(), ltrim(), and rtrim().
Currently, the default behavior of trim() strips the following characters: \n, \r, \t, \v, \0, and space. The form feed character \f is notably missing, despite being widely recognized as a whitespace character (in python, rust...).
Although I think this change aligns trim() with standard whitespace definitions, it is technically a backward compatibility break. I am writing to check if there are any strong objections to this change or if it requires further discussion.
References:
Issue: https://github.com/php/php-src/issues/20783
Pull Request: https://github.com/php/php-src/pull/20788
Thanks,
Weilin Du
Hi Internals,
I have opened a Pull Request to add \f (form feed) to the list of
characters stripped by default intrim(),ltrim(), andrtrim().Currently, the default behavior of
trim()strips the following characters:
\n, \r, \t, \v, \0, and space. The form feed character \f is notably
missing, despite being widely recognized as a whitespace character (in
python, rust...).
Although I think this change alignstrim()with standard whitespace
definitions, it is technically a backward compatibility break. I am writing
to check if there are any strong objections to this change or if it
requires further discussion.References*:*
Issue: https://github.com/php/php-src/issues/20783
https://github.com/php/php-src/issues/20783Pull Request: https://github.com/php/php-src/pull/20788
https://www.google.com/search?q=https://github.com/php/php-src/pull/20788Thanks,
Weilin Du
Notably, mb_trim(), introduced in 8.4, includes \f in the list of
characters it trims.
https://www.php.net/mb_trim
I’m not opposed to this change, but the BC break could lead to really
difficult and tricky bugs in applications that rely on the form feed to be
retained. I can’t think of any use cases that would expect form feed to
remain after trimming, but there might be some.
On the other hand, many users might consider it a bug that form feed isn’t
trimmed, so this might be argued as a bug fix, especially since other
languages consider form feed as a whitespace character to remove in similar
situations.
I’m for this change, but on the fence about the BC break.
Cheers,
Ben
Hi Internals,
I have opened a Pull Request to add|\f| (form feed) to the list of
characters stripped by default in|trim()|,|ltrim()|, and|rtrim()|.Currently, the default behavior of|trim()| strips the following
characters:|\n|,|\r|,|\t|,|\v|,|\0|, and space. The form feed character|
\f| is notably missing, despite being widely recognized as a whitespace
character (in python, rust...).
Also in PHP, where ctype_space() recognises \f as whitespace as well as
all of those that trim() removes. (As the name implies, of course, this
is because \f is recognised as whitespace in C.)
Also in PHP, where
ctype_space()recognises \f as whitespace as well as
all of those thattrim()removes. (As the name implies, of course, this
is because \f is recognised as whitespace in C.)
Uh, except of course for NUL, which trim() removes.