Greetings internal,
While reviving the "Permit trailing whitespace in numeric strings" RFC [1]
I didn't see Nikita's comment on Andrea's original PR which commented on
the fact that we should get rid of the "leading-numeric string" concept.
Therefore, mostly due to time constraints, I've merge the trailing
whitespace RFC with the removal of this concept in the following RFC:
https://wiki.php.net/rfc/saner-numeric-strings
The main gist is to convert all E_NOTICEs “A non well formed numeric value
encountered” to the usual E_WARNING
“A non-numeric value encountered” and
allow trailing whitespaces, which would be most reasonables cases where the
E_NOTICE
has been emitted.
This RFC is heavily based on Andrea's original RFC [1] and I would like to
thank her once more for the preliminary work she's done that I could reuse.
Best regards
George P. Banyard
Le 29 juin 2020 à 12:30, G. P. B. george.banyard@gmail.com a écrit :
Greetings internal,
While reviving the "Permit trailing whitespace in numeric strings" RFC [1]
I didn't see Nikita's comment on Andrea's original PR which commented on
the fact that we should get rid of the "leading-numeric string" concept.Therefore, mostly due to time constraints, I've merge the trailing
whitespace RFC with the removal of this concept in the following RFC:
https://wiki.php.net/rfc/saner-numeric-stringsThe main gist is to convert all E_NOTICEs “A non well formed numeric value
encountered” to the usualE_WARNING
“A non-numeric value encountered” and
allow trailing whitespaces, which would be most reasonables cases where the
E_NOTICE
has been emitted.This RFC is heavily based on Andrea's original RFC [1] and I would like to
thank her once more for the preliminary work she's done that I could reuse.Best regards
George P. Banyard
Hi,
It is not clear for me, by reading the RFC, whether or not the behaviour of either implicit or explicit conversion of so-called leading numeric string will be preserved, beyond some Notices that will be converted to Warnings in the implicit case. For example, does (int) "2px" still produce 2, even when the string is now considered as non-numeric?
I think it is important to make sure that the current results of casting to int/float will not change. For example, I can imagine some script that reads a CSS value like "2px" and does some calculation on it, converting it to a number on the process. A Warning instead of a Notice will prompt the programmer to add explicit casts, which is probably a good thing. But changing the casted value to 0 would lead to bugs that are (1) harder to detect especially if the program already uses explicit casts, and (2) harder to correct.
—Claude
Le 29 juin 2020 à 12:30, G. P. B. george.banyard@gmail.com a écrit :
Greetings internal,
While reviving the "Permit trailing whitespace in numeric strings" RFC
[1]
I didn't see Nikita's comment on Andrea's original PR which commented on
the fact that we should get rid of the "leading-numeric string" concept.Therefore, mostly due to time constraints, I've merge the trailing
whitespace RFC with the removal of this concept in the following RFC:
https://wiki.php.net/rfc/saner-numeric-stringsThe main gist is to convert all E_NOTICEs “A non well formed numeric
value
encountered” to the usualE_WARNING
“A non-numeric value encountered” and
allow trailing whitespaces, which would be most reasonables cases where
the
E_NOTICE
has been emitted.This RFC is heavily based on Andrea's original RFC [1] and I would like
to
thank her once more for the preliminary work she's done that I could
reuse.Best regards
George P. Banyard
Hi,
It is not clear for me, by reading the RFC, whether or not the behaviour
of either implicit or explicit conversion of so-called leading numeric
string will be preserved, beyond some Notices that will be converted to
Warnings in the implicit case. For example, does (int) "2px" still produce
2, even when the string is now considered as non-numeric?I think it is important to make sure that the current results of casting
to int/float will not change. For example, I can imagine some script that
reads a CSS value like "2px" and does some calculation on it, converting it
to a number on the process. A Warning instead of a Notice will prompt the
programmer to add explicit casts, which is probably a good thing. But
changing the casted value to 0 would lead to bugs that are (1) harder to
detect especially if the program already uses explicit casts, and (2)
harder to correct.—Claude
The behaviour of explicit casts (or any other conversion which accepts
errors in the string,
which corresponds to the 1 mode of allow_errors in the relevant C
is_numeric_string function)
will be preserved, so this should still produce 2.
I will add test cases and clarify this in the RFC.
Best regards
George P. Banyard
I've added a test case just to ensure that casting of leading numeric
strings yields
the previous behaviour and have added this to the BC Break section of the
RFC:
https://wiki.php.net/rfc/saner-numeric-strings
Any other feedback or questions are welcomed.
George P. Banyard
Hello again,
While reviewing the implementation Nikita spotted a peculiar behaviour that
PHP currently has in regards to string offsets.
The offset is validated using the is_numeric_string but integer
representation of
it is computed using an explicit int cast, this is what leads to the
E_NOTICE
in
a roundabout way.
Therefore, a string offset access such as $str['2str'] or $str['2.5'] would
still
evaluate to 2 under the current implementation but would generate an
E_WARNING
"Illegal string offset".
The question becomes should such string offsets evaluate to 0 or not.
The answer isn't that straight forward as a float index like 14.5 would also
emit the same warning but evaluate to 14.
I see 3 options:
- Keep the current behaviour which emit the
E_WARNING
"Illegal string
offset" and use the explicit int cast for both types of offsets ( '2str'
and '2.5'), which would bring the behaviour of leading integer numeric
strings in line with the one for float numeric strings. - Emit the
E_WARNING
"Illegal string offset" and evaluate to 0 for
offsets like '2str', and emit theE_WARNING
"String offset cast
occurred" for float numeric strings like '2.5' - Emit the
E_WARNING
"Illegal string offset" and evaluate to 0 for
non-numeric integer strings.
As this is relatively tricky and all of the options have some change in
behaviour compared to PHP 7 feedback would be very much appreciated.
I'll amend the RFC to detail this behaviour.
Best regards
George P. Banyard
Hi George,
- Emit the
E_WARNING
"Illegal string offset" and evaluate to 0 for
offsets like '2str', and emit theE_WARNING
"String offset cast
occurred" for float numeric strings like '2.5'
I'm not clear how this option fits with the rest of the RFC; there's no
explicit mention of evaluating values to 0 which would previously have been
evaluated differently. That would fall into the most significant category
of BC breaks: working code in version X still works in version Y, but
produces different results.
At the moment, the "Backward Incompatible Changes" section just says:
code with liberal use of leading-numerical strings will need to be
updated
It would be good to expand that with examples of what such code would look
like, how it would behave before and after, and what updates the user would
need to make.
Regards,
Rowan Tommins
[IMSoP]
Hi George,
- Emit the
E_WARNING
"Illegal string offset" and evaluate to 0 for
offsets like '2str', and emit theE_WARNING
"String offset cast
occurred" for float numeric strings like '2.5'I'm not clear how this option fits with the rest of the RFC; there's no
explicit mention of evaluating values to 0 which would previously have been
evaluated differently. That would fall into the most significant category
of BC breaks: working code in version X still works in version Y, but
produces different results.At the moment, the "Backward Incompatible Changes" section just says:
code with liberal use of leading-numerical strings will need to be
updatedIt would be good to expand that with examples of what such code would look
like, how it would behave before and after, and what updates the user would
need to make.Regards,
Rowan Tommins
[IMSoP]
I've had another thought about while I was implementing this change.
I've updated the RFC to reflect what I think is a more reasonable approach:
For string offsets accessed using numeric strings the following changes
will be made:
- numeric strings which correspond to well formed floats will emit the
more usual “String offset cast occurred” warning instead of the “Illegal
string offset” one.- leading numeric strings will emit the “Illegal string offset”
instead of the “A non well formed numeric value encountered” notice, and
continue to evaluate to their respective values.- Non-numeric strings which emmited the “Illegal string offset”
warning will throw an “Illegal offset type” TypeErrorThe only aspect I'm not totally convinced is making well formed float
strings more "legal" than they were in PHP 7.x.
I will also flesh out the RFC with more examples and which code is affected
and how it would need to be updated to produce the previous behaviour.
Best regards
George P. Banyard
Good afternoon Internals,
I rewrote the RFC to improve clarity but the content is mostly identical
with one small change is that the warning change from 'Illegal offset type'
to 'String cast occurred' for string offsets such as '5.5' will be a
secondary
implementation vote as it adds some boilerplate code in the engine.
https://wiki.php.net/rfc/saner-numeric-strings
This email also serves as a reminder that voting is expected to start
next Monday (13th of July).
Best regards
George P. Banyard
Hello internals,
Apologies again for the delay, I was pointed out a sort of inconsistency
between arithmetic/bitwise ops and type declarations.
I've tried to address this in the RFC by promoting all current warnings to
TypeErrors: https://wiki.php.net/rfc/saner-numeric-strings
Please have a look at it, as the current plan is to open the vote on
Thursday
the 16th.
Best regards
George P. Banyard