[RFC] Warning for implicit float to int conversions

4 years ago by G. P. B. — view source

unread

Greetings internal,

I'm proposing a new RFC which would warn when an implicit conversion
from float to int occurs.

The draft is currently located on GitHub:
https://github.com/Girgias/float-int-warning/
for ease of commenting/providing changes to it.

The official discussion phase wouldn't start before I convert it to docwiki
and
post it on the wiki, something I'm planning to do next week.

Any feedback is appreciated.

Best regards,

George P. Banyard

4 years ago by Benjamin Morel — view source

unread

Greetings internal,

I'm proposing a new RFC which would warn when an implicit conversion
from float to int occurs.

The draft is currently located on GitHub:
https://github.com/Girgias/float-int-warning/
for ease of commenting/providing changes to it.

The official discussion phase wouldn't start before I convert it to docwiki
and
post it on the wiki, something I'm planning to do next week.

Any feedback is appreciated.

Best regards,

George P. Banyard

Hi George,

Thank you for this proposal, I'm all for it, though I'd go one step
further, and actually issue a warning during explicit casting as well:

(int) 3.5; // E_WARNING
(int) round(3.5); // OK

In my experience, it's better to be explicit about your intent, so forcing
the user to round() before casting when the float has a fractional part is
OK to me.
This would help prevent weird silent conversions such as when you cast to
(int) from user input:

$age = $_GET['age']; // '25.75'
$x = (int) $foo; // I'd expect a warning, not a silent conversion to 25

— Benjamin

4 years ago by Rowan Tommins — view source

unread

$age = $_GET['age']; // '25.75'
$x = (int) $foo; // I'd expect a warning, not a silent conversion to 25

If we want to make explicit casts "fussy", that would be a much bigger
change. Right now, the following are all valid, with no warnings, even
though they are all lossy conversions:

    (int)25.75,
    (int)"25.75",
    (int)"hello",
    (int)"99 red balloons",
    (bool)2,
    (bool)"yes",
    (float)"3.1415 is not pi",
    (string)M_PI, // calling this "lossy" is a bit of a stretch, but it
will retain decimal rather than binary precision

There's a case to be made for "strict casts" - I've pointed out before
that so-called "strict types" mode encourages users to write code that
is actually less strict, because explicit casts are very forgiving.
However, they should probably be a new syntax, because there will be an
ENORMOUS amount of code using deliberately using the existing casts in
lossy situations.

Regards,

--
Rowan Tommins
[IMSoP]

4 years ago by G. P. B. — view source

unread

On Thu, 4 Feb 2021 at 17:55, Benjamin Morel benjamin.morel@gmail.com
wrote:

Greetings internal,

I'm proposing a new RFC which would warn when an implicit conversion
from float to int occurs.

The draft is currently located on GitHub:
https://github.com/Girgias/float-int-warning/
for ease of commenting/providing changes to it.

The official discussion phase wouldn't start before I convert it to
docwiki
and
post it on the wiki, something I'm planning to do next week.

Any feedback is appreciated.

Best regards,

George P. Banyard

Hi George,

Thank you for this proposal, I'm all for it, though I'd go one step
further, and actually issue a warning during explicit casting as well:

(int) 3.5; // E_WARNING
(int) round(3.5); // OK

In my experience, it's better to be explicit about your intent, so forcing
the user to round() before casting when the float has a fractional part is
OK to me.
This would help prevent weird silent conversions such as when you cast to
(int) from user input:

$age = $_GET['age']; // '25.75'
$x = (int) $foo; // I'd expect a warning, not a silent conversion to 25

— Benjamin

As Rowan said explicit casts are already very fuzzy and should be handled
globally, although I don't agree that they should be changed.
An explicit cast is a choice to force the value to a given type, and many
cases are somewhat reasonable
(e.g. (array) 5 giving array(1) { [0]=> int(5) }), there are IMHO only two
types of blatant bogus type casts which are array being casted to string,
and array/objects being casted to int/float.

And by the looks of it these casts were added because of the usage of the
strict_type declare, which encourages the usage of explicit cast despite
them leading to less type safe code.

This proposal is one of the stepping stones needed to make the strict_type
declare obsolete by converging the behaviour of the coercive typing mode to
the strict type one.
The more I think about it, the more I think it's just a bandaid on some of
the suboptimal behaviour PHP traditionally had.
However, many of these have been somewhat fixed in PHP 8, the two major
ones are systematic TypeErrors for internal functions and the other one is
a saner definition of what a numeric string is.
There are currently 3 remaining reasons, at least from my
perspective/knowledge, as to why the strict_type mode is still needed in
PHP 8:

Internal functions being implicitly nullable, something being
deprecated with Nikita's RFC [1] which is currently passing with flying
colours.
Implicit float to int conversions, which this RFC is trying to address
Implicit boolean to scalar conversions, something I want to look into
afterwards but it has a larger surface area which needs to be considered

All other implicit conversions, which currently do not warn or error, are
reasonable from my PoV, and if a codebase wants to adhere to strict type
safety rules there are enough different static analysers nowadays to
achieve this, as strict_types does not achieve this but provides a false
sense that it does.

On Thu, Feb 4, 2021 at 7:22 PM Larry Garfield larry@garfieldtech.com
wrote:

I get the idea behind your proposal, but equally I'm not convinced this
is
comparable to numeric vs non-numeric or malformed/partially numeric
strings. There isn't any value of the string "foobar" which makes sense
as
an integer. But there is a value for a float which makes sense as an
integer; the integral part. In the float 3.81232 the integral portion 3
is
completely unambiguous. It's not like it would make just as much sense
to
interpret it as any other arbitrary integer value.

Except that example is ambiguous. Specifically, which is more logical,
to
truncate it to 3 or round it up to 4? It probably depends heavily on
your
context. Implicitly doing one or the other can result in surprises.

I disagree this is ambiguous. The integral portion of a float is what it
is, any notion of rounding it up is no more relevant here than multiplying
by it 20, calculating it's sin value or anything else you can do with a
number. These are operations you explicitly choose to perform on a scalar.

If you don't care that it has a factional part, then yes it is unambiguous
(which is kind of debatable with the fact floating numbers can be slightly
below the integer value which it should represent when operations have been
done to them).
However, if you don't care then an explicit cast is totally in order to
signal this, as IMHO most people want to have some sort of indication that
they are receiving a float instead of an integer as that can mean there is
an issue prior to receiving it,
as it may need some further processing (be that rounding, increasing
validation on the value received, etc.).

Considering what my long term goal is with this proposal I think I will
change this to a deprecation and write an annexe document which I can refer
to.

The feedback so far has thus been productive.

Best regards,

George P. Banyard

4 years ago by AllenJB — view source

unread

Greetings internal,

I'm proposing a new RFC which would warn when an implicit conversion
from float to int occurs.

The draft is currently located on GitHub:
https://github.com/Girgias/float-int-warning/
for ease of commenting/providing changes to it.

The official discussion phase wouldn't start before I convert it to
docwiki
and
post it on the wiki, something I'm planning to do next week.

Any feedback is appreciated.

Best regards,

George P. Banyard

Hi George,

Thank you for this proposal, I'm all for it, though I'd go one step
further, and actually issue a warning during explicit casting as well:

(int) 3.5; // E_WARNING
(int) round(3.5); // OK

In my experience, it's better to be explicit about your intent, so forcing
the user to round() before casting when the float has a fractional part is
OK to me.
This would help prevent weird silent conversions such as when you cast to
(int) from user input:

$age = $_GET['age']; // '25.75'
$x = (int) $foo; // I'd expect a warning, not a silent conversion to 25

— Benjamin

There are legitimate cases for explicitly casting floats to int. For
example floor() outputs a float, but in the context of the domain I'm
working I might know that the result is never going to exceed a certain
value and want the result explicitly as an int.

(And after checking the manual, I'd also note here that round() also
returns a float, so how exactly does your example here work? Is it only
OK to explictly cast a float that's the return value of a function? Or
only explictly cast a float if the fractional part is .0? Is that viable
given the "inaccuracy" of floats? Or would it be better for PHP to have
some non-range/accuracy-sensitive representation for integers (and
decimals) here?) (and now we're getting into "why are we still using
floating point math by default in 2021" territory, so I'll stop right here)

AllenJB

PS. Apologies for the dupe to those who did receive my original send,
but I forgot to amend my from address to my subscribed address and as I
recall the list doesn't like list-only replies)

4 years ago by David Gebler — view source

unread

If this were to be done, my gut feeling is a notice would be preferable to
a warning, particularly as there must be many scripts which would suddenly
start churning out warnings for this proposed change which might/probably
ignore lower error levels and emitting a warning for a previously common
script behaviour is quite a significant backwards incompatible change.

The bit which makes me more nervous about the proposed change is your
rationale that implicit float to int conversion dropping the fractional
portion means there is "no way to know if the data provided is erroneous".

I get the idea behind your proposal, but equally I'm not convinced this is
comparable to numeric vs non-numeric or malformed/partially numeric
strings. There isn't any value of the string "foobar" which makes sense as
an integer. But there is a value for a float which makes sense as an
integer; the integral part. In the float 3.81232 the integral portion 3 is
completely unambiguous. It's not like it would make just as much sense to
interpret it as any other arbitrary integer value.

So in these cases, via coercion you're just straightforwardly giving a
valid, unambiguous integer to something which expects an integer. I'd
question why should that raise a warning or TypeError.

In favour of the proposal are a couple of the other issues you mentioned
which mean this would make PHP a bit more consistent all-round...but I'm
not entirely persuaded at this point.

-Dave

Greetings internal,

I'm proposing a new RFC which would warn when an implicit conversion
from float to int occurs.

The draft is currently located on GitHub:
https://github.com/Girgias/float-int-warning/
for ease of commenting/providing changes to it.

The official discussion phase wouldn't start before I convert it to docwiki
and
post it on the wiki, something I'm planning to do next week.

Any feedback is appreciated.

Best regards,

George P. Banyard

4 years ago by Larry Garfield — view source

unread

If this were to be done, my gut feeling is a notice would be preferable to
a warning, particularly as there must be many scripts which would suddenly
start churning out warnings for this proposed change which might/probably
ignore lower error levels and emitting a warning for a previously common
script behaviour is quite a significant backwards incompatible change.

The bit which makes me more nervous about the proposed change is your
rationale that implicit float to int conversion dropping the fractional
portion means there is "no way to know if the data provided is erroneous".

I get the idea behind your proposal, but equally I'm not convinced this is
comparable to numeric vs non-numeric or malformed/partially numeric
strings. There isn't any value of the string "foobar" which makes sense as
an integer. But there is a value for a float which makes sense as an
integer; the integral part. In the float 3.81232 the integral portion 3 is
completely unambiguous. It's not like it would make just as much sense to
interpret it as any other arbitrary integer value.

Except that example is ambiguous. Specifically, which is more logical, to truncate it to 3 or round it up to 4? It probably depends heavily on your context. Implicitly doing one or the other can result in surprises.

My main concern is if you're casting floats to ints and the floats are usually ints anyway, and so no error, you may not even realize the error remains for a long time until you suddenly start getting a warning if your incoming data shifts. I have no idea how common that pattern is in practice, though.

--Larry Garfield

4 years ago by David Gebler — view source

unread

Except that example is ambiguous. Specifically, which is more logical,
to truncate it to 3 or round it up to 4? It probably depends heavily on
your context. Implicitly doing one or the other can result in surprises.

I disagree this is ambiguous. The integral portion of a float is what it
is, any notion of rounding it up is no more relevant here than multiplying
by it 20, calculating it's sin value or anything else you can do with a
number. These are operations you explicitly choose to perform on a scalar.

On Thu, Feb 4, 2021 at 7:22 PM Larry Garfield larry@garfieldtech.com
wrote:

If this were to be done, my gut feeling is a notice would be preferable
to
a warning, particularly as there must be many scripts which would
suddenly
start churning out warnings for this proposed change which might/probably
ignore lower error levels and emitting a warning for a previously common
script behaviour is quite a significant backwards incompatible change.

The bit which makes me more nervous about the proposed change is your
rationale that implicit float to int conversion dropping the fractional
portion means there is "no way to know if the data provided is
erroneous".

I get the idea behind your proposal, but equally I'm not convinced this
is
comparable to numeric vs non-numeric or malformed/partially numeric
strings. There isn't any value of the string "foobar" which makes sense
as
an integer. But there is a value for a float which makes sense as an
integer; the integral part. In the float 3.81232 the integral portion 3
is
completely unambiguous. It's not like it would make just as much sense to
interpret it as any other arbitrary integer value.

Except that example is ambiguous. Specifically, which is more logical, to
truncate it to 3 or round it up to 4? It probably depends heavily on your
context. Implicitly doing one or the other can result in surprises.

My main concern is if you're casting floats to ints and the floats are
usually ints anyway, and so no error, you may not even realize the error
remains for a long time until you suddenly start getting a warning if your
incoming data shifts. I have no idea how common that pattern is in
practice, though.

--Larry Garfield

--

To unsubscribe, visit: https://www.php.net/unsub.php

4 years ago by Benjamin Morel — view source

unread

(And after checking the manual, I'd also note here that round() also
returns a float, so how exactly does your example here work? Is it only
OK to explictly cast a float that's the return value of a function? Or
only explictly cast a float if the fractional part is .0? Is that viable
given the "inaccuracy" of floats? Or would it be better for PHP to have
some non-range/accuracy-sensitive representation for integers (and
decimals) here?) (and now we're getting into "why are we still using
floating point math by default in 2021" territory, so I'll stop right here)

Floats (doubles) can accurately represent all integers up to 2⁵³, so there
is no inaccuracy in this range; the result from round() or floor() could
therefore be safely passed to (int) even if the cast operator checked for a
0 fractional part, which is what I'm advocating for.

There are legitimate cases for explicitly casting floats to int. For

example floor() outputs a float, but in the context of the domain I'm
working I might know that the result is never going to exceed a certain
value and want the result explicitly as an int.

Perfect, so (int) floor() would work wonders for you, even with the strict
casting I'm talking about.
And if the result does overflow an integer one day, I'm sure you'd be happy
to know it by getting an exception, rather than getting silently ZERO:

echo (int) 1e60; // 0

— Benjamin

4 years ago by David Gebler — view source

unread

Floats (doubles) can accurately represent all integers up to 2⁵³, so there
is no inaccuracy in this range; the result from round() or floor() could
therefore be safely passed to (int) even if the cast operator checked for
a
0 fractional part, which is what I'm advocating for.

Generating a warning on explicit casts of (non-integer) floats to int would
IMO make no sense at all, it would put PHP at odds with other major
languages such as C, Python and Java and go against normal, reasonable
expectations of how a programming language behaves.

You said in an earlier comment "it's better to be explicit about your
intent", but doing something like (int)3.5 is being explicit about your
intent - and truncating casts on float to int is the widely established
norm.

This was exactly my reservation about deprecating this behaviour even as an
implicit cast - in my mind it isn't a bug or flaw, it's there by design.

If developers want to round/ceil/floor/do whatever with a float prior to
using it as an int, they already have that option and the greatest
flexibility.

At least with the implicit case, I understand the motivation and argument
for bringing coercion more in line with strict typing behaviour and
catching cases where such a cast may not have been intentional (though I
still think a warning is too high an error level for this and would favour
a notice or deprecation, were it to be done at all).

On Fri, Feb 5, 2021 at 12:52 PM Benjamin Morel benjamin.morel@gmail.com
wrote:

(And after checking the manual, I'd also note here that round() also
returns a float, so how exactly does your example here work? Is it only
OK to explictly cast a float that's the return value of a function? Or
only explictly cast a float if the fractional part is .0? Is that viable
given the "inaccuracy" of floats? Or would it be better for PHP to have
some non-range/accuracy-sensitive representation for integers (and
decimals) here?) (and now we're getting into "why are we still using
floating point math by default in 2021" territory, so I'll stop right
here)

Floats (doubles) can accurately represent all integers up to 2⁵³, so there
is no inaccuracy in this range; the result from round() or floor() could
therefore be safely passed to (int) even if the cast operator checked for a
0 fractional part, which is what I'm advocating for.

There are legitimate cases for explicitly casting floats to int. For

example floor() outputs a float, but in the context of the domain I'm
working I might know that the result is never going to exceed a certain
value and want the result explicitly as an int.

Perfect, so (int) floor() would work wonders for you, even with the strict
casting I'm talking about.
And if the result does overflow an integer one day, I'm sure you'd be happy
to know it by getting an exception, rather than getting silently ZERO:

echo (int) 1e60; // 0

— Benjamin

4 years ago by Benjamin Morel — view source

unread

Generating a warning on explicit casts of (non-integer) floats to int
would IMO make no sense at all, it would put PHP at odds with other major
languages such as C, Python and Java and go against normal, reasonable
expectations of how a programming language behaves.

You said in an earlier comment "it's better to be explicit about your
intent", but doing something like (int)3.5 is being explicit about your
intent - and truncating casts on float to int is the widely established
norm.

This was exactly my reservation about deprecating this behaviour even as
an implicit cast - in my mind it isn't a bug or flaw, it's there by design.

If developers want to round/ceil/floor/do whatever with a float prior to
using it as an int, they already have that option and the greatest
flexibility.

At least with the implicit case, I understand the motivation and argument
for bringing coercion more in line with strict typing behaviour and
catching cases where such a cast may not have been intentional (though I
still think a warning is too high an error level for this and would favour
a notice or deprecation, were it to be done at all).

Let's agree to disagree on what would be the ideal behaviour of type casts.
I do understand that it would be a big concern for BC if (int) stopped
working for floats with a fractional part.

Could we at least fix the odd cases where the cast is definitely a failure?
Like:

(int) 1e60; // 0
(int) "foo"; // 0

— Benjamin

4 years ago by Christian Schneider — view source

unread

Am 06.02.2021 um 01:17 schrieb Benjamin Morel benjamin.morel@gmail.com:

Could we at least fix the odd cases where the cast is definitely a failure?
Like:

(int) 1e60; // 0
(int) "foo"; // 0

Are you talking about the constant values 1e60 and "foo"? If not then please don't add a warning (or worse).

I'm sure there is a lot of code which takes user input and uses (int) casts to ensure they are dealing with integers.
There is also intval() as an alternative but my guess would be that real world code uses 50% (int) and 50% intval() to do this.

This would be a big BC break IMHO.

Chris

4 years ago by Rowan Tommins — view source

unread

I'm sure there is a lot of code which takes user input and uses (int) casts to ensure they are dealing with integers.
There is also intval() as an alternative but my guess would be that real world code uses 50% (int) and 50% intval() to do this.

My thoughts exactly. Code along these lines is common and, in my
opinion, perfectly reasonable:

$id = (int)$_GET['id'];
if ( $id !== 0 ) {
// throw an exception, return false, look up a default, etc, as
the application's design requires
}
// proceed knowing that $id is a non-zero integer

I would however welcome a new function or syntax that either performs a
"strict cast" (producing an error if the cast is lossy in any way) or
checks in advance if a cast would be lossy.

Regards,

--
Rowan Tommins
[IMSoP]

4 years ago by David Gebler — view source

unread

This is all a bit moot anyway, the RFC proposal is for warnings or notices
on implicit casts only.

I'm not a voting member for RFCs so my opinion is mere food for thought,
nonetheless my two cents is that:

a) The proposal relies on a premise that an implicit cast of (non-zero
fractional) float to int is inherently ambiguous or a mistake.
I disagree with this as outlined in my previous messages; namely my
objection is truncating a float on cast to int is the widely established
normal behaviour in numerous programming languages.
There should not be a penalty in the form of an error just for doing such a
conversion implicitly, in accordance with how PHP's type coercion works by
design.

b) String offsets, where a warning occurs already, is something of a
special case; this warning was added I believe (5.4?) because malformed
string offset was a known common error in the community. It's not even
entirely consistent; $foo["2"] is fine, $foo[2.5] is a warning with offset
[2], $foo["2x"] is a warning with offset [2] and $foo["2.5"] is a TypeError.

c) It's a substantial BC breaking change likely to affect a lot of existing
code, even though that code works as intended.

d) If it is implemented at all, it should not be an error level as high as
a warning.

-Dave

On Sat, Feb 6, 2021 at 7:32 PM Rowan Tommins rowan.collins@gmail.com
wrote:

I'm sure there is a lot of code which takes user input and uses (int)
casts to ensure they are dealing with integers.
There is also intval() as an alternative but my guess would be that real
world code uses 50% (int) and 50% intval() to do this.

My thoughts exactly. Code along these lines is common and, in my
opinion, perfectly reasonable:

$id = (int)$_GET['id'];
if ( $id !== 0 ) {
// throw an exception, return false, look up a default, etc, as
the application's design requires
}
// proceed knowing that $id is a non-zero integer

I would however welcome a new function or syntax that either performs a
"strict cast" (producing an error if the cast is lossy in any way) or
checks in advance if a cast would be lossy.

Regards,

--
Rowan Tommins
[IMSoP]

--

To unsubscribe, visit: https://www.php.net/unsub.php

4 years ago by Chase Peeler — view source

unread

Floats (doubles) can accurately represent all integers up to 2⁵³, so
there
is no inaccuracy in this range; the result from round() or floor() could
therefore be safely passed to (int) even if the cast operator checked for
a
0 fractional part, which is what I'm advocating for.

Generating a warning on explicit casts of (non-integer) floats to int would
IMO make no sense at all, it would put PHP at odds with other major
languages such as C, Python and Java and go against normal, reasonable
expectations of how a programming language behaves.

You said in an earlier comment "it's better to be explicit about your
intent", but doing something like (int)3.5 is being explicit about your
intent - and truncating casts on float to int is the widely established
norm.

This was exactly my reservation about deprecating this behaviour even as an
implicit cast - in my mind it isn't a bug or flaw, it's there by design.

If developers want to round/ceil/floor/do whatever with a float prior to
using it as an int, they already have that option and the greatest
flexibility.

At least with the implicit case, I understand the motivation and argument
for bringing coercion more in line with strict typing behaviour and
catching cases where such a cast may not have been intentional (though I
still think a warning is too high an error level for this and would favour
a notice or deprecation, were it to be done at all).

A notice is fine, but PLEASE don't make it a warning. I'm in the process of
upgrading to 8.0 right now and I have so much code that works perfectly
fine but generates warnings (undefined array key for example - in 99.9% of
the cases where an array key is not defined, the null value that used to
result from that was perfectly fine).

On Fri, Feb 5, 2021 at 12:52 PM Benjamin Morel benjamin.morel@gmail.com
wrote:

(And after checking the manual, I'd also note here that round() also
returns a float, so how exactly does your example here work? Is it only
OK to explictly cast a float that's the return value of a function? Or
only explictly cast a float if the fractional part is .0? Is that
viable
given the "inaccuracy" of floats? Or would it be better for PHP to have
some non-range/accuracy-sensitive representation for integers (and
decimals) here?) (and now we're getting into "why are we still using
floating point math by default in 2021" territory, so I'll stop right
here)

Floats (doubles) can accurately represent all integers up to 2⁵³, so
there
is no inaccuracy in this range; the result from round() or floor() could
therefore be safely passed to (int) even if the cast operator checked
for a
0 fractional part, which is what I'm advocating for.

There are legitimate cases for explicitly casting floats to int. For

example floor() outputs a float, but in the context of the domain I'm
working I might know that the result is never going to exceed a certain
value and want the result explicitly as an int.

Perfect, so (int) floor() would work wonders for you, even with the
strict
casting I'm talking about.
And if the result does overflow an integer one day, I'm sure you'd be
happy
to know it by getting an exception, rather than getting silently ZERO:

echo (int) 1e60; // 0

— Benjamin

--
Chase Peeler
chasepeeler@gmail.com

4 years ago by Rowan Tommins — view source

unread

I still think a warning is too high an error level for this and would
favour a notice or deprecation, were it to be done at all.

I'd just like to point out that a deprecation notice implies (or should imply!) that the behaviour will be completely removed in a future version. So it doesn't really make sense to say "a warning is too much, but I'm fine with it being an error in the future".

Probably, E_DEPRECATED shouldn't be considered its own severity level, but an extra flag paired with either E_WARNING or E_NOTICE.

Regards,

--
Rowan Tommins
[IMSoP]