Hi internals,
I would like to present the RFC to add the
"is_representable_as_float()" and "is_representable_as_int()"
functions. These functions provide developers with a way to check
whether values can be losslessly converted between integer and
floating-point representations.
https://wiki.php.net/rfc/is-representable-as-float-int
Best,
Alexandre Daubois
Hi internals,
I would like to present the RFC to add the
"is_representable_as_float()" and "is_representable_as_int()"
functions. These functions provide developers with a way to check
whether values can be losslessly converted between integer and
floating-point representations.https://wiki.php.net/rfc/is-representable-as-float-int
Best,
Alexandre Daubois
This looks lovely.
Since only int, float, and string can ever be true, does it make sense to type the function parameters as int|float|string
rather than mixed
, to automatically exclude arrays, objects, and null? (All of which would always be false.)
(I could probably argue either way on it, but my first inclination is yes; push the initial validation upstream. You should always know if a variable is a primitive or not, at the very least.)
--Larry Garfield
Hi internals,
I would like to present the RFC to add the
"is_representable_as_float()" and "is_representable_as_int()"
functions. These functions provide developers with a way to check
whether values can be losslessly converted between integer and
floating-point representations.https://wiki.php.net/rfc/is-representable-as-float-int
Best,
Alexandre DauboisThis looks lovely.
Since only int, float, and string can ever be true, does it make sense
to type the function parameters asint|float|string
rather thanmixed
,
to automatically exclude arrays, objects, and null? (All of which would
always be false.)
There is prior art and also I'd personally recommend to abide to mixed to
not open new gaps while closing older.
php --rf is_countable
Function [ <internal:standard> function is_countable ] {
- Parameters [1] {
Parameter #0 [ <required> mixed $value ]
}
- Return [ bool ]
}
(I could probably argue either way on it, but my first inclination is
yes; push the initial validation upstream. You should always know if a
variable is a primitive or not, at the very least.)
IMHO this is not a validation but a test function.
So if you think one should know, you probably should not need this function
at all.
Btw. is_primitive() was not a thing last time I looked. It would be great
though to have the combined test of is_scalar()
and is_null()
in a single
function call.
This is perhaps something for the line-up.
Just my 2 cents
-- hakre
So if you think one should know, you probably should not need this function
at all.
That's right. I'd also say that since the aim is to have better
interoperability between systems, mixed seems judicious. For example,
a bool or a null can occur.
Btw. is_primitive() was not a thing last time I looked. It would be great
though to have the combined test ofis_scalar()
andis_null()
in a single
function call.
I like the idea. Would it have its place in this RFC? I feel like it
could with a separate vote, but maybe this would require its own RFC.
Btw. is_primitive() was not a thing last time I looked. It would be
great
though to have the combined test ofis_scalar()
andis_null()
in a
single
function call.I like the idea.
Thanks.
Would it have its place in this RFC? I feel like it
could with a separate vote, but maybe this would require its own RFC.
Nah, at least me would not put it in there. But maybe after that RFC is
done, we probably should revisit the idea.
With representable as float, is_primitive() could get even more clarity
apart from the ordinary type test, and probably buy into the commons
mirrored in the rel. recent JSON Path Internet-RFC JSON value mindset,
which was with an early PHP implementation since it's beginning ca. 20
years or so ago. If we're able to capture such longtime wisdom within the
language it would be a testament to software engineering in web/phptech and
could become a beautiful example of unveiling true ergonomics.
My 2 cents
-- hakre
Alexandre Daubois alex.daubois+php@gmail.com hat am 29.07.2025 14:52 CEST geschrieben:
Hi internals,
I would like to present the RFC to add the
"is_representable_as_float()" and "is_representable_as_int()"
functions. These functions provide developers with a way to check
whether values can be losslessly converted between integer and
floating-point representations.https://wiki.php.net/rfc/is-representable-as-float-int
Best,
Alexandre Daubois
Thanks for the RFC.
Since frameworks already have things like Strings::is_stringable() and Arr::arrayable(), I'd suggest to use is_floatable(), is_intable() or is_integerable(), etc.
To me, this would also feel more consistent is is_float()
, is_int()
, etc.
Regards
Thomas
Since frameworks already have things like Strings::is_stringable() and Arr::arrayable(), I'd suggest to use is_floatable(), is_intable() or is_integerable(), etc.
To me, this would also feel more consistent isis_float()
,is_int()
, etc.
I get what you mean. Naming is definitely challenging here. However,
with alternatives like "is_floatable()", I'm afraid that it is not
clear enough that there's a potential loss of precision. I would
personally feel that "is_floatable()" would mean "can this be cast to
float?" without further "limitation".
Best,
Alexandre Daubois
In general, I think the proposed semantics make sense. I'm not sure
about the names, but it's hard to think of a name that accurately
describes what they do.
The value is a float that is not NaN or infinity
It feels a bit odd to have a value where is_float($v) would be true, but
is_representable_as_float($v) would be false. I'd be interested to
understand the thinking behind this case.
Because of this, the function return value is also platform-dependent:
is_representable_as_float(PHP_INT_MAX); // true on 64-bit platforms, false on 32-bit platforms
This is, at best, misleading: it's not the function that's behaving
differently, it's just being given different input. By that reasoning,
strlen()
is platform-dependent, because strlen(PHP_OS_FAMILY) returns 5
on Linux but 7 on Windows.
As written, it's also the wrong way around: on a 32-bit platform, you
are passing 2147483647, which is a "safe" value; on a 64-bit platform,
you are passing 9223372036854775807, which is an odd number outside the
"safe" range.
is_representable_as_float(2147483647) === true on any platform
is_representable_as_float(9223372036854775807) === false on a 64-bit build
Confusingly, passing that absolute value on a 32-bit system will appear
to return true, but that's because the loss of precision has already
happened in the compiler: 9223372036854775807 will actually be compiled
as 9223372036854775808.0, the nearest representable float. So again,
it's actually running the same function with different input, not
different behaviour inside the function.
I suggest removing that sentence and example completely. There is
nothing the function can or should do differently on different platforms.
Here also, the function return value is platform-dependent:
is_representable_as_int(2**31); // true on 64-bit platforms, false on 32-bit platforms
This one is fair enough, but I'd suggest tweaking the example slightly
to avoid the same problem with different inputs: on a 64-bit build,
2**31 already is an integer, so it's trivially true. So either force a
floating point input:
is_representable_as_int(2.0**31); // true on 64-bit platforms, false on
32-bit platforms
or, perhaps clearer, use an out-of-range string input:
is_representable_as_int('2147483648'); // true on 64-bit platforms,
false on 32-bit platforms
Regards,
--
Rowan Tommins
[IMSoP]
The value is a float that is not NaN or infinity
It feels a bit odd to have a value where is_float($v) would be true, but is_representable_as_float($v) would be false. I'd be interested to understand the thinking behind this case.
As the strings “INF” and “NAN” can't be cast as floats directly, I
think it's safer to return false on these extreme values. See
https://3v4l.org/TabMK#vnull for the cast example.
I suggest removing that sentence and example completely. There is nothing the function can or should do differently on different platforms.
You are right, it's misleading. I'll update the RFC accordingly.
This one is fair enough, but I'd suggest tweaking the example slightly to avoid the same problem with different inputs: on a 64-bit build, 2**31 already is an integer, so it's trivially true. So either force a floating point input:
is_representable_as_int(2.0**31); // true on 64-bit platforms, false on 32-bit platforms
or, perhaps clearer, use an out-of-range string input:
is_representable_as_int('2147483648'); // true on 64-bit platforms, false on 32-bit platforms
I'll also update accordingly. Thank you!
Best,
Alexandre Daubois
The value is a float that is not NaN or infinity
It feels a bit odd to have a value where is_float($v) would be true, but is_representable_as_float($v) would be false. I'd be interested to understand the thinking behind this case.
As the strings “INF” and “NAN” can't be cast as floats directly, I
think it's safer to return false on these extreme values. See
https://3v4l.org/TabMK#vnull for the cast example.
The sentence I quoted isn't talking about string inputs, it's talking about values that are already floats. The fact that you can't create one from a string cast doesn't seem relevant, if you have in fact created one some other way.
Rowan Tommins
[IMSoP]
The sentence I quoted isn't talking about string inputs, it's talking about values that are already floats. The fact that you can't create one from a string cast doesn't seem relevant, if you have in fact created one some other way.
Ah yes, my bad. I just updated the RFC page with your previous
comments. Also, I changed examples of INF
and NAN
to return true with
"is_representable_as_float()". The example of the float overflowing to
infinity (2**1024) is also updated to return true.
Best,
Alexandre Daubois
I would like to present the RFC to add the
"is_representable_as_float()" and "is_representable_as_int()"
functions. These functions provide developers with a way to check
whether values can be losslessly converted between integer and
floating-point representations.
The RFC is now updated with the draft implementation of the current
state of the document. The draft PR is available at
https://github.com/php/php-src/pull/19308.
Best,
Alexandre Daubois
Hi Internals,
This is a friendly reminder that this RFC has been under discussion
for a couple of weeks. Today marks two weeks since the announcement of
this RFC. After some uneventful discussions, I am wondering how we
should proceed now. Would you like to continue the discussion, or
perhaps initiate a vote? With the feature freeze imminent (tomorrow,
if I'm right?), I am not sure that the RFC would be implemented in 8.5
if it is accepted given that the vote lasts at least a week.
What's the best way to proceed?
The implementation is still available at
https://github.com/php/php-src/pull/19308.
Best,
Alexandre Daubois
Le mar. 29 juil. 2025 à 14:52, Alexandre Daubois
alex.daubois+php@gmail.com a écrit :
Hi internals,
I would like to present the RFC to add the
"is_representable_as_float()" and "is_representable_as_int()"
functions. These functions provide developers with a way to check
whether values can be losslessly converted between integer and
floating-point representations.https://wiki.php.net/rfc/is-representable-as-float-int
Best,
Alexandre Daubois
On Tue, Aug 12, 2025 at 4:35 AM Alexandre Daubois <
alex.daubois+php@gmail.com> wrote:
Hi Internals,
This is a friendly reminder that this RFC has been under discussion
for a couple of weeks. Today marks two weeks since the announcement of
this RFC. After some uneventful discussions, I am wondering how we
should proceed now. Would you like to continue the discussion, or
perhaps initiate a vote? With the feature freeze imminent (tomorrow,
if I'm right?), I am not sure that the RFC would be implemented in 8.5
if it is accepted given that the vote lasts at least a week.
It is too late for RFCs to target PHP 8.5; they needed to finish voting
today in order to be included.
Given that this cannot be included in PHP 8.5, I would suggest waiting at
least a few weeks before starting the vote - some people (myself definitely
included) may not have had a chance to review the RFC because of the work
surrounding trying to finalize and implement things before the PHP 8.5
freeze.
-Daniel
Given that this cannot be included in PHP 8.5, I would suggest waiting at least a few weeks before starting the vote - some people (myself definitely included) may not have had a chance to review the RFC because of the work surrounding trying to finalize and implement things before the PHP 8.5 freeze.
Tim kindly told me so as well. Let's wait a few more weeks so everyone
interested can get into the discussion.
Good luck with the release!
Best,
Alexandre Daubois
Hi
Am 2025-07-29 14:52, schrieb Alexandre Daubois:
I would like to present the RFC to add the
"is_representable_as_float()" and "is_representable_as_int()"
functions. These functions provide developers with a way to check
whether values can be losslessly converted between integer and
floating-point representations.
Thank you for the RFC. I fundamentally disagree with that there is a
single, obvious and meaningful definition for
is_representable_as_float()
. In fact right from the beginning of the
“Proposal” section:
The value is a string that represents a valid floating-point number
(e.g., “3.14”, “1e10”, “-0.001”)
3.14 is not exactly representable as a IEEE-754 double-precision
floating point number. The two nearest representable values are
3.14000000000000012434 (this one is the nearest) and
3.13999999999999968026. Returning true
for
is_representable_as_float("3.14")
is therefore wrong to me.
The introduction is also wrong. JSON does not specify that numbers
must be representable as IEEE-754 double-precision floating point
numbers. The grammar for “Numbers” is independent of any particular
representation and transmitting numbers > 2^53 is perfectly fine if
every system involved can deal with 64 bit integers. Also just casting a
non-representable value to string will break any consumers that are
strongly typed and expect a number rather than a string. So that doesn't
make sense to me either.
I also question how meaningful it is knowing that a given number is
exactly representable as a float when (e.g. a power of two) when almost
any operation on it will incur implicit rounding.
For is_representable_as_int()
finding a reasonable definition is much
easier, but I don't think I've ever had a use case where I needed to
know whether a non-integer value is exactly representable as an int.
Long story short: What actual real-world problem does this RFC attempt
to solve? The list in the “Introduction” section is not particularly
meaningful with regard to the real world and the JSON example is wrong
as outlined above.
Best regards
Tim Düsterhus
For
is_representable_as_int()
finding a reasonable definition is much
easier, but I don't think I've ever had a use case where I needed to
know whether a non-integer value is exactly representable as an int.Long story short: What actual real-world problem does this RFC attempt
to solve? The list in the “Introduction” section is not particularly
meaningful with regard to the real world and the JSON example is wrong
as outlined above.Best regards
Tim Düsterhus
I have multiple times just recently had need of "I have a numeric string, should I cast it to an int or a float?", for which an is_representable_as_int() function (or similar) would be quite helpful, and neater than the messy solution I usually use.
I haven't had a use for is_representable_as_float() that I can recall.
--Larry Garfield
Hi Larry,
I have multiple times just recently had need of "I have a numeric string, should I cast it to an int or a float?", for which an is_representable_as_int() function (or similar) would be quite helpful, and neater than the messy solution I usually use.
Indeed, this is a nice use case! This is also something I encountered
a few times.
I haven't had a use for is_representable_as_float() that I can recall.
Just thinking out loud, if this RFC ever makes it to the voting phase,
maybe we should have two separate votes, one for each function.
Best,
Alexandre Daubois
Hi
I have multiple times just recently had need of "I have a numeric string, should I cast it to an int or a float?", for which an is_representable_as_int() function (or similar) would be quite helpful, and neater than the messy solution I usually use.
It would've been nice to know what that use-case is, rather than just
knowing that you had that use-case.
I'm having a hard time thinking of something where I don't a-priori know
what type I expect to get and would need to inspect the value to make a
decision.
I see how having a function that safely coerces a string into an int,
returning null if coercion fails, basically intval()
with better error
handling and taking only string
s, could be useful, but that's not what
is being asked here.
Best regards
Tim Düsterhus
Hi
I have multiple times just recently had need of "I have a numeric string, should I cast it to an int or a float?", for which an is_representable_as_int() function (or similar) would be quite helpful, and neater than the messy solution I usually use.
It would've been nice to know what that use-case is, rather than just
knowing that you had that use-case.I'm having a hard time thinking of something where I don't a-priori know
what type I expect to get and would need to inspect the value to make a
decision.I see how having a function that safely coerces a string into an int,
returning null if coercion fails, basicallyintval()
with better error
handling and taking onlystring
s, could be useful, but that's not what
is being asked here.Best regards
Tim Düsterhus
When doing generic serialization, the input is often always-strings (eg, environment variables, HTTP Query parameters, etc.) When doing generic code (not type generics, but "works on anything" kind of generic), I often have to resort to this:
https://github.com/Crell/EnvMapper/blob/master/src/EnvMapper.php#L88
Which gets the job done, but feels ugly to me.
Even if I know what the target type is, I still need to ask the question "so does this string match the target type?"
https://github.com/Crell/Carica/blob/master/src/Middleware/NormalizeArgumentTypesMiddleware.php#L73
"I have a string, the parameter wants an int, is that even possible?" Being able to replace that floor()
nonsense with is_integerable() (by whatever name) would make my life a lot easier.
For float, is_numeric()
is already sufficient for my purposes. I just need to be able to differentiate between "3" and "3.14" to cast to the correct type.
--Larry Garfield
Le mar. 26 août 2025 à 21:38, Larry Garfield larry@garfieldtech.com a
écrit :
Hi
I have multiple times just recently had need of "I have a numeric
string, should I cast it to an int or a float?", for which an
is_representable_as_int() function (or similar) would be quite helpful, and
neater than the messy solution I usually use.It would've been nice to know what that use-case is, rather than just
knowing that you had that use-case.I'm having a hard time thinking of something where I don't a-priori know
what type I expect to get and would need to inspect the value to make a
decision.I see how having a function that safely coerces a string into an int,
returning null if coercion fails, basicallyintval()
with better error
handling and taking onlystring
s, could be useful, but that's not what
is being asked here.Best regards
Tim DüsterhusWhen doing generic serialization, the input is often always-strings (eg,
environment variables, HTTP Query parameters, etc.) When doing generic
code (not type generics, but "works on anything" kind of generic), I often
have to resort to this:https://github.com/Crell/EnvMapper/blob/master/src/EnvMapper.php#L88
Which gets the job done, but feels ugly to me.
Even if I know what the target type is, I still need to ask the question
"so does this string match the target type?"https://github.com/Crell/Carica/blob/master/src/Middleware/NormalizeArgumentTypesMiddleware.php#L73
"I have a string, the parameter wants an int, is that even possible?"
Being able to replace thatfloor()
nonsense with is_integerable() (by
whatever name) would make my life a lot easier.For float,
is_numeric()
is already sufficient for my purposes. I just
need to be able to differentiate between "3" and "3.14" to cast to the
correct type.--Larry Garfield
Why not just 0 + $theValue? Then the engine will decide. No?
Hi
I have multiple times just recently had need of "I have a numeric string, should I cast it to an int or a float?", for which an is_representable_as_int() function (or similar) would be quite helpful, and neater than the messy solution I usually use.
It would've been nice to know what that use-case is, rather than just
knowing that you had that use-case.I'm having a hard time thinking of something where I don't a-priori know
what type I expect to get and would need to inspect the value to make a
decision.I see how having a function that safely coerces a string into an int,
returning null if coercion fails, basicallyintval()
with better error
handling and taking onlystring
s, could be useful, but that's not what
is being asked here.Best regards
Tim DüsterhusWhen doing generic serialization, the input is often always-strings (eg, environment variables, HTTP Query parameters, etc.) When doing generic code (not type generics, but "works on anything" kind of generic), I often have to resort to this:
https://github.com/Crell/EnvMapper/blob/master/src/EnvMapper.php#L88
Which gets the job done, but feels ugly to me.
Even if I know what the target type is, I still need to ask the question "so does this string match the target type?"
https://github.com/Crell/Carica/blob/master/src/Middleware/NormalizeArgumentTypesMiddleware.php#L73
"I have a string, the parameter wants an int, is that even possible?" Being able to replace that
floor()
nonsense with is_integerable() (by whatever name) would make my life a lot easier.For float,
is_numeric()
is already sufficient for my purposes. I just need to be able to differentiate between "3" and "3.14" to cast to the correct type.--Larry Garfield
Isn't this the entire use-case of type coercion?
— Rob
Hi Tim,
Thank you for your thorough review.
3.14 is not exactly representable as a IEEE-754 double-precision
floating point number. The two nearest representable values are
3.14000000000000012434 (this one is the nearest) and
3.13999999999999968026. Returningtrue
for
is_representable_as_float("3.14")
is therefore wrong to me.
You're right about mathematical precision. Perhaps the function name
is misleading. What the function actually should check is whether a
string > float > string roundtrip preserves the value as PHP
developers would expect it, not whether it's mathematically exactly
representable in IEEE-754.
Maybe a more accurate name would better reflect this behavior. Naming
is super hard again here. The goal is pragmatic: "will this value
survive a type conversion cycle without surprising changes?"
The introduction is also wrong. JSON does not specify that numbers
must be representable as IEEE-754 double-precision floating point
numbers. The grammar for “Numbers” is independent of any particular
representation and transmitting numbers > 2^53 is perfectly fine if
every system involved can deal with 64 bit integers. Also just casting a
non-representable value to string will break any consumers that are
strongly typed and expect a number rather than a string. So that doesn't
make sense to me either.
Fair point about the JSON specification itself. However, the
real-world issue is JavaScript's Number type limitation (safe integers
only up to 2^53). The fact that it is fine if "every system involved
can deal with 64 bits integers" is true, but nothing guarantees that
all systems involved are systematically x64. Many PHP applications
need to communicate with JavaScript frontends, and knowing when a
numeric ID exceeds JavaScript's safe integer range would improve
interoperability.
I also question how meaningful it is knowing that a given number is
exactly representable as a float when (e.g. a power of two) when almost
any operation on it will incur implicit rounding.
To me, the code snippet in the RFC introduction illustrates a valid
use-case of a numeric identifier that could be dangerous to cast.
Long story short: What actual real-world problem does this RFC attempt
to solve? The list in the “Introduction” section is not particularly
meaningful with regard to the real world and the JSON example is wrong
as outlined above.
A few examples, as interoperability between with JSON/JavaScript is
the main point here, could be:
- Snowflake IDs
- High-precision timestamps in microseconds
- Large auto-increment IDs in mature applications
Hope I addressed your concerns.
Best,
Alexandre Daubois
Hi
3.14 is not exactly representable as a IEEE-754 double-precision
floating point number. The two nearest representable values are
3.14000000000000012434 (this one is the nearest) and
3.13999999999999968026. Returningtrue
for
is_representable_as_float("3.14")
is therefore wrong to me.You're right about mathematical precision. Perhaps the function name
is misleading. What the function actually should check is whether a
string > float > string roundtrip preserves the value as PHP
developers would expect it, not whether it's mathematically exactly
representable in IEEE-754.Maybe a more accurate name would better reflect this behavior. Naming
is super hard again here. The goal is pragmatic: "will this value
survive a type conversion cycle without surprising changes?"
"Surprising changes" is not a meaningful term. Keep in mind that the
behavior of the function will need to be accurately documented and that
folks need something to work with to determine whether a report is valid
or not when bugs in the function get reported - which will inevitably
happen.
In fact to use one of the examples for the RFC: 1e10 does not roundtrip.
php > var_dump((string)(float)"1e10");
string(11) "10000000000"
"1e10" clearly is different than "10000000000".
Generally speaking, I'm also not sure if “printing raw floats” is a
use-case we should encourage. Instead almost any application will need
to format a number for a specific use-case. Formatting numbers for human
consumption [1] has different requirements compared to formatting
numbers for programmatic consumption.
Also note that the stringification of floats unfortunately is controlled
by INI settings precision
and serialize_precision
which adds
additional complexity.
[1] My recommendation is using ext/intl's NumberFormatter:
https://www.php.net/manual/en/class.numberformatter.php
need to communicate with JavaScript frontends, and knowing when a
numeric ID exceeds JavaScript's safe integer range would improve
interoperability.
For this use case we would not need a complex function. A simple
constant FLOAT_INTEGER_RANGE or similar would be sufficient.
I also question how meaningful it is knowing that a given number is
exactly representable as a float when (e.g. a power of two) when almost
any operation on it will incur implicit rounding.To me, the code snippet in the RFC introduction illustrates a valid
use-case of a numeric identifier that could be dangerous to cast.
Please note my remark regarding “strongly typed consumers”. Magically
switching from JSON's number type to JSON's string type for specific
values is needlessly increasing complexity for everyone who interacts
with the JSON payload, particularly for compiled languages. If you know
that the consumers of your JSON are having troubles with certain numbers
you emit and you want to emit strings for those kind of numbers instead,
then you should just emit strings all the time.
In fact I'd argue that using JSON's number type for numeric IDs is
wrong, just like treating phone numbers or zip codes as numbers is
wrong. If it's not reasonable to do math on a value then it's likely not
a number, but instead a numeric string.
Long story short: What actual real-world problem does this RFC attempt
to solve? The list in the “Introduction” section is not particularly
meaningful with regard to the real world and the JSON example is wrong
as outlined above.A few examples, as interoperability between with JSON/JavaScript is
the main point here, could be:
- Snowflake IDs
At the application boundary these should just be a string, since
consumers of your API should treat IDs as opaque values, unless you want
to give some guarantees (e.g. rough ordering, which would still work
with strings).
- High-precision timestamps in microseconds
Using a number is reasonable here, but see above regarding “magically
switching types”.
(From my experience timestamps are generally represented as ISO-8601
strings in JSON APIs anyways)
- Large auto-increment IDs in mature applications
Same as snowflakes.
Hope I addressed your concerns.
I'm afraid not.
Best regards
Tim Düsterhus
Hi
3.14 is not exactly representable as a IEEE-754 double-precision
floating point number. The two nearest representable values are
3.14000000000000012434 (this one is the nearest) and
3.13999999999999968026. Returningtrue
for
is_representable_as_float("3.14")
is therefore wrong to me.You're right about mathematical precision. Perhaps the function name
is misleading. What the function actually should check is whether a
string > float > string roundtrip preserves the value as PHP
developers would expect it, not whether it's mathematically exactly
representable in IEEE-754.Maybe a more accurate name would better reflect this behavior. Naming
is super hard again here. The goal is pragmatic: "will this value
survive a type conversion cycle without surprising changes?""Surprising changes" is not a meaningful term. Keep in mind that the
behavior of the function will need to be accurately documented and that
folks need something to work with to determine whether a report is valid
or not when bugs in the function get reported - which will inevitably
happen.In fact to use one of the examples for the RFC: 1e10 does not roundtrip.
php > var_dump((string)(float)"1e10"); string(11) "10000000000"
"1e10" clearly is different than "10000000000".
Generally speaking, I'm also not sure if “printing raw floats” is a
use-case we should encourage. Instead almost any application will need
to format a number for a specific use-case. Formatting numbers for human
consumption [1] has different requirements compared to formatting
numbers for programmatic consumption.
Indeed. In some parts of the world, 1.000,01 is interpreted as 1000.01 in computers. That format also doesn’t roundtrip but is a valid numeric string, depending on locale.
— Rob
"Surprising changes" is not a meaningful term. Keep in mind that the behavior of the function will need to be accurately documented and that folks need something to work with to determine whether a report is valid or not when bugs in the function get reported - which will inevitably happen.
I think there's a sliding scale of "strictness" of casts, with string to float probably the hardest to pin down.
At one end, you have "purely lossless" or "symmetrical" - where (string)(float)$x === $x This would mean accepting '4' but not '4.0', which is probably not what users would expect.
From there, you can add some "normalisation" - ignore insignificant zeros at start and end; perhaps also ignore leading and trailing spaces.
Then you get alternative representations - most commonly, scientific notation like '2.5E10'. This hugely increases the number of inputs that result in the same output, e.g. '0.2E1' is yet another synonym for '2'.
Then there are inputs which don't fully match the normal format, but are unambiguous - '.1', '1.', '+1', etc
At the other end of the scale, there's the "best guess" approach that PHP's explicit cast operators use - (string)(float)' +4.2E1 bananas ' results in '42'
And that's before we check the precision of the resulting interpretation, and how close the floating point value is to the target decimal.
I'd really like something stricter than is_numeric but more flexible than ctype_digit; but exactly what it should do is a bit of a minefield.
Rowan Tommins
[IMSoP]
Hi,
You pointed out many valid issues here. After taking a few steps back,
I wonder indeed if Gina's proposition [1] wouldn't be enough.
Also, as Rowan said, knowing what these functions should exactly do is
complex and I'm afraid we won't find a happy middle ground that suits
everyone and every situation. Based on this thread feedback and
suggestions, the use cases seem niche, for is_representable_as_float()
at least.
Best,
Alexandre Daubois