[DISCUSSION] Adding the "is_integer_safe()" function

3 months ago by Alexandre Daubois — view source — reply

unread

Hello Internals,

I would like to gather feedback on the proposal to add an
"is_integer_safe()" function to PHP. The idea came to me while
replying to a thread [1], but I think creating a discussion is more
appropriate.

Here are a few reasons I think this would be useful:

PHP automatically converts integers that exceed PHP_INT_MAX to
float, which can cause undetected precision loss. This function would
allow checking that an integer remains within safe bounds before
critical operations ;
Integer limits vary between 32-bit and 64-bit systems. This function
would provide a standardized way to check the portability of integer
values across different architectures ;
This would improve code readability. We had the case in Symfony [2]
to implement a JSONPath crawler, and the solution feels hacky to say
the least ;
PHP already has functions like is_finite() and is_nan() for
floats. Adding is_integer_safe() would logically complete this set
of type validation functions.
Of course more optimized than dealing with strings and using
arithmetic operators on strings ;
As PHP continues to evolve towards stricter type safety
(strict_types, typed properties, etc.), having more granular type
validation functions supports this direction.

If this proposition gets some interest, I'll take care of the
implementation in the engine. I guess it would also require an RFC.

Best,
Alex

[1] https://externals.io/message/128032
[2] https://github.com/symfony/symfony/blob/d3a0df0243e1c68598fa066eaa6cd0cf39f256dc/src/Symfony/Component/JsonPath/JsonPathUtils.php#L243

3 months ago by Rowan Tommins [IMSoP] — view source — reply

unread

I would like to gather feedback on the proposal to add an
"is_integer_safe()" function to PHP.

From the name, I don't understand what this function would do. Do I pass in an integer, and get a boolean telling me if it's safe? Safe for what? Do I pass in some other value and get a boolean if it's safe to convert to an integer? What value exactly, and safe in what way?

That tells me two things:

You need to flesh out your proposal to be explicit about what it would do, as well as why.
The function needs a better name, to avoid confusion over what "safe" means.

Regards,

Rowan Tommins
[IMSoP]

3 months ago by Alexandre Daubois — view source — reply

unread

Hello Rowan,

Le ven. 25 juil. 2025 à 23:10, Rowan Tommins [IMSoP]
imsop.php@rwec.co.uk a écrit :

You need to flesh out your proposal to be explicit about what it would do, as well as why.

Indeed, I was too focused on the why and I forgot the most important part.

The idea is to have a function that receives an integer or a float and
returns bool if the provided argument is inside the safe Javascript
integer interval, namely [-(2^53)+1 ; (2^53)-1]. It's signature would
be is_integer_safe(int|float $num): bool. This interval is
considered safe because the floating-point mantissa is stored on 52
bits. This is nicely described in MDN [1].

The function needs a better name, to avoid confusion over what "safe" means.

I agree with you. I haven't come up with a better name yet as this one
could be ambiguous.

Best,
Alex

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER#description

3 months ago by Rowan Tommins [IMSoP] — view source — reply

unread

The idea is to have a function that receives an integer or a float and
returns bool if the provided argument is inside the safe Javascript
integer interval, namely [-(2^53)+1 ; (2^53)-1].

That suggests maybe the name should communicate "safe for JavaScript";
or more generally "safe for 64-bit IEEE floating point".

is_safe_for_js()
is_safe_for_float()

It's signature would
be is_integer_safe(int|float $num): bool.

I'm not sure if accepting floats in the same function makes sense. If
the question is "can this value safely be stored in a float?" and you
give it a float, then the answer is surely "yes". You have no way of
knowing if it already lost precision during a previous operation.

Possibly there could be a separate function which asks "can this float
value be safely converted to an integer?", but that implies a different
definition - it wouldn't make much sense to answer "yes" for 3.5, or NaN.

That then makes me wonder if this is really part of a family of
functions relating to casts - something I've been thinking about off an
on for some time. Namely, "can this value be losslessly converted to
this type?"

My thinking is that this needs new syntax, to avoid dozens of specific
functions. For instance, a generic (or generic-like) form:

function can_lossless_cast<T>(mixed $value): bool

can_lossless_cast<float>(2 ** 50) === true
can_lossless_cast<float>(2 ** 54) === false

can_lossless_cast<int>(2 .0** 54) === true
can_lossless_cast<int>(2.0 ** 65) === false
can_lossless_cast<int>(3.5) === false
can_lossless_cast<int>(NaN) === false

can_lossless_cast<int>('9007199254740991000') == true // less than 264
can_lossless_cast<float>('9007199254740991000') == false // more than 253

can_lossless_cast<int|float>('9007199254740991000') == true
can_lossless_cast<int|float>('3.5') == true

Regards,

--
Rowan Tommins
[IMSoP]

3 months ago by Alexandre Daubois — view source — reply

unread

That suggests maybe the name should communicate "safe for JavaScript"; or more generally "safe for 64-bit IEEE floating point".

is_safe_for_js()
is_safe_for_float()

I'm not sure we should mention Javascript, because it's actually
related to floating point storage.

I'm not sure if accepting floats in the same function makes sense. If the question is "can this value safely be stored in a float?" and you give it a float, then the answer is surely "yes". You have no way of knowing if it already lost precision during a previous operation.

Possibly there could be a separate function which asks "can this float value be safely converted to an integer?", but that implies a different definition - it wouldn't make much sense to answer "yes" for 3.5, or NaN.

I think we can see this function more like "can this number be
compared safely?", e.g. var_dump(2**53 === (2**53)+1); returns true
as these numbers are not in the safe interval. A name like
is_safely_comparable() would fit better maybe.

My thinking is that this needs new syntax, to avoid dozens of specific functions. For instance, a generic (or generic-like) form:

function can_lossless_cast<T>(mixed $value): bool

can_lossless_cast<float>(2 ** 50) === true
can_lossless_cast<float>(2 ** 54) === false

can_lossless_cast<int>(2 .0** 54) === true
can_lossless_cast<int>(2.0 ** 65) === false
can_lossless_cast<int>(3.5) === false
can_lossless_cast<int>(NaN) === false

can_lossless_cast<int>('9007199254740991000') == true // less than 264
can_lossless_cast<float>('9007199254740991000') == false // more than 253

can_lossless_cast<int|float>('9007199254740991000') == true
can_lossless_cast<int|float>('3.5') == true

I really like the idea of generic functions, I however imagine it
would bring a lot more complexity and really profound changes to the
engine, well beyond the scope of the proposal. Maybe the better naming
of the function would solve this?

Best,
Alex

3 months ago by Claude Pache — view source — reply

unread

Le 26 juil. 2025 à 18:13, Alexandre Daubois alex.daubois+php@gmail.com a écrit :

I'm not sure if accepting floats in the same function makes sense. If the question is "can this value safely be stored in a float?" and you give it a float, then the answer is surely "yes". You have no way of knowing if it already lost precision during a previous operation.

Possibly there could be a separate function which asks "can this float value be safely converted to an integer?", but that implies a different definition - it wouldn't make much sense to answer "yes" for 3.5, or NaN.

I think we can see this function more like "can this number be
compared safely?", e.g. var_dump(2**53 === (2**53)+1); returns true
as these numbers are not in the safe interval. A name like
is_safely_comparable() would fit better maybe.

This is not correct: 2**53 + 1 is perfectly “safe” (for 64-bit builds of PHP), see: https://3v4l.org/P939d

The specific notion of “safe integer” as introduced in JavaScript makes sense only for numbers encoded using IEEE 754, which is what PHP uses for float. In PHP, there is a specialised type for integers, so that the need of such a function is not clear, because every integer encoded as int is “safe”. Or maybe you want something like is_safe_integer_when_interpreted_as_float()?

Also, note that the particular Symfony example given at the beginning of this thread uses a function that expects a string, not an int or a float. In that example, something like is_numeric_value_producing_exact_integer_when_interpreted_as_float() could have been useful, but this use case is very specialised.

—Claude

3 months ago by Alexandre Daubois — view source — reply

unread

Le lun. 28 juil. 2025 à 12:18, Claude Pache claude.pache@gmail.com a écrit :

This is not correct: 2**53 + 1 is perfectly “safe” (for 64-bit builds of PHP), see: https://3v4l.org/P939d

The specific notion of “safe integer” as introduced in JavaScript makes sense only for numbers encoded using IEEE 754, which is what PHP uses for float. In PHP, there is a specialised type for integers, so that the need of such a function is not clear, because every integer encoded as int is “safe”. Or maybe you want something like is_safe_integer_when_interpreted_as_float()?

Also, note that the particular Symfony example given at the beginning of this thread uses a function that expects a string, not an int or a float. In that example, something like is_numeric_value_producing_exact_integer_when_interpreted_as_float() could have been useful, but this use case is very specialised.

—Claude

Right! I think we found Gina proposed a nice alternative. I wrote a
few specs in my last thread message in response to Gina. Please let me
know if there's something that is unclear or bothering you.

Best,
Alexandre Daubois

3 months ago by tim@bastelstu.be — view source — reply

unread

Am 2025-07-26 15:45, schrieb Rowan Tommins [IMSoP]:

can_lossless_cast<float>(2 ** 54) === false

That one is incorrect, 254 can be exactly represented as a float.
The magic 253 boundary just means that the difference between
consecutive representable values becomes larger than 1.

See: https://float.exposed/0x4350000000000000 and try incrementing or
decrementing the "Raw Decimal Integer Value".

Best regards
Tim Düsterhus

3 months ago by Gina P. Banyard — view source — reply

unread

Hello Internals,

I would like to gather feedback on the proposal to add an
"is_integer_safe()" function to PHP. The idea came to me while
replying to a thread [1], but I think creating a discussion is more
appropriate.

Here are a few reasons I think this would be useful:

PHP automatically converts integers that exceed PHP_INT_MAX to
float, which can cause undetected precision loss. This function would
allow checking that an integer remains within safe bounds before
critical operations ;

Integer limits vary between 32-bit and 64-bit systems. This function
would provide a standardized way to check the portability of integer
values across different architectures ;

This would improve code readability. We had the case in Symfony [2]
to implement a JSONPath crawler, and the solution feels hacky to say
the least ;

PHP already has functions like is_finite() and is_nan() for
floats. Adding is_integer_safe() would logically complete this set
of type validation functions.

Of course more optimized than dealing with strings and using
arithmetic operators on strings ;

As PHP continues to evolve towards stricter type safety
(strict_types, typed properties, etc.), having more granular type
validation functions supports this direction.

If this proposition gets some interest, I'll take care of the
implementation in the engine. I guess it would also require an RFC.

Best,
Alex

[1] https://externals.io/message/128032
[2] https://github.com/symfony/symfony/blob/d3a0df0243e1c68598fa066eaa6cd0cf39f256dc/src/Symfony/Component/JsonPath/JsonPathUtils.php#L243

Naming aside, I would be very much in favour of anything that makes it easier to make bidirectional transformations between floats and ints reliable.
For naming, maybe the following pair of functions make sense?

is_representable_as_int()
is_representable_as_float()

Best regards,

Gina P. Banyard

3 months ago by Alexandre Daubois — view source — reply

unread

Le lun. 28 juil. 2025 à 12:14, Gina P. Banyard internals@gpb.moe a écrit :

For naming, maybe the following pair of functions make sense?

is_representable_as_int()

Definitely, thank you for proposing. Having a pair of functions for
bidirectional transformations makes much more sense than a single
ambiguous function. Here are some quick specs for what these functions
could do.

is_representable_as_int(mixed $value): bool checks if a given value
can be losslessly converted to an integer.

Floats: returns true if the float is within PHP_INT_MIN to
PHP_INT_MAX range AND has no fractional part
Strings: returns true if the string represents a valid integer
within the platform's integer range
Always returns true for integers
Special float values: returns false for NaN, INF, -INF

This gives the following example:

is_representable_as_int(42.0);        // true
is_representable_as_int(42.5);        // false
is_representable_as_int(2**31);       // false on 32-bit, true on 64-bit
is_representable_as_int(2**63);       // false on both platforms
is_representable_as_int("123");       // true
is_representable_as_int("123.0");     // true
is_representable_as_int("123.5");     // false
is_representable_as_int(NAN);         // false

is_representable_as_float()

Now, is_representable_as_float(mixed $value): bool. The function
would check if a value can be represented as a float without precision
loss.

Integers: returns true if within the IEEE 754 safe integer range
(+/-(2^53-1)) regardless of the system's PHP_INT_MAX
Strings: returns true if parseable as a float within safe precision bounds
Floats always returns true
The IEEE 754 safe integer limit applies universally

This give the following examples:

is_representable_as_float(2**53 - 1); // true
is_representable_as_float(2**53);     // false, precision loss when
casted to float
is_representable_as_float(PHP_INT_MAX); // false on 64-bit, true on 32-bit
is_representable_as_float(42);        // true
is_representable_as_float("123.456"); // true
is_representable_as_float("1e308");   // true
is_representable_as_float("1e400");   // false

What do you think of this new approach?

Best,
Alexandre Daubois

3 months ago by tim@bastelstu.be — view source — reply

unread

Am 2025-07-28 15:00, schrieb Alexandre Daubois:

is_representable_as_float(2**53); // false, precision loss when
casted to float

Similarly to my other email: This is false. 2**53 is exactly
representable as float. Every power of two (<= 1024) is.

is_representable_as_float("1e308"); // true

This is false. 1e308 is not exactly representable. The nearest
representable value is 1.00000000000000001098e+308:
https://float.exposed/0x7fe1ccf385ebc8a0

Best regards
Tim Düsterhus

3 months ago by tim@bastelstu.be — view source — reply

unread

Am 2025-07-28 15:35, schrieb Tim Düsterhus:

Similarly to my other email: This is false. 2**53 is exactly
representable as float. Every power of two (<= 1024) is.

Small correction: That should've read <= 1023 (or < 1024).

Best regards
Tim Düsterhus

3 months ago by Larry Garfield — view source — reply

unread

Le lun. 28 juil. 2025 à 12:14, Gina P. Banyard internals@gpb.moe a écrit :

For naming, maybe the following pair of functions make sense?

is_representable_as_int()

Definitely, thank you for proposing. Having a pair of functions for
bidirectional transformations makes much more sense than a single
ambiguous function. Here are some quick specs for what these functions
could do.

is_representable_as_int(mixed $value): bool checks if a given value
can be losslessly converted to an integer.

Floats: returns true if the float is within PHP_INT_MIN to
PHP_INT_MAX range AND has no fractional part

Strings: returns true if the string represents a valid integer
within the platform's integer range

Always returns true for integers

Special float values: returns false for NaN, INF, -INF

This gives the following example:
is_representable_as_int(42.0);        // true
is_representable_as_int(42.5);        // false
is_representable_as_int(2**31);       // false on 32-bit, true on 64-bit
is_representable_as_int(2**63);       // false on both platforms
is_representable_as_int("123");       // true
is_representable_as_int("123.0");     // true
is_representable_as_int("123.5");     // false
is_representable_as_int(NAN);         // false

Even without the precision loss question, such a function would be very useful when parsing input data of unknown type (and thus is a string). "Can I cast this string to an int to use with an int parameter without any loss" is currently not as simple as it ought to me. Providing a single standardized function for that would be most welcome.

The name is quite long though. :-)

is_representable_as_float()

The same argument applies here, although in most cases is_numeric() is close enough. A more precise option would not be unwelcome, though.

--Larry Garfield

3 months ago by Rowan Tommins [IMSoP] — view source — reply

unread

Even without the precision loss question, such a function would be very useful when parsing input data of unknown type (and thus is a string). "Can I cast this string to an int to use with an int parameter without any loss" is currently not as simple as it ought to me. Providing a single standardized function for that would be most welcome.

The name is quite long though. 🙂

This is basically where I was going with "can_lossless_cast". The reason
I went with a generic-like syntax is because it makes it exntendible to
any representable type, without an ever-growing list of long names like
"is_representable_as_int_or_float".

I also see it as part of a wider set of missing cast functions; for any
value V and type T:

Is V of type T?
Can V be safely converted to T (for some definition of "safe")?
If V can safely be converted to T, give that result; otherwise, throw
an Exception|Error
If V can safely be converted to T, give that result, else return null
If V can safely be converted to T, give that result, else return a
default value of type T

Given (5), you can implement (2), (3), and (4), but not elegantly,
particularly if null is a valid output, e.g.:

( cast<?int>($v, default: false) !== false ) ? cast<?int>($v, default:
false) : throw new TypeError('Not null, and not castable to int');

cast_or_throw<?int>($v)

I've got a half-drafted RFC around this, but stalled on a) naming; and
b) what a "safe" cast means for different types. It seems like both
problems have already come to light in this thread.

--
Rowan Tommins
[IMSoP]