[PATCH] double to long conversion change

16 years ago by Matt Wilmas — view source

unread

Hi all,

Since noticing and reporting last year [1] different behavior when casting
out-of-range doubles to int after the DVAL_TO_LVAL() macro was updated, I've
wondered how to get the behavior I observed, and thought could be relied on
(that was wrong to think, since it was un- or implementation-defined), back.
And how to do so (what should be expected?), while keeping in mind the
reason for the change: consistent behavior for tests. [2] Except that the
current code does not give consistent results, depending on which
DVAL_TO_LVAL definition is used on a platform. [3]

[1] http://marc.info/?l=php-internals&m=120799720922202&w=2
[2] http://marc.info/?l=php-internals&m=123495655802226&w=2
[3] http://marc.info/?l=php-internals&m=123496364812725&w=2

So after I finally started to test my ideas for "consistent/reliable
overflow across platforms" a few days ago, I noticed that my workaround
technique quit working (0 instead of overflow) with doubles over 2^63,
without resorting to fmod(). That's on Windows, but I suspect the same may
happen on other systems that are limited to 64-bit integer processing
internally or something (32-bit platforms?). On 64-bit Linux anyway, it
looks like doubles > 2^63 do rollover as expected (128-bit "internal
processing?"): http://marc.info/?l=php-internals&m=123376495021789&w=2

I wasn't sure how to rethink things after that... But of course with
doubles, precision has been lost long before 2^63 anyway, as far as
increments of 1 (it's 1024 at 2^63).

What I wound up with for now, is using 5.2's method on 64-bit platforms, and
on 32-bit, overflow behavior should be reliable up to 2^63 on platforms that
have zend_long64 type available (long long, __int64), which I'm guessing is
most (?), because of the unsigned long involvement. Finally a fallback
workaround for 32-bit platforms without a 64-bit type.

I updated a few other places in the code where only a (long) cast was used.
And sort of unrelated, but I added an 'L' conversion specifier for
zend_parse_parameters() in case it would be useful for PHP functions that
want to limit values to LONG_MAX/LONG_MIN, without overflow, which I thought
the DVAL_TO_LVAL change was trying to do.

http://realplain.com/php/dval_to_lval.diff
http://realplain.com/php/dval_to_lval_5_3.diff

And here is an initial version of zend_dval_to_lval() (before 2^63 issue and
thinking of zend_long64 + unsigned long), where some configure checks would
set ZEND_DVAL_TO_LVAL_USE_* as needed.

http://realplain.com/php/dval_to_lval.txt

Any general feedback, comments, questions, suggestions? Hoping these
conversion issues could be sorted out for good in a "nice," logical way. :-)
Unfortunately on Windows, I'm just guessing, rather than testing, conversion
results in different environments...

Thanks,
Matt

16 years ago by Dmitry Stogov — view source

unread

Hi Matt,

I tried to look into this issue once again, but I completely
misunderstand why do we need all this magic. Why do we need conversion
of positive double into negative long?

I would stay with single DVAL_TO_LVAL() definition and use it in places
instead of (long)Z_DVAL().

#define DVAL_TO_LVAL(d, l)
if ((d) > LONG_MAX) {
(l) = LONG_MAX;
} else if ((d) < LONG_MIN) {
(l) = LONG_MIN;
} else {
(l) = (long) (d);
}

Or may be we need a second macro for conversion into unsigned long where
it needed?

#define DVAL_TO_ULONG(d, l)
if ((d) > ULONG_MAX) {
(l) = ULONG_MAX;
} else if ((d) < 0) {
(l) = 0;
} else {
(l) = (unsigned long) (d);
}

It also possible to add notices in case of overflow detection.

Thanks. Dmitry.

Matt Wilmas wrote:

Hi all,

Since noticing and reporting last year [1] different behavior when
casting out-of-range doubles to int after the DVAL_TO_LVAL() macro was
updated, I've wondered how to get the behavior I observed, and thought
could be relied on (that was wrong to think, since it was un- or
implementation-defined), back. And how to do so (what should be
expected?), while keeping in mind the reason for the change: consistent
behavior for tests. [2] Except that the current code does not give
consistent results, depending on which DVAL_TO_LVAL definition is used
on a platform. [3]

[1] http://marc.info/?l=php-internals&m=120799720922202&w=2
[2] http://marc.info/?l=php-internals&m=123495655802226&w=2
[3] http://marc.info/?l=php-internals&m=123496364812725&w=2

So after I finally started to test my ideas for "consistent/reliable
overflow across platforms" a few days ago, I noticed that my workaround
technique quit working (0 instead of overflow) with doubles over 2^63,
without resorting to fmod(). That's on Windows, but I suspect the same
may happen on other systems that are limited to 64-bit integer
processing internally or something (32-bit platforms?). On 64-bit Linux
anyway, it looks like doubles > 2^63 do rollover as expected (128-bit
"internal processing?"):
http://marc.info/?l=php-internals&m=123376495021789&w=2

I wasn't sure how to rethink things after that... But of course with
doubles, precision has been lost long before 2^63 anyway, as far as
increments of 1 (it's 1024 at 2^63).

What I wound up with for now, is using 5.2's method on 64-bit platforms,
and on 32-bit, overflow behavior should be reliable up to 2^63 on
platforms that have zend_long64 type available (long long, __int64),
which I'm guessing is most (?), because of the unsigned long
involvement. Finally a fallback workaround for 32-bit platforms without
a 64-bit type.

I updated a few other places in the code where only a (long) cast was
used. And sort of unrelated, but I added an 'L' conversion specifier for
zend_parse_parameters() in case it would be useful for PHP functions
that want to limit values to LONG_MAX/LONG_MIN, without overflow, which
I thought the DVAL_TO_LVAL change was trying to do.

http://realplain.com/php/dval_to_lval.diff
http://realplain.com/php/dval_to_lval_5_3.diff

And here is an initial version of zend_dval_to_lval() (before 2^63 issue
and thinking of zend_long64 + unsigned long), where some configure
checks would set ZEND_DVAL_TO_LVAL_USE_* as needed.

http://realplain.com/php/dval_to_lval.txt

Any general feedback, comments, questions, suggestions? Hoping these
conversion issues could be sorted out for good in a "nice," logical way.
:-) Unfortunately on Windows, I'm just guessing, rather than testing,
conversion results in different environments...

Thanks,
Matt

16 years ago by Matt Wilmas — view source

unread

Hi Dmitry,

----- Original Message -----
From: "Dmitry Stogov"
Sent: Thursday, April 02, 2009

Hi Matt,

I tried to look into this issue once again, but I completely misunderstand
why do we need all this magic. Why do we need conversion of positive
double into negative long?

I don't really have any more information than what has been given in my
various earlier messages that I've referenced. :-) But it's no problem!
It's probably too much to keep track of, or try to find which message I said
something in, I know (I have to do that myself to refresh memory about some
parts). So feel free to ask for an explanation about anything. :-)

OK, regarding conversion of postive double into negative long (or it could
be positive if it "rolls over" again above ULONG_MAX, etc...): 1) for me,
the original issue I noticed, is preserving the least significant bits when
using bitwise AND on a large number (old ref again: [1]). Although now I
know the 5.2 behavior I was getting can't be relied on (<= ULONG_MAX it's
probably OK however), that's what I'm trying to do -- make conversions
consistent and reliable. And 2) unsigned specifiers in sprintf() (%u, %x,
etc.) rely on this conversion (though it currently won't work in 5.3 on
64-bit non-Windows). See references in Bugs #30695 and #42868.

[1] http://marc.info/?l=php-internals&m=120799720922202&w=2

The magic (different methods...?) is needed depending on what type of
conversion works on a platform. BTW, I wasn't satisfied with what I ended
up with for my patch (unsure about how some things would behave, some
guessing), so a few days ago I started to try coming up with something more
complete and precise depending on what works on a platform. Not done yet,
and will need to add some configure checks, etc. (new for me).

I would stay with single DVAL_TO_LVAL() definition and use it in places
instead of (long)Z_DVAL().

That (single DVAL_TO_LVAL()) is basically what 5.2 has now until you added
more definitions (from Zoe) ;-) (which behave differently [2]) for 5.3 in
Nov '07 for Bug #42868.

[2] http://marc.info/?l=php-internals&m=123496364812725&w=2

#define DVAL_TO_LVAL(d, l)
if ((d) > LONG_MAX) {
(l) = LONG_MAX;
} else if ((d) < LONG_MIN) {
(l) = LONG_MIN;
} else {
(l) = (long) (d);
}

That's close to 5.3's new version (depending which is used for a platform),
and precisely what was committed to zend_operators.c in Sep '04 (v1.195
"Resolve undefined behavior (joe at redhat)" [3]). After Bug #30695, it was
reverted in Nov: v1.203 "Revert Joe's work around a bug in GCC patch as it
breaks too many things." [4]

[3]
http://cvs.php.net/viewvc.cgi/ZendEngine2/zend_operators.c?r1=1.194&r2=1.195&view=patch
[4]
http://cvs.php.net/viewvc.cgi/ZendEngine2/zend_operators.c?r1=1.202&r2=1.203&view=patch

Matt

Or may be we need a second macro for conversion into unsigned long where
it needed?

#define DVAL_TO_ULONG(d, l)
if ((d) > ULONG_MAX) {
(l) = ULONG_MAX;
} else if ((d) < 0) {
(l) = 0;
} else {
(l) = (unsigned long) (d);
}

It also possible to add notices in case of overflow detection.

Thanks. Dmitry.

Matt Wilmas wrote:

Hi all,

Since noticing and reporting last year [1] different behavior when
casting out-of-range doubles to int after the DVAL_TO_LVAL() macro was
updated, I've wondered how to get the behavior I observed, and thought
could be relied on (that was wrong to think, since it was un- or
implementation-defined), back. And how to do so (what should be
expected?), while keeping in mind the reason for the change: consistent
behavior for tests. [2] Except that the current code does not give
consistent results, depending on which DVAL_TO_LVAL definition is used on
a platform. [3]

[1] http://marc.info/?l=php-internals&m=120799720922202&w=2
[2] http://marc.info/?l=php-internals&m=123495655802226&w=2
[3] http://marc.info/?l=php-internals&m=123496364812725&w=2

So after I finally started to test my ideas for "consistent/reliable
overflow across platforms" a few days ago, I noticed that my workaround
technique quit working (0 instead of overflow) with doubles over 2^63,
without resorting to fmod(). That's on Windows, but I suspect the same
may happen on other systems that are limited to 64-bit integer processing
internally or something (32-bit platforms?). On 64-bit Linux anyway, it
looks like doubles > 2^63 do rollover as expected (128-bit "internal
processing?"): http://marc.info/?l=php-internals&m=123376495021789&w=2

I wasn't sure how to rethink things after that... But of course with
doubles, precision has been lost long before 2^63 anyway, as far as
increments of 1 (it's 1024 at 2^63).

What I wound up with for now, is using 5.2's method on 64-bit platforms,
and on 32-bit, overflow behavior should be reliable up to 2^63 on
platforms that have zend_long64 type available (long long, __int64),
which I'm guessing is most (?), because of the unsigned long involvement.
Finally a fallback workaround for 32-bit platforms without a 64-bit type.

I updated a few other places in the code where only a (long) cast was
used. And sort of unrelated, but I added an 'L' conversion specifier for
zend_parse_parameters() in case it would be useful for PHP functions that
want to limit values to LONG_MAX/LONG_MIN, without overflow, which I
thought the DVAL_TO_LVAL change was trying to do.

http://realplain.com/php/dval_to_lval.diff
http://realplain.com/php/dval_to_lval_5_3.diff

And here is an initial version of zend_dval_to_lval() (before 2^63 issue
and thinking of zend_long64 + unsigned long), where some configure checks
would set ZEND_DVAL_TO_LVAL_USE_* as needed.

http://realplain.com/php/dval_to_lval.txt

Any general feedback, comments, questions, suggestions? Hoping these
conversion issues could be sorted out for good in a "nice," logical way.
:-) Unfortunately on Windows, I'm just guessing, rather than testing,
conversion results in different environments...

Thanks,
Matt

16 years ago by Dmitry Stogov — view source

unread

Hi Matt,

I don't really see why we should "preserve the least significant bits"
and I don't think we should support bitwise operations with doubles.

Stas, could you please look into this too.

Thanks. Dmitry.

Matt Wilmas wrote:

Hi Dmitry,

----- Original Message -----
From: "Dmitry Stogov"
Sent: Thursday, April 02, 2009

Hi Matt,

I tried to look into this issue once again, but I completely
misunderstand why do we need all this magic. Why do we need conversion
of positive double into negative long?

I don't really have any more information than what has been given in my
various earlier messages that I've referenced. :-) But it's no problem!
It's probably too much to keep track of, or try to find which message I
said something in, I know (I have to do that myself to refresh memory
about some parts). So feel free to ask for an explanation about
anything. :-)

OK, regarding conversion of postive double into negative long (or it
could be positive if it "rolls over" again above ULONG_MAX, etc...): 1)
for me, the original issue I noticed, is preserving the least
significant bits when using bitwise AND on a large number (old ref
again: [1]). Although now I know the 5.2 behavior I was getting can't
be relied on (<= ULONG_MAX it's probably OK however), that's what I'm
trying to do -- make conversions consistent and reliable. And 2)
unsigned specifiers in sprintf() (%u, %x, etc.) rely on this conversion
(though it currently won't work in 5.3 on 64-bit non-Windows). See
references in Bugs #30695 and #42868.

[1] http://marc.info/?l=php-internals&m=120799720922202&w=2

The magic (different methods...?) is needed depending on what type of
conversion works on a platform. BTW, I wasn't satisfied with what I
ended up with for my patch (unsure about how some things would behave,
some guessing), so a few days ago I started to try coming up with
something more complete and precise depending on what works on a
platform. Not done yet, and will need to add some configure checks,
etc. (new for me).

I would stay with single DVAL_TO_LVAL() definition and use it in
places instead of (long)Z_DVAL().

That (single DVAL_TO_LVAL()) is basically what 5.2 has now until you
added more definitions (from Zoe) ;-) (which behave differently [2]) for
5.3 in Nov '07 for Bug #42868.

[2] http://marc.info/?l=php-internals&m=123496364812725&w=2

#define DVAL_TO_LVAL(d, l)
if ((d) > LONG_MAX) {
(l) = LONG_MAX;
} else if ((d) < LONG_MIN) {
(l) = LONG_MIN;
} else {
(l) = (long) (d);
}

That's close to 5.3's new version (depending which is used for a
platform), and precisely what was committed to zend_operators.c in Sep
'04 (v1.195 "Resolve undefined behavior (joe at redhat)" [3]). After
Bug #30695, it was reverted in Nov: v1.203 "Revert Joe's work around a
bug in GCC patch as it breaks too many things." [4]

[3]
http://cvs.php.net/viewvc.cgi/ZendEngine2/zend_operators.c?r1=1.194&r2=1.195&view=patch

[4]
http://cvs.php.net/viewvc.cgi/ZendEngine2/zend_operators.c?r1=1.202&r2=1.203&view=patch

Matt

Or may be we need a second macro for conversion into unsigned long
where it needed?

#define DVAL_TO_ULONG(d, l)
if ((d) > ULONG_MAX) {
(l) = ULONG_MAX;
} else if ((d) < 0) {
(l) = 0;
} else {
(l) = (unsigned long) (d);
}

It also possible to add notices in case of overflow detection.

Thanks. Dmitry.

Matt Wilmas wrote:

Hi all,

Since noticing and reporting last year [1] different behavior when
casting out-of-range doubles to int after the DVAL_TO_LVAL() macro
was updated, I've wondered how to get the behavior I observed, and
thought could be relied on (that was wrong to think, since it was un-
or implementation-defined), back. And how to do so (what should be
expected?), while keeping in mind the reason for the change:
consistent behavior for tests. [2] Except that the current code does
not give consistent results, depending on which DVAL_TO_LVAL
definition is used on a platform. [3]

[1] http://marc.info/?l=php-internals&m=120799720922202&w=2
[2] http://marc.info/?l=php-internals&m=123495655802226&w=2
[3] http://marc.info/?l=php-internals&m=123496364812725&w=2

So after I finally started to test my ideas for "consistent/reliable
overflow across platforms" a few days ago, I noticed that my
workaround technique quit working (0 instead of overflow) with
doubles over 2^63, without resorting to fmod(). That's on Windows,
but I suspect the same may happen on other systems that are limited
to 64-bit integer processing internally or something (32-bit
platforms?). On 64-bit Linux anyway, it looks like doubles > 2^63 do
rollover as expected (128-bit "internal processing?"):
http://marc.info/?l=php-internals&m=123376495021789&w=2

I wasn't sure how to rethink things after that... But of course with
doubles, precision has been lost long before 2^63 anyway, as far as
increments of 1 (it's 1024 at 2^63).

What I wound up with for now, is using 5.2's method on 64-bit
platforms, and on 32-bit, overflow behavior should be reliable up to
2^63 on platforms that have zend_long64 type available (long long,
__int64), which I'm guessing is most (?), because of the unsigned
long involvement. Finally a fallback workaround for 32-bit platforms
without a 64-bit type.

I updated a few other places in the code where only a (long) cast was
used. And sort of unrelated, but I added an 'L' conversion specifier
for zend_parse_parameters() in case it would be useful for PHP
functions that want to limit values to LONG_MAX/LONG_MIN, without
overflow, which I thought the DVAL_TO_LVAL change was trying to do.

http://realplain.com/php/dval_to_lval.diff
http://realplain.com/php/dval_to_lval_5_3.diff

And here is an initial version of zend_dval_to_lval() (before 2^63
issue and thinking of zend_long64 + unsigned long), where some
configure checks would set ZEND_DVAL_TO_LVAL_USE_* as needed.

http://realplain.com/php/dval_to_lval.txt

Any general feedback, comments, questions, suggestions? Hoping these
conversion issues could be sorted out for good in a "nice," logical
way. :-) Unfortunately on Windows, I'm just guessing, rather than
testing, conversion results in different environments...

Thanks,
Matt

16 years ago by Matt Wilmas — view source

unread

Hi Dmitry,

I finally updated the patches:
http://realplain.com/php/dval_to_lval.diff
http://realplain.com/php/dval_to_lval_5_3.diff

After seeing how things work on Windows and [new for me] 32- and 64-bit
Linux, there's no longer all that magic stuff I had before. :-) Now the
conversion method is basically just like what 5.2 has, and its behavior
that's been there for years (I assume, for the majority of users, at least
for positive numbers). The difference is that on 32-bit, it now uses a
(zend_long64) cast, which should fix the inconsistency with the Mac that Zoe
reported in Bug #42868, which lead to the change in 5.3. (It appears that
the (unsigned long) cast was limiting to ULONG_MAX, instead of overflowing
and preserving least significant bits like other platforms.)

NOTE: The (zend_long64) part is exactly the same thing you did (to ensure
overflow?), for positive values, in April '07, zend_operators.c v1.269 for
"WIN64 support" (I don't think Win64 is any different than Win32 -- a
difference in compilers between VC6 and newer? Yes).

This should handle the 64-bit range of values. From what I've been able to
check, once a double's exponent is above 64 (or is it 63? whatever), the
conversion doesn't work (no different than PHP 5.2), without fmod...
Outside of -2^63 and 2^63 on 32-bit Linux/Windows, you get 0; on 64-bit
Linux (like 5.2), above 2^64 is 0 and below -2^63 is LONG_MIN.

I also added a configure check and flag if a simple (long) cast is
sufficient to preserve LSB. Don't know if I put it in the "right" place
(Zend.m4) or if it should be kept, but it's there.

BTW, I used a function so it's easier to use than the current macro. With
zend_always_inline, it should usually be the same...? I would have used a
macro with ?: but that was used previously before removal from
zend_operators.c in Sep '06, v1.256: "use if() instead of ?: and avoid
possible optimization problems."

More below...

----- Original Message -----
From: "Dmitry Stogov"
Sent: Friday, April 03, 2009

Hi Matt,

I don't really see why we should "preserve the least significant bits"

Because that's presumably what most users have seen, at least for positive
numbers, in previous versions for years? And I think it makes sense,
effectively truncating the number in that way. Obviously no out-of-range
number can be correct, but at least preserving LSB has a chance of giving
the correct result in the case of bitwise ANDing a large number with a small
one, for example.

You asked Stas to take a look as well, and I see that he was the one who
introduced the 5.2 and prior behavior to overflow/preserve LSB (again, for
most users, etc.) by adding the (unsigned long) cast in Jul '01,
zend_operators.c v1.105: "fix double->long conversion" (though it doesn't
say what was broken :-)).

and I don't think we should support bitwise operations with doubles.

But we do. :-) (I don't know if you're talking about keeping LSB or my
other "64-bit operators" patch for doubles on 32-bit platforms.)

It's not an error ("Invalid operand types"), even if out of long range. As
I said above, the 5.2 and prior method has a chance of working correctly in
some cases, unlike a limit.

Anyway, I don't see why bitwise operations with doubles wouldn't be
supported, as much as is possible. With the PHP way of "do what we can to
make it work" and all. :-) That was my thinking with the "64-bit operators"
patch -- it's possible, simple, and brings consistency with 64-bit
platforms.

Stas, could you please look into this too.

Thanks. Dmitry.

Thanks,
Matt

16 years ago by johannes@php.net — view source

unread

Hi Dmitry,

I finally updated the patches:
http://realplain.com/php/dval_to_lval.diff
http://realplain.com/php/dval_to_lval_5_3.diff

Has anybody (Dmitry?) reviewed this or other feedback?

johannes

16 years ago by Dmitry Stogov — view source

unread

I looked into it some time ago.
Really, I don't understand what it's going to fix (the need of
preserving low bits looks strange for me).
Also it breaks ~30 tests.

Thanks. Dmitry.

Johannes Schlüter wrote:

Hi Dmitry,

I finally updated the patches:
http://realplain.com/php/dval_to_lval.diff
http://realplain.com/php/dval_to_lval_5_3.diff

Has anybody (Dmitry?) reviewed this or other feedback?

johannes

16 years ago by Stanislav Malyshev — view source

unread

Hi!

Really, I don't understand what it's going to fix (the need of
preserving low bits looks strange for me).

From what I understand the idea of the patch is what we are doing if we
try to convert double that's too big for long into long (please correct
me if I'm talking nonsense here). We'd then have to "cut" it, and as I
understand there could be two approaches - either bitwise cut or just
putting there some pre-determined value like LONG_MAX. I also understand
the former was behavior in 5.2 and before (but not in 5.3 as of now),
and thus I think it makes sense to keep this behavior, unless there is
some serious problem with it - and I understand we don't know now about
any such problem. Am I right?

Also it breaks ~30 tests.

Now that should be checked out - are those directly related to the said
conversion (i.e. the issue is that the tests just expect different
behavior which is natural and we just have to fix the tests) or those
are independent tests.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

16 years ago by Matt Wilmas — view source

unread

Hi all,

----- Original Message -----
From: "Matt Wilmas"
Sent: Friday, April 10, 2009

Hi Dmitry,

I finally updated the patches:
http://realplain.com/php/dval_to_lval.diff
http://realplain.com/php/dval_to_lval_5_3.diff

The patches were updated again, against current CVS (no changes by me).

After some off- and on-list discussion weeks ago, it sounds like (from Stas)
these changes are acceptable to restore float->int conversion to the
long-standing behavior that exists through PHP 5.2 (for most
users/platforms), as long as the necessary tests were updated, which I've
done for ones I checked. At least I haven't heard any negative feedback
since my final proposal? :-)

Patches for test updates:
http://realplain.com/php/dval_to_lval_tests.diff
http://realplain.com/php/dval_to_lval_tests_5_3.diff

ext/standard/tests/url/parse_url_variation_002.phpt was split into 32- and
64-bit versions again (like 5.2). If other failing tests are found, let me
know.

I guess I could go ahead and commit the changes next week if there aren't
objections... Was wondering about two minor parts, explained previously, OK
to keep?:

Configure check for whether just a simple (long) cast gives the desired
result, to save runtime range checking. (Not sure how many platforms would
pass this, other than Windows' VC6.)
The 'L' conversion specifier for zend_parse_parameters() (I've updated
README.PARAMETER_PARSING_API). Like I said, some functions could be updated
to use this (with "limit" or "length" type params) to prevent
overflow-related weirdness if huge numbers are passed. It should be noted
that such unexpected behavior has existed through 5.2, however, though the
changes in 5.3 started to prevent it in a lot of cases (very inconsistent
though; Bug #47854, etc.).

Thanks,
Matt

(Previous message continues below, for some explanation of final version.)

After seeing how things work on Windows and [new for me] 32- and 64-bit
Linux, there's no longer all that magic stuff I had before. :-) Now the
conversion method is basically just like what 5.2 has, and its behavior
that's been there for years (I assume, for the majority of users, at least
for positive numbers). The difference is that on 32-bit, it now uses a
(zend_long64) cast, which should fix the inconsistency with the Mac that
Zoe reported in Bug #42868, which lead to the change in 5.3. (It appears
that the (unsigned long) cast was limiting to ULONG_MAX, instead of
overflowing and preserving least significant bits like other platforms.)

NOTE: The (zend_long64) part is exactly the same thing you did (to ensure
overflow?), for positive values, in April '07, zend_operators.c v1.269 for
"WIN64 support" (I don't think Win64 is any different than Win32 -- a
difference in compilers between VC6 and newer? Yes).

This should handle the 64-bit range of values. From what I've been able
to check, once a double's exponent is above 64 (or is it 63? whatever),
the conversion doesn't work (no different than PHP 5.2), without fmod...
Outside of -2^63 and 2^63 on 32-bit Linux/Windows, you get 0; on 64-bit
Linux (like 5.2), above 2^64 is 0 and below -2^63 is LONG_MIN.

I also added a configure check and flag if a simple (long) cast is
sufficient to preserve LSB. Don't know if I put it in the "right" place
(Zend.m4) or if it should be kept, but it's there.

BTW, I used a function so it's easier to use than the current macro. With
zend_always_inline, it should usually be the same...? I would have used a
macro with ?: but that was used previously before removal from
zend_operators.c in Sep '06, v1.256: "use if() instead of ?: and avoid
possible optimization problems."

16 years ago by Matt Wilmas — view source

unread

Hi again,

OK, I will go ahead and commit the patch + test updates in about 24 hours...

Matt

----- Original Message -----
From: "Matt Wilmas"
Sent: Friday, May 29, 2009

Hi all,

----- Original Message -----
From: "Matt Wilmas"
Sent: Friday, April 10, 2009

Hi Dmitry,

I finally updated the patches:
http://realplain.com/php/dval_to_lval.diff
http://realplain.com/php/dval_to_lval_5_3.diff

The patches were updated again, against current CVS (no changes by me).

After some off- and on-list discussion weeks ago, it sounds like (from
Stas) these changes are acceptable to restore float->int conversion to the
long-standing behavior that exists through PHP 5.2 (for most
users/platforms), as long as the necessary tests were updated, which I've
done for ones I checked. At least I haven't heard any negative feedback
since my final proposal? :-)

Patches for test updates:
http://realplain.com/php/dval_to_lval_tests.diff
http://realplain.com/php/dval_to_lval_tests_5_3.diff

ext/standard/tests/url/parse_url_variation_002.phpt was split into 32- and
64-bit versions again (like 5.2). If other failing tests are found, let
me know.

I guess I could go ahead and commit the changes next week if there aren't
objections... Was wondering about two minor parts, explained previously,
OK to keep?:

Configure check for whether just a simple (long) cast gives the desired
result, to save runtime range checking. (Not sure how many platforms
would pass this, other than Windows' VC6.)

The 'L' conversion specifier for zend_parse_parameters() (I've updated
README.PARAMETER_PARSING_API). Like I said, some functions could be
updated to use this (with "limit" or "length" type params) to prevent
overflow-related weirdness if huge numbers are passed. It should be noted
that such unexpected behavior has existed through 5.2, however, though the
changes in 5.3 started to prevent it in a lot of cases (very inconsistent
though; Bug #47854, etc.).

Thanks,
Matt

16 years ago by Daniel Convissor — view source

unread

Hi Matt:

But of course with
doubles, precision has been lost long before 2^63 anyway, as far as
increments of 1 (it's 1024 at 2^63).

I just ran into these issues in PHP 5.2.8 on 64 bit Linux while running
examples I'm using to improve our documentation surrounding bitwise
operators, bindec() and decbin().

The integer float conversions start failing above 2^53, meaning
9007199254740993 is the first integer that doesn't work. As you noted,
ints divisible by 1024 are fine. I can provide the test script to anyone
who needs it.

A bash script I put together makes it look like things work correctly in
the shell. So it seems this is a PHP thing, not a Linux thing.

Am I correct in believing your changes will take care of these issues?
It seems you're looking to patch 5.3 and HEAD. Are there thoughts of
applying this to 5.2?

Thanks,

--Dan

Note: The guy who put the box together says everything on it is built
against 64 bit libraries.

--
T H E A N A L Y S I S A N D S O L U T I O N S C O M P A N Y
data intensive web and database programming
http://www.AnalysisAndSolutions.com/
4015 7th Ave #4, Brooklyn NY 11232 v: 718-854-0335 f: 718-854-0409

16 years ago by Matt Wilmas — view source

unread

Hi Dan,

----- Original Message -----
From: "Daniel Convissor"
Sent: Thursday, April 02, 2009

Hi Matt:

But of course with
doubles, precision has been lost long before 2^63 anyway, as far as
increments of 1 (it's 1024 at 2^63).

I just ran into these issues in PHP 5.2.8 on 64 bit Linux while running
examples I'm using to improve our documentation surrounding bitwise
operators, bindec() and decbin().

The integer float conversions start failing above 2^53, meaning
9007199254740993 is the first integer that doesn't work. As you noted,
ints divisible by 1024 are fine. I can provide the test script to anyone
who needs it.

Sorry I didn't get back to you sooner, but unless I'm missing something,
you're talking about converting long/int to double/float. That's the
opposite of this thread subject, which is how to convert a double to a long
when it's out of the range of a long. :-)

But, for what you're testing, that's the behavior I'd expect -- once you've
reached the precision of a double, you'll only get the closest
representation possible (and of course a 64-bit long is more precise than a
double since there's no floating point to represent). Also, I assume what
can be represented by a double is the same across platforms, if it's IEEE

Just curious though, you're saying that all whole numbers (from long) below
2^53 are representable? (Powers of 2 should always be OK.) When writing a
big literal number on a 32-bit system, I'm seeing much lower than that
(around 2^40), but in that case conversion is happening from a string,
digit-by-digit, instead of directly from a 64-bit integer type, so maybe
that's why...

A bash script I put together makes it look like things work correctly in
the shell. So it seems this is a PHP thing, not a Linux thing.

That's interesting, though I don't know much about shell script stuff (or
other *nix details ;-)). Like I said, I figure a double type should behave
the same everywhere. Unless the shell/bash uses a long double type (twice
as big as a regular double)? shrug

Am I correct in believing your changes will take care of these issues?
It seems you're looking to patch 5.3 and HEAD. Are there thoughts of
applying this to 5.2?

Thanks,

--Dan

Note: The guy who put the box together says everything on it is built
against 64 bit libraries.

Matt

16 years ago by Daniel Convissor — view source

unread

Hi Matt, and everyone:

unless I'm missing something,
you're talking about converting long/int to double/float. That's the
opposite of this thread subject, which is how to convert a double to a
long when it's out of the range of a long. :-)

It's a two way street. If the floats don't have enough precision, things
get jumbled when converting the floats back to integers.

But, for what you're testing, that's the behavior I'd expect -- once
you've reached the precision of a double, you'll only get the closest
representation possible (and of course a 64-bit long is more precise than
a double since there's no floating point to represent). Also, I assume
what can be represented by a double is the same across platforms, if it's
IEEE 754.

Yes. But I was expecting that since long on 64-bit machines holds 64
bits in PHP (et al), that PHP would use C's long double type for floats
on 64-bit platforms rather than plain old doubles. It seems like the
user-friendly, PHP way to handle the situation, particularly as 64-bit
computers are commonplace these days.

Just curious though, you're saying that all whole numbers (from long)
below 2^53 are representable? (Powers of 2 should always be OK.) When
writing a big literal number on a 32-bit system, I'm seeing much lower
than that (around 2^40)

I'm talking about 64-bit machines.

Like I said, I figure a double type should
behave the same everywhere. Unless the shell/bash uses a long double
type (twice as big as a regular double)? shrug

Exactly.

The test scripts in question are now available for download from
http://www.analysisandsolutions.com/php/intfloat/

Thanks,

--Dan

16 years ago by Christian Seiler — view source

unread

Hi Daniel,

But, for what you're testing, that's the behavior I'd expect -- once
you've reached the precision of a double, you'll only get the closest
representation possible (and of course a 64-bit long is more precise than
a double since there's no floating point to represent). Also, I assume
what can be represented by a double is the same across platforms, if it's
IEEE 754.

Yes. But I was expecting that since long on 64-bit machines holds 64
bits in PHP (et al), that PHP would use C's long double type for floats
on 64-bit platforms rather than plain old doubles. It seems like the
user-friendly, PHP way to handle the situation, particularly as 64-bit
computers are commonplace these days.

If you talk about 64 bit platforms, the 64 bit refers only to integers -
not floats. There is no difference between the supported floating point
types and operations of the most recent "32 bit" x86 processors and the
newer "64 bit" x86_64 counterparts. Both have an i387 compatible FPU and
both implement the SSE standard for vectorized floating point operations.

SSE only supports single precision (23 + 8 + 1 = 32 bit) and double
precision (52 + 11 + 1 = 64 bit) data types for vectorized operaitons.
The i387 FPU supports single precision, double precision and a
proprietary Intel "double extended" precision data type which uses 80
bits (actually only 79 are really necessary).

Other processor architectures may only support single and double
precision data types, yet others support some kind of "double double"
precision which is a combination of two double precision values (total
128 bits) to support higher mantissa (but not higher exponent) and yet
others support a real quad precision data type which has a larger
mantissa and exponent.

Some compilers allow the use of any of the above data types (Intel
"double extended", "double-double", quad) via the »long double« C data
type. Others (ESPECIALLY ALL the Microsoft compilers!) do not support
»long double« but rather make »long double« be a normal double value.
And for double-double data types the calculations are nearly always done
with software emulation.

So basically the situation is the following: You have 4 different
possible data types for "long double" in C. Each with different mantissa
and different exponent and 3 different sizes (64, 80 and 128 bit). To
me, this sounds like hell.

One of the changes I made in PHP 5.3 was actually to make sure all
platforms use the normal IEEE double data type also for calculations.
(there is something like "internal precision" in x87 compatible FPUs
which makes life very complicated) This was done to ensure portability
of the code. Because with floating point operations, even "simple"
numbers (such as 0.01 or 1.0/3.0) cannot be exactly represented by a
computer and thus every bit of the mantissa is required to approximate
the number. If you then have different precisions on different platforms
or even with different compilers, life gets complicated. For example, if
you check that $a + $b >= $c on one system, this need not be true on
another system if the precisions don't match - there are examples in
both ways! So please, please, please: Don't complicate life and
introduce "long double" in PHP. At least as long as there is no
standardized >64 bit floating point data type that works across
platforms and compilers.

Regards,
Christian

[PATCH] double to long conversion change

Now that should be checked out - are those directly related to the said conversion (i.e. the issue is that the tests just expect different behavior which is natural and we just have to fix the tests) or those are independent tests.

Now that should be checked out - are those directly related to the said
conversion (i.e. the issue is that the tests just expect different
behavior which is natural and we just have to fix the tests) or those
are independent tests.