Hi all,
Assuming there are no objections, I'll commit this fix in a few hours...
Besides the bug report(s), I had also found awhile ago that currently an
array key can be LONG_MAX or LONG_MIN as a string and/or integer because of
a check in ZEND_HANDLE_NUMERIC() (I assume to avoid a slow errno check for
ERANGE originally). I changed it to use the same method that's used in
the scanner, is_numeric_string(), etc., and those 2 values are now treated
as an integer.
It's just a few lines changed in zend_hash.h, although I had to move some of
the definitions from zend_operators.h to zend.h (is that OK?).
Patches (didn't check HEAD's yet):
http://realplain.com/php/array_key_limit.diff
http://realplain.com/php/array_key_limit_5_3.diff
- Matt
Hi Matt,
I have no objections against proposed behaviour, but can't we use just a
check for (errno == ERANGE) after strtol()?
Thanks. Dmitry.
Matt Wilmas wrote:
Hi all,
Assuming there are no objections, I'll commit this fix in a few hours...
Besides the bug report(s), I had also found awhile ago that currently an
array key can be LONG_MAX or LONG_MIN as a string and/or integer because
of a check in ZEND_HANDLE_NUMERIC() (I assume to avoid a slow errno
check for ERANGE originally). I changed it to use the same method
that's used in the scanner, is_numeric_string(), etc., and those 2
values are now treated as an integer.It's just a few lines changed in zend_hash.h, although I had to move
some of the definitions from zend_operators.h to zend.h (is that OK?).Patches (didn't check HEAD's yet):
http://realplain.com/php/array_key_limit.diff
http://realplain.com/php/array_key_limit_5_3.diff
- Matt
BTW may be we should eliminate strtol() at all.
There's no need to scan the string twice.
Dmitry.
Matt Wilmas wrote:
Hi all,
Assuming there are no objections, I'll commit this fix in a few hours...
Besides the bug report(s), I had also found awhile ago that currently an
array key can be LONG_MAX or LONG_MIN as a string and/or integer because
of a check in ZEND_HANDLE_NUMERIC() (I assume to avoid a slow errno
check for ERANGE originally). I changed it to use the same method
that's used in the scanner, is_numeric_string(), etc., and those 2
values are now treated as an integer.It's just a few lines changed in zend_hash.h, although I had to move
some of the definitions from zend_operators.h to zend.h (is that OK?).Patches (didn't check HEAD's yet):
http://realplain.com/php/array_key_limit.diff
http://realplain.com/php/array_key_limit_5_3.diff
- Matt
Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Wednesday, March 18, 2009
BTW may be we should eliminate strtol() at all.
There's no need to scan the string twice.
Your change to remove strtol() [1] is not checking for overflow correctly
(for example, zend_u_strtol()'s checks are more complicated). This breaks
handling of keys above ULONG_MAX (it's correct again after
ULONG_MAX+LONG_MAX, until ULONG_MAX * 2, etc.). See:
var_dump(array('5000000000' => 1));
array(1) {
[705032704]=>
int(1)
}
And of course the new code is a bit slower for keys that aren't fully
numeric, e.g. "123a"
[1] http://news.php.net/php.zend-engine.cvs/7465
Dmitry.
- Matt
Matt Wilmas wrote:
Hi all,
Assuming there are no objections, I'll commit this fix in a few hours...
Besides the bug report(s), I had also found awhile ago that currently an
array key can be LONG_MAX or LONG_MIN as a string and/or integer because
of a check in ZEND_HANDLE_NUMERIC() (I assume to avoid a slow errno check
for ERANGE originally). I changed it to use the same method that's
used in the scanner, is_numeric_string(), etc., and those 2 values are
now treated as an integer.It's just a few lines changed in zend_hash.h, although I had to move some
of the definitions from zend_operators.h to zend.h (is that OK?).Patches (didn't check HEAD's yet):
http://realplain.com/php/array_key_limit.diff
http://realplain.com/php/array_key_limit_5_3.diff
- Matt
Hi Matt,
Matt Wilmas wrote:
Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Wednesday, March 18, 2009BTW may be we should eliminate strtol() at all.
There's no need to scan the string twice.Your change to remove strtol() [1] is not checking for overflow
correctly (for example, zend_u_strtol()'s checks are more complicated).
This breaks handling of keys above ULONG_MAX (it's correct again after
ULONG_MAX+LONG_MAX, until ULONG_MAX * 2, etc.). See:var_dump(array('5000000000' => 1));
array(1) {
[705032704]=>
int(1)
}
Thank you for catching this.
I'll think a bit and then revert patch if don't find better solution.
And of course the new code is a bit slower for keys that aren't fully
numeric, e.g. "123a"
Agree, but it's not a usual case, "1234" would occur much faster.
Thanks. Dmitry.
[1] http://news.php.net/php.zend-engine.cvs/7465
Dmitry.
- Matt
Matt Wilmas wrote:
Hi all,
Assuming there are no objections, I'll commit this fix in a few hours...
Besides the bug report(s), I had also found awhile ago that currently
an array key can be LONG_MAX or LONG_MIN as a string and/or integer
because of a check in ZEND_HANDLE_NUMERIC() (I assume to avoid a slow
errno check for ERANGE originally). I changed it to use the same
method that's used in the scanner, is_numeric_string(), etc., and
those 2 values are now treated as an integer.It's just a few lines changed in zend_hash.h, although I had to move
some of the definitions from zend_operators.h to zend.h (is that OK?).Patches (didn't check HEAD's yet):
http://realplain.com/php/array_key_limit.diff
http://realplain.com/php/array_key_limit_5_3.diff
- Matt
Hi Matt,
I suppose I fixed the patch.
Could you please check which patch is better yours or the attached one?
According to attached benchmark my one is faster for most usual cases,
but may be I forget something again.
$a["abcdefghij"] 0.130 0.130
$a["1234567890"] 0.187 0.104
$a["2147483648"] 0.168 0.142
$a["0"] 0.116 0.082
$a["214748364x"] 0.136 0.141
$a[0] 0.080 0.080
Also, do you have other patches which I could look into before RC?
Thanks. Dmitry.
Matt Wilmas wrote:
Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Wednesday, March 18, 2009BTW may be we should eliminate strtol() at all.
There's no need to scan the string twice.Your change to remove strtol() [1] is not checking for overflow
correctly (for example, zend_u_strtol()'s checks are more complicated).
This breaks handling of keys above ULONG_MAX (it's correct again after
ULONG_MAX+LONG_MAX, until ULONG_MAX * 2, etc.). See:var_dump(array('5000000000' => 1));
array(1) {
[705032704]=>
int(1)
}And of course the new code is a bit slower for keys that aren't fully
numeric, e.g. "123a"[1] http://news.php.net/php.zend-engine.cvs/7465
Dmitry.
- Matt
Matt Wilmas wrote:
Hi all,
Assuming there are no objections, I'll commit this fix in a few hours...
Besides the bug report(s), I had also found awhile ago that currently
an array key can be LONG_MAX or LONG_MIN as a string and/or integer
because of a check in ZEND_HANDLE_NUMERIC() (I assume to avoid a slow
errno check for ERANGE originally). I changed it to use the same
method that's used in the scanner, is_numeric_string(), etc., and
those 2 values are now treated as an integer.It's just a few lines changed in zend_hash.h, although I had to move
some of the definitions from zend_operators.h to zend.h (is that OK?).Patches (didn't check HEAD's yet):
http://realplain.com/php/array_key_limit.diff
http://realplain.com/php/array_key_limit_5_3.diff
- Matt
Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Wednesday, March 18, 2009
Hi Matt,
I suppose I fixed the patch.
Could you please check which patch is better yours or the attached one?
According to attached benchmark my one is faster for most usual cases,
but may be I forget something again.$a["abcdefghij"] 0.130 0.130
$a["1234567890"] 0.187 0.104
$a["2147483648"] 0.168 0.142
$a["0"] 0.116 0.082
$a["214748364x"] 0.136 0.141
$a[0] 0.080 0.080
I gotta be away from the computer for a while in a second, so didn't
actually test your patch myself, but just wanted to say quick that it looks
good when I skimmed it. And those benchmarks look cool. :-)
Also, do you have other patches which I could look into before RC?
I think just the 2 I sent to the list a few days ago, and hoped to get some
feedback from others (haven't). If you didn't see them:
http://marc.info/?l=php-internals&m=123704111325725&w=2
This is about double to long conversion, what should happen, consistency,
etc. that I've been wondering about since last year. All info should be in
that message (and linked ones).
http://marc.info/?l=php-internals&m=123722650225792&w=2
Suggestion to have bitwise and modulus operators operate in "64-bit mode" if
necessary on 32-bit platforms. Just a bit ago it came to mind that I think
I can improve this patch -- the behavior would be exactly the same, just
better, more optimized code. (I'll do that as soon as I can, but it
wouldn't affect you looking into it. I plan to eliminate the big macro I
added, FYI.)
I'd imagine they could be discussed, etc. past the RC...
Thanks. Dmitry.
- Matt
Matt Wilmas wrote:
Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Wednesday, March 18, 2009BTW may be we should eliminate strtol() at all.
There's no need to scan the string twice.Your change to remove strtol() [1] is not checking for overflow
correctly (for example, zend_u_strtol()'s checks are more complicated).
This breaks handling of keys above ULONG_MAX (it's correct again after
ULONG_MAX+LONG_MAX, until ULONG_MAX * 2, etc.). See:var_dump(array('5000000000' => 1));
array(1) {
[705032704]=>
int(1)
}And of course the new code is a bit slower for keys that aren't fully
numeric, e.g. "123a"[1] http://news.php.net/php.zend-engine.cvs/7465
Dmitry.
- Matt
Matt Wilmas wrote:
Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Wednesday, March 18, 2009Hi Matt,
I suppose I fixed the patch.
Could you please check which patch is better yours or the attached one?
According to attached benchmark my one is faster for most usual cases,
but may be I forget something again.$a["abcdefghij"] 0.130 0.130
$a["1234567890"] 0.187 0.104
$a["2147483648"] 0.168 0.142
$a["0"] 0.116 0.082
$a["214748364x"] 0.136 0.141
$a[0] 0.080 0.080I gotta be away from the computer for a while in a second, so didn't
actually test your patch myself, but just wanted to say quick that it
looks good when I skimmed it. And those benchmarks look cool. :-)
ok. I'll wait till tomorrow.
Also, do you have other patches which I could look into before RC?
I think just the 2 I sent to the list a few days ago, and hoped to get
some feedback from others (haven't). If you didn't see them:http://marc.info/?l=php-internals&m=123704111325725&w=2
This is about double to long conversion, what should happen,
consistency, etc. that I've been wondering about since last year. All
info should be in that message (and linked ones).http://marc.info/?l=php-internals&m=123722650225792&w=2
Suggestion to have bitwise and modulus operators operate in "64-bit
mode" if necessary on 32-bit platforms. Just a bit ago it came to mind
that I think I can improve this patch -- the behavior would be exactly
the same, just better, more optimized code. (I'll do that as soon as I
can, but it wouldn't affect you looking into it. I plan to eliminate
the big macro I added, FYI.)
I'll try to look into them too.
Thanks. Dmitry.
I'd imagine they could be discussed, etc. past the RC...
Thanks. Dmitry.
- Matt
Matt Wilmas wrote:
Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Wednesday, March 18, 2009BTW may be we should eliminate strtol() at all.
There's no need to scan the string twice.Your change to remove strtol() [1] is not checking for overflow
correctly (for example, zend_u_strtol()'s checks are more complicated).
This breaks handling of keys above ULONG_MAX (it's correct again after
ULONG_MAX+LONG_MAX, until ULONG_MAX * 2, etc.). See:var_dump(array('5000000000' => 1));
array(1) {
[705032704]=>
int(1)
}And of course the new code is a bit slower for keys that aren't fully
numeric, e.g. "123a"[1] http://news.php.net/php.zend-engine.cvs/7465
Dmitry.
- Matt
Hi again Dmitry,
Just wanted to say that I think I can make your code a bit smaller after
looking at it closer. Don't quite have enough time to modify and verify it
now, but I'll be back later (around the usual time) to let you know either
way. :-)
- Matt
----- Original Message -----
From: "Dmitry Stogov"
Sent: Wednesday, March 18, 2009
Hi Matt,
I suppose I fixed the patch.
Could you please check which patch is better yours or the attached one?
According to attached benchmark my one is faster for most usual cases,
but may be I forget something again.$a["abcdefghij"] 0.130 0.130
$a["1234567890"] 0.187 0.104
$a["2147483648"] 0.168 0.142
$a["0"] 0.116 0.082
$a["214748364x"] 0.136 0.141
$a[0] 0.080 0.080Also, do you have other patches which I could look into before RC?
Thanks. Dmitry.
Hi Matt,
Matt Wilmas wrote:
Hi again Dmitry,
Just wanted to say that I think I can make your code a bit smaller after
looking at it closer.
It would be great, but keep in mind that performance is more important.
Don't quite have enough time to modify and verify
it now, but I'll be back later (around the usual time) to let you know
either way. :-)
I'm waiting for your changes.
Thanks. Dmitry.
- Matt
----- Original Message -----
From: "Dmitry Stogov"
Sent: Wednesday, March 18, 2009Hi Matt,
I suppose I fixed the patch.
Could you please check which patch is better yours or the attached one?
According to attached benchmark my one is faster for most usual cases,
but may be I forget something again.$a["abcdefghij"] 0.130 0.130
$a["1234567890"] 0.187 0.104
$a["2147483648"] 0.168 0.142
$a["0"] 0.116 0.082
$a["214748364x"] 0.136 0.141
$a[0] 0.080 0.080Also, do you have other patches which I could look into before RC?
Thanks. Dmitry.
Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Thursday, March 19, 2009
Hi Matt,
Matt Wilmas wrote:
Hi again Dmitry,
Just wanted to say that I think I can make your code a bit smaller after
looking at it closer.It would be great, but keep in mind that performance is more important.
I know. :-)
Don't quite have enough time to modify and verify it now, but I'll be
back later (around the usual time) to let you know either way. :-)I'm waiting for your changes.
No patch, but here's the straight code:
http://realplain.com/php/handle_numeric.txt
With your benchmark script, if I changed one of the keys, it would sometimes
radically affect the time of another key value, hmm. But in my own tests,
this was always faster. Some notes:
I've never understood why the terminating null check is there... I don't
see why it would be missing if there aren't other problems in code. A place
for UNEXPECTED()?
I removed the special handling of "0" and it's faster.
(Oh no, I see you just commited your code right now, LOL. Sorry I didn't
send this sooner. :-/)
For the leading 0 check, I changed it back to "length > 2" again instead of
the "end - tmp != 1" calculation.
I used ++tmp instead of 2 *tmp++ because I assume having the increment in
just one place creates less instructions. ;-) (For size, not speed.)
Finally, in the Unicode version (copied, untested), I'm using the 0x values
that were there originally instead of '0' char values...
Thanks. Dmitry.
Again, sorry I was about 20 minutes late with this! :-(
- Matt
Hi Matt,
I ran you version but it looked little bit slower (I checked it with
callgrind). So I kept the current CVS version. Anyway, it is not a big
problem to change it after RC.
Thanks. Dmitry.
Matt Wilmas wrote:
Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Thursday, March 19, 2009Hi Matt,
Matt Wilmas wrote:
Hi again Dmitry,
Just wanted to say that I think I can make your code a bit smaller
after looking at it closer.It would be great, but keep in mind that performance is more important.
I know. :-)
Don't quite have enough time to modify and verify it now, but I'll
be back later (around the usual time) to let you know either way. :-)I'm waiting for your changes.
No patch, but here's the straight code:
http://realplain.com/php/handle_numeric.txtWith your benchmark script, if I changed one of the keys, it would
sometimes radically affect the time of another key value, hmm. But in
my own tests, this was always faster. Some notes:I've never understood why the terminating null check is there... I
don't see why it would be missing if there aren't other problems in
code. A place for UNEXPECTED()?I removed the special handling of "0" and it's faster.
(Oh no, I see you just commited your code right now, LOL. Sorry I
didn't send this sooner. :-/)For the leading 0 check, I changed it back to "length > 2" again instead
of the "end - tmp != 1" calculation.I used ++tmp instead of 2 *tmp++ because I assume having the increment
in just one place creates less instructions. ;-) (For size, not speed.)Finally, in the Unicode version (copied, untested), I'm using the 0x
values that were there originally instead of '0' char values...Thanks. Dmitry.
Again, sorry I was about 20 minutes late with this! :-(
- Matt
Hi Dmitry,
While updating my version (more below), I noticed another problem that I
overlooked before with your leading zero check: "-0" is being treated as
numeric.
----- Original Message -----
From: "Dmitry Stogov"
Sent: Friday, March 20, 2009
Hi Matt,
I ran you version but it looked little bit slower (I checked it with
callgrind). So I kept the current CVS version. Anyway, it is not a big
problem to change it after RC.
Yeah, after I sent that message, I noticed other times were strange
(bench.php, etc.), so it looks like I was getting compiler weirdness with
that code layout... And of course looking at the code, there shouldn't have
been anything faster about it, though it was shorter, which you confirmed
with callgrind.
Anyway, my updated version seems good, still smaller, and should do better
with callgrind. :-) On 32-bit, the first digit of a 10 digit number can be
checked which avoids having "2x overflow" on "5000000000", etc. that I
pointed out with your original code. (No problem on 64-bit.) Also made the
terminating null check be last again since it's least likely to fail...
http://realplain.com/php/handle_numeric.txt
OLD: http://realplain.com/php/handle_numeric-v1.txt
Thanks. Dmitry.
- Matt
Hi Matt,
Matt Wilmas wrote:
Hi Dmitry,
While updating my version (more below), I noticed another problem that I
overlooked before with your leading zero check: "-0" is being treated as
numeric.
Yeah. Thank you for catching this. It should be fixed.
----- Original Message -----
From: "Dmitry Stogov"
Sent: Friday, March 20, 2009Hi Matt,
I ran you version but it looked little bit slower (I checked it with
callgrind). So I kept the current CVS version. Anyway, it is not a big
problem to change it after RC.Yeah, after I sent that message, I noticed other times were strange
(bench.php, etc.), so it looks like I was getting compiler weirdness
with that code layout... And of course looking at the code, there
shouldn't have been anything faster about it, though it was shorter,
which you confirmed with callgrind.Anyway, my updated version seems good, still smaller, and should do
better with callgrind. :-) On 32-bit, the first digit of a 10 digit
number can be checked which avoids having "2x overflow" on "5000000000",
etc. that I pointed out with your original code. (No problem on
64-bit.) Also made the terminating null check be last again since it's
least likely to fail...http://realplain.com/php/handle_numeric.txt
OLD: http://realplain.com/php/handle_numeric-v1.txt
It looks like now your version must be better.
Are you sure that it can't miss overflow? Can it be proved?
Thanks. Dmitry.
Thanks. Dmitry.
- Matt
Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Monday, March 23, 2009
Hi Matt,
Matt Wilmas wrote:
Hi Dmitry,
While updating my version (more below), I noticed another problem that I
overlooked before with your leading zero check: "-0" is being treated as
numeric.Yeah. Thank you for catching this. It should be fixed.
----- Original Message -----
From: "Dmitry Stogov"
Sent: Friday, March 20, 2009Hi Matt,
I ran you version but it looked little bit slower (I checked it with
callgrind). So I kept the current CVS version. Anyway, it is not a big
problem to change it after RC.Yeah, after I sent that message, I noticed other times were strange
(bench.php, etc.), so it looks like I was getting compiler weirdness with
that code layout... And of course looking at the code, there shouldn't
have been anything faster about it, though it was shorter, which you
confirmed with callgrind.Anyway, my updated version seems good, still smaller, and should do
better with callgrind. :-) On 32-bit, the first digit of a 10 digit
number can be checked which avoids having "2x overflow" on "5000000000",
etc. that I pointed out with your original code. (No problem on 64-bit.)
Also made the terminating null check be last again since it's least
likely to fail...http://realplain.com/php/handle_numeric.txt
OLD: http://realplain.com/php/handle_numeric-v1.txtIt looks like now your version must be better.
Are you sure that it can't miss overflow? Can it be proved?
Well, as far as I can tell, it can be proved. :-) On 32-bit, <= 9 digits is
obviously OK, and if there's 10, it's only used if it starts with a 1 or 2.
Overflow with 2500000000 is caught by the </> 0 check, and 5000000000, which
the </> 0 check in your first version missed, won't be used since we can
tell from the first digit that it wouldn't fit. On 64-bit, LONG_MAX starts
with a 9, and there wouldn't be an overflow missed by the </> 0 check until
ULONG_MAX, which is another digit longer, and would be skipped by the
"numbers too long" check.
Thanks. Dmitry.
- Matt
Matt Wilmas wrote:
Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Monday, March 23, 2009Hi Matt,
Matt Wilmas wrote:
Hi Dmitry,
While updating my version (more below), I noticed another problem
that I overlooked before with your leading zero check: "-0" is being
treated as numeric.Yeah. Thank you for catching this. It should be fixed.
----- Original Message -----
From: "Dmitry Stogov"
Sent: Friday, March 20, 2009Hi Matt,
I ran you version but it looked little bit slower (I checked it with
callgrind). So I kept the current CVS version. Anyway, it is not a
big problem to change it after RC.Yeah, after I sent that message, I noticed other times were strange
(bench.php, etc.), so it looks like I was getting compiler weirdness
with that code layout... And of course looking at the code, there
shouldn't have been anything faster about it, though it was shorter,
which you confirmed with callgrind.Anyway, my updated version seems good, still smaller, and should do
better with callgrind. :-) On 32-bit, the first digit of a 10 digit
number can be checked which avoids having "2x overflow" on
"5000000000", etc. that I pointed out with your original code. (No
problem on 64-bit.) Also made the terminating null check be last
again since it's least likely to fail...http://realplain.com/php/handle_numeric.txt
OLD: http://realplain.com/php/handle_numeric-v1.txtIt looks like now your version must be better.
Are you sure that it can't miss overflow? Can it be proved?Well, as far as I can tell, it can be proved. :-) On 32-bit, <= 9
digits is obviously OK, and if there's 10, it's only used if it starts
with a 1 or 2. Overflow with 2500000000 is caught by the </> 0 check,
and 5000000000, which the </> 0 check in your first version missed,
won't be used since we can tell from the first digit that it wouldn't
fit. On 64-bit, LONG_MAX starts with a 9, and there wouldn't be an
overflow missed by the </> 0 check until ULONG_MAX, which is another
digit longer, and would be skipped by the "numbers too long" check.
OK. I'll test and ally your patch after RC.
Dmitry.
So, what is the final conclusion on this one? Are we at a combination
of Matt's and Dmitry's patches here?
I think we definitely need to fix this even in the 5.2 branch and get it
back to 5.1.x and earlier behavior. I consider it a bug that:
$arr[3500000000] = 'blah';
print_r($arr);
results in:
[-2147483648] => blah
if someone has written brand new 5.2-specific code that relies on this
weird behavior, then we will just have to bite the bullet and break that
code. It is way more likely that people are relying on the earlier
behavior and will end up with subtle problems in 5.2. I just had
someone at Yahoo get bitten by this when they upgraded from 5.1.x to 5.2.x.
-Rasmus
Hi Rasmus,
----- Original Message -----
From: "Rasmus Lerdorf"
Sent: Thursday, March 19, 2009
So, what is the final conclusion on this one? Are we at a combination
of Matt's and Dmitry's patches here?I think we definitely need to fix this even in the 5.2 branch and get it
back to 5.1.x and earlier behavior. I consider it a bug that:$arr[3500000000] = 'blah';
print_r($arr);results in:
[-2147483648] => blah
if someone has written brand new 5.2-specific code that relies on this
weird behavior, then we will just have to bite the bullet and break that
code. It is way more likely that people are relying on the earlier
behavior and will end up with subtle problems in 5.2. I just had
someone at Yahoo get bitten by this when they upgraded from 5.1.x to
5.2.x.
This thread is about conversion of numeric string keys to int/long, rather
than double conversion in your example. Bug #45877's behavior has been
around forever I think until the fix a couple days ago (5.2 hasn't been
changed).
With the conversion behavior you're talking about, it sounds like you want
the "double to long conversion change" thread. [1] :-) A change was made
there over a year ago in 5.3+, and I've been wondering about what should be
expected and how to make that expected behavior consistent across platforms.
All details are in that message or the linked ones.
However, nothing has been changed in a 5.2.x release (there was some partial
backport shortly before getting reverted). If you noticed a change in
5.2.x, perhaps it was platform related?
[1] http://marc.info/?l=php-internals&m=123704111325725&w=2
-Rasmus
- Matt
Matt Wilmas wrote:
However, nothing has been changed in a 5.2.x release (there was some
partial backport shortly before getting reverted). If you noticed a
change in 5.2.x, perhaps it was platform related?
Right, sorry, same people, wrong thread. I don't think it is platform
related. The change happened from 5.1 to 5.2, not within the 5.2 cycle.
-Rasmus
So, what is the final conclusion on this one? Are we at a combination
of Matt's and Dmitry's patches here?I think we definitely need to fix this even in the 5.2 branch and
get it
back to 5.1.x and earlier behavior. I consider it a bug that:$arr[3500000000] = 'blah';
print_r($arr);results in:
[-2147483648] => blah
if someone has written brand new 5.2-specific code that relies on this
weird behavior, then we will just have to bite the bullet and break
that
code. It is way more likely that people are relying on the earlier
behavior and will end up with subtle problems in 5.2. I just had
someone at Yahoo get bitten by this when they upgraded from 5.1.x to
5.2.x.
If I understood it properly, the issue Matt/Dmitry are working on is
something else. So where do we stand on the issue Rasmus's notes (is
there a ticket for this one already)?
regards,
Lukas Kahwe Smith
mls@pooteeweet.org
Lukas Kahwe Smith wrote:
So, what is the final conclusion on this one? Are we at a combination
of Matt's and Dmitry's patches here?I think we definitely need to fix this even in the 5.2 branch and get it
back to 5.1.x and earlier behavior. I consider it a bug that:$arr[3500000000] = 'blah';
print_r($arr);results in:
[-2147483648] => blah
if someone has written brand new 5.2-specific code that relies on this
weird behavior, then we will just have to bite the bullet and break that
code. It is way more likely that people are relying on the earlier
behavior and will end up with subtle problems in 5.2. I just had
someone at Yahoo get bitten by this when they upgraded from 5.1.x to
5.2.x.If I understood it properly, the issue Matt/Dmitry are working on is
something else. So where do we stand on the issue Rasmus's notes (is
there a ticket for this one already)?
It's related to another Matt's proposal which cares about float to long
conversion.
Matt, what will we get with your patch and code above?
[LONG_MAX] => blah?
Thanks. Dmitry.
regards,
Lukas Kahwe Smith
mls@pooteeweet.org
Hi Dmitry,
----- Original Message -----
From: "Dmitry Stogov"
Sent: Monday, March 23, 2009
Lukas Kahwe Smith wrote:
So, what is the final conclusion on this one? Are we at a combination
of Matt's and Dmitry's patches here?I think we definitely need to fix this even in the 5.2 branch and get it
back to 5.1.x and earlier behavior. I consider it a bug that:$arr[3500000000] = 'blah';
print_r($arr);results in:
[-2147483648] => blah
if someone has written brand new 5.2-specific code that relies on this
weird behavior, then we will just have to bite the bullet and break that
code. It is way more likely that people are relying on the earlier
behavior and will end up with subtle problems in 5.2. I just had
someone at Yahoo get bitten by this when they upgraded from 5.1.x to
5.2.x.If I understood it properly, the issue Matt/Dmitry are working on is
something else. So where do we stand on the issue Rasmus's notes (is
there a ticket for this one already)?It's related to another Matt's proposal which cares about float to long
conversion.Matt, what will we get with your patch and code above?
[LONG_MAX] => blah?
You mean from 3500000000 as a double? No, I'm going for the overflow
behavior that most people have been getting in 5.2 (before 5.3's
DVAL_TO_LVAL() change), though that overflow isn't guaranteed (which I'm
trying to do). So:
var_dump(array(3500000000 => 1));
array(1) {
[-794967296]=>
int(1)
}
However, the array example may not be the best, since DVAL_TO_LVAL() is
not used in 5.2 for double array keys, but a simple (long) cast, which
could behave differently than using PHP's (int) cast on the same value.
I originally brought up the double->long conversion not for array keys, but
because I wanted overflow to keep the least significant bits when using the
bitwise & operator. Last year's message (linked in the recent patch
message): http://marc.info/?l=php-internals&m=120799720922202&w=2
Thanks. Dmitry.
- Matt
Hi Lukas,
----- Original Message -----
From: "Lukas Kahwe Smith"
Sent: Monday, March 23, 2009
So, what is the final conclusion on this one? Are we at a combination
of Matt's and Dmitry's patches here?I think we definitely need to fix this even in the 5.2 branch and get it
back to 5.1.x and earlier behavior. I consider it a bug that:$arr[3500000000] = 'blah';
print_r($arr);results in:
[-2147483648] => blah
if someone has written brand new 5.2-specific code that relies on this
weird behavior, then we will just have to bite the bullet and break that
code. It is way more likely that people are relying on the earlier
behavior and will end up with subtle problems in 5.2. I just had
someone at Yahoo get bitten by this when they upgraded from 5.1.x to
5.2.x.If I understood it properly, the issue Matt/Dmitry are working on is
something else. So where do we stand on the issue Rasmus's notes (is
there a ticket for this one already)?
Yeah, you understood that the subject about Bug #45877 is different. :-)
Rasmus' issue is the "DVAL_TO_LVAL() change, different behavior" stuff I've
been bringing up for a while. My last message from 9 days ago with all info
either in it or the linked previous messages:
http://marc.info/?l=php-internals&m=123704111325725&w=2 I hoped there might
be some discussion about what should be expected, and how to ensure desired
behavior. I kinda hit a limit with what I can do and test, without being
able to verify things on other platforms.
However, Rasmus is talking about a change in 5.2, and DVAL_TO_LVAL() didn't
change there, but his symptoms are the same, at least.
Of course, my message/patch from a week ago [1] to extend bitwise and
modulus operators (on 32-bit platforms) would help with my original problem
caused by DVAL_TO_LVAL(), as well as issues others may run into (like Bug
#46579). Regardless, the current DVAL_TO_LVAL() definitely doesn't behave
consistently across platforms [2], so it should be updated somehow.
I don't think there's much to worry about with these things and the RC
(after). It's not like either of them should "wreak havoc" with code by
changing how large numbers are handled (isolated, easy to check results,
etc.).
[1] http://marc.info/?l=php-internals&m=123722650225792&w=2
[2] http://marc.info/?l=php-internals&m=123496364812725&w=2
regards,
Lukas Kahwe Smith
mls@pooteeweet.org
- Matt
Hi Rasmus,
----- Original Message -----
From: "Rasmus Lerdorf"
Sent: Thursday, March 19, 2009
So, what is the final conclusion on this one? Are we at a combination
of Matt's and Dmitry's patches here?I think we definitely need to fix this even in the 5.2 branch and get it
back to 5.1.x and earlier behavior. I consider it a bug that:$arr[3500000000] = 'blah';
print_r($arr);results in:
[-2147483648] => blah
I didn't think to ask at the time, but what do you think the result should
be in your example...?
if someone has written brand new 5.2-specific code that relies on this
weird behavior, then we will just have to bite the bullet and break that
code. It is way more likely that people are relying on the earlier
behavior and will end up with subtle problems in 5.2. I just had
someone at Yahoo get bitten by this when they upgraded from 5.1.x to
5.2.x.-Rasmus
- Matt