Patches for Review

13 years ago by Anthony Ferrara — view source

unread

Hey all,

I had messaged about these patches before, but with the 5.4 release
process happening I think it slipped through the cracks. I have 3
patches that are ready for inclusion...

https://bugs.php.net/bug.php?id=60813
Adding a new hash_pbkdf2() function to allow for a C level
implementation of the PKCS5 recomended password key stretching
algorithm.

https://bugs.php.net/bug.php?id=60789
Bringing pow() inline with docs in that it will attempt to return an
integer (by casting one or more of the arguments to int) if possible.
This fixes a precision loss that can occur with float arguments that
are exactly integers on 64 bit platforms.

https://bugs.php.net/bug.php?id=60596
A little code-cleanup to remove a superfluous if statement in a switch
in the spl_offset_convert_to_long function...

If anyone can commit these, if I can be granted the Karma to commit
them. of if anyone can comment as to why they should not be committed,
that would be great.

Thanks,

Anthony

13 years ago by Gustavo Lopes — view source

unread

On Sun, 04 Mar 2012 14:29:49 +0100, Anthony Ferrara ircmaxell@gmail.com
wrote:

I had messaged about these patches before, but with the 5.4 release
process happening I think it slipped through the cracks. I have 3
patches that are ready for inclusion...

[...]

https://bugs.php.net/bug.php?id=60789
Bringing pow() inline with docs in that it will attempt to return an
integer (by casting one or more of the arguments to int) if possible.
This fixes a precision loss that can occur with float arguments that
are exactly integers on 64 bit platforms.

[...]

This doesn't seem right. It's overly clever. It's like saying that

4.0 + 1

should return int(5) and not float(5.). Or perhaps a closer analogy would
be:

if we're in a 64-bit platform (only then can integers have a larger
precision than doubles);
if one of the operands is exactly an integer;
if the other operand is a float exactly representable as an integer
(perhaps a large power of 2), but its accuracy is negative (where accuracy
is the 15.9546 - log10(abs(x)), or, informally, the effective number of
digits to the right of the decimal point);
if the result of the sum is exactly representable as a 64-bit integer;
then return such integer.

It's easy to see these sort of tricks would have to be applied all over
the place (additions, products, and all other mathematical operations).

In sum, I think the rule that if a mathematical operation involves a
float, the result should be a float ought to be kept.

--
Gustavo Lopes

13 years ago by Pierre Joye — view source

unread

In sum, I think the rule that if a mathematical operation involves a float,
the result should be a float ought to be kept.

Yes, it is a must.

--
Pierre

@pierrejoye | http://blog.thepimp.net | http://www.libgd.org

13 years ago by Anthony Ferrara — view source

unread

In sum, I think the rule that if a mathematical operation involves a float,
the result should be a float ought to be kept.

Yes, it is a must.

Well, I'm not so sure about that.

If you look at the docs of pow(), it describes the return type as
"base raised to the power of exp. If the result can be represented as
integer it will be returned as type integer, else it will be returned
as type float.".

That's not the current behavior. pow((float) 2, 1) returns float(2).
If you want to update the docs instead of the code, then that's fine.

However, this can lead to precision loss in a number of cases (like
pow(2, (float) 55) which is not representable exactly by a float, and
hence cause data-loss. Whereas converting to an integer allows (on 64
bit platforms) an exact representation.

I'm not suggesting changing addition to follow these rules, but to
bring pow() into alignment with the documentation. And I think
because of the nature of pow(), the likelihood that you'll have
precision loss is a lot higher than other areas...

But if the sentiment is strong against it, then an update of docs is in order...

Additionally, I thought the whole point of type juggling in PHP was
such that the type didn't matter? So that a float and an int are the
same from a usage standpoint? But this is a case where the exact same
input (just type is different) leads to different output. Which I
believe is inconsistent and incorrect.

But if the consensus is otherwise, that's fine...

Anthony

13 years ago by Gustavo Lopes — view source

unread

On Sun, 04 Mar 2012 15:02:23 +0100, Anthony Ferrara ircmaxell@gmail.com
wrote:

In sum, I think the rule that if a mathematical operation involves a
float, the result should be a float ought to be kept.

Yes, it is a must.

Well, I'm not so sure about that.

If you look at the docs of pow(), it describes the return type as
"base raised to the power of exp. If the result can be represented as
integer it will be returned as type integer, else it will be returned
as type float.".

That's not the current behavior. pow((float) 2, 1) returns float(2).
If you want to update the docs instead of the code, then that's fine.

You're reading too much into the documentation. The intention of that
phrase seems to be if the result overflows, it will a float instead (just
like when you add two large integers, for instance).

Yes, the plain text doesn't preclude your interpretation, but it's not
supported by the implementation or the general behavior of mathematical
operators in PHP. By all means, change it in order to make it more clear.

However, this can lead to precision loss in a number of cases (like
pow(2, (float) 55) which is not representable exactly by a float, and
hence cause data-loss. Whereas converting to an integer allows (on 64
bit platforms) an exact representation.

You're making a critical error here. float(55.) is in fact different from
int(55). Float values have a rounding error associated, while integers
don't. The accuracy of float(55.) is 14.2142, which means the uncertainty
is 6.1062310^-15, or that float(55.) = 55 +- 3.0531110^-15.

Mathematical operations will only increase the (relative) error and
therefore decrease the precision. This is an example with Mathematica's
arbitrary precision numbers (NOT machine precision numbers, even though
I'm setting the arbitrary precision to $MachinePrecision); this keeps
track of the accumulated errors:

In[30]:= SetPrecision[55, $MachinePrecision] // Precision
Out[30]= 15.9546
In[31]:= 2^SetPrecision[55, $MachinePrecision] // Precision
Out[31]= 14.3734

This means that if you do 2^55., the result will have total uncertainty:

In[34]:= 2^SetPrecision[55, $MachinePrecision] // 10^-Accuracy[#] &
Out[34]= 152.492

Take your example:

var_dump((int) (pow(2, (float) 55) - 1));

You're adding 1 to a result that has a total uncertainty of more than
152... It's pointless, don't you think?

Of course, PHP doesn't keep track of errors, and the reason subtracting 1
has no effect is because the rounding error itself in the result is
already larger than 2:

In[45]:= 2^55. // 10^-Accuracy[#] &
Out[45]= 4.

[...]
Additionally, I thought the whole point of type juggling in PHP was
such that the type didn't matter? So that a float and an int are the
same from a usage standpoint?

This is a very vague statement. "Doesn't matter" for what? We have
different types for integers and floats, 1. and 1 are different things and
this is reflected in several places. What type juggling does is that one
one specific type is expected, but another one is given, PHP tries to
convert the result. This doesn't mean the types are "implementation
details".

But this is a case where the exact same
input (just type is different) leads to different output. Which I
believe is inconsistent and incorrect.

But if the consensus is otherwise, that's fine...

--
Gustavo Lopes

13 years ago by Nikita Popov — view source

unread

Hey all,
[snip]
https://bugs.php.net/bug.php?id=60596
A little code-cleanup to remove a superfluous if statement in a switch
in the spl_offset_convert_to_long function...
Applied in http://svn.php.net/viewvc/?view=revision&revision=323863.

Nikita