Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:22607
Mailing-List: contact internals-help@lists.php.net; run by ezmlm
To: internals@lists.php.net,Rasmus Lerdorf <rasmus@lerdorf.com>
Message-ID: <4429B78A.4080907@web.de>
Date: Wed, 29 Mar 2006 00:24:10 +0200
User-Agent: Thunderbird 1.5 (Windows/20051201)
MIME-Version: 1.0
References: <KAEJKFABDLGCCFCHEBNJGEIHFJAA.benipmiller@comcast.net> <43FEB500.7030200@lerdorf.com>
In-Reply-To: <43FEB500.7030200@lerdorf.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: native arbitrary precision datatype for PHP?
From: akorthaus@web.de (Andreas Korthaus)

Hi Rasmus!

Rasmus Lerdorf wrote:

>> I ran some tests, and did the following:
>>
>>         $Order_Total = sprintf("%01.20f",$Order_Total);
>>         $Refund_Amount = sprintf("%01.20f",$Refund_Amount);
>>
>> which produced:
>>
>>         $Order_Total = 102.84999999999999431566
>> and        $Refund_Amount = 102.85000000000000852651
>>
> 
> Read the note on floating point precision here:
> 
>   http://us3.php.net/manual/en/language.types.float.php
> 
> You need to either work completely in pennies and only convert to 
> dollars for display purposes so you are always working with integers, or 
> you need to introduce a little fuzz factor whenever you are doing 
> operations on floating point values.

I don't understand why people still use FLOAT at all for some monetary 
calculations in a typical PHP web application. FLOAT has been used for 
the first computer 70 years ago because of its limited capabilities. 
There are alternatives, other tools/languages today use fixed-point or 
arbitrary-precision arithmetic, to avoid problems with floating-point 
arithmetic. FLOAT is completely useless if you want to use it for 
"financial calculations", and the small 32bit INTEGER in PHP is by far 
too small (you often have to multiply monetary data with very detailed 
factors with 2-5 and sometimes even more decimal places) and can't help 
you with dividing.

I know about GMP and BC extensions, but why not impelementing something 
like that into core, which can be used transparently like float but with 
arbitrary precision (as done in GNUCash, java.math.BigDecimal, 
postgresql 'NUMERIC')? It needs too much resources? How many 
floating-point calculations do people in PHP-code? Do you think it's 
really noticable if each calculation is even 100 times slower?

> There is simply no way for a computer to accurately represent a fraction 
> with anything other than an estimation.

Why not? Every child in elementary school can do this if it had enough 
paper and time ;-)
Simply try to do what an elementary school child does (without calculator).

Perhaps store arbitrary precision numbers in a struct with an INT array 
of all digits, and also store the decimal points position. Now simply 
calculate two arbitrary precision numbers with the same steps learned in 
school. Of course there can be optimized a lot without loosing precision 
(I don't think it's necessary to reinvent the wheel here...).

The difference will be, that you don't loose precision, which is good, 
but you will need more memory and more cpu cycles (and don't use it's 
fpu). But if you look at the average PHP Script, I don't think this will 
be a big problem today. Communication with databases, manipulation of 
large strings, output-buffering, on-the-fly compression... will still 
cost by far more resources.

> The typical fix if you don't 
> want to switch to using integer math is to add an appropriate fuzz 
> factor.  Like this:
> 
>   $fuzz = 0.0000001;
>   if(floor($value+$fuzz) == 10) ...

Why is something like that still recommended in days of CPUs with 
billions of cycles/second, PHP-HEAD with Unicode (>1.000.000 possible 
characters) if you simply want to calculate some monetary data?
Most people don't use workarounds like that (or BC/GMP extensions), and 
their applications only work by luck, or they simply have overlooked 
errors caused by floating point arithmetic.

Perhaps the new operator overloading feature can be used to create 
something like that as a PECL extension, but IMO something like that 
belongs into the core.

I think, 64 bit INTEGER and an "arbitrary precision numbers" datatype 
are the last major features missing in PHP!


best regards
Andreas

PS: will a 64 bit INTEGER make it into PHP6?