Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:22607 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 71475 invoked by uid 1010); 28 Mar 2006 22:24:07 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 71460 invoked from network); 28 Mar 2006 22:24:07 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 28 Mar 2006 22:24:07 -0000 X-Host-Fingerprint: 84.63.32.63 dslb-084-063-032-063.pools.arcor-ip.net Received: from ([84.63.32.63:15023] helo=localhost.localdomain) by pb1.pair.com (ecelerity 2.0 beta r(6323M)) with SMTP id 21/DE-14993-487B9244 for ; Tue, 28 Mar 2006 17:24:04 -0500 To: internals@lists.php.net,Rasmus Lerdorf Message-ID: <4429B78A.4080907@web.de> Date: Wed, 29 Mar 2006 00:24:10 +0200 User-Agent: Thunderbird 1.5 (Windows/20051201) MIME-Version: 1.0 References: <43FEB500.7030200@lerdorf.com> In-Reply-To: <43FEB500.7030200@lerdorf.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Posted-By: 84.63.32.63 Subject: native arbitrary precision datatype for PHP? From: akorthaus@web.de (Andreas Korthaus) Hi Rasmus! Rasmus Lerdorf wrote: >> I ran some tests, and did the following: >> >> $Order_Total = sprintf("%01.20f",$Order_Total); >> $Refund_Amount = sprintf("%01.20f",$Refund_Amount); >> >> which produced: >> >> $Order_Total = 102.84999999999999431566 >> and $Refund_Amount = 102.85000000000000852651 >> > > Read the note on floating point precision here: > > http://us3.php.net/manual/en/language.types.float.php > > You need to either work completely in pennies and only convert to > dollars for display purposes so you are always working with integers, or > you need to introduce a little fuzz factor whenever you are doing > operations on floating point values. I don't understand why people still use FLOAT at all for some monetary calculations in a typical PHP web application. FLOAT has been used for the first computer 70 years ago because of its limited capabilities. There are alternatives, other tools/languages today use fixed-point or arbitrary-precision arithmetic, to avoid problems with floating-point arithmetic. FLOAT is completely useless if you want to use it for "financial calculations", and the small 32bit INTEGER in PHP is by far too small (you often have to multiply monetary data with very detailed factors with 2-5 and sometimes even more decimal places) and can't help you with dividing. I know about GMP and BC extensions, but why not impelementing something like that into core, which can be used transparently like float but with arbitrary precision (as done in GNUCash, java.math.BigDecimal, postgresql 'NUMERIC')? It needs too much resources? How many floating-point calculations do people in PHP-code? Do you think it's really noticable if each calculation is even 100 times slower? > There is simply no way for a computer to accurately represent a fraction > with anything other than an estimation. Why not? Every child in elementary school can do this if it had enough paper and time ;-) Simply try to do what an elementary school child does (without calculator). Perhaps store arbitrary precision numbers in a struct with an INT array of all digits, and also store the decimal points position. Now simply calculate two arbitrary precision numbers with the same steps learned in school. Of course there can be optimized a lot without loosing precision (I don't think it's necessary to reinvent the wheel here...). The difference will be, that you don't loose precision, which is good, but you will need more memory and more cpu cycles (and don't use it's fpu). But if you look at the average PHP Script, I don't think this will be a big problem today. Communication with databases, manipulation of large strings, output-buffering, on-the-fly compression... will still cost by far more resources. > The typical fix if you don't > want to switch to using integer math is to add an appropriate fuzz > factor. Like this: > > $fuzz = 0.0000001; > if(floor($value+$fuzz) == 10) ... Why is something like that still recommended in days of CPUs with billions of cycles/second, PHP-HEAD with Unicode (>1.000.000 possible characters) if you simply want to calculate some monetary data? Most people don't use workarounds like that (or BC/GMP extensions), and their applications only work by luck, or they simply have overlooked errors caused by floating point arithmetic. Perhaps the new operator overloading feature can be used to create something like that as a PECL extension, but IMO something like that belongs into the core. I think, 64 bit INTEGER and an "arbitrary precision numbers" datatype are the last major features missing in PHP! best regards Andreas PS: will a 64 bit INTEGER make it into PHP6?