Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:27097 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 7353 invoked by uid 1010); 19 Dec 2006 13:36:27 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 7338 invoked from network); 19 Dec 2006 13:36:27 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 19 Dec 2006 13:36:27 -0000 Authentication-Results: pb1.pair.com header.from=php_lists@realplain.com; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=php_lists@realplain.com; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain realplain.com from 209.142.136.132 cause and error) X-PHP-List-Original-Sender: php_lists@realplain.com X-Host-Fingerprint: 209.142.136.132 msa2-mx.centurytel.net Linux 2.4/2.6 Received: from [209.142.136.132] ([209.142.136.132:54763] helo=msa2-mx.centurytel.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 9A/B4-05058-8BAE7854 for ; Tue, 19 Dec 2006 08:36:27 -0500 Received: from pc1 (207-119-219-81.dyn.centurytel.net [207.119.219.81]) by msa2-mx.centurytel.net (8.13.6/8.13.6) with SMTP id kBJDZmNm024147; Tue, 19 Dec 2006 07:35:48 -0600 Message-ID: <022101c72372$98a829f0$0201a8c0@pc1> To: , "Ilia Alshanetsky" References: <014f01c71dfd$f0431cd0$0201a8c0@pc1> <018301c71e06$9ef4f840$0201a8c0@pc1> <019301c71e08$9837bdb0$0201a8c0@pc1> <016e01c722a3$d35c3e80$0201a8c0@pc1> Date: Tue, 19 Dec 2006 07:35:49 -0600 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1807 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1896 Subject: Re: [PHP-DEV] [PATCH] New, optimized is_numeric_string, and other number stuff From: php_lists@realplain.com ("Matt Wilmas") Hi (again) Ilia, all, After thinking about the zend_smart_strcmp() issue more, I believe a numeric comparison can be used except when *both* values overflow *and* have the same sign (either INF or -INF). I also think it's OK to say an underflow (that becomes an even 0.0) of both values can simply be considered a floating-point precision loss. I've updated the function to reflect this. Sound like an acceptable/good solution? The old is_numeric_* function caused a string comparison if *either* value overflowed, so an expression like ('123' < str_repeat('1', 500)) was FALSE; it's now TRUE, which seems correct. BTW, I noticed another bug that considered '-2000000000' to be *greater than* '2000000000' because of overflow subtracting longs. I changed it to be the same as in compare_function() where this bug was fixed exactly 6 years ago. :-) Finally, speaking of compare_function(), _smart_strcmp() now uses a (double) cast too instead of _strtod() when dealing with a long/double combo. Patches are updated: http://realplain.com/php/is_numeric.diff http://realplain.com/php/is_numeric_5_2.diff Hopefully everything is now in good shape for inclusion. :-? Matt ----- Original Message ----- From: "Matt Wilmas" Sent: Monday, December 18, 2006 > Hi Ilia, all, > > I was just looking at zendi_smart_strcmp() and realized something I hadn't > considered (more on that in a sec.). First, and unrelated to the new > is_numeric_*, because _smart_strcmp() uses zend_strtod() if one operand is a > double and the other isn't, it results in ('0.0' == '0x123') being TRUE! > Seems like a bug. I think a (double) cast can be used there instead of > zend_strtod(), since that's what's done in compare_function()? Would be > faster too... > > All right, now what's different with my new is_numeric_* functions: > previously is_numeric_* ignored doubles that overflowed (INF), and returned > 0 -- think like 500 digits -- and the operands would be compared as strings. > So 2 strings that overflowed wouldn't wrongly be considered equal, I assume. > Now that's different with IS_DOUBLE being returned always, which is > appropriate for most cases like arithmetic. And it wasn't just overflown > numbers that previously returned 0, but underflown also ('1e-1000' or a VERY > small decimal number; they set ERANGE). > > I think the simplest way to keep the old behavior with the new is_numeric_* > is to use errno in zendi_smart_strcmp(). That'll make it a little slower, > but no slower on longs than the old version I don't think, and it'll still > be much faster with doubles/non-numeric strings. > > Any other thoughts on how it should be handled? That is the correct and > desired behavior to only compare numerically if both values can be > accurately represented? I'll update the patches ASAP once I find out what > to do...