Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:26927 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 19003 invoked by uid 1010); 13 Dec 2006 07:13:38 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 18987 invoked from network); 13 Dec 2006 07:13:38 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 13 Dec 2006 07:13:38 -0000 Authentication-Results: pb1.pair.com smtp.mail=andi@zend.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=andi@zend.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain zend.com designates 212.25.124.162 as permitted sender) X-PHP-List-Original-Sender: andi@zend.com X-Host-Fingerprint: 212.25.124.162 mail.zend.com Linux 2.5 (sometimes 2.4) (4) Received: from [212.25.124.162] ([212.25.124.162:16505] helo=mail.zend.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id C4/42-02953-BD7AF754 for ; Wed, 13 Dec 2006 02:13:03 -0500 Received: (qmail 13831 invoked from network); 13 Dec 2006 07:10:54 -0000 Received: from localhost (HELO ANDILENOVO) (127.0.0.1) by localhost with SMTP; 13 Dec 2006 07:10:54 -0000 To: "'Matt Wilmas'" , Date: Tue, 12 Dec 2006 23:12:17 -0800 Message-ID: <026c01c71e86$093ad510$6600a8c0@zend.2k> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 In-Reply-To: <014f01c71dfd$f0431cd0$0201a8c0@pc1> Thread-Index: Accd/f17VLNFrdT5QCW81kdVTTJ0swAh6G0g X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2962 Subject: RE: [PHP-DEV] [PATCH] New, optimized is_numeric_string, and other number stuff From: andi@zend.com ("Andi Gutmans") References: <014f01c71dfd$f0431cd0$0201a8c0@pc1> I briefly reviewed the patch and it looks interesting. Will take me a bit longer though and I'd also like some others here to do so. In any case, I think it's risky for PHP 5.2.1 and would prefer to defer it to the next release even if I end up being in favor of including the patch. It changes a lot of subtleties which can be dangerous in such a short time frame. Thanks for the patch. Will dig deeper now... Andi > -----Original Message----- > From: Matt Wilmas [mailto:php_lists@realplain.com] > Sent: Tuesday, December 12, 2006 6:58 AM > To: internals@lists.php.net > Subject: [PHP-DEV] [PATCH] New, optimized is_numeric_string, > and other number stuff > > Hi all, > > I rewrote is_numeric_string/unicode to be faster and change a > couple things. > The changes being: > 1) Previously, large numbers (very long or "1e500") that > became INF were ignored (Bug #26349), which is not the > behavior anywhere else. > 2) Leading whitespace with hex numbers or ones that started > with . (" .123") also caused them to be ignored. > 3) Hex strings were limited to LONG_MAX, and in > scripts/parser, ULONG_MAX. > I added a zend_hex_strtod() function to handle numbers > > LONG_MAX in both places. From the previous comments like > "strtod() messes up hex numbers," > it seems there was desire to support them. :-) > 4) Small change, but the string "0x" was considered > non-numeric before, but a partial match of the 0 now > (basically to get a more accurate error level/message with > zend_parse_parameters(), for example). > > Now the performance... The errno stuff has been removed from > is_numeric_* (and optimized in the parser) to save function > calls with thread-safe libraries (are they used even when ZTS > is disabled?). In my tests on Windows, I saw a 5-15% > improvement with longs (less with more digits; on 64-bit > systems, it could be slower at 12-15+ digits, but they're not > common). (With HEAD, everything I checked was consistent, > but in 5.2, a few random long tests were slower; must be some > compiler weirdness? :-/) So not much difference there for > these changes, BUT doubles are over *twice* as fast, and > non-numeric string comparisons are up to nearly 3 times faster! > (Slightly less % improvement in Unicode mode.) Yeah, > non-numeric strings are detected very fast, which may be more > significant since is_numeric_* is always used on them (from > compare_function(), zendi_smart_strcmp(), etc.). > Also, no number conversion is done if there's no > corresponding pointer to fill -- much faster when code is > "just checking." > > The larger inline function did increase the binary size by a > few K... The > patches: > > http://realplain.com/php/is_numeric.diff > http://realplain.com/php/is_numeric_5_2.diff > > You can see that I changed MAX_LENGTH_OF_LONG to be accurate > on 32-/64-bit, which my changes rely on. I also fixed a few > places where memory calculations that use it could be too > small, in theory. > > I wanted to get this in before Ilia's Thursday deadline (if > it's still on :-)), in case it can be applied soon. Finally, > don't know if you'd want to use it as is, but I've attached > possible NEWS file updates about this stuff. > > Thoughts, questions? Thanks. > > > Matt >