Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:26909 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 40535 invoked by uid 1010); 12 Dec 2006 14:58:47 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 40520 invoked from network); 12 Dec 2006 14:58:47 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 12 Dec 2006 14:58:47 -0000 Authentication-Results: pb1.pair.com header.from=php_lists@realplain.com; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=php_lists@realplain.com; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain realplain.com from 209.142.136.132 cause and error) X-PHP-List-Original-Sender: php_lists@realplain.com X-Host-Fingerprint: 209.142.136.132 msa2-mx.centurytel.net Linux 2.4/2.6 Received: from [209.142.136.132] ([209.142.136.132:57742] helo=msa2-mx.centurytel.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id B9/63-19935-483CE754 for ; Tue, 12 Dec 2006 09:58:47 -0500 Received: from pc1 (dsl-192-84.jax.centurytel.net [69.179.192.84]) by msa2-mx.centurytel.net (8.13.6/8.13.6) with SMTP id kBCEw8Du004704 for ; Tue, 12 Dec 2006 08:58:08 -0600 Message-ID: <014f01c71dfd$f0431cd0$0201a8c0@pc1> To: Date: Tue, 12 Dec 2006 08:58:09 -0600 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_014C_01C71DCB.A56D2A70" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1807 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1807 Subject: [PATCH] New, optimized is_numeric_string, and other number stuff From: php_lists@realplain.com ("Matt Wilmas") ------=_NextPart_000_014C_01C71DCB.A56D2A70 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Hi all, I rewrote is_numeric_string/unicode to be faster and change a couple things. The changes being: 1) Previously, large numbers (very long or "1e500") that became INF were ignored (Bug #26349), which is not the behavior anywhere else. 2) Leading whitespace with hex numbers or ones that started with . (" .123") also caused them to be ignored. 3) Hex strings were limited to LONG_MAX, and in scripts/parser, ULONG_MAX. I added a zend_hex_strtod() function to handle numbers > LONG_MAX in both places. From the previous comments like "strtod() messes up hex numbers," it seems there was desire to support them. :-) 4) Small change, but the string "0x" was considered non-numeric before, but a partial match of the 0 now (basically to get a more accurate error level/message with zend_parse_parameters(), for example). Now the performance... The errno stuff has been removed from is_numeric_* (and optimized in the parser) to save function calls with thread-safe libraries (are they used even when ZTS is disabled?). In my tests on Windows, I saw a 5-15% improvement with longs (less with more digits; on 64-bit systems, it could be slower at 12-15+ digits, but they're not common). (With HEAD, everything I checked was consistent, but in 5.2, a few random long tests were slower; must be some compiler weirdness? :-/) So not much difference there for these changes, BUT doubles are over *twice* as fast, and non-numeric string comparisons are up to nearly 3 times faster! (Slightly less % improvement in Unicode mode.) Yeah, non-numeric strings are detected very fast, which may be more significant since is_numeric_* is always used on them (from compare_function(), zendi_smart_strcmp(), etc.). Also, no number conversion is done if there's no corresponding pointer to fill -- much faster when code is "just checking." The larger inline function did increase the binary size by a few K... The patches: http://realplain.com/php/is_numeric.diff http://realplain.com/php/is_numeric_5_2.diff You can see that I changed MAX_LENGTH_OF_LONG to be accurate on 32-/64-bit, which my changes rely on. I also fixed a few places where memory calculations that use it could be too small, in theory. I wanted to get this in before Ilia's Thursday deadline (if it's still on :-)), in case it can be applied soon. Finally, don't know if you'd want to use it as is, but I've attached possible NEWS file updates about this stuff. Thoughts, questions? Thanks. Matt ------=_NextPart_000_014C_01C71DCB.A56D2A70 Content-Type: text/plain; name="NEWS.diff.txt" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="NEWS.diff.txt" Index: NEWS=0A= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A= RCS file: /repository/php-src/NEWS,v=0A= retrieving revision 1.2027.2.547.2.426=0A= diff -u -r1.2027.2.547.2.426 NEWS=0A= --- NEWS 12 Dec 2006 07:38:04 -0000 1.2027.2.547.2.426=0A= +++ NEWS 12 Dec 2006 13:29:19 -0000=0A= @@ -5,12 +5,15 @@=0A= the page. (Ilia)=0A= - Added new function, sys_get_temp_dir(). (Hartmut)=0A= - Added missing object support to file_put_contents(). (Ilia)=0A= +- Added support for hex numbers of any size. (Matt Wilmas)=0A= - Changed double-to-string utilities to use BSD implementation. = (Dmitry, Tony)=0A= - Updated bundled libcURL to version 7.16.0 in the Windows distro. = (Edin)=0A= - Updated timezone database to version 2006.16. (Derick)=0A= - cgi.* and fastcgi.* directives are moved to INI subsystem.=0A= The new directive cgi.check_shebang_line can be used to ommiting = checnk=0A= for "#! /usr/bin/php" line. (Dmitry).=0A= +- Improved performance of numeric string detection and non-identical = comparison=0A= + of strings. (Matt Wilmas)=0A= - Windows related optimizations (Dmitry, Stas)=0A= . COM initialization/deinitialization are done only if necessary=0A= . removed unnecessary checks for ISREG file and corresponding stat() = calls=0A= @@ -182,6 +185,8 @@=0A= (Ilia,Dmitry, Matt Wilmas)=0A= - Fixed bug #29840 (is_executable() does not honor safe_mode_exec_dir=0A= setting). (Ilia)=0A= +- Fixed bug #26349 (is_numeric() returns false for strings with more = than 308=0A= + digits). (Matt Wilmas)=0A= =0A= 02 Nov 2006, PHP 5.2.0=0A= - Updated bundled OpenSSL to version 0.9.8d in the Windows distro. = (Edin)=0A= ------=_NextPart_000_014C_01C71DCB.A56D2A70--