Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:60539 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 32565 invoked from network); 13 May 2012 10:57:58 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 13 May 2012 10:57:58 -0000 Authentication-Results: pb1.pair.com smtp.mail=glopes@nebm.ist.utl.pt; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=glopes@nebm.ist.utl.pt; sender-id=unknown Received-SPF: error (pb1.pair.com: domain nebm.ist.utl.pt from 193.136.128.22 cause and error) X-PHP-List-Original-Sender: glopes@nebm.ist.utl.pt X-Host-Fingerprint: 193.136.128.22 smtp2.ist.utl.pt Linux 2.6 Received: from [193.136.128.22] ([193.136.128.22:55295] helo=smtp2.ist.utl.pt) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 5E/21-16338-3B39FAF4 for ; Sun, 13 May 2012 06:57:56 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp2.ist.utl.pt (Postfix) with ESMTP id 27ED47000458; Sun, 13 May 2012 11:57:52 +0100 (WEST) X-Virus-Scanned: by amavisd-new-2.6.4 (20090625) (Debian) at ist.utl.pt Received: from smtp2.ist.utl.pt ([127.0.0.1]) by localhost (smtp2.ist.utl.pt [127.0.0.1]) (amavisd-new, port 10025) with LMTP id qIQAI+v1sQoH; Sun, 13 May 2012 11:57:51 +0100 (WEST) Received: from mail2.ist.utl.pt (mail.ist.utl.pt [IPv6:2001:690:2100:1::8]) by smtp2.ist.utl.pt (Postfix) with ESMTP id 57075700044D; Sun, 13 May 2012 11:57:51 +0100 (WEST) Received: from damnation.mshome.net (damnation-air.nl.lo.geleia.net [IPv6:2001:470:94a2:4:7d06:1af1:ea64:2d52]) (Authenticated sender: ist155741) by mail2.ist.utl.pt (Postfix) with ESMTPSA id 3CF382007061; Sun, 13 May 2012 11:57:48 +0100 (WEST) Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes To: "PHP Internals" , "Stas Malyshev" Cc: =?utf-8?Q?Johannes_Schl=C3=BCter?= References: <4FAF1EE5.5010409@sugarcrm.com> Date: Sun, 13 May 2012 12:57:44 +0200 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Organization: =?utf-8?Q?N=C3=BAcleo_de_Eng=2E_Biom=C3=A9di?= =?utf-8?Q?ca_do_I=2ES=2ET=2E?= Message-ID: In-Reply-To: <4FAF1EE5.5010409@sugarcrm.com> User-Agent: Opera Mail/11.62 (Win32) Subject: Re: [PHP-DEV] bug 54547 From: glopes@nebm.ist.utl.pt ("Gustavo Lopes") On Sun, 13 May 2012 04:39:33 +0200, Stas Malyshev wrote: > I know this was discussed a number of times here, but just to bring it > to a conclusion - I intend to apply patch in the bug report - which > removes conversion for strings that do not convert to integers - to 5.4. > If anybody sees anything that breaks because of this please tell. Not > sure what about 5.3 - Johannes, could you please comment? > I've run the tests and the patch does not seem to cause any breakage. I should point out that "remove[ing] conversion for strings that do not convert to integers" (let's call it proposition A) is not exactly what the patch does. In addition to that condition, one other is required: * The floats a and b the strings convert to are such that a - b == 0.0 (B) It's implemented as if (oflow1 != 0 && oflow1 == oflow2 && dval1 - dval2 == 0.) { ... } This is irrelevant for == or != comparisons. Let's call C "strings are equal in the memcmp() sense". If only A is evaluated, then if ~A, the result of the comparison is the value C, i.e., if strings do not convert to integers, the result of the comparison is the result of memcmp(). Also under ~A, if we add B into the mix, we have the following results: B ~B C 1 can't occur ~C 0 0 But for the comparisons >, <, etc. the result may be surprising. Under ~A && ~B && ~C, using only the first condition results in a memcmp() comparison, while under the two comparisons results in the floats being compared. See: $ php -r 'var_dump(strcmp(" 9223372036854775809", "-9223372036854775808"));' int(-1) (str1 is less than str2) $ php -r 'var_dump((float)" 9223372036854775809" < (float)"-9223372036854775808");' bool(false) So the float comparison behavior under ~B (what's in the patch) may seem more desirable because it preserves the numerical comparison when possible (and we don't have to add leading whitespace and zeros to the mix, strcmp("9, "11") returns 1). Until you realize it's alternating between two behaviors depending on whether B or ~B. So: "9223372036854775809" < " 09443372036854775809" (true, -- floats differ, compare as float) "9223372036854775809" < " 09223372036854775810" (false -- floats are the same, memcmp) In both cases (incorporating the test for B or not), there's no escaping a discontinuity in behavior, unless we revert not to memcmp() but to a custom string comparison function that strips whitespace and leading zeros, compares the size of the input and finally calls memcpy(). -- Gustavo Lopes