Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:25340 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 28258 invoked by uid 1010); 13 Aug 2006 14:18:47 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 28243 invoked from network); 13 Aug 2006 14:18:47 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 13 Aug 2006 14:18:47 -0000 X-PHP-List-Original-Sender: php_lists@realplain.com X-Host-Fingerprint: 69.179.208.43 msa3-mx.centurytel.net Linux 2.4/2.6 Received: from ([69.179.208.43:60701] helo=msa3-mx.centurytel.net) by pb1.pair.com (ecelerity 2.1.1.3 r(11751M)) with ESMTP id 31/41-19138-56B2FD44 for ; Sun, 13 Aug 2006 09:38:46 -0400 Received: from pc1 (d19-198.rt-bras.wnvl.centurytel.net [69.179.146.198]) by msa3-mx.centurytel.net (8.13.6/8.13.6) with SMTP id k7DDcb9A008538; Sun, 13 Aug 2006 08:38:37 -0500 Message-ID: <006d01c6bedd$c86c7a00$0201a8c0@pc1> To: , "Pierre" References: <005301c6bec7$783e5c80$0201a8c0@pc1> Date: Sun, 13 Aug 2006 08:38:38 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1807 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2800.1807 Subject: Re: [PHP-DEV] [PATCH] array_count_values bug: mishandling of numbers with leading whitespace From: php_lists@realplain.com ("Matt W") Hi Pierre, ----- Original Message ----- From: "Pierre" Sent: Sunday, August 13, 2006 > Hello, > > On 8/13/06, Matt W wrote: > > Hi there, > > > > I didn't know whether to make an official bug report or send to the list -- > > so I'm trying this first. :-) > > You already tried in the previous thread about is_numeric. I was [originally] just mentioning it as an aside, since it uses is_numeric_string(). But the "is_numeric..." subject isn't right IMO for this separate problem (even though it uses is_numeric...). :-) And I wouldn't send a patch for the array function in the other thread. Oh, and finally, when I first mentioned it there I had just discovered it, and didn't realize the difference with 4.x. Now I *know* the behavior is wrong -- the fact that it was changed in 5 *is the bug* which caused the previous reports. > > var_dump(array_count_values(array(' 001', 1, ' 1 ', '1'))); > > > > Expected result (and what PHP 4 gives): > > array(3) { > > [" 001"]=> > > int(1) > > [1]=> > > int(2) > > [" 1 "]=> > > int(1) > > } > > > > Actual result: > > array(2) { > > [1]=> > > int(3) > > [" 1 "]=> > > int(1) > > } > > Can this patch please be applied for 5.2's release? To maintain what I'm > > sure is the correct behavior from PHP 4, the function needs to use the > > zend_[u_]symtable* functions, which take care of *correctly* handling > > numeric strings, as zend_hash_[find|update] did in 4.x. If they would've > > been used in 5+ in the first place, there wouldn't have been bugs like > > #34723 (still present with leading whitespace, as you can see), #30833, > > #29808, etc. (I guess that's all, actually). :-) > > The updated code should be faster, too... > > As I said in the previous discussion (why in the world have you > splitted it out?), I don't think it is reasonable or safe to apply > this patch in 5.2. Secondly it seems to wrong address the problem. You > work around the actual is_numeric "problems" by suppressing its usage. Not reasonable or safe? Then why were the other bugs fixed instead of being marked "Bogus" if array_count_values() was *supposed* to behave differently in 5? Because it's not supposed to! There should never have been a change -- just simply changing the word "hash" to "symtable". That's *it*. Functionality would've remained the same, as it should. But for some reason, this wasn't realized, and instead of trying to figure out *how* the function worked as it did in 4.x, this hack with is_numeric() was added, which still doesn't work correctly. (Also makes things a good amount slower, which I verified since the last e-mail.) And of course I'm suppressing is_numeric's usage, because it shouldn't be there. You see that leading 0's are *still* being lost if preceded by whitespace? > Doing so reintroduce the numeric string keys problem (what addresses > the removed code). No, why would it?! :-) 4.x didn't have the problem. With the updated code, the HANDLE_NUMERIC() macro (in symtable functions) will take care of *true* numeric string keys (which is *exactly* what happened in 4.x), just as it does with $array['123'] -- is_numeric isn't used there because it would screw up stuff like it is in array_count_values(). > I think it is much more safer to first determine the general policy > for is_numeric and then address this problem. It is too late in the > game to change such behaviors in 5.x (I care little about 4.x > compatibility). There is nothing unsafe about this change, as it will simply be back to 4.x's *bug-free* behavior. Run the existing array_count_values() tests and they'll pass. That's what they're there for, right? This has nothing to do with "4.x compatibility," but correct operation, which happened to be there in 4.x. :-) is_numeric should have nothing to do with this, because leading whitespace can't be ignored, just like it isn't with $array[' 123']. HANDLE_NUMERIC() does everything that's trying to be done now with "hacks" and then some (and much faster and more efficiently). Just compare my updated 5.2 code to 4.x's and you'll see it's the same (except for layout changes, etc. and my using the ZVAL_LONG() macro in 2 places :-)) other than the word "symtable" instead of "hash". Then, compare 5.2's "symtable" functions to 4.x's "hash" ones, and you'll see that their end result is exactly the same. > Cheers, > --Pierre Matt