Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:80959 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 68782 invoked from network); 22 Jan 2015 08:33:07 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 22 Jan 2015 08:33:07 -0000 Authentication-Results: pb1.pair.com smtp.mail=ben.coutu@zeyos.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=ben.coutu@zeyos.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain zeyos.com designates 109.70.220.166 as permitted sender) X-PHP-List-Original-Sender: ben.coutu@zeyos.com X-Host-Fingerprint: 109.70.220.166 unknown Received: from [109.70.220.166] ([109.70.220.166:47355] helo=mx.zeyon.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 39/41-61273-2C5B0C45 for ; Thu, 22 Jan 2015 03:33:07 -0500 Received: from localhost (mx.zeyon.net [127.0.0.1]) by mx.zeyon.net (Postfix) with ESMTP id 384145F8F0 for ; Thu, 22 Jan 2015 09:33:04 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at mx.zeyon.net Received: from mx.zeyon.net ([127.0.0.1]) by localhost (mx.zeyon.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eNWMPI9DPuHP for ; Thu, 22 Jan 2015 09:33:03 +0100 (CET) Received: from cloud.zeyos.com (unknown [109.70.220.163]) by mx.zeyon.net (Postfix) with ESMTPA id 827C55F8EE; Thu, 22 Jan 2015 09:33:00 +0100 (CET) Date: Thu, 22 Jan 2015 09:33:00 +0100 To: Dmitry Stogov , Xinchen Hui , Nikita Popov , "internals@lists.php.net" MIME-Version: 1.0 X-Mailer: ZeyOS Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="UTF-8" Message-ID: <20150122083304.384145F8F0@mx.zeyon.net> Subject: [PHP-DEV] Improvements to for-each implementation From: ben.coutu@zeyos.com (Benjamin Coutu) Hi, this post is a fork of the "[PHP-DEV] Fixing strange foreach behavior" thre= ad. It proposes a more efficient for-each mechanism (that does NOT change t= he conceptual behaviour). Currently on for-each the engine will have to copy the array if that array = is visible anywhere else in the program because it will reset the internal = position pointer (which is part of the underlying hashtable structure) and = another part of the program might rely on it. Essentially the array gets duplicated prematurely, only because of the inte= rnal position pointer. Of course it might have to anyways be duplicated wit= hin the for-each loop, but if (any only if) it is actually altered. In most= cases one just iterates over without altering. Please consider the followi= ng sample, taken from my recent post: $arr =3D $obj->arr; // property "arr" is an array foreach ($arr as $val) ...; This will currently copy the array, because $arr is also visible through $o= bj->arr although this is not really necessary unless the array is actually = changed during iteration. If one would use an external position variable that is initialized in FE_RE= SET (TEMPVAR) and then incremented in FE_FETCH one could just increment the= ref_count of the array while being traversed without the initial need to p= erform copy-on-write. Now, if the hashtable is in any way altered during the traversal then the u= sual copy-on-write would kick in because for-each initialization made sure = that ref_count was incremented before starting traversal. In that case PHP = would - just like currently - have to duplicate, but only on first actual a= lteration, not prematurely on for-each initialization. So in 90% (just a guess) of the cases, when you just traverse without alter= ing you get the full benefit of no-copy-necessary, while in the other cases= you will basically have the previous performance penalty of duplication, b= ut at least postponed to the first alteration (which might be inside a bran= ch that is not even taken). Nested for-each loops would not have to revert to copy-on-write either, bec= ause they have their own pointer. This would effectively speed up most for-each operations and would have the= extra benefit of not having to store an internal pointer in the hashtable = structure. Please let me know your thoughts! Cheers, Ben --=20 Benjamin Coutu Zeyon Technologies Inc. http://www.zeyos.com