Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:111744 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 10544 invoked from network); 31 Aug 2020 19:09:16 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 31 Aug 2020 19:09:16 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 11CA2180539 for ; Mon, 31 Aug 2020 11:13:44 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from 5.mo179.mail-out.ovh.net (5.mo179.mail-out.ovh.net [46.105.43.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 31 Aug 2020 11:13:43 -0700 (PDT) Received: from player779.ha.ovh.net (unknown [10.108.54.38]) by mo179.mail-out.ovh.net (Postfix) with ESMTP id 84EB81766E7 for ; Mon, 31 Aug 2020 20:13:41 +0200 (CEST) Received: from riimu.net (mail-il1-f170.google.com [209.85.166.170]) (Authenticated sender: riikka.kalliomaki@riimu.net) by player779.ha.ovh.net (Postfix) with ESMTPSA id 15DA315A70BC9 for ; Mon, 31 Aug 2020 18:13:40 +0000 (UTC) Authentication-Results:garm.ovh; auth=pass (GARM-95G001675cef59-1776-4e9e-bd9b-25a7ae807fb3, B42AAE3914EB1018EFC92936FD0A44E2F79986A7) smtp.auth=riikka.kalliomaki@riimu.net Received: by mail-il1-f170.google.com with SMTP id e14so610677ile.6 for ; Mon, 31 Aug 2020 11:13:40 -0700 (PDT) X-Gm-Message-State: AOAM532P5iiMcMhyS3533tOzdrr1sVSRqZ1QpwJniMfbx/Vyrta7scKw 7sA3gimx3+ABoC2XKN4ENG17RjssrFkCx4lHnhk= X-Google-Smtp-Source: ABdhPJyFuSxxCmRk/bHyMVVfwRWLSMHK8z4DfeihcVv6OGBrDNcAhxeATQ68ryVu7kMZdeQskVxhvbrls+8CTCwfJe8= X-Received: by 2002:a92:4d1:: with SMTP id 200mr2441771ile.223.1598897619895; Mon, 31 Aug 2020 11:13:39 -0700 (PDT) MIME-Version: 1.0 Date: Mon, 31 Aug 2020 21:13:28 +0300 X-Gmail-Original-Message-ID: Message-ID: To: PHP internals Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Ovh-Tracer-Id: 4455467408268747466 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -51 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgeduiedrudefhedguddvfecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfqggfjpdevjffgvefmvefgnecuuegrihhlohhuthemucehtddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenogfuuhhsphgvtghtffhomhgrihhnucdlgeelmdenucfjughrpegghfffkffuvfgtgfesthhqredttddtjeenucfhrhhomheptfhiihhkkhgrpgfmrghllhhiohhmmohkihcuoehrihhikhhkrgdrkhgrlhhlihhomhgrkhhisehrihhimhhurdhnvghtqeenucggtffrrghtthgvrhhnpeejgeduueeuhedtkedtfffgveeuieetkefhgeevgeejffelvefhuedutdeffeffueenucffohhmrghinhepfehvgehlrdhorhhgpdhgihhthhhusgdrtghomhenucfkpheptddrtddrtddrtddpvddtledrkeehrdduieeirddujedtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmohguvgepshhmthhpqdhouhhtpdhhvghlohepphhlrgihvghrjeejledrhhgrrdhovhhhrdhnvghtpdhinhgvtheptddrtddrtddrtddpmhgrihhlfhhrohhmpehrihhikhhkrgdrkhgrlhhlihhomhgrkhhisehrihhimhhurdhnvghtpdhrtghpthhtohepihhnthgvrhhnrghlsheslhhishhtshdrphhhphdrnhgvth Subject: Request for couple memory optimized array improvements From: riikka.kalliomaki@riimu.net (=?UTF-8?Q?Riikka_Kalliom=C3=A4ki?=) Hello, For the past couple years I've been working with a PHP code base that at times deals with quite large payloads memory wise. This has made me pay more attention to some array operations in PHP that are rather frustrating to deal with in userland PHP, but could perhaps be optimized more in PHP core. A common pattern that I've seen that could dearly use PHP internal optimization, if possible, would be: foreach (array_keys($array) as $key) { } The problem with this pattern, of course, is the fact that it needlessly duplicates the array passed to foreach, as can be seen from this example: https://3v4l.org/MRSv6 I would be ever so grateful, if it would be possible to improve the PHP engine to detect that fully qualified function name array_keys is used with foreach, in which case it would simply perform a foreach over the keys of the array without creating a copy. Optimizing this wouldn't even require any userland changes. Not sure if the PHP engine makes it at all feasible, though. Of course, you could just be using something like this in code: foreach ($array as $key =3D> $_) { } Which has actually become a pattern for us in some memory sensitive places, but using array_keys inside foreach is a very intuitive and common approach and doesn't require the unused variable, so it would be nice to see the usage enshired. Another similar problem with creating array copies is the detection of "indexed" arrays (as opposed to associative arrays). Particularly when dealing with JSON, it's a common need to detect if an array has keys from 0 to n-1 and in that order. My understanding is that at least in some cases this would be trivial and fast to tell internally in PHP, but the functionality is not exposed to userland. Current common practices include for example: array_keys($array) =3D=3D=3D range(0, count($array) - 1) Memory optimized way of dealing with this is via foreach, but it's quite cumbersome and again, you must not use array_keys in the foreach. The following example demonstrates that the worst case scenario triples the memory usage using range: https://3v4l.org/FiWdk Interestingly, using "array_values($array) =3D=3D=3D $array" is the fastest and most optimized way in best case scenarios, since php just returns the array itself in cases it's "packed" and "without holes". However, this could get hairy in worst case scenarios since it starts comparing the values as well. So, it would be nice to have a core PHP function implementing this test, because the userland way of doing it is unnecessarily unoptimized. I don't know what the function should be called. In our code base the function is called is_indexed_array, but PHP doesn't really have a standard term for this, afaik. I regret my lack of C skills so I can't really propose implementations, but I would be truly appreciative if these suggestions would gain some traction. --=20 Riikka Kalliom=C3=A4ki https://github.com/Riimu