Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:114686 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 97421 invoked from network); 1 Jun 2021 22:21:07 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 1 Jun 2021 22:21:07 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 35DA41804CC for ; Tue, 1 Jun 2021 15:34:12 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_NONE autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-qv1-f42.google.com (mail-qv1-f42.google.com [209.85.219.42]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 1 Jun 2021 15:34:11 -0700 (PDT) Received: by mail-qv1-f42.google.com with SMTP id 5so364051qvk.0 for ; Tue, 01 Jun 2021 15:34:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=newclarity-net.20150623.gappssmtp.com; s=20150623; h=from:mime-version:subject:message-id:date:cc:to; bh=c96W0RfeYffqG1ZQPuu49GeDJjJYtae2g1tJJhkQ4Mg=; b=ExK7puxJhxYNcdQlZrvgf6a/DQys6cPIEenchvT0AHULlp5CXYXSrsmJTo08m/4eOD x/u72PJHr41OczdPXSy8hhsGcSwEyHdM7r8aya0bk1r3IHO2VplLpyJg+fSwLlsodPAU FHhJl1nSiUPz4NWzhUeYLll6lJNA7qJog4eFNXvezKN/Hz/sGcS8t6bSeQcd34dTKtZ0 AOypejux+AoETk0o//6l/3KDjlukihXWZRlm6HfPUd59yKTS9VDUQf7n+vgG1WThfWjA llul6GXCbTWC/itbf9wnNeFBV9uGgrj6gHlJxexlzUfUygLvx7Ulr3L0/cuU8gzD2JKp CdWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:mime-version:subject:message-id:date:cc:to; bh=c96W0RfeYffqG1ZQPuu49GeDJjJYtae2g1tJJhkQ4Mg=; b=qh0bi259yUqHbnacSzz8S9V1ULy9Za9yrdK0MDrMpy2szpt97rjTXKSHapWQW/BagG 82m49zQnC1JOOTwSAJP5K8Mc9g4vYgxMX3x/leg0F8Kfjhv6IP6fjT7zN0+hEXraQoQy SVguUCbNIhZCba6h7jzPNLyvbmKoqjyr7REoJd3lXBQKfBe5kooX5NkdLFsZyc6rVWY5 L5eYxr0OkOq6EWtK3GvKNmQfC/yRvlDe0w4kQQJsSfEsV0lwOWAMva0b8Hyl1ED88VT1 N4/aTx+HcxbdEEiHx9RjjsMgTiVWlV95+79UGOuCcFCz8nn/SLVcx5AEGwOppacbhLMU 9srg== X-Gm-Message-State: AOAM530DWnJyVyIL5epztXEqlsSPYdfxl29cZIOcyT1BsaILLgrxVCnt x9PvDL1MslgI8D272qx7lODE7g== X-Google-Smtp-Source: ABdhPJw6EoNgl23B6NJF9WNx+eT7+AWVpBmU0ZTbM4v6Jl3dviO8JoqUtlJ8jfbIgiVw7vy+LxxbUA== X-Received: by 2002:a05:6214:2125:: with SMTP id r5mr25398838qvc.28.1622586851091; Tue, 01 Jun 2021 15:34:11 -0700 (PDT) Received: from [192.168.1.10] (c-24-98-254-8.hsd1.ga.comcast.net. [24.98.254.8]) by smtp.gmail.com with ESMTPSA id f1sm11737622qkl.93.2021.06.01.15.34.08 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 01 Jun 2021 15:34:10 -0700 (PDT) Content-Type: multipart/alternative; boundary="Apple-Mail=_7A3FC4B3-93DE-49A8-AF2C-4F9338176753" Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.6\)) Message-ID: Date: Tue, 1 Jun 2021 18:34:08 -0400 Cc: "internals@lists.php.net" To: Nikita Popov X-Mailer: Apple Mail (2.3608.120.23.2.6) Subject: Regarding array_shift()/array_unshift() From: mike@newclarity.net (Mike Schinkel) --Apple-Mail=_7A3FC4B3-93DE-49A8-AF2C-4F9338176753 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi Nikita, >> On May 28, 2021, at 10:31 AM, Nikita Popov > wrote: >=20 >> I don't think there's much need for mutable operations. sort() and = shuffle() would be best implemented by returning a new array instead. = array_push() is redundant with $array[]. array_shift() and = array_unshift() should never be used.=20 >=20 > Why do you say array_shift() and array_unshift() should never be used? = When I wrote the above questions the use-case I was thinking about most = was $a->unshift($value) as I use array_unshift() more than most of the = other array functions. >=20 > Do you mean that these if applied as "methods" to an array should not = be use immutably =E2=80=94 meaning in-place is bad but returning an = array value that has been shifted would be okay =E2=80=94 or do you have = some other reason you believe that shifting an array is bad? Note the = reason I have used them in the past is when I need to pass an array to a = function written by someone else that expects the array to be ordered. >=20 > Also, what about very large arrays? I assume =E2=80=94 which could be = a bad assumption =E2=80=94 that PHP internally can be more efficient = about how it handles array_unshift() instead of just duplicating the = large array so as to add an element at the beginning? >=20 > Arrays only support efficient push/pop operations. Performing an = array_shift() or array_unshift() requires going through the whole array = to reindex all the keys, even though you're only adding/removing one = element. In other words, array_shift() and array_unshift() are O(n) = operations, not O(1) as one would intuitively expect. If you use = shift/unshift as common operations, you're better off using a different = data-structure or construction approach. I appreciate your explanation regarding array_shift()/array_unshift(). It left me curious though to see how those functions and have been used = in userland, and what use-cases they frequently solve.=20 I decided to focused on array_unshift() because I have found reason to = use it numerous times in the past but rarely use array_shift(). I = searched some of the most popular PHP projects on Github and found the = following # of uses for array_unshift() while the last project was one = of yours: Phan:10 = https://github.com/search?q=3Darray_unshift+language%3APHP+repo%3Aphan%2Fp= han&type=3DCode Phabricator:27 = https://github.com/search?l=3D&p=3D1&q=3Darray_unshift+language%3APHP+repo= %3Aphacility%2Fphabricator&type=3DCode Laravel:8 = https://github.com/search?q=3Darray_unshift+language%3APHP+repo%3Alaravel%= 2Fframework&type=3DCode Symfony:34 = https://github.com/search?q=3Darray_unshift+language%3APHP+repo%3Asymfony%= 2Fsymfony&type=3DCode PHP-CS-Fixer:8 = https://github.com/search?q=3Darray_unshift+language%3APHP+repo%3AFriendsO= fPHP%2FPHP-CS-Fixer&type=3DCode Composer:7 = https://github.com/search?q=3Darray_unshift+language%3APHP+repo%3Acomposer= %2Fcomposer&type=3DCode Statamic:2 = https://github.com/search?q=3Darray_unshift+language%3APHP+repo%3Astatamic= %2Fcms&type=3DCode Magento2:75 = https://github.com/search?q=3Darray_unshift+language%3APHP+repo%3Amagento%= 2Fmagento2&type=3DCode WordPress:22 = https://github.com/search?q=3Darray_unshift+language%3APHP+repo%3AWordPres= s%2Fwordpress-develop&type=3DCode PHPMailer:1 = https://github.com/search?q=3Darray_unshift+language%3APHP+repo%3APHPMaile= r%2FPHPMailer&type=3DCode =20 PHPBackporter:1 = https://github.com/nikic/PHP-Backporter/blob/master/lib/PHPBackporter/Conv= erter/Closure.php#L100 = =46rom a manual review of those usages I have found roughly the = following use-case categories: ------ Testing =E2=80=94 It seems a lot of tests use array_unshift(), for a = variety of pre- and post- test uses. Path Manipulation =E2=80=94 Such as when relative paths are exploded, = the base path is inserted, and path segments imploded. Ordered Processing/Filtering =E2=80=94 Where a list of elements are = filtered to insert values to be processed first such as loading = "plugins", inserting regular expressions for URL route processing, = "policy modes", adding optional positional arguments to default values = for a "builder" (command line, SQL, etc.), inserting custom menu items = in front of default menu items, inserting higher priority middleware, = insert column definitions to be displayed, inserting custom paths to = look for theme template in front of default paths, inserting mime-types = into a list of default acceptable mime-types, inserting directories for = autoloaders, etc. Parsers/Tokenizers =E2=80=94 When writing a parser or tokenizer and the = need to insert a value or token at the head of the list. Parameter Delegation =E2=80=94 When a function accepts a list of = parameter, especially when they include variadic, and then needs to pass = on the same set of parameters to another function such as with = call_user_func_array(), sprints(), etc.Additional similar uses are to = insert $this into return value of func_get_args() to allow delegation to = a global function, also using call_user_func_array().=20 ------ There may be more but in general those were the main reasons I found for = using array_unshift() that I am not sure could be replaced by another = data structure to any advantage. It's hard to replace an array with = something else when your source for the data is a function that returns = an array. (BTW, Phabricator uses array_unshift() more than any other, = but uses it in contexts that would be easily replaced with something = more performant. Ironic given that the project is no longer being = maintained.) So I was also intrigued by your statement that "you're better off using = a different data-structure or construction approach" since I think we = mostly use these functions when we have to use an array, not because we = chose to. Such as when we are either given an array by some function we = are not in control of, or when we need to pass a value to an ordered = function we are not in control of. In the above use-cases I would really = like to know if you can envision other data structures or constructions = being better and if so what they are? Also, almost every one of the use-cases mentioned above generally deal = with very short arrays, such as parameter delegation. Is O() really = problematic when is rather small? ----- But from all this I do agree with you that just returning an array would = likely be acceptable.=20 ----- This would not be a great solution for really large arrays, but then we = can't eliminate the need for a developer to have a reasonable level of = knowledge, right? So: $unshifted =3D $array->unshift($new_element); $shifted =3D $array->shift(); But, we could also possibly use better names for these: $right_shifted =3D $array->shift_right(...$new_element(s)); $left_shifted =3D $array->shift_left([element_count]); OR $prepended =3D $array->prepend(...$new_element(s)); =09 $prebated =3D $array->prebate([element_count]); -Mike= --Apple-Mail=_7A3FC4B3-93DE-49A8-AF2C-4F9338176753--