Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:114657 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 8274 invoked from network); 28 May 2021 14:19:37 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 28 May 2021 14:19:37 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 126CC1804DA for ; Fri, 28 May 2021 07:31:39 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-lf1-f46.google.com (mail-lf1-f46.google.com [209.85.167.46]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 28 May 2021 07:31:38 -0700 (PDT) Received: by mail-lf1-f46.google.com with SMTP id b18so1512615lfv.11 for ; Fri, 28 May 2021 07:31:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=5qKBgb7IsNSmzPy73VIQPd1YF0JWuJph9PdQ5ROMnSE=; b=CjL1hPCr5XTUJI9BiWe80wR/JhMTJAbaiw+uL63dyW02lNauOYg/F2uU0sa4GydWb6 RLTkS321zU0P/kr3+G1yDn3rGdmfWXOcIht7eEN0zaII/gTebUdUZrp3fdsNOGMxjv36 E4fydXGNEymKFsMpLA22et8aVajsFUCo6dzJhpakHcgtIgqySUGqRWzQh0sy58jOmX+a ESFf3Uyfl33OhpYNHoJ59zr5/3ntL+jaQPlocYq09JqeX/+6bl41Gq6xAAFAsRU5fVOB 77E1RKZ0fDdygTeGyBapW/Fi1Xy1GY8NuEjjx7D+BB7wGsgGjB82zr5RSVl+Sn06P5CE v9Lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=5qKBgb7IsNSmzPy73VIQPd1YF0JWuJph9PdQ5ROMnSE=; b=tSvrQlWm43FjPqXyq3lPgihHkFWToMiurZ6pZdx1c6Myufv5VeK7Gan1Kg61y20NaV Ahk7GBNCVuv+mBT4PKdjb00v5VMF2KKoNmYoNLh8TJjzT2qBRRD29pT37gjzLWUhMby8 33zK4mj5Ije/lZVXxbpY/yf1kuvDClOGPjbQd+93vGbGXotenkgvdKDQXcOeJFjBcjo+ cKIJoLSdYjaqh8rbUDIcJ5EJ8KvEYOb7znyvGcmeVaNZLq8wK0MEeJ50MYJru+OE2KKZ wGZecBG6c1F58Hi+sPB9KcSdZm7T0yGCLLoyZQ0ad0j/x4YztRhrVa3+CZkAtSsWhsZs 3CTg== X-Gm-Message-State: AOAM531jL9QKAZF6lEMb8aFth/A0sEieii5bL1FXvQxzJsMhx5+QTGZI 7feQZbSVBfgzpEmu66GUb3PqDSAH/2VGoVDdLG0= X-Google-Smtp-Source: ABdhPJzWdckmng/nBn3Lb3YbZ8pc8YOaJGNamQXRuj8tKDVZ3nGjiSsv+LfyWCnUOmsww9hwO4tfVvsdsHvp99balcQ= X-Received: by 2002:a05:6512:acd:: with SMTP id n13mr6089719lfu.485.1622212295064; Fri, 28 May 2021 07:31:35 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: Date: Fri, 28 May 2021 16:31:18 +0200 Message-ID: To: Mike Schinkel Cc: Hendra Gunawan , "internals@lists.php.net" Content-Type: multipart/alternative; boundary="00000000000099015105c364bbb3" Subject: Re: [PHP-DEV] A little syntactic sugar on array_* function calls? From: nikita.ppv@gmail.com (Nikita Popov) --00000000000099015105c364bbb3 Content-Type: text/plain; charset="UTF-8" On Fri, May 28, 2021 at 3:11 AM Mike Schinkel wrote: > > On May 26, 2021, at 7:44 PM, Hendra Gunawan > wrote: > > > > Hello. > > > >> > >> Yes, but Nikita wrote this note about technical limitations at the > bottom of the repo README: > >> > >> Due to technical limitations, it is not possible to create mutable APIs > for > >> primitive types. Modifying $self within the methods is not possible (or > >> rather, will have no effect, as you'd just be changing a copy). > >> > > > > If it is solved, this is a great accomplishment for PHP. But I think > > scalar object is not going anywhere in the near future. If you are not > > convinced, please take a look > > https://github.com/nikic/scalar_objects/issues/20#issuecomment-569520181 > . > > Nikita's comment actually causes me more questions, not fewer. > > Nikita says "We need to know that $a[$b][$c is an array in order to > determine that the call should be performed by-reference. However, we > already need to convert $a, $a[$b] and $a[$b][$c] into references before we > know about that." > > How then are we able to do the following?: > > $a[$b][$c][] = 1; > In this case, we're clearly performing a write operation on the array. If you want to know the technical details, the compiler will convert this into a sequence of FETCH_DIM_W ops followed by ASSIGN_DIM. The "W" bit here is for "write", which will perform all the necessary special handling, such as copy-on-write separation and auto-vivification. How also can we do this: > > byref($a[$b][$c]); > function byref(&$x) { > $x[]= 2; > } > > See https://3v4l.org/aPvTD > This is a more complex case. In this case the compiler doesn't know in advance whether the argument is passed by value or by reference. What happens here is: 1. INIT_FCALL determines that we're calling byref(). 2. CHECK_FUNC_ARG for the first arg determines that this argument is passed by-reference for this function. 3. FETCH_DIM_FUNC_ARG on the array will be perform either an FETCH_DIM_R or to FETCH_DIM_W operation, depending on what CHECK_FUNC_ARG determined. I assume that in both my examples $a[$b][$c] would be considered an > "lvalue"[1] and can be a target of assignment triggered by either the > assignment operator or calling the function and passing to a by-ref > parameter. > > [1] > https://en.wikipedia.org/wiki/Value_(computer_science)#Assignment:_l-values_and_r-values > > So is there a reason that -> on an array could not trigger the same? Is > Nikita saying that the performance of those calls performed by-reference > would not matter because they are always being assigned, at least in the > former case, but to do so with array expressions would be problematic? > (Ignoring there is no code in the wild that currently uses the -> operator, > or does that matter?) > Note that the byref($a[$b][$c]) case only works because we know which function is being called at the time the argument is passed. If you have $a[$b][$c]->test() we need to pass $a[$b][$c] by reference (FETCH_DIM_W) or by value (FETCH_DIM_R) depending on whether $a[$b][$c]->test() accepts the argument by-value or by-reference. But we can only know that once we have already evaluated $a[$b][$c] and found out that it is indeed an array. The only way around this is to *always* perform a for-write fetch of $a[$b][$c], even though we don't know that the end result is going to be an array. However, doing so would pessimize the performance of code operating on objects. Consider $some_huge_shared_array[0]->foo(). If we fetch $some_huge_shared_array for write, we'll be required to perform a full duplication of the array in preparation for a possible future write. If it turns out that $some_huge_shared_array[0] is actually an object, or that $some_huge_shared_array[0] is an array and the performed operation is by-value, then we have performed this copy unnecessarily. I don't believe this is acceptable. I ask honestly to understand, and not as a rhetorical question. > > Additionally, if the case of updating an array variable is not a problem > but updating an array expression is a problem then why not just limit the > -> operator to only work on expressions for immutable methods and require > variables for mutable methods? I would think should be easy enough to > throw an error for those specific "methods" that would be mutable, such as > shift() and unshift() if $a[$b][$c]->shift('foo') were called? > There are externalities associated even with the simple $x->foo() case, though they are less severe. They primarily involve reduced ability to analyze code in opcache. In either case, this limitation does not seem reasonable to me from a language design perspective. If $a->push($b) works, then $a[$k]->push($b) can reasonably be expected to work as well. > Or maybe just completely limit using the -> operator on array variables. > Don't work on any array expressions for consistency. There is already > precedence in PHP for operators that work on variables and not on > expressions: ++, --, and &. > > IF we can get a thumbs up from Nikita that one of these would actually be > possible then I think the next step should be to write up a list of > proposed array methods that would be implemented to support the -> operator > with arrays and put them in an RFC, and to flesh out any edge cases. > The only correct way to resolve this issue is to not support mutable operations. I don't think there's much need for mutable operations. sort() and shuffle() would be best implemented by returning a new array instead. array_push() is redundant with $array[]. array_shift() and array_unshift() should never be used. array_pop() and array_splice() are the only sensible mutable array methods that come to mind, and I daresay we can do without them. Regards, Nikita --00000000000099015105c364bbb3--