Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:114660 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 34032 invoked from network); 28 May 2021 20:49:28 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 28 May 2021 20:49:28 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 12DF71804D9 for ; Fri, 28 May 2021 14:01:34 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-qt1-f182.google.com (mail-qt1-f182.google.com [209.85.160.182]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 28 May 2021 14:01:33 -0700 (PDT) Received: by mail-qt1-f182.google.com with SMTP id t17so3664483qta.11 for ; Fri, 28 May 2021 14:01:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=newclarity-net.20150623.gappssmtp.com; s=20150623; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=xW2Cu03vE+mPtN6gs1DSprp09a5VZXV8SyHoQjsLzuA=; b=EhnTaRPiPk1ghVJ8+pHvvEAoGfK6BOCbdBK8yXDhwldv25t5ylpkD5ufI9JP6tB2+C UwyDXAHqS2IlJjqaWDUg6UAX30m04r2NPFTCIufNxsLnAsfdfkJKPEIugiOXBDZzGLyl /wn1KLfzK9lnF5zECCAONsErOvmR8V2zZkM+ewP8wJWpnGLNfJY2JCDiWdBckm4r2c7y xTJyI/PimNR5Fm9gbEg+UdhcIKXs0ddxifEXTiFYAgU5xr8tU4srlZmyYOefZ22212Eb wOBfZQRm8Lz/Wn1EfboE/hs43c4KnAB2xX+9976wAiVk1RMhDtZWKug5eV3Jdc1ncfnr uxvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=xW2Cu03vE+mPtN6gs1DSprp09a5VZXV8SyHoQjsLzuA=; b=bs96VZVEMv5rMpW10pgl7T3J+PvSmBTh0IH0ezpifWGS94HgZ+CCgTawVGe+YHjX9/ BxxlA504JhfXNiYvBPOieP1kY35kzQnVX8D43VvkM7WD601An7FqIWRBVNNTqAt/W/Fw pr6qAvi9DcPRwclU8c//G6cmeDXkkYFK+cueH8a2laOzSL1sQMV0nth1Y3p6dUM0g36r g0GhWKCooXX4EslCCL5mU3niMk/1tv76r9I/J2iDgfgYbrfoJoiEMyH6HW8yH/AnD4iV WFm4xrvEaLYg47d8iuVXVpQ0lbPhP3DjuCCjDrbsOzYo7MCUa6siBIALZODAtSjiCH0a PuQw== X-Gm-Message-State: AOAM531j4o3Zuw4OIWV7z0KBSNMblOvDRBeg3tDxxyVcNWXdlH+sXwW+ BRcw3D/i5CyRX5G419Ed0PljfQ== X-Google-Smtp-Source: ABdhPJx8mxh8OWAUxjYX8k9ECdarqDrHyd5WnVOSdp/IXa2+A4f9FNiVpLNE0IVI01y1UR3bTj+/DA== X-Received: by 2002:a05:622a:14d0:: with SMTP id u16mr5185505qtx.42.1622235692006; Fri, 28 May 2021 14:01:32 -0700 (PDT) Received: from [192.168.1.10] (c-24-98-254-8.hsd1.ga.comcast.net. [24.98.254.8]) by smtp.gmail.com with ESMTPSA id g63sm4235177qkd.92.2021.05.28.14.01.30 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 28 May 2021 14:01:31 -0700 (PDT) Message-ID: <1A9FD7A8-0BFD-475F-896A-DA8579FC0D9F@newclarity.net> Content-Type: multipart/alternative; boundary="Apple-Mail=_90B63890-5216-4F71-A69A-2C3BBC45710D" Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.6\)) Date: Fri, 28 May 2021 17:01:29 -0400 In-Reply-To: Cc: "internals@lists.php.net" To: Nikita Popov References: X-Mailer: Apple Mail (2.3608.120.23.2.6) Subject: Re: [PHP-DEV] A little syntactic sugar on array_* function calls? From: mike@newclarity.net (Mike Schinkel) --Apple-Mail=_90B63890-5216-4F71-A69A-2C3BBC45710D Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi Nikita, Thank you for taking the time to explain in detail. =20 One more question below. -Mike > On May 28, 2021, at 10:31 AM, Nikita Popov = wrote: >=20 > On Fri, May 28, 2021 at 3:11 AM Mike Schinkel > wrote: > > On May 26, 2021, at 7:44 PM, Hendra Gunawan = > wrote: > >=20 > > Hello. > >=20 > >>=20 > >> Yes, but Nikita wrote this note about technical limitations at the = bottom of the repo README: > >>=20 > >> Due to technical limitations, it is not possible to create mutable = APIs for > >> primitive types. Modifying $self within the methods is not possible = (or > >> rather, will have no effect, as you'd just be changing a copy). > >>=20 > >=20 > > If it is solved, this is a great accomplishment for PHP. But I think > > scalar object is not going anywhere in the near future. If you are = not > > convinced, please take a look > > = https://github.com/nikic/scalar_objects/issues/20#issuecomment-569520181 = = . >=20 > Nikita's comment actually causes me more questions, not fewer. >=20 > Nikita says "We need to know that $a[$b][$c is an array in order to = determine that the call should be performed by-reference. However, we = already need to convert $a, $a[$b] and $a[$b][$c] into references before = we know about that." =20 >=20 > How then are we able to do the following?: >=20 > $a[$b][$c][] =3D 1; >=20 > In this case, we're clearly performing a write operation on the array. = If you want to know the technical details, the compiler will convert = this into a sequence of FETCH_DIM_W ops followed by ASSIGN_DIM. The "W" = bit here is for "write", which will perform all the necessary special = handling, such as copy-on-write separation and auto-vivification. >=20 > How also can we do this: >=20 > byref($a[$b][$c]); > function byref(&$x) { > $x[]=3D 2; > } >=20 > See https://3v4l.org/aPvTD = > >=20 > This is a more complex case. In this case the compiler doesn't know in = advance whether the argument is passed by value or by reference. What = happens here is: >=20 > 1. INIT_FCALL determines that we're calling byref(). > 2. CHECK_FUNC_ARG for the first arg determines that this argument is = passed by-reference for this function. > 3. FETCH_DIM_FUNC_ARG on the array will be perform either an = FETCH_DIM_R or to FETCH_DIM_W operation, depending on what = CHECK_FUNC_ARG determined. >=20 > I assume that in both my examples $a[$b][$c] would be considered an = "lvalue"[1] and can be a target of assignment triggered by either the = assignment operator or calling the function and passing to a by-ref = parameter. =20 >=20 > [1] = https://en.wikipedia.org/wiki/Value_(computer_science)#Assignment:_l-value= s_and_r-values = >=20 > So is there a reason that -> on an array could not trigger the same? = Is Nikita saying that the performance of those calls performed = by-reference would not matter because they are always being assigned, at = least in the former case, but to do so with array expressions would be = problematic? (Ignoring there is no code in the wild that currently uses = the -> operator, or does that matter?) >=20 > Note that the byref($a[$b][$c]) case only works because we know which = function is being called at the time the argument is passed. If you have = $a[$b][$c]->test() we need to pass $a[$b][$c] by reference (FETCH_DIM_W) = or by value (FETCH_DIM_R) depending on whether $a[$b][$c]->test() = accepts the argument by-value or by-reference. But we can only know that = once we have already evaluated $a[$b][$c] and found out that it is = indeed an array. >=20 > The only way around this is to *always* perform a for-write fetch of = $a[$b][$c], even though we don't know that the end result is going to be = an array. However, doing so would pessimize the performance of code = operating on objects. Consider $some_huge_shared_array[0]->foo(). If we = fetch $some_huge_shared_array for write, we'll be required to perform a = full duplication of the array in preparation for a possible future = write. If it turns out that $some_huge_shared_array[0] is actually an = object, or that $some_huge_shared_array[0] is an array and the performed = operation is by-value, then we have performed this copy unnecessarily. >=20 > I don't believe this is acceptable. >=20 > I ask honestly to understand, and not as a rhetorical question. >=20 > Additionally, if the case of updating an array variable is not a = problem but updating an array expression is a problem then why not just = limit the -> operator to only work on expressions for immutable methods = and require variables for mutable methods? I would think should be easy = enough to throw an error for those specific "methods" that would be = mutable, such as shift() and unshift() if $a[$b][$c]->shift('foo') were = called? >=20 > There are externalities associated even with the simple $x->foo() = case, though they are less severe. They primarily involve reduced = ability to analyze code in opcache. >=20 > In either case, this limitation does not seem reasonable to me from a = language design perspective. If $a->push($b) works, then = $a[$k]->push($b) can reasonably be expected to work as well. > =20 > Or maybe just completely limit using the -> operator on array = variables. Don't work on any array expressions for consistency. There is = already precedence in PHP for operators that work on variables and not = on expressions: ++, --, and &. >=20 > IF we can get a thumbs up from Nikita that one of these would actually = be possible then I think the next step should be to write up a list of = proposed array methods that would be implemented to support the -> = operator with arrays and put them in an RFC, and to flesh out any edge = cases.=20 >=20 > The only correct way to resolve this issue is to not support mutable = operations. I don't think I agree that this is the only correct way, but I respect = your position of authority on the matter. > I don't think there's much need for mutable operations. sort() and = shuffle() would be best implemented by returning a new array instead. = array_push() is redundant with $array[]. array_shift() and = array_unshift() should never be used. Why do you say array_shift() and array_unshift() should never be used? = When I wrote the above questions the use-case I was thinking about most = was $a->unshift($value) as I use array_unshift() more than most of the = other array functions. Do you mean that these if applied as "methods" to an array should not be = use immutably =E2=80=94 meaning in-place is bad but returning an array = value that has been shifted would be okay =E2=80=94 or do you have some = other reason you believe that shifting an array is bad? Note the reason = I have used them in the past is when I need to pass an array to a = function written by someone else that expects the array to be ordered. Also, what about very large arrays? I assume =E2=80=94 which could be a = bad assumption =E2=80=94 that PHP internally can be more efficient about = how it handles array_unshift() instead of just duplicating the large = array so as to add an element at the beginning? -Mike= --Apple-Mail=_90B63890-5216-4F71-A69A-2C3BBC45710D--