Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:108724 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 1655 invoked from network); 23 Feb 2020 08:46:52 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 23 Feb 2020 08:46:52 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id A56B21801FD for ; Sat, 22 Feb 2020 23:03:42 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-yw1-f54.google.com (mail-yw1-f54.google.com [209.85.161.54]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 22 Feb 2020 23:03:42 -0800 (PST) Received: by mail-yw1-f54.google.com with SMTP id i190so3742934ywc.2 for ; Sat, 22 Feb 2020 23:03:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=newclarity-net.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=k1hTVDNpg6nG+Rtr9OuHHfxXy+gz7yrillwUgWIbXfo=; b=c4veiCV9SEDls5cgDoQQV6XsU16lRZKe9JVF7WR6WrSyE1jVRXIHySw3w5ZmpzXg2o Wa5BWHrf7jvf7+KcKW9j6Jy4rGKO66TFAjjXB2u/u5Z5+DU8bVLgp0p8IfAdavbjb4Qe lmI2xz5y9z2jADCihPwVn0vRpAQmtNQLwwmGmav6k2Fg+4A3uavaZHhrufqKidT4/Zlr YMS2Tg0uT0gehhfOMglh7AWjLnpNzsS8ilNoEgwCTBs8KB9hBEuwYgVaOchndxbp1Ag2 FgIFfH2UM3cWg1gXjkyHg55pyv+riSMkIkn57gG3XezIQ1xEhvjUiYCRwZ7U2bG9EdQ2 /N5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=k1hTVDNpg6nG+Rtr9OuHHfxXy+gz7yrillwUgWIbXfo=; b=iSG91QTQ8/7y0NgVb/mLNQ1VMTCCxM9cevvq6sCiM6OgTNGa4rBfd1dNT6g0CmEPxd H8Cr2FWrAtB4YsADAIhutRG9ZnsI/sWqVNKW37eCwcPk7OMf5oCjGRtNU027HYrJA3dq f43XRilIWPNlmmRZV11JpOhD/vSJSqbR2jzEYgWfy+DvWl5GWLHgCbZ206dBQHXDiamx kgjKsVng6+0VPpBCjY2YYZMmck/VSj6aBU+olO2hhgrcMPQZlIkLp41R1eto/PCBiOKB onBbh0ppIwHew5Ba3O9CWzdp51gv0ilIes5XfIGy4HFooDKP5SWYXKiRss1CotBTrVwm mWBQ== X-Gm-Message-State: APjAAAX9sF0rM63Kgz1SFV8z39Dcw4HW2Suy71XpC5rt+V5ljbXUMZT3 HxElcoiOMwgGqMHpA9rLC18t/F/XSG3ddg== X-Google-Smtp-Source: APXvYqz3oj8qotu9ejBY46LAo6ji+Au76I0AbSECnQ3YI4wqETYpXd572OwBR5aWKrVcHegO5Az9OQ== X-Received: by 2002:a0d:dd83:: with SMTP id g125mr39997789ywe.396.1582441420051; Sat, 22 Feb 2020 23:03:40 -0800 (PST) Received: from ?IPv6:2601:c0:c680:5cc0:4032:6b9e:6c2d:f784? ([2601:c0:c680:5cc0:4032:6b9e:6c2d:f784]) by smtp.gmail.com with ESMTPSA id o13sm3803738ywo.20.2020.02.22.23.03.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 22 Feb 2020 23:03:39 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) In-Reply-To: <73831D4E-B53C-4350-BDAB-2F5BB31E2E8D@gmail.com> Date: Sun, 23 Feb 2020 02:03:38 -0500 Cc: PHP internals Content-Transfer-Encoding: quoted-printable Message-ID: <03009039-9B2B-49BC-9CFF-CFB77BEFE2DC@newclarity.net> References: <8B2AFC37-9425-440C-B89D-61CBAAB0CDDD@gmail.com> <08159EA6-61E5-4E85-AC43-9637783315EA@newclarity.net> <73831D4E-B53C-4350-BDAB-2F5BB31E2E8D@gmail.com> To: Rowan Tommins X-Mailer: Apple Mail (2.3445.104.11) Subject: Re: [PHP-DEV] [RFC] Explicit call-site pass-by-reference (again) From: mike@newclarity.net (Mike Schinkel) > On Feb 23, 2020, at 2:00 AM, Mike Schinkel = wrote: > On Feb 22, 2020, at 5:56 AM, Rowan Tommins = wrote: >> One of the reasons it is confusing is because developers are = currently >> required to use the ampersand in one place and not the other. Making >> it always used removes said confusion as they would no longer be a >> reason to have to remember when and when not to use the ampersand >> anymore. >=20 > Maybe. I think a larger part of it is that references themselves are a = slightly confusing concept, and the fact that & looks like an operator = of its own (and is often documented that way) but is really an = annotation on other operators/commands. That is, the & in $foo =3D &$bar = and return &$bar doesn't modify $bar, it modifies =3D and return, = respectively. >=20 > Making the rules more logical and symmetrical would perhaps be more = helpful to new users than it is to established users, particularly those = who've known multiple versions of the language already. You call out the use of the ampersand being viewed as an operator acting = on a variable as problematic, but that is already baked into current = PHP, not going to change any time soon if ever, and is orthogonal to = this RFC.=20 So whether or not people find the ampersand operator to be confusing = that is irrelevant to the debate posed by Nikita's RFC over whether we = should make the use of ampersand related to passing-by-reference be more = consistent. >> There is a potential "PR" cost of this change that should be weighed >> against the advantages. >>=20 >> To say "We fixed something that in hindsight we've since determined = was >> a problem." How is this a concern?=20 >=20 > The concern is that the costs will be much more visible to users than = the benefits, and they will resent the core developers pushing that = requirement onto them, rather than thanking then for their hard work. >=20 > As I said, that's not an absolute reason not to do it, it's a cost to = be weighed. Nikita's RFC proposes that the ampersand would be optional at the = calling site, so is it really a concern that developers will resent = something that is optional? Yes Nikita mentioned that a future "edition" might make is a = requirement, but even then it will still be optional =E2=80=94 = developers can choose not to use the new edition =E2=80=94 and I think = the resentment will come more from the concept of forcing an "edition" = on developers than any specific feature. Note that I plan to post soon = about how I think we can alleviate that. So we can debate the PR "cost" of requiring ampersands at the call site = when the requiring RFC is on the table. As a side note, I remember thinking "WTF?!?" when the requirement to use = an ampersand at the calling site was removed. It is possible your = analysis of PR cost is discounting the potential large number of people = who will think adding it back is a good think. >>> I'm also not very keen on internal functions being able to do things >> that can't be replicated on userland, and this RFC adds two: = additional >> behaviour for existing "prefer-ref" arguments, and new "prefer-value" >> arguments >>=20 >> So what specific problems would having these enhancement cause for = the >> language? >=20 > There are two problems I have with internal-only features in general: = the inability to polyfill and extend, and the requirement for a separate = mental model. >=20 > As an example of the first, the RFC mentions using call_user_func with = a call-site annotation to forward the parameter by reference. The reason = for allowing that also applies to a user-defined wrapper like = call_with_current_user or call_with_swapped_parameters, but there's no = syntax for those to be marked "prefer-val". Let's analyze. =20 In this case there does not appear to be a need for "prefer-val." And = Nikita's RFC adds functionality we currently do not have =E2=80=94 = ability to pass by reference to call_user_func() so that is a win over = status quo as it gains a feature that we previously internal-only: $f ) { $args[$i]++; } } function call_with_current_user(Callable $callable, int ...&$args ) { array_unshift(&$args,current_user()); $temp_args =3D $args; $result =3D call_user_func_array( $callable, &$temp_args ); foreach( $temp_args as $i =3D> $t ) { if ( func_is_byref_arg( $i, $args ) ) { $args =3D $temp_args[$i]; } =20 } return $result; } $foo =3D 0; $bar =3D 0; $baz =3D 0; call_with_current_user( 'foobar', &$foo, $bar, $baz ); echo $foo; // prints 1 echo $bar; // prints 0 echo $baz; // prints 0 In my example func_is_byref_arg($pos[,$variadic_arg]):bool accepts one = parameter if you are checking for by-ref positionally, and two if you = are introspecting a variadic parameter. So I argue we should fill in the holes of the RFC that introduces a = feature that to help developers write more robust code instead of = decline an RFC for imperfections in its first draft. > As an example of the second, even under strict settings, calls to = certain internal functions will have an optional & at the call site, = which changes their behaviour. >=20 > To those without knowledge of the core, those functions simply have to = be remembered as "magic", because their behaviour can't be modelled as = part of the normal language. I am unclear how the optional ampersand at the call site will change the = behavior. =20 As I understand the RFC the behavior will still be driven by the = ampersand at the declaration site. The presence or absence of ampersand = at a call still will merely be decoration that allows developers to = better convey their intent. Can you please give an example of how this RFC would change behavior at = call site compared to a call site where the ampersand did not exist, = given the behavior of this RFC? >>> My current opinion is that I'd rather wait for the details of out = and >> inout parameters to be worked out, and reap higher gains for the same >> cost. For instance, if preg_match could mark $matches as "out", I'd = be >> more happy to run in a mode where I needed to add a call-site = keyword. >>=20 >> This sounds like preferring perfect in the (potentially distant) = future >> vs. much better today. >=20 >=20 > No, it's preferring to hold out for a little bit more value to weigh = against my evaluation of the cost.=20 >=20 > This is, when followed through to its conclusion of mandatory marking, = a disruptive change to every piece of code, so we need to decide if the = disruption is worth it.=20 That is disingenuous. The RFC does not require mandatory use, period. =20= The "cost" you worry about will not exist unless and until a future RFC = proposes to make it mandatory and that RFC is accepted. Further, your cost analysis does not appear to consider the cost of = status quo and this RFC's ability to reduce that cost. Using Nikita's RFC example there is a potential real-world cost to = getting the following wrong in a userland project: $ret =3D array_slice($array, 0, 3); $ret =3D array_splice($array, 0, 3); With Nikita's RFC developers could chose to start using ampersand at the = calling site for these type of methods. Let's consider that I write the = following: array_slice(&$array, 0, 3); array_splice(&$array, 0, 3); With this RFC (I assume) an error could be generated on = array_splice(&$array, 0, 3)saying that I cannot pass the array by = reference. Today we don't get that. This alone could reduce errors that = I have seen in source code and I admittedly have committed myself. Said succinctly, there is a (IMO significant) cost to doing nothing that = your analysis appears to ignore. > It's also the second change in the same place, and we should be sure = that we've got it right this time, and won't require a third change in = the near future.=20 I don't particularly see a problem with requiring a third change in the = future. Hindsight is a wonderful clarifier. And I believe elsewhere you = have been debating me over the need for incremental change. Caveat = emptor. > For instance, if out parameters were added, would the same line of = code end up going from optional &, to forbidden &, to mandatory &, to = mandatory "out"?=20 My view is that we should actually hash those concerns out and move = forward rather than state them in the abstract and let the fact that = legitimate concerns *might* exist derail an improvement to the language. Since there are not infinite potentials, let's just address your = specific concerns here. My straw man proposal is that if we add an `out` = keyword exists then a developer could use either `out` or `&` but not = both. Then in a future "edition" of pHP it would be possible that we = disallow `&` if enough people agree that that is better. Or we could = leave as either/or. Allowing ampersand at a call site today does not = block potential future `out` keywords AFAICT. For me I don't care which it is as long as there is a calling site = notation that allows a developer to write code indicating intent and for = other developers to read code and see that intent.=20 Status quo waiting for some future potential that may not arrive for = years does not get us there in the near term, but Nikita's RFC would. > I'm not strongly against the idea, but the advantages just don't feel = quite strong enough, so if I had a vote, I'd currently be inclined to = vote no. Heh. My vote would yes. And since neither of us have a vote I guess it = would be applicable to say they cancel each other's vote out. Or not. = :-D -Mike