Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:125056 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id D41891A00BD for ; Mon, 19 Aug 2024 23:21:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1724109815; bh=Zli7hLqfxQEp72OeBBXUfP9qnxqBs19+ulN61cnIAtM=; h=Date:From:To:In-Reply-To:References:Subject:From; b=oQN49S+i0KW/4RPnYgSAEeB0Cqdx7NoFOG9cXuZcx0S/HhwyJgRCTPDx2+WtY1XAM zyvYk5ng2PtziQZXSgpx+P1uvIsFXIFtTnTaD9k3LvU3ploJinjl3NCQqJ3mRl3FOH G9CE9OinWruHogOB8mmDTOxaY5nWfJvvS7ZqV/VGM1yQYVnAk4a0v1Ywf7JJSavMED lxEK26RGoBQVA3SUxB90+DIrteN4odUkoP7SpIiXTwfZteyrzoy/CGncEblaniINXO R7O/grhW9sb+3OEbIugR1xAV9DQVQNxVYue3XFptqdFW28yTHBle8SszIe8YWX/Vtl 0lUTeVnNXtdtA== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 5251218006D for ; Mon, 19 Aug 2024 23:23:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from fhigh7-smtp.messagingengine.com (fhigh7-smtp.messagingengine.com [103.168.172.158]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 19 Aug 2024 23:23:33 +0000 (UTC) Received: from phl-compute-03.internal (phl-compute-03.nyi.internal [10.202.2.43]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 65966114C014 for ; Mon, 19 Aug 2024 19:21:44 -0400 (EDT) Received: from phl-imap-09 ([10.202.2.99]) by phl-compute-03.internal (MEProxy); Mon, 19 Aug 2024 19:21:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bottled.codes; h=cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1724109704; x=1724196104; bh=K8W4o6gt34 D4YErVvhObSNCr6KxgXIrI356X7X5z8w0=; b=s3zwDijZiMV0TcnVM3K7LQz9PJ F6fcLkLkh65gVFlbykRjAG3oDnnRNdA7uOS2dOe/gyQZ9zEqvi7UAb2cuxiYP/6E cCDuXxEF3DtbynqnE5XUrKb0mMW6dLGcNrDgdw4yFiklAkOE9CNufbRmtmxLHKmW tHm4uLKvVkFp8QHUKmg14Immb3WS++O667FeZB1JE1pgB9K3v8O9cv0xyQZOvtud YrR/UbNMjaSBkT+uSarHHjueqwVah1eSkG8IoLurZ3vDPc1sxUdsZ9GKfIs/M19M UJfvxUyxnvXPmKkDAQG30fbDrSQAaMajy2BganHYWyThiQFzfivJcvtvjlhQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; t=1724109704; x=1724196104; bh=K8W4o6gt34D4YErVvhObSNCr6Kxg XIrI356X7X5z8w0=; b=rNoM2QhukzP2+1Ij/lRrawLk14Gk+LuxkmMWXmwc4gb4 /c79qMCInneSaIM1S53Nyyqk/rqqZT2hb5gu9qGFdxvwK9y4ZsSWHiuY4oz5kphm NJGnxocOTnzlMOc+1sZK4fyvQPnBErtgazb/riKHdFNRt2YaQsd85vaokJ3/qyON 5MrviRN0125/cF3KJ9VipsttTAdWUDmpNa+LlfncZLrjNaafAtA2X7dyhlXvNe0s 56PAp+eKp8QOvTqavg7m1fPeCLxCqTAzaOUm0TVSHSuvtOXm2D6F6j5EkKGH3LzO mB2ypH6ary3y73UXS/a2VDVsfzlULv0BmvkKZUkWfw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrudduhedgvdduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucgoufhushhpvggtthffoh hmrghinhculdegledmnecujfgurhepofggfffhvffkjghfufgtsegrtderreertdejnecu hfhrohhmpedftfhosgcunfgrnhguvghrshdfuceorhhosgessghothhtlhgvugdrtghoug gvsheqnecuggftrfgrthhtvghrnheptdeitddvvdevhfdufffhgeelffetgeffveekheek feeluedutdeiveekvdetjedvnecuffhomhgrihhnpeefvheglhdrohhrghenucevlhhush htvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehrohgssegsohhtthhl vggurdgtohguvghspdhnsggprhgtphhtthhopedupdhmohguvgepshhmthhpohhuthdprh gtphhtthhopehinhhtvghrnhgrlhhssehlihhsthhsrdhphhhprdhnvght X-ME-Proxy: Feedback-ID: ifab94697:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 0B63515A005E; Mon, 19 Aug 2024 19:21:44 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 Date: Tue, 20 Aug 2024 01:21:22 +0200 To: internals@lists.php.net Message-ID: <97d993ee-c558-43a1-a36a-8e06aa2d829a@app.fastmail.com> In-Reply-To: <6197f6e1-d123-41e1-87c8-887c6508bfa2@rwec.co.uk> References: <2716f729-4008-4f75-8412-861d8960b746@app.fastmail.com> <26338153-6d16-456a-81bf-8231bdaf1b79@app.fastmail.com> <6197f6e1-d123-41e1-87c8-887c6508bfa2@rwec.co.uk> Subject: Re: [PHP-DEV] function autoloading v4 RFC Content-Type: multipart/alternative; boundary=b9471b25cc1a4dcd914ac06d54ddf3ea From: rob@bottled.codes ("Rob Landers") --b9471b25cc1a4dcd914ac06d54ddf3ea Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Mon, Aug 19, 2024, at 23:17, Rowan Tommins [IMSoP] wrote: > On 19/08/2024 17:23, Rob Landers wrote: >=20 > > As far as performance for ambiguous functions go, I was thinking of=20 > > submitting an RFC, where ambiguous function calls are tagged during=20 > > compilation and always resolve lexically, sorta like how it works no= w: > > > > echo strlen($x); // resolves to global, always > > require_once "my-strlen"; > > echo strlen($x); // resolves to my strlen, always > > > > This works by basically rewriting the function name once resolved an= d=20 > > may make the code more predictable. If I can pull it off, it can be=20 > > relegated to a technical change that doesn=E2=80=99t need an RFC. St= ill=20 > > working on it. >=20 >=20 > I'm not entirely clear what you're suggesting, but I think it might be=20 > either what already happens, or the same as Gina is proposing. Yeah, that would be a separate RFC, so I didn't go into the weeds, but m= y point, is that it would result in no change. But here we go :D > Consider this code [https://3v4l.org/184k3]: >=20 > namespace Foo; >=20 > foreach ( [1,2,3] as $i ) { > echo strlen('hello'), ' '; > shadow_strlen(); > echo strlen('hello'), '; '; > } >=20 > function shadow_strlen() { > if ( ! function_exists('Foo\\strlen') ) { > function strlen($s) { > return 42; > } > } > } >=20 > In PHP 5.3, this outputs '5 42; 42 42; 42 42;' That's fairly=20 > straight-forward: each time the function is called, the engine checks = if=20 > "Foo\strlen" is defined. >=20 > Since PHP 5.4, it outputs '5 42; 5 42; 5 42;' The engine caches the=20 > result of the lookup against each compiled opcode, so the first strlen= ()=20 > is cached as a call to \strlen() and the second as a call to \Foo\strl= en(). >=20 > As I understand it, Gina is proposing that it would instead output '5 = 5;=20 > 5 5; 5 5;' - the function would be "pinned" by making "\Foo\strlen" an=20 > alias to "\strlen" for the rest of the program, and the function_exist= s=20 > call would immediately return true. >=20 > Neither, as far as I can see, can happen at compile time, because the=20 > compiler doesn't know if and when a definition of \Foo\strlen() will b= e=20 > encountered. To be fair, this isn't even really completely figured out yet. I was mos= tly wanting to point out that it maybe could be a totally separate issue= . But, the gist is that at compile time, we can mark a function as "ambi= guous," meaning we don't really know if the function exists (because it = isn't fully qualified). The main issue with Gina's implementation (if I = am understanding it properly, and I potentially am not, so take this wit= h a grain of salt) is that this (https://3v4l.org/0jCpW) could fail if i= t were pinned, where before, it would not. In my idea, a function becomes pinned until it isn't, with strict rules = that kick it out of being pinned and I think those rules are complex eno= ugh to warrant being a completely different RFC or PR. >=20 >=20 > > In other words, maybe pinning could be solved more generally in a=20 > > future RFC, decrease your RFC=E2=80=99s scope and chance for sharp e= dge cases. >=20 > If anything, I think it would need to be the other way around: change=20 > the name resolution logic, so that an autoloading proposal was more=20 > palatable, because it didn't require running the autoloader an unbound= ed=20 > number of times for the same name. >=20 > Proposing both at once seems reasonable, as the autoloading gives an=20 > extra benefit to outweigh the breaking change to shadowing behaviour. >=20 > Regards, I assume you are worried about something like this passing test? --TEST-- show called only once --FILE-- --EXPECT-- name=3Dtest\strlen 333 In my RFC, I mention it is called exactly once. I could maybe add it as = a test in the PR. I've committed it as another test on the RFC implement= ation. >=20 > --=20 > Rowan Tommins > [IMSoP] >=20 =E2=80=94 Rob --b9471b25cc1a4dcd914ac06d54ddf3ea Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable

=
On Mon, Aug 19, 2024, at 23:17, Rowan Tommins [IMSoP] wro= te:
On 19/0= 8/2024 17:23, Rob Landers wrote:

> As fa= r as performance for ambiguous functions go, I was thinking of 
=
> submitting an RFC, where ambiguous function calls are ta= gged during 
> compilation and always resolve lexi= cally, sorta like how it works now:
>
>= ; echo strlen($x); // resolves to global, always
> requ= ire_once "my-strlen";
> echo strlen($x); // resolves to= my strlen, always
>
> This works by b= asically rewriting the function name once resolved and 
> may make the code more predictable. If I can pull it off, it can= be 
> relegated to a technical change that doesn=E2= =80=99t need an RFC. Still 
> working on it.


I'm not entirely clear what you'r= e suggesting, but I think it might be 
either what al= ready happens, or the same as Gina is proposing.
<= div>
Yeah, that would be a separate RFC, so I didn't go in= to the weeds, but my point, is that it would result in no change. But he= re we go :D

Consider this code [https= ://3v4l.org/184k3]:

namespace Foo;
<= /div>

foreach ( [1,2,3] as $i ) {
 = ;    echo strlen('hello'), ' ';
  = ;   shadow_strlen();
     ec= ho strlen('hello'), '; ';
}

f= unction shadow_strlen() {
    if ( ! functi= on_exists('Foo\\strlen') ) {
     = ;   function strlen($s) {
   &nbs= p;        return 42;
&n= bsp;       }
  &nb= sp; }
}

In PHP 5.3, this outp= uts '5 42; 42 42; 42 42;'  That's fairly 
straig= ht-forward: each time the function is called, the engine checks if =
"Foo\strlen" is defined.

Sin= ce PHP 5.4, it outputs '5 42; 5 42; 5 42;'   The engine caches= the 
result of the lookup against each compiled opco= de, so the first strlen() 
is cached as a call to \st= rlen() and the second as a call to \Foo\strlen().

As I understand it, Gina is proposing that it would instead outpu= t '5 5; 
5 5; 5 5;' - the function would be "pinned" = by making "\Foo\strlen" an 
alias to "\strlen" for th= e rest of the program, and the function_exists 
call = would immediately return true.

Neither, as = far as I can see, can happen at compile time, because the 
compiler doesn't know if and when a definition of \Foo\strlen() wi= ll be 
encountered.

To be fair, this isn't even really completely figured out yet. = I was mostly wanting to point out that it maybe could be a totally separ= ate issue. But, the gist is that at compile time, we can mark a function= as "ambiguous," meaning we don't really know if the function exists (be= cause it isn't fully qualified). The main issue with Gina's implementati= on (if I am understanding it properly, and I potentially am not, so take= this with a grain of salt) is that this (https://3v4l.org/0jCpW) could fail if it were pinned, where bef= ore, it would not.

In my idea, a function b= ecomes pinned until it isn't, with strict rules that kick it out of bein= g pinned and I think those rules are complex enough to warrant being a c= ompletely different RFC or PR.



> In o= ther words, maybe pinning could be solved more generally in a 
<= /div>
> future RFC, decrease your RFC=E2=80=99s scope and chance = for sharp edge cases.

If anything, I think = it would need to be the other way around: change 
the= name resolution logic, so that an autoloading proposal was more 
palatable, because it didn't require running the autoloader= an unbounded 
number of times for the same name.
=

Proposing both at once seems reasonable, as th= e autoloading gives an 
extra benefit to outweigh the= breaking change to shadowing behaviour.

Re= gards,

I assume you are worrie= d about something like this passing test?

--TEST--
show&= nbsp;called only once
&= lt;?php

=
namespace test;=

= spl_autoload_register(= function($name) {
echo "name=3D$name\n";
}, true, false, SPL_AUTOLOAD_FUNCTION);
<= span style=3D"background-color:rgb(41, 60, 64);">
echo strlen('foo');
echo strlen('bar');
echo strlen('baz');<= br>?>=
--EXPECT--
name=3Dtest\strlen
333

<= div>In my RFC, I mention it is called exactly once. I could maybe add it= as a test in the PR. I've committed it as another test on the RFC imple= mentation.


-- 
Rowan Tommins
=
[IMSoP]


=E2=80=94 Rob
--b9471b25cc1a4dcd914ac06d54ddf3ea--