Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:125176 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 95AFD1A00BD for ; Sat, 24 Aug 2024 09:00:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1724490149; bh=0KLBl4Cq8TdRhOz65Z1N4mFddWohFDMtqhdx7d9CO0g=; h=Date:From:To:In-Reply-To:References:Subject:From; b=UYI1HU91lb41dCJTd4n75XV2VLo91SOAsUMx4HkFS9UHzyNpUHHkKTsIjJfAaAedx 7xpOyEpz0RH4JLacksHxfhW+BEZD14L3JeQ1vHVQu6htwNSMqXBXqtzv+He6oFDGvD 2SW/CAkZyuGGDi51EvvORl+hEZvxbuaN7sb9N27wbVaOIB7eEZvPK7+DJZviADSWQh tl+2LuFnE2yOjYdwLcze8ysofOmBERp4O7yapR3UrJR7xlH+I5cRfDKaRsiuXw3f5i +JOwNlALgEkk82lCcvXjn4MpXIO0ArTkiJ+WVxESvj9gAjLE+sMAIgTmUI8QQAxM7l hRErEUAf6LT9Q== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 80CB9180052 for ; Sat, 24 Aug 2024 09:02:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from fout6-smtp.messagingengine.com (fout6-smtp.messagingengine.com [103.168.172.149]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 24 Aug 2024 09:02:27 +0000 (UTC) Received: from phl-compute-03.internal (phl-compute-03.nyi.internal [10.202.2.43]) by mailfout.nyi.internal (Postfix) with ESMTP id 053DB13920ED for ; Sat, 24 Aug 2024 05:00:34 -0400 (EDT) Received: from phl-imap-09 ([10.202.2.99]) by phl-compute-03.internal (MEProxy); Sat, 24 Aug 2024 05:00:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bottled.codes; h=cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1724490034; x=1724576434; bh=u3cEHDkZB/ cj/nfbGkwMEq8oB1EFoTo8M/yUzJ4fLiA=; b=nMMMPB8kntwC0JdKGPXU3I7pPb Z0b1oboLjKx+Q1XxW5+iXhAKuSwjNkNxfhirvfs+7qnXuWGUhgdmDltgIXCoaSIL PQ1zRtS/X+B78em3Ki1jPTIeOcZ+2SyZ8i5FC3lQH92VtgDzcNwVzEQsfJptjA/M teoKEUWUH2x5Otvg2ghM4h5xwQDpsphN0gerc+2ID8ywZgR+9KvIKF50JCXpdRmv U2Rd16m0PdZGtYC3OzE+bnt6aWIxy+ffdQkaxaCfD0OzfcGYt4mkjdzTmM6j1pjq 8H1gOVeKXZCbNemDFJTZpstcl/semRAQVrJUnoTyeck1Epg3SZs7pAyDULdA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; t=1724490034; x=1724576434; bh=u3cEHDkZB/cj/nfbGkwMEq8oB1EF oTo8M/yUzJ4fLiA=; b=Vn97JeTasPZXcILOOU0G9BAYWNj6i+5F0W/5Sc69ZDif FSH2p5umiBWvBQGRQisTb/CXnSwLlxMMJcAzEwRdkPlRT3EzbyI4B06Tlm5S6yPj sxa7akm8nggjBuV34EzoILeyUdis8ip4x5Zcj2HsC4iuoKXrr72M9/PuZaNyB88Z CglNU86tnVzCDrk/Bqc+toL2+hu9SILxO5KDCjITNnTss8s1a8iBQMwwsMnoLFMM cd18gCDJMX+5wpG1fy4VawOejV/07lt6RymMqudn/fIKBiWjBmq9y/y1w1+FA8Th Zypj/RT1hI89C5UJO3fRPbbnFAfb0kqtGqBfm3nsrQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddruddvgedgudduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucenucfjughrpefoggffhf fvkfgjfhfutgesrgdtreerredtjeenucfhrhhomhepfdftohgsucfnrghnuggvrhhsfdcu oehrohgssegsohhtthhlvggurdgtohguvghsqeenucggtffrrghtthgvrhhnpeeitdfhhf dvfffhtedtgfevfefgueeggeduueekjeehieeggffhieevleeffeeufeenucffohhmrghi nhepghhithhhuhgsrdgtohhmnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpe hmrghilhhfrhhomheprhhosgessghothhtlhgvugdrtghouggvshdpnhgspghrtghpthht ohepuddpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepihhnthgvrhhnrghlsheslh hishhtshdrphhhphdrnhgvth X-ME-Proxy: Feedback-ID: ifab94697:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 8FCEF780065; Sat, 24 Aug 2024 05:00:33 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 Date: Sat, 24 Aug 2024 11:00:13 +0200 To: internals@lists.php.net Message-ID: In-Reply-To: References: <21D6F160-5EAE-44FA-907B-E1DAAC1B8D75@rwec.co.uk> <53BD062A-4D7F-4E5D-852E-6D27641213A8@koalephant.com> <7607FD64-5572-466E-9866-63C2536B2A09@koalephant.com> <0d269a38-28fe-494c-a903-50022e09f27b@app.fastmail.com> <63DAE337-B117-4380-8735-186DC30FE0B7@rwec.co.uk> <977227E3-8793-4FB1-B572-B75D27C06ED5@rwec.co.uk> Subject: Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local) Content-Type: multipart/alternative; boundary=83c2b1acac8c4e35a07d9f88002b88bd From: rob@bottled.codes ("Rob Landers") --83c2b1acac8c4e35a07d9f88002b88bd Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Fri, Aug 23, 2024, at 23:57, Ilija Tovilo wrote: > On Fri, Aug 23, 2024 at 9:41=E2=80=AFPM Rowan Tommins [IMSoP] > wrote: > > > > On 23 August 2024 18:32:41 BST, Ilija Tovilo wrote: > > >IMO, 1. is too drastic. As people have mentioned, there are tools to > > >automate disambiguation. But unless we gain some other benefit from > > >dropping the lookup entirely, why do it? > > > > I can think of a few disadvantages of "global first": > > > > - Fewer code bases will be affected, but working out which ones is h= arder. The easiest migration will probably be to make sure all calls to = namespaced functions are fully qualified, as though it was "global only". >=20 > To talk about more concrete numbers, I now also analyzed how many > relative calls to local functions there are in the top 1000 composer > packages. >=20 > https://gist.github.com/iluuu1994/9d4bbbcd5f378d221851efa4e82b1f63 >=20 > There were 4229 calls to local functions that were statically visible. > Of those, 1534 came from thecodingmachine/safe, which I'm excluding > again for a fair comparison. The remaining 2695 calls were split > across 210 files and 27 repositories, which is less than I expected. >=20 > The calls that need to be fixed by swapping the lookup order are a > subset of these calls, namely only the ones also clashing with some > global function. Hence, the process of identifying them doesn't seem > fundamentally different. Whether the above are "few enough" to justify > the BC break, I don't know. >=20 > > - The engine won't be able to optimise calls where the name exists l= ocally but not globally, because a userland global function could be def= ined at any time. >=20 > When relying on the lookup, the lookup will be slower. But if the > hypothesis is that there are few people relying on this in the first > place, it shouldn't be an issue. It's also worth noting that many of > the optimizations don't apply anyway, because the global function is > also unknown and hence a user function, with an unknown signature. >=20 > > - Unlike with the current way around, there's unlikely to be a use c= ase for shadowing a namespaced name with a global one; it will just be a= gotcha that trips people up occasionally. >=20 > Indeed. But this is a downside of both these approaches. >=20 > > None of these seem like showstoppers to me, but since we can so easi= ly go one step further to "global only", and avoid them, why wouldn't we? > > > > Your answer to that seems to be that you think "global only" is a bi= gger BC break, but I wonder how much difference it really makes. As in, = how many codebases are using unqualified calls to reference a namespaced= function, but *not* shadowing a global name? >=20 > I hope this provides some additional insight. Looking at the analysis, > I'm not completely opposed to your approach. There are some open > questions. For example, how do we handle functions declared and called > in the same file? >=20 > namespace Foo; > function bar() {} > bar(); >=20 > Without a local fallback, it seems odd for this call to fail. An > option might be to auto-use Foo\bar when it is declared, although that > will require a separate pass over the top functions so that functions > don't become order-dependent. >=20 > Ilija >=20 Hey Ilija, I'm actually coming around to global first, then local second. I haven't= gotten statistically significant results yet though, but preliminary re= sults show that global first gives symfony/laravel their speed boost and= function autoloading gives things like wordpress their speed boost. Eve= ryone wins. For function autoloading, it is only called on the local check. So, it l= ooks kinda like this: 1. does it exist in global namespace? 1. yes: load the function; done. 2. no: continue 2. does it exist in local namespace? 1. yes: load the function; done. 2. no: continue 3. call the autoloader for local namespace. 4. does it exist in local namespace? 1. yes: load the function; done. 2. no: continue 5. does it exist in the global namespace? 1. yes: load the function; done. 2. no: continue It checks the scopes in reverse order after autoloading because it is mo= re likely that the autoloader loaded a local scope function than a globa= l one. This adds a small inconsistency (if the autoloader were to load b= oth a global and non-global function of the same name), but keeps autolo= ading fast for unqualified function calls. By checking global first, for= OOP-centric codebases like Symfony and Laravel that call unqualified gl= obal functions, they never hit the autoloader. For things that do call q= ualified local-namespace functions, they hit the autoloader and immediat= ely start loading them. The worst performance then becomes autoloading g= lobal functions that are called unqualified. Not only do you have to str= ip out the current namespace in the autoloader, but you have to deal wit= h being the absolute last check in the function table. However, (and I'm= still trying to figure out how to quantify this), I'm reasonably certai= n projects do not use global functions that often. =E2=80=94 Rob --83c2b1acac8c4e35a07d9f88002b88bd Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
On Fri, Aug 23,= 2024, at 23:57, Ilija Tovilo wrote:
On Fri, Aug 23, 2024 at 9:41=E2=80=AFPM Rowan = Tommins [IMSoP]
>
&= gt; On 23 August 2024 18:32:41 BST, Ilija Tovilo <tovilo.ilija@gmail.com> wrote:
> >IMO, 1. is too drastic. As people have mentioned, there are = tools to
> >automate disambiguation. But unless we g= ain some other benefit from
> >dropping the lookup e= ntirely, why do it?
>
> I can think of= a few disadvantages of "global first":
>
> - Fewer code bases will be affected, but working out which ones is= harder. The easiest migration will probably be to make sure all calls t= o namespaced functions are fully qualified, as though it was "global onl= y".

To talk about more concrete numbers, I = now also analyzed how many
relative calls to local functio= ns there are in the top 1000 composer
packages.
<= div>

There were 4229 ca= lls to local functions that were statically visible.
Of th= ose, 1534 came from thecodingmachine/safe, which I'm excluding
=
again for a fair comparison. The remaining 2695 calls were split
across 210 files and 27 repositories, which is less than I e= xpected.

The calls that need to be fixed by= swapping the lookup order are a
subset of these calls, na= mely only the ones also clashing with some
global function= . Hence, the process of identifying them doesn't seem
fund= amentally different. Whether the above are "few enough" to justify
the BC break, I don't know.

> -= The engine won't be able to optimise calls where the name exists locall= y but not globally, because a userland global function could be defined = at any time.

When relying on the lookup, th= e lookup will be slower. But if the
hypothesis is that the= re are few people relying on this in the first
place, it s= houldn't be an issue. It's also worth noting that many of
= the optimizations don't apply anyway, because the global function is
=
also unknown and hence a user function, with an unknown signa= ture.

> - Unlike with the current way ar= ound, there's unlikely to be a use case for shadowing a namespaced name = with a global one; it will just be a gotcha that trips people up occasio= nally.

Indeed. But this is a downside of bo= th these approaches.

> None of these see= m like showstoppers to me, but since we can so easily go one step furthe= r to "global only", and avoid them, why wouldn't we?
><= br>
> Your answer to that seems to be that you think "globa= l only" is a bigger BC break, but I wonder how much difference it really= makes. As in, how many codebases are using unqualified calls to referen= ce a namespaced function, but *not* shadowing a global name?

I hope this provides some additional insight. Looking = at the analysis,
I'm not completely opposed to your approa= ch. There are some open
questions. For example, how do we = handle functions declared and called
in the same file?
=

namespace Foo;
function bar() {}=
bar();

Without a local fallb= ack, it seems odd for this call to fail. An
option might b= e to auto-use Foo\bar when it is declared, although that
w= ill require a separate pass over the top functions so that functions
=
don't become order-dependent.

Il= ija


Hey Il= ija,

I'm actually coming around to global f= irst, then local second. I haven't gotten statistically significant resu= lts yet though, but preliminary results show that global first gives sym= fony/laravel their speed boost and function autoloading gives things lik= e wordpress their speed boost. Everyone wins.

For function autoloading, it is only called on the local check. So, i= t looks kinda like this:

  1. does it exist i= n global namespace?
    1. yes: load the function; done.
    2. no: continue
  2. does it exist in local namespace?
    1. yes: load the function; done.
    2. no: continue
      =
  3. call the autoloader for local namespace.
  4. does = it exist in local namespace?
    1. yes: load the function; don= e.
    2. no: continue
  5. does it exist in the global= namespace?
    1. yes: load the function; done.
    2. no= : continue

It checks the scopes in= reverse order after autoloading because it is more likely that the auto= loader loaded a local scope function than a global one. This adds a smal= l inconsistency (if the autoloader were to load both a global and non-gl= obal function of the same name), but keeps autoloading fast for unqualif= ied function calls. By checking global first, for OOP-centric codebases = like Symfony and Laravel that call unqualified global functions, they ne= ver hit the autoloader. For things that do call qualified local-namespac= e functions, they hit the autoloader and immediately start loading them.= The worst performance then becomes autoloading global functions that ar= e called unqualified. Not only do you have to strip out the current name= space in the autoloader, but you have to deal with being the absolute la= st check in the function table. However, (and I'm still trying to figure= out how to quantify this), I'm reasonably certain projects do not use g= lobal functions that often.

=E2=80=94 Rob
--83c2b1acac8c4e35a07d9f88002b88bd--