Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:125177 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 0701C1A00BD for ; Sat, 24 Aug 2024 09:09:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1724490701; bh=QB6gwSDHDZJ9xw2SpFiLjpZVqH/tNSJnPNY5gjX9BMU=; h=Date:From:To:In-Reply-To:References:Subject:From; b=GAfB9o86Vb3YrFLu/bn1oO46oy43nQeil4n9T9aeHUYFWexQKu7kaeOCQYV04WadC DknAvmCoUvSjAVCnQhDiIldZpUSeON5jkOYci3IBYNZm8AyPxRlOTrbSZNm7teZvia U1F1L9Vhgc/TFzsO+Z4CGXs9NUqNmWRSBU7tdJ1hzj+g85rzuDAo/PbCRClwBft9rr 4BtNsXRTppycej3LPI5B4oOamgbwKiBRVEXfcMbXxQV5nH2e5v5aFAd7VEZdMXAMHw z1FI3M3WhF9W8M6Cqixh0uCjbx1SJXFJ4kYagBI0d8/RzTdadP4C6mVA7JqmPjx2YS FpmeAM7WmqIQA== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 4176A180034 for ; Sat, 24 Aug 2024 09:11:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from fhigh1-smtp.messagingengine.com (fhigh1-smtp.messagingengine.com [103.168.172.152]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 24 Aug 2024 09:11:39 +0000 (UTC) Received: from phl-compute-03.internal (phl-compute-03.nyi.internal [10.202.2.43]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 082EF1153858 for ; Sat, 24 Aug 2024 05:09:48 -0400 (EDT) Received: from phl-imap-09 ([10.202.2.99]) by phl-compute-03.internal (MEProxy); Sat, 24 Aug 2024 05:09:48 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bottled.codes; h=cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1724490588; x=1724576988; bh=5El+oZ9cOM P7xVDqg/VUmbTbm7QqvMPfyyvgZLUMqQs=; b=UpnfknI0PwZM4MFnYyaw2SDK6Y aySelx0/Un6GMkvzUZsfOVOaBRYg8QR/C7LKzfdPSChG9a0jx1CnGjENdBJVgwc2 ZXZsT382GGs3GuReS0wss7OjVCeo746dlcvDscWOCYcdAF6HWjPQZgGXbL4pHdaw Z0fsV5FGwDr3YAA7v5eDOcqT+GbPtON9vj1W5zBGTSj92HiR3m9HQVSqi/kC6cYi SFmKx7qlxAX6BLnQmRD3yl52qKoydDeKsXtw16ZvgO1ieVm3LM4pv2Wutj4SakkS Pm/Tp2AsCkWd9oUGlgJURXhFj62Eh6yjIX7Lsk3oavpAPM/Ng5p+ZNJltmmQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; t=1724490588; x=1724576988; bh=5El+oZ9cOMP7xVDqg/VUmbTbm7Qq vMPfyyvgZLUMqQs=; b=lKJ511ImIWSAxD1cCdJD+wuLmV5rxD2g5WLF/Hvx07vm v+OMCVSRhTFRLkPuWb7zIgMAl9o/FUrUVqFXlZw38TtlgIY/R9a7feWsXuj1Lq/F O6IEv7EHTyMUgWXiMUpzl7ANEVvc5QMOcOl3Xd6zsMd+kpy0WYAB3Gjn+2NeSFg5 FdCWR/DXaqfY94zafsy4IjWqFFsvG79064dJxe/YbQLcIdXzUG+qgo2GS1ol2xrw +Yr/6Cu0T31WXZHpm5So+ogM7jfWocle1UINorHMI3HqJL7B4HcsiZu8Kyw3/zha P0ZNQ4nUGWwg4XiGh7L+cRMeqHA8B5/EOz6OYdFFPg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddruddvgedgudefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucenucfjughrpefoggffhf fvkfgjfhfutgesrgdtreerredtjeenucfhrhhomhepfdftohgsucfnrghnuggvrhhsfdcu oehrohgssegsohhtthhlvggurdgtohguvghsqeenucggtffrrghtthgvrhhnpeeitdfhhf dvfffhtedtgfevfefgueeggeduueekjeehieeggffhieevleeffeeufeenucffohhmrghi nhepghhithhhuhgsrdgtohhmnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpe hmrghilhhfrhhomheprhhosgessghothhtlhgvugdrtghouggvshdpnhgspghrtghpthht ohepuddpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepihhnthgvrhhnrghlsheslh hishhtshdrphhhphdrnhgvth X-ME-Proxy: Feedback-ID: ifab94697:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id C9EEA780065; Sat, 24 Aug 2024 05:09:47 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 Date: Sat, 24 Aug 2024 11:09:27 +0200 To: internals@lists.php.net Message-ID: <3e8eaf60-1778-4579-b058-e0849a7b7106@app.fastmail.com> In-Reply-To: References: <21D6F160-5EAE-44FA-907B-E1DAAC1B8D75@rwec.co.uk> <53BD062A-4D7F-4E5D-852E-6D27641213A8@koalephant.com> <7607FD64-5572-466E-9866-63C2536B2A09@koalephant.com> <0d269a38-28fe-494c-a903-50022e09f27b@app.fastmail.com> <63DAE337-B117-4380-8735-186DC30FE0B7@rwec.co.uk> <977227E3-8793-4FB1-B572-B75D27C06ED5@rwec.co.uk> Subject: Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local) Content-Type: multipart/alternative; boundary=1e969b0f36bb4041a7a8565bea3177d5 From: rob@bottled.codes ("Rob Landers") --1e969b0f36bb4041a7a8565bea3177d5 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Sat, Aug 24, 2024, at 11:00, Rob Landers wrote: > On Fri, Aug 23, 2024, at 23:57, Ilija Tovilo wrote: >> On Fri, Aug 23, 2024 at 9:41=E2=80=AFPM Rowan Tommins [IMSoP] >> wrote: >> > >> > On 23 August 2024 18:32:41 BST, Ilija Tovilo wrote: >> > >IMO, 1. is too drastic. As people have mentioned, there are tools = to >> > >automate disambiguation. But unless we gain some other benefit from >> > >dropping the lookup entirely, why do it? >> > >> > I can think of a few disadvantages of "global first": >> > >> > - Fewer code bases will be affected, but working out which ones is = harder. The easiest migration will probably be to make sure all calls to= namespaced functions are fully qualified, as though it was "global only= ". >>=20 >> To talk about more concrete numbers, I now also analyzed how many >> relative calls to local functions there are in the top 1000 composer >> packages. >>=20 >> https://gist.github.com/iluuu1994/9d4bbbcd5f378d221851efa4e82b1f63 >>=20 >> There were 4229 calls to local functions that were statically visible. >> Of those, 1534 came from thecodingmachine/safe, which I'm excluding >> again for a fair comparison. The remaining 2695 calls were split >> across 210 files and 27 repositories, which is less than I expected. >>=20 >> The calls that need to be fixed by swapping the lookup order are a >> subset of these calls, namely only the ones also clashing with some >> global function. Hence, the process of identifying them doesn't seem >> fundamentally different. Whether the above are "few enough" to justify >> the BC break, I don't know. >>=20 >> > - The engine won't be able to optimise calls where the name exists = locally but not globally, because a userland global function could be de= fined at any time. >>=20 >> When relying on the lookup, the lookup will be slower. But if the >> hypothesis is that there are few people relying on this in the first >> place, it shouldn't be an issue. It's also worth noting that many of >> the optimizations don't apply anyway, because the global function is >> also unknown and hence a user function, with an unknown signature. >>=20 >> > - Unlike with the current way around, there's unlikely to be a use = case for shadowing a namespaced name with a global one; it will just be = a gotcha that trips people up occasionally. >>=20 >> Indeed. But this is a downside of both these approaches. >>=20 >> > None of these seem like showstoppers to me, but since we can so eas= ily go one step further to "global only", and avoid them, why wouldn't w= e? >> > >> > Your answer to that seems to be that you think "global only" is a b= igger BC break, but I wonder how much difference it really makes. As in,= how many codebases are using unqualified calls to reference a namespace= d function, but *not* shadowing a global name? >>=20 >> I hope this provides some additional insight. Looking at the analysis, >> I'm not completely opposed to your approach. There are some open >> questions. For example, how do we handle functions declared and called >> in the same file? >>=20 >> namespace Foo; >> function bar() {} >> bar(); >>=20 >> Without a local fallback, it seems odd for this call to fail. An >> option might be to auto-use Foo\bar when it is declared, although that >> will require a separate pass over the top functions so that functions >> don't become order-dependent. >>=20 >> Ilija >>=20 >=20 > Hey Ilija, >=20 > I'm actually coming around to global first, then local second. I haven= 't gotten statistically significant results yet though, but preliminary = results show that global first gives symfony/laravel their speed boost a= nd function autoloading gives things like wordpress their speed boost. E= veryone wins. >=20 > For function autoloading, it is only called on the local check. So, it= looks kinda like this: >=20 > 1. does it exist in global namespace? > 1. yes: load the function; done. > 2. no: continue > 2. does it exist in local namespace? > 1. yes: load the function; done. > 2. no: continue > 3. call the autoloader for local namespace. > 4. does it exist in local namespace? > 1. yes: load the function; done. > 2. no: continue > 5. does it exist in the global namespace? > 1. yes: load the function; done. > 2. no: continue >=20 > It checks the scopes in reverse order after autoloading because it is = more likely that the autoloader loaded a local scope function than a glo= bal one. This adds a small inconsistency (if the autoloader were to load= both a global and non-global function of the same name), but keeps auto= loading fast for unqualified function calls. By checking global first, f= or OOP-centric codebases like Symfony and Laravel that call unqualified = global functions, they never hit the autoloader. For things that do call= qualified local-namespace functions, they hit the autoloader and immedi= ately start loading them. The worst performance then becomes autoloading= global functions that are called unqualified. Not only do you have to s= trip out the current namespace in the autoloader, but you have to deal w= ith being the absolute last check in the function table. However, (and I= 'm still trying to figure out how to quantify this), I'm reasonably cert= ain projects do not use global functions that often. >=20 > =E2=80=94 Rob Amendment: Actually, I may skip allowing the second check in the global space for a= utoloaders. In other words, if you want to autoload a global function, y= ou need to call it fully qualified. It's not 100% ideal, but better than= pinning and better performance for everyone. =E2=80=94 Rob --1e969b0f36bb4041a7a8565bea3177d5 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable


On Sat, Aug 24, 2024, at 11:00, Rob Landers wrote:
=
On Fri, Aug 23= , 2024, at 23:57, Ilija Tovilo wrote:
On Fri, Aug 23, 2024 at 9:41=E2=80=AFPM Ro= wan Tommins [IMSoP]
>
> On 23 August 2024 18:32:41 BST, Ilija Tovilo <tovilo.ilija@gmail.com> wrote:
> >IMO, 1. is too drastic. As people have mentioned, there = are tools to
> >automate disambiguation. But unless = we gain some other benefit from
> >dropping the look= up entirely, why do it?
>
> I can thin= k of a few disadvantages of "global first":
>
=
> - Fewer code bases will be affected, but working out which one= s is harder. The easiest migration will probably be to make sure all cal= ls to namespaced functions are fully qualified, as though it was "global= only".

To talk about more concrete numbers= , I now also analyzed how many
relative calls to local fun= ctions there are in the top 1000 composer
packages.


There were 422= 9 calls to local functions that were statically visible.
O= f those, 1534 came from thecodingmachine/safe, which I'm excluding
again for a fair comparison. The remaining 2695 calls were spli= t
across 210 files and 27 repositories, which is less than= I expected.

The calls that need to be fixe= d by swapping the lookup order are a
subset of these calls= , namely only the ones also clashing with some
global func= tion. Hence, the process of identifying them doesn't seem
= fundamentally different. Whether the above are "few enough" to justify
the BC break, I don't know.

&g= t; - The engine won't be able to optimise calls where the name exists lo= cally but not globally, because a userland global function could be defi= ned at any time.

When relying on the lookup= , the lookup will be slower. But if the
hypothesis is that= there are few people relying on this in the first
place, = it shouldn't be an issue. It's also worth noting that many of
<= div>the optimizations don't apply anyway, because the global function is=
also unknown and hence a user function, with an unknown s= ignature.

> - Unlike with the current wa= y around, there's unlikely to be a use case for shadowing a namespaced n= ame with a global one; it will just be a gotcha that trips people up occ= asionally.

Indeed. But this is a downside o= f both these approaches.

> None of these= seem like showstoppers to me, but since we can so easily go one step fu= rther to "global only", and avoid them, why wouldn't we?
&= gt;
> Your answer to that seems to be that you think "g= lobal only" is a bigger BC break, but I wonder how much difference it re= ally makes. As in, how many codebases are using unqualified calls to ref= erence a namespaced function, but *not* shadowing a global name?

I hope this provides some additional insight. Look= ing at the analysis,
I'm not completely opposed to your ap= proach. There are some open
questions. For example, how do= we handle functions declared and called
in the same file?=

namespace Foo;
function bar(= ) {}
bar();

Without a local f= allback, it seems odd for this call to fail. An
option mig= ht be to auto-use Foo\bar when it is declared, although that
will require a separate pass over the top functions so that functions=
don't become order-dependent.

Ilija


Hey&nbs= p;Ilija,

I'm actually coming around to glob= al first, then local second. I haven't gotten statistically significant = results yet though, but preliminary results show that global first gives= symfony/laravel their speed boost and function autoloading gives things= like wordpress their speed boost. Everyone wins.

For function autoloading, it is only called on the local check. S= o, it looks kinda like this:

  1. does it exi= st in global namespace?
    1. yes: load the function; done.
    2. no: continue
  2. does it exist in local namespace= ?
    1. yes: load the function; done.
    2. no: continue=
  3. call the autoloader for local namespace.
  4. d= oes it exist in local namespace?
    1. yes: load the function;= done.
    2. no: continue
  5. does it exist in the gl= obal namespace?
    1. yes: load the function; done.
    2. no: continue

It checks the scope= s in reverse order after autoloading because it is more likely that the = autoloader loaded a local scope function than a global one. This adds a = small inconsistency (if the autoloader were to load both a global and no= n-global function of the same name), but keeps autoloading fast for unqu= alified function calls. By checking global first, for OOP-centric codeba= ses like Symfony and Laravel that call unqualified global functions, the= y never hit the autoloader. For things that do call qualified local-name= space functions, they hit the autoloader and immediately start loading t= hem. The worst performance then becomes autoloading global functions tha= t are called unqualified. Not only do you have to strip out the current = namespace in the autoloader, but you have to deal with being the absolut= e last check in the function table. However, (and I'm still trying to fi= gure out how to quantify this), I'm reasonably certain projects do not u= se global functions that often.

=E2=80=94 Rob

Amen= dment:

Actually, I may skip allowing the se= cond check in the global space for autoloaders. In other words, if you w= ant to autoload a global function, you need to call it fully qualified. = It's not 100% ideal, but better than pinning and better performance for = everyone.

=E2=80=94 Rob
=
--1e969b0f36bb4041a7a8565bea3177d5--