Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:125155 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 050611A00BD for <internals@lists.php.net>; Fri, 23 Aug 2024 17:47:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1724435343; bh=aQC1Bz50RA9pEq78WwB9FmcgvcVF3YWt8ZM5jQ5QElY=; h=From:Subject:Date:In-Reply-To:Cc:To:References:From; b=EoqSKcZ1rTnMGcDxLHF9yzB873HjV2B0kyx1kIr63gf3kHMbNeJ3rdMkVde0scLIa i5ixOwMEq+02lP3LCMEoJ+O8gQmy88nE0B16oed4fF7sCfIwdwp2QlYE+KMP9733Vg Iz7ZWXkKaCZk0t6n66pWJPcZogOfKrF6BAX+QPFYntEdkkH9NhaSLFSCngEmM/xQgE dS9zFQ+IQwlogj7qnT5RhTTiZdZWIVVaERj/YTKKXaYFbodI28eaUzGCq1qdDbbOo+ TeJjYo7FmXRsBqHeAuhOtAJ+Rky67JUHfLt87KqCx1FUky6ir0y1pq2Z7gdCH4qDYU 6C9v1oBzTvfNQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 52DA5180034 for <internals@lists.php.net>; Fri, 23 Aug 2024 17:49:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,DMARC_MISSING, HTML_MESSAGE,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: <php-lists@koalephant.com> Received: from mail1.25mail.st (mail1.25mail.st [206.123.115.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for <internals@lists.php.net>; Fri, 23 Aug 2024 17:49:01 +0000 (UTC) Received: from smtpclient.apple (unknown [49.48.221.224]) by mail1.25mail.st (Postfix) with ESMTPSA id 2E168604C3; Fri, 23 Aug 2024 17:47:01 +0000 (UTC) Message-ID: <6C127753-4AA4-424A-BCD4-BD4CA6FA1DB3@koalephant.com> Content-Type: multipart/alternative; boundary="Apple-Mail=_CAFE29BA-F864-4215-ABC5-7EAA042EB002" Precedence: bulk list-help: <mailto:internals+help@lists.php.net list-unsubscribe: <mailto:internals+unsubscribe@lists.php.net> list-post: <mailto:internals@lists.php.net> List-Id: internals.lists.php.net x-ms-reactions: disallow Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3776.700.51\)) Subject: Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local) Date: Sat, 24 Aug 2024 00:46:48 +0700 In-Reply-To: <19ED3E60-FB1C-4691-B0F6-7BF3A569821B@getmailspring.com> Cc: "Rowan Tommins [IMSoP]" <imsop.php@rwec.co.uk>, PHP internals <internals@lists.php.net> To: John Coggeshall <john@coggeshall.org> References: <68BD8107-9BBE-476B-B085-B340CB709A49@koalephant.com> <19ED3E60-FB1C-4691-B0F6-7BF3A569821B@getmailspring.com> X-Mailer: Apple Mail (2.3776.700.51) From: php-lists@koalephant.com (Stephen Reay) --Apple-Mail=_CAFE29BA-F864-4215-ABC5-7EAA042EB002 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii > On 23 Aug 2024, at 23:56, John Coggeshall <john@coggeshall.org> wrote: >=20 >=20 >=20 > On Aug 23 2024, at 12:27 pm, Stephen Reay <php-lists@koalephant.com> = wrote: >=20 > The current inconsistencies between symbol types can be avoided in = userland in a 100% consistent way. Import or qualify the symbols you = use, all the time, and you have 0 inconsistencies or bizarreness in = terms of what it used when. >=20 >=20 > So are you essentially arguing that we should put the burden on the = majority of users, most of whom (documented by us or not) likely will = have no idea what the problem is or potential consequences are? No, I'm saying that the "problem" of performance has had a pretty = simple, consistent solution since namespaces were added to the language, = and that for the vast, vast majority of projects it's a stretch to even = classify it as a problem. The claims about "security" because a function you defined (or included = via a package) is resolved in place of a global one are irrelevant. If = you're including compromised code in your project, all bets are off. > BC breaks happen. While I am all for avoiding BC breaks when possible, = sometimes they make sense -- and I think this is a clear example of when = it does. Please be specific what you mean by "this". The original proposal by = Ilija provides a constant BC break over time, whenever a new global = function is introduced. >=20 > I think you are exaggerating the impact of the BC break here. In fact = Ilija measured the impact on the top 1000 composer packages: >=20 > https://gist.github.com/iluuu1994/4b83481baac563f8f0d3204c697c5551 Great, so 0.24% of public packages represented, and 0% of private code = represented. That certainly seems representative. You've also missed the other aspect here, which I mentioned earlier: = namespaced function usage is low because the language hasn't = traditionally supported it anywhere near as well as namespaced classes. = There have been multiple people proclaiming recently that "static = utility classes" are the 'wrong' approach, that people should use = namespaced functions in their code. There are two active RFCs about = function autoloading. This change would at best, make those functions slower to use within the = same namespace, and at worst, more work, with a brand new inconsistency, = to use within the same namespace. >=20 > I was specifically pointing out that a small number of people = complaining about this is a ridiculous reason to even consider the = change. >=20 > That's one take. Another take is this is an easy win for a few = percentage points bump in speed, with improved supply-chain security for = composer packages that has a minimal impact on users. >=20 I was clarifying (to someone else) that the claim about who objects or = doesn't, was never mine. It's a bit weird that in one email you admit = you have no data, and in the next claim "minimal impact on users". = Either you have data or you don't. > Great, how about the solution that doesn't have any BC, and works in = every version back to 5.3? >=20 > By this logic, we should never introduce BC breaks. We should aim to reduce BC breaks as much as possible, and especially BC = breaks that have an ongoing impact over time (i.e. new breaks into the = future). The point is that every single *technical* problem pointed out = in the original issue, and the email that arose from it, can be solved, = and could be soled 15 years ago, by using a `\` for global functions (or = using a `use` statement), exactly the same way you do with global = classes and interfaces. >=20 > Great, so then we can resolve this whole thing by adding a footnote to = the "Name resolution rules" page in the manual that (a) recommends using = qualified names (i.e. prefix with a `\`) and (b) provides deeper details = of the reasons for those who care. >=20 > =46rom the perspective of program language _design_ (which is what = we're talking about here), the goal is to create a language that helps = the developer do something faster/better/easier, not do the wrong thing = (slower code, etc.) by default and dump the responsibly for that on = developers by expecting them to read a footnote buried in a doc. = Especially when the justification is because there's concerns that code = written in 2009 won't work anymore. >=20 To be clear, we aren't "creating" a language. We're talking about a = hypothetical *change* to a core aspect of an existing language, that is = used by literally millions of developers around the planet. The change we're talking about is in the range of maybe 2-4%, and is = 100% solvable in userland - and has been for those 15 years, in a way = that has zero impact on developers using the language to write their own = functions, and is consistent with the way other symbol lookups (e.g. = classes) work. I'll concede you one point. A footnote is clearly not = important enough for a 2% performance benefit. Let's make it the subtext = on the header of ever php.net <http://php.net/> page, just to make sure = people know. > I mean this: I'm honestly not even sure where to begin here. If you add a namespaced = function to your code, and call it from within that namespace, it will = run. That's literally by design. If that is somehow surprising to you, = I'd suggest the aforementioned name resolution page in the php manual. = It's not exactly long, you can probably read it quicker than this email. As I and others have said: if your project has a credible security risk = because of this functionality, you have bigger problems than needing to = use a leading backslash. > At it's core a vast majority of the functionality of the PHP language = exists within internally-implemented functions, not classes. There's a lot of procedural APIs in the standard library/extensions, = sure. But people still *use* classes *a lot* though, and there is in = general a push towards more OOP API's and less groups of functions - = particularly for anything with state (e.g. see recent discussions and = RFC's about Curl objects, HTTP Request data objects, BCMath Number = object, Tokenizer, etc). =20 > So yes, I think it's entirely reasonable that people would expect that = internal functions resolve at a higher priority than user-defined = functions with the same name OK, you can think that. I don't agree. One of the top, if not the top = thing people complain about PHP is inconsistencies in the standard = library - the order of needle/haystack arguments being different in = string vs array functions is probably one of the most well known. --Apple-Mail=_CAFE29BA-F864-4215-ABC5-7EAA042EB002 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii <html><head><meta http-equiv=3D"content-type" content=3D"text/html; = charset=3Dus-ascii"></head><body style=3D"overflow-wrap: break-word; = -webkit-nbsp-mode: space; line-break: = after-white-space;"><br><div><blockquote type=3D"cite"><div>On 23 Aug = 2024, at 23:56, John Coggeshall <john@coggeshall.org> = wrote:</div><br class=3D"Apple-interchange-newline"><div><br><br><div = class=3D"gmail_quote_attribution">On Aug 23 2024, at 12:27 pm, Stephen = Reay <php-lists@koalephant.com> = wrote:</div><blockquote><div><br><div>The current inconsistencies = between symbol types can be avoided in userland in a 100% consistent = way. Import or qualify the symbols you use, all the time, and you have 0 = inconsistencies or bizarreness in terms of what it used = when.</div><br></div></blockquote><div><br><div>So are you essentially = arguing that we should put the burden on the majority of users, most of = whom (documented by us or not) likely will have no idea what the problem = is or potential consequences are? = </div></div></div></blockquote><div><br></div><div><br></div><div>No, = I'm saying that the "problem" of performance has had a pretty simple, = consistent solution since namespaces were added to the language, and = that for the vast, vast majority of projects it's a stretch to even = classify it as a problem.</div><div><br></div><div><br></div><div>The = claims about "security" because a function you defined (or included via = a package) is resolved in place of a global one are irrelevant. If = you're including compromised code in your project, all bets are = off.</div><div><br></div><div><br></div><blockquote = type=3D"cite"><div><div><div>BC breaks happen. While I am all for = avoiding BC breaks when possible, sometimes they make sense -- and I = think this is a clear example of when it = does.</div></div></div></blockquote><div><br></div><div>Please be = specific what you mean by "this". The original proposal by Ilija = provides a constant BC break over time, whenever a new global function = is introduced.</div><div><br></div><br><blockquote = type=3D"cite"><div><div><br><div>I think you are exaggerating the impact = of the BC break here. In fact Ilija measured the impact on the top 1000 = composer packages:</div><br><div><a = href=3D"https://gist.github.com/iluuu1994/4b83481baac563f8f0d3204c697c5551= " = title=3D"https://gist.github.com/iluuu1994/4b83481baac563f8f0d3204c697c555= 1">https://gist.github.com/iluuu1994/4b83481baac563f8f0d3204c697c5551</a><= /div></div></div></blockquote><div><br></div><div><br></div><div>Great, = so 0.24% of public packages represented, and 0% of private code = represented. That certainly seems = representative.</div><div><br></div><div>You've also missed the other = aspect here, which I mentioned earlier: namespaced function usage is low = because the language hasn't traditionally supported it anywhere near as = well as namespaced classes. There have been multiple people proclaiming = recently that "static utility classes" are the 'wrong' approach, that = people should use namespaced functions in their code. There are two = active RFCs about function autoloading.</div><div><br></div><div>This = change would at best, make those functions slower to use within the same = namespace, and at worst, more work, with a brand new inconsistency, to = use within the same namespace.</div><div><br></div><blockquote = type=3D"cite"><div><div><br><blockquote>I was specifically pointing out = that a small number of people complaining about this is a ridiculous = reason to even consider the change.</blockquote><br><div>That's one = take. Another take is this is an easy win for a few percentage points = bump in speed, with improved supply-chain security for composer packages = that has a minimal impact on = users.</div><br></div></div></blockquote><div><br></div><div>I was = clarifying (to someone else) that the claim about who objects or = doesn't, was never mine. It's a bit weird that in one email you admit = you have no data, and in the next claim "minimal impact on users". = Either you have data or you don't.</div><div><br></div><br><blockquote = type=3D"cite"><div><div><blockquote>Great, how about the solution that = doesn't have any BC, and works in every version back to = 5.3?</blockquote><br><div>By this logic, we should never introduce BC = breaks. </div></div></div></blockquote><div><br></div><div>We should aim = to reduce BC breaks as much as possible, and especially BC breaks that = have an ongoing impact over time (i.e. new breaks into the future). The = point is that every single *technical* problem pointed out in the = original issue, and the email that arose from it, can be solved, and = could be soled 15 years ago, by using a `\` for global functions (or = using a `use` statement), exactly the same way you do with global = classes and interfaces.</div><br><blockquote = type=3D"cite"><div><div><br><blockquote>Great, so then we can resolve = this whole thing by adding a footnote to the "Name resolution rules" = page in the manual that (a) recommends using qualified names (i.e. = prefix with a `\`) and (b) provides deeper details of the reasons for = those who care.</blockquote><br><div>=46rom the perspective of program = language _design_ (which is what we're talking about here), the goal is = to create a language that helps the developer do something = faster/better/easier, not do the wrong thing (slower code, etc.) by = default and dump the responsibly for that on developers by expecting = them to read a footnote buried in a doc. Especially when the = justification is because there's concerns that code written in 2009 = won't work anymore.</div><br></div></div></blockquote></div><br><div>To = be clear, we aren't "creating" a language. We're talking about a = hypothetical *change* to a core aspect of an existing language, that is = used by literally millions of developers around the = planet.</div><div><br></div><div>The change we're talking about is in = the range of maybe 2-4%, and is 100% solvable in userland - and has been = for those 15 years, in a way that has zero impact on developers using = the language to write their own functions, and is consistent with the = way other symbol lookups (e.g. classes) work. I'll concede you one = point. A footnote is clearly not important enough for a 2% performance = benefit. Let's make it the subtext on the header of ever <a = href=3D"http://php.net">php.net</a> page, just to make sure people = know.</div><div><br></div><div><br></div><div><br></div><blockquote = type=3D"cite"><div><span style=3D"caret-color: rgb(0, 0, 0); color: = rgb(0, 0, 0);">I mean this:</span></div></blockquote><div><span = style=3D"caret-color: rgb(0, 0, 0); color: rgb(0, 0, = 0);"><br></span></div><div><font color=3D"#000000">I'm honestly not even = sure where to begin here. If you add a namespaced function to your code, = and call it from within that namespace, it will run. That's literally by = design. If that is somehow surprising to you, I'd suggest the = aforementioned name resolution page in the php manual. It's not exactly = long, you can probably read it quicker than this = email.</font></div><div><font = color=3D"#000000"><br></font></div><div><font color=3D"#000000">As I and = others have said: if your project has a credible security risk because = of this functionality, you have bigger problems than needing to use a = leading <span style=3D"caret-color: rgb(0, 0, = 0);">backslash.</span></font></div><div><font = color=3D"#000000"><br></font></div><div><br></div><blockquote = type=3D"cite"><div><span style=3D"caret-color: rgb(0, 0, 0); color: = rgb(0, 0, 0);">At it's core a vast majority of the functionality of the = PHP language exists within internally-implemented functions, not = classes.</span></div></blockquote><div><br></div><div><font = color=3D"#000000">There's a lot of procedural APIs in the standard = library/extensions, sure. But p</font><span style=3D"color: rgb(0, 0, = 0);">eople still *use* classes *a lot* though, and there is in general a = push towards more OOP API's and less groups of functions - particularly = for anything with state (e.g. see recent discussions and RFC's about = Curl objects, HTTP Request data objects, BCMath Number object, = Tokenizer, etc). </span></div><div><font = color=3D"#000000"><br></font></div><div><br></div><blockquote = type=3D"cite"><div><span style=3D"caret-color: rgb(0, 0, 0); color: = rgb(0, 0, 0);">So yes, I think it's entirely reasonable that people = would expect that internal functions resolve at a higher priority than = user-defined functions with the same = name</span></div></blockquote><div><span style=3D"caret-color: rgb(0, 0, = 0); color: rgb(0, 0, 0);"><br></span></div><div>OK, you can think that. = I don't agree. One of the top, if not the top thing people complain = about PHP is inconsistencies in the standard library - the order of = needle/haystack arguments being different in string vs array functions = is probably one of the most well = known.</div><div><br></div><div><br></div><div><br></div></body></html>= --Apple-Mail=_CAFE29BA-F864-4215-ABC5-7EAA042EB002--