Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:125187 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 6EF5C1A00BD for ; Sat, 24 Aug 2024 18:16:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1724523506; bh=pHnH2i3kfkXfZ/Xv6rBFg6o+JYums+hkxbPuRG5Wp14=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=CcmGWWGHaO25nIV0THV2dekJTFfJjioDRtit9eyyQ0KoJvhIH16GcY08sb29dCF4F G/QgngbBLjp3zYoXEc+ZIncjzvmi7epZli1gIsHKJMvWJ300AULTM4+ze5ooW7D097 +J2C6AAmZNgzY+HQxHgSq0CJ3ilX3GBN/x5anHWfvKzyR05nVI1qbjmfYQQaOV99R3 XqE2TDpVEbysnqpvNV37tPnIDz1dsUIa5lllyrxiFPTVZsRwx2rpFwBaNbMoIoiI85 wh1Zzli2PpN3VO8JnymjCz151FQImF7lACECF5/FoiT+/rfzS9vrZRxjlDt1fSGqne F4UlxbXBHNx5A== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 3A54F18003E for ; Sat, 24 Aug 2024 18:18:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,DMARC_MISSING, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail1.25mail.st (mail1.25mail.st [206.123.115.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 24 Aug 2024 18:18:24 +0000 (UTC) Received: from smtpclient.apple (unknown [49.48.221.224]) by mail1.25mail.st (Postfix) with ESMTPSA id F18456054D; Sat, 24 Aug 2024 18:16:26 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3776.700.51\)) Subject: Re: [PHP-DEV] [Concept] Flip relative function lookup order (global, then local) In-Reply-To: Date: Sun, 25 Aug 2024 01:16:13 +0700 Cc: PHP internals Content-Transfer-Encoding: quoted-printable Message-ID: <642C34C4-9E6E-4190-8EF9-9765F4A0F175@koalephant.com> References: <181C78B1-65AD-44B4-AD95-A59791B7FF86@koalephant.com> To: Ilija Tovilo X-Mailer: Apple Mail (2.3776.700.51) From: php-lists@koalephant.com (Stephen Reay) > On 25 Aug 2024, at 00:01, Ilija Tovilo wrote: >=20 > Hi Stephen >=20 > On Sat, Aug 24, 2024 at 1:54=E2=80=AFPM Stephen Reay = wrote: >>=20 >> Thanks for clarifying. Out of curiosity, how much optimisation do you = imagine would be possible if the lookups were done the same was as = classes (ie no fallback, names must be local, qualified or imported with = `use`)? >=20 > I haven't measured this case specifically, but if unqualified calls to > local functions are indeed rare (which the last analysis seems to > indicate), then it should make barely any difference. Of course, if > your code makes lots of use of them, then the story might be > different. That said, the penalty of an ambiguous internal call is > much higher than that of a user, local call, given that internal calls > sometimes have special optimizations or can even be entirely executed > at compile time. For local calls, it will simply lead to a double > lookup on first execution. >=20 >> I am aware this is a BC break. But if it's kosher to discuss = introducing a never ending BC break I don't see why this isn't a valid = discussion either. It would give *everyone* that elusive 2-4% = performance boost, would resolve any ambiguity about which function a = person intended to call (the claimed security issue) and would bring = consistency with the way classes/etc are referenced. >=20 >> =46rom my analysis, there were 2 967 unqualified calls to local > functions in the top 1 000 repositories. (Disclaimer: There might be a > "use function" at the top for some of these, the analysis isn't that > sophisticated.) >=20 > I also ran the script to check for unqualified calls to global > functions (or at least functions that weren't statically visible in > that scope in any of the repositories files), and there were ~139 000 > of them. It seems like this is quite a different beast. To summarize: >=20 > 1. Flipping lookup order: ~a few dozens of changes > 2. Global only: ~3 000 changes > 3. Local only: ~139 000 changes >=20 > While much of this can be automated, huge diffs still require > reviewing time, and can lead to many merge conflicts which also take > time to resolve. I would definitely prefer to go with 1. or > potentially 2. >=20 > Ilija >=20 Hi Ilija, I understand that a change like (3) is a huge BC break, and as I said = earlier, I wasn't actually suggesting that is the action to take, = because I don't think there is sufficient reason to take *any* action. = But given that some people in this thread seem convinced that *a* change = to functionality is apparently required, I do think every potential = change, and it's pros and cons, should be discussed. As I've said numerous times, and been either outright dismissed or = ignored: there has been a consistent push from a non-trivial number of = internals members that userland developers should make better use of = regular functions, rather than using classes as fancy namespaces. There = was a recent RFC vote that implicitly endorsed this opinion. Right now, the lookup rules make namespaced regular functions a = consistent experience for developers, but the lack of autoload makes it = unpopular, and the lack of visibility for such symbols can be = problematic.=20 With the change you're proposing, there will be *another* hurdle that = makes the use of regular namespaced functions harder/less intuitive, or = potentially (with option 1) unpredictable over PHP versions, due to the = constant threat of BC breaks due to new builtin functions - right when = we have not one but two RFCs for function autoloading (arguably the = biggest barrier to their increased usage in userland). So the whole reason I asked about (3) is because it would - (a) bring consistency with class/interface/trait symbols; - (b) inherently bring the much desired 2% performance boost for = function calls, because people would be forced to qualify the names; - (c) have zero risk of of future WTF BC break when a new global = function interrupting local function lookups; - (d) have no need for a new "simpler" qualifying syntax (you can't get = shorter than 1 character); - (e) presumably simplify function autoloading, because there's no = longer any "fallback" step to worry about before triggering an = autoloader; - (e) even solve the "security" concerns John raised, because the = developer would be forced to qualify their usage if they wanted to use = the builtin function - their intent is always explicit, never guessed. Yes, it is a huge BC break in terms of the amount of code that's = affected. But it's almost certainly one of the simplest BC break to = "fix" in the history of PHP BC breaks. How much code was affected when register globals was removed? Or when = magic quotes was removed? Or when short codes were removed?=20 Surely any change being proposed here would mean a deprecation notice in = the next release after 8.4, and then whatever actual change is proposed, = in the next major version after that. So possibly 8.5 and then 9.0, but = potentially 9.0 and then 10.0. If either of (1) or (2) is chosen, and the "acceptability" of such a = choice depends on something less verbose than "namespace\" to qualify a = local function, projects literally can't future proof (or = no-deprecation-notice-proof, if you prefer) their code against the = eventual flip until that change is implemented - in a scenario where a = deprecation (and new local qualifier) goes out in 2025 as part of 8.5, = and a flip happens in 2026 as part of 9.0, that would cuts the time = projects have to effectively adapt, in half, and it means any code = that's updated for it, can't make use of the new "less verbose" local = qualifier if they also need to support versions prior to it being = available. If it happened to be 9.0 and 10.0 being the deprecation and "change" = versions, obviously people have longer to make the required change - but = that argument cuts both ways. If you have 5 years to change every = `strlen` to `\strlen` it's hardly going to cause a huge and sudden swath = of noise in revision history. I would imagine most projects would just = adopt a new code style, and prefixing with `\` would occur automatically = whenever a file is otherwise modified by a developer. There's also an impact on internals development/RFC with either (1) or = (2): *any* proposed new global function in the standard library now has = a BC barrier to pass if it *might* conflict with one defined by anyone = in userland, in any namespace. JS is a living embodiment of of this = problem: see String#includes, Array#includes, and Array#flat - and = that's with people doing the "wrong thing" (extending builtin JS = prototypes is arguably the same as using the `\php` namespace) Multiple people have lamented the way function fallbacks were originally = implemented. If you're going to insist on making a change, let's at = least aim for a change that brings *MORE* consistency, and fixes a = previous mistake, rather than adding a brand new inconsistency and who = knows how many years of unexpected BC breaks for unsuspecting userland = developers - who apparently *already* stuggle to understand the way = symbol lookup happens - into the future, and adding yet *another* = reason for people to not use namespaced functions. Cheers Stephen=20=