Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:124295 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 9B6DD1A009C for ; Mon, 8 Jul 2024 20:54:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1720472165; bh=UDsHSzAtZ/vnh1RlyChKYIg2e1cNriOJH6GtH45YWtg=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=iAjj0Gl7dbNey9IdEM1B7QoggrfqWJkKzA6w5jzxSLtyGNrqDN9uoooRFTKzlmfQL 93NEgIMyWk2QyoYGZVN5LAYgpYqoR0KmuIoVXNZtIMp302Nyx1MO+s6ssNG8O1xnAc QEFMVTL4IHbyRhL4weMou/TSVvMQObho81wAidz5WwvzbtyXWFJrTdh+jOmbiLVgyK I4Z9RnNvhhhXxIU4YlxzfSGKwiPbfXOPFmbEeUK9MOfMPtYTtEhMWqV0Je4rllXDwu 9ijHlJ/lvlKW2KrCrb94cOexA+BHcm6v2XbbiSD8GB1Frrjfsx8I0HIEruee2fwzWP HFCJsyL3knPwQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 4DBBC18076C for ; Mon, 8 Jul 2024 20:56:04 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: Error (Cannot connect to unix socket '/var/run/clamav/clamd.ctl': connect: Connection refused) X-Envelope-From: Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 8 Jul 2024 20:56:03 +0000 (UTC) Received: by mail-pj1-f46.google.com with SMTP id 98e67ed59e1d1-2c97ff39453so3011337a91.0 for ; Mon, 08 Jul 2024 13:54:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720472078; x=1721076878; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=UDsHSzAtZ/vnh1RlyChKYIg2e1cNriOJH6GtH45YWtg=; b=NHYUzB1F4wksD/UWrYvdvzplAoBpSzICoOutuAApyEv2pvy2c7ZfrGQlRZ6OwaQdt/ hLMcsRrJfjrSkbGe8SUkGD6RzglwGUfUNe6ksxjAScB/kSLOjRXkvneFpWAFdXUtGk5y MSLUJEgIdNLAXtJk0u8865xd3yrsfanH303QMu+xX8I4yEgesj5BVtYsleWHd0Oqk3n4 fElfOhjvAUDBgfqHjoERdVZXyrgKf+tHQC8vvVZdjbAe3eZgWyknZmuUjcsap3WoOACh u3NUndEvEt8tX8jNpSBeRfDAVzOYIwnuWHvVUt2S6tgEvkKsD3MWWU+1VHWLovke39+f klEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720472078; x=1721076878; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=UDsHSzAtZ/vnh1RlyChKYIg2e1cNriOJH6GtH45YWtg=; b=UlILVzbp6R++ABc4t7OhQMKg6aSYHCTMSOKl+E+5d3rN9HVgTo7k+46onSRLNOh30a 12BKUwKe0o2AJSaHXwxHCV4oODxjynr6ig/thZKfs7IvDRxfOnbO4KnNcaz3yITW1ARA 7owhRbnafUjgFH7D9aDvJrqFhoTiVV651SlgGRmuQeEHYTdTaE1CcDexcGGCbmL0m47Q xGZQMAqD7PWtuiDwpOKJUWIRNYDYY3Tw7sZnlIrOOxzpn4p7DKeNpZ+veafRh5zEScun a8JElV22UTQ5F8kEGtMRi/UeuZ4cm+qqb+M/Dgw2AWyPOJZ2g48HEP8NMzdW4VDnlaaC sC8A== X-Gm-Message-State: AOJu0YzGUd44RXknuGRF/6RG2vovxWTZ+p0xJGAVi7kQIaY1nLG27gec QPWHXrwO5MNk50o2yfyQp8sLP3J6MVyrpYl32Mz7TnaXgZIQZFWTy5LeGJgiwdP1t6YUuGraD+d QGUM59iY3GOg9jOarTY5d2r+3Jzk2uA== X-Google-Smtp-Source: AGHT+IFpurRpAFZp2CeK4ADt3x52EQkyMN7N2Z+6PVyffL/fmfnr/ater1m/JOqZ/caEMmM3hm/ldFjoQ00X8x7Dys4= X-Received: by 2002:a17:90b:3786:b0:2c9:7ebd:b957 with SMTP id 98e67ed59e1d1-2ca35c272fbmr744812a91.11.1720472077759; Mon, 08 Jul 2024 13:54:37 -0700 (PDT) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 References: <09559430-4477-4516-8D78-6F4071E1AA6C@newclarity.net> <0182F3D6-F464-477F-9029-A2D0A8B50C71@koalephant.com> <1AFD7AAE-8BEA-460D-88A8-15BB3D30A775@koalephant.com> <1BE6A849-A2A9-4E17-9C11-5099EF74F5C0@rwec.co.uk> In-Reply-To: <1BE6A849-A2A9-4E17-9C11-5099EF74F5C0@rwec.co.uk> Date: Mon, 8 Jul 2024 13:54:26 -0700 Message-ID: Subject: Re: [PHP-DEV] [PHP-Dev] Versioned Packagers (Iteration IV) To: "Rowan Tommins [IMSoP]" Cc: internals@lists.php.net Content-Type: multipart/alternative; boundary="0000000000000a7d32061cc29ffe" From: jordan.ledoux@gmail.com (Jordan LeDoux) --0000000000000a7d32061cc29ffe Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Jul 8, 2024 at 2:42=E2=80=AFAM Rowan Tommins [IMSoP] wrote: > > > On 8 July 2024 04:25:45 CEST, Jordan LeDoux > wrote: > >I think it's strange that this discussion has driven deep down the tange= nt > >of versioning... > [...] > >Things like separating global scope between importer and importee, manag= ed > >visibility of symbols and exports from modules/packages, allowing for > >separate autoloaders for things which are called or included via an > import, > >etc. Those are the things that the language itself can do. > > > >All this other stuff feels like a distraction. > > I agree. I wrote most of the below a couple of days ago, but I don't thin= k > it posted correctly, so apologies if some people see it twice: > > Autoloading is just a way to load files later, by the engine telling you > when a class is first needed. PHP does not, and should not, make any > assumptions about how files are laid out on disk; an autoloader doesn't > actually need to load any files at all, and if it does, it uses the same > include or require statements which have been in PHP for decades. > > Likewise, installing packages and defining version schemes is a completel= y > separate problem space that can probably be served by a few small tweaks = to > Composer once the language provides the underlying functionality. > > The core of the problem you seem to want to solve is this: if you have tw= o > files foo_1.php and foo_2.php, which both define a class \Acme\Foo, how d= o > you load both of them, so that you end up with two differently named > classes? > > In JS, that's easy, because functions and object constructors (and > "classes") exist as objects you can pass around as variables, they don't > need to know their own name. In PHP, everything is based on the idea that > functions and classes are identified by name. You can rewrite the name in > the class declaration, and in direct references to it, but what about cod= e > using ::class, or constructing a name and using "new $name", and so on? H= ow > will tools using static analysis or reflection handle the renaming - e.g. > how does DI autowiring work if names are in some sense dynamic? > > You've also got to work out what to do with transitive dependencies - if = I > "import 'foo_1.php' as MyFoo", but Foo in turn has "import 'guzzle_2.php' > as MyGuzzle", what namespace do all Guzzle's classes get rewritten into? > What about dependencies that are specifically intended to bridge between > packages, like PSR-7 RequestInterface? > > My advice: start with the assumption that something has already installed > all the files you need into an arbitrary directory structure, and somethi= ng > is going to generate a bunch of statements to load them. What happens nex= t, > in the language itself, to make them live side by side without breaking? = If > we get a solid solution to that (which I'm skeptical of), we can discuss > how Composer, or the WordPress plugin installer, would generate whatever > include/import/alias/rewrite statements we end up creating. > > Regards, > -- > Rowan Tommins > [IMSoP] > Rowan Tommins > [IMSoP] > I think it could be done somewhat simply (relative to the other things that have been discussed) if the engine reserved a specific namespace for imported symbols internally. Something like: `\__Imported\MyImportStatement` Where the `\__Imported` namespace is reserved and throws a parser error if it occurs in code anywhere, and `MyImportStatement` corresponds to an application importing the code using something like `import MyPackage as MyImportStatement;` Then, all symbols which are loaded into the global space as a result of the import are actually rewritten into the hidden namespace the engine actually uses under the hood, and any uses from the import statement in the application code which has the import would reference the symbols in the prefixed namespace. This would not be trivial however. The engine code which supports this would need to keep track of a kind of "context" for each file, based on what namespace the file was included from. For instance, if an autoload occurs inside the package that was loaded into `MyImportStatement`, the engine would need to be aware that the code being executed is defined in that namespace, REGARDLESS of whether it was a class, function, or statement, and load ALL symbols that are created as a result into the rewritten namespace. It would also need to translate in the other direction for `use` statements inside the package, since it would not know ahead of time what rewritten namespace it would actually be loaded in. However, this is the simplest solution I see that doesn't involve writing a second PHP engine just for this sort of thing. Jordan PS: For those unaware, for each "symbol" (something that has a unique referenceable name in the code, roughly), there is at least one name that refers to ONLY that thing internally. (I'm fairly certain that there are NO situations where one name can refer to two things at all, but I am not enough of an expert in the C code to be completely certain about this, and it's entirely possible this is in fact a niche common thing that I've never encountered before). When something is namespaced, the entire namespace in the engine is prefixed to the "name" of the thing when it is created. So a function `foo` in the namespace `Bar` has the name "\Bar\foo". Any time you use it as just "foo", the engine because of context knows to put "Bar" in front of it before looking up its definition to execute it. The global symbol space are the items which have nothing prepended, and if a namespace was inaccessible because the parser errored on `use`, `namespace`, `new`, and other similar statements that used a part of that namespace like I outlined, the result would be that for the engine it would treat all of the code as if it were one application that it knows some extremely complex namespace replacement rules for (because that's what it actually would be), but to PHP devs it would act almost like sandboxes where code from one area cannot access or affect other areas. This would create some edge cases, like how `global` behaves, or how any of the superglobal variables could be used, etc. But those are probably easier to nail down than writing a different engine or running a second process and setting up some messaging between the two. Though that might be the more "correct" way to handle something like this. However, I could be wrong about the difficulty of this, as I've never attempted that kind of change in the Zend engine before. --0000000000000a7d32061cc29ffe Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Mon, Jul 8, 2024 at 2:42=E2=80=AFA= M Rowan Tommins [IMSoP] <imsop.p= hp@rwec.co.uk> wrote:


On 8 July 2024 04:25:45 CEST, Jordan LeDoux <jordan.ledoux@gmail.com> wrote: >I think it's strange that this discussion has driven deep down the = tangent
>of versioning...
[...]
>Things like separating global scope between importer and importee, mana= ged
>visibility of symbols and exports from modules/packages, allowing for >separate autoloaders for things which are called or included via an imp= ort,
>etc. Those are the things that the language itself can do.
>
>All this other stuff feels like a distraction.

I agree. I wrote most of the below a couple of days ago, but I don't th= ink it posted correctly, so apologies if some people see it twice:

Autoloading is just a way to load files later, by the engine telling you wh= en a class is first needed. PHP does not, and should not, make any assumpti= ons about how files are laid out on disk; an autoloader doesn't actuall= y need to load any files at all, and if it does, it uses the same include o= r require statements which have been in PHP for decades.

Likewise, installing packages and defining version schemes is a completely = separate problem space that can probably be served by a few small tweaks to= Composer once the language provides the underlying functionality.

The core of the problem you seem to want to solve is this: if you have two = files foo_1.php and foo_2.php, which both define a class \Acme\Foo, how do = you load both of them, so that you end up with two differently named classe= s?

In JS, that's easy, because functions and object constructors (and &quo= t;classes") exist as objects you can pass around as variables, they do= n't need to know their own name. In PHP, everything is based on the ide= a that functions and classes are identified by name. You can rewrite the na= me in the class declaration, and in direct references to it, but what about= code using ::class, or constructing a name and using "new $name"= , and so on? How will tools using static analysis or reflection handle the = renaming - e.g. how does DI autowiring work if names are in some sense dyna= mic?

You've also got to work out what to do with transitive dependencies - i= f I "import 'foo_1.php' as MyFoo", but Foo in turn has &q= uot;import 'guzzle_2.php' as MyGuzzle", what namespace do all = Guzzle's classes get rewritten into? What about dependencies that are s= pecifically intended to bridge between packages, like PSR-7 RequestInterfac= e?

My advice: start with the assumption that something has already installed a= ll the files you need into an arbitrary directory structure, and something = is going to generate a bunch of statements to load them. What happens next,= in the language itself, to make them live side by side without breaking? I= f we get a solid solution to that (which I'm skeptical of), we can disc= uss how Composer, or the WordPress plugin installer, would generate whateve= r include/import/alias/rewrite statements we end up creating.

Regards,
--
Rowan Tommins
[IMSoP]
Rowan Tommins
[IMSoP]

I think it could be done somewh= at simply (relative to the other things that have been discussed) if the en= gine reserved a specific namespace for imported symbols internally. Somethi= ng like:

`\__Imported\MyImportStatement`

Where the `\__Imported` namespace is reserved and throws a parser = error if it occurs in code anywhere, and `MyImportStatement` corresponds to= an application importing the code using something like `import MyPackage a= s MyImportStatement;`

Then, all symbols which are = loaded into the global space as a result of the import are actually rewritt= en into the hidden namespace the engine actually uses under the hood, and a= ny uses from the import statement in the application code which has the imp= ort would reference the symbols in the prefixed namespace.

This would not be trivial however. The engine code which supports = this would need to keep track of a kind of "context" for each fil= e, based on what namespace the file was included from. For instance, if an = autoload occurs inside the package that was loaded into `MyImportStatement`= , the engine would need to be aware that the code being executed is defined= in that namespace, REGARDLESS of whether it was a class, function, or stat= ement, and load ALL symbols that are created as a result into the rewritten= namespace. It would also need to translate in the other direction for `use= ` statements inside the package, since it would not know ahead of time what= rewritten namespace it would actually be loaded in.

However, this is the simplest solution I see that doesn't involve wr= iting a second PHP engine just for this sort of thing.

=
Jordan

PS: For those unaware, for each "= symbol" (something that has a unique referenceable name in the code, r= oughly), there is at least one name that refers to ONLY that thing internal= ly. (I'm fairly certain that there are NO situations where one name can= refer to two things at all, but I am not enough of an expert in the C code= to be completely certain about this, and it's entirely possible this i= s in fact a niche common thing that I've never encountered before). Whe= n something is namespaced, the entire namespace in the engine is prefixed t= o the "name" of the thing when it is created. So a function `foo`= in the namespace `Bar` has the name "\Bar\foo". Any time you use= it as just "foo", the engine because of context knows to put &qu= ot;Bar" in front of it before looking up its definition to execute it.=

The global symbol space are the items which have = nothing prepended, and if a namespace was inaccessible because the parser e= rrored on `use`, `namespace`, `new`, and other similar statements that used= a part of that namespace like I outlined, the result would be that for the= engine it would treat all of the code as if it were one application that i= t knows some extremely complex namespace replacement rules for (because tha= t's what it actually would be), but to PHP devs it would act almost lik= e sandboxes where code from one area cannot access or affect other areas.

This would create some edge cases, like how `global= ` behaves, or how any of the superglobal variables could be used, etc. But = those are probably easier to nail down than writing a different engine or r= unning a second process and setting up some messaging between the two. Thou= gh that might be the more "correct" way to handle something like = this. However, I could be wrong about the difficulty of this, as I've n= ever attempted that kind of change in the Zend engine before.
--0000000000000a7d32061cc29ffe--