Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:127277 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id 0A7201A00BC for ; Sun, 4 May 2025 07:34:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1746343929; bh=2g1SpS64QkUlWn58K89Z/RoqSBhyB3DWNepsuzgMIlY=; h=From:Date:Subject:To:From; b=FPnBcEhXIJlujdq2Eu0mbMmyu3xzZjwCh4fsv6XiDqacwy8RM1OLxyRHwHbjAGl2r pCcT33bRW0jWNWruYXLRwxODydEb7Xbi+jAcTHAYz0kJ8qgx7uS3+lmKKnN5vbkZyc Ynk2+CcRnP7B5tvbMAW3/1WqTq75UJlWjH1pbuQQolGqWvncMcKYH7KVLjJSk8MsHf 3R26lJz42I8kfWCx3xCcxRq401JNi0bNEdUTtx/cmIfH4FcB/ceEW5NIlsp/eYbg8b allf4bMTFPnmobFXfo7QBUS4ijuIL+S9oEmka6NZyYgb/5D7WNdzT2EMH3EB11nf4l 6so3FRmyTap8g== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 7F999180047 for ; Sun, 4 May 2025 07:32:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: Error (Cannot connect to unix socket '/var/run/clamav/clamd.ctl': connect: Connection refused) X-Envelope-From: Received: from mail-qk1-f175.google.com (mail-qk1-f175.google.com [209.85.222.175]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sun, 4 May 2025 07:32:05 +0000 (UTC) Received: by mail-qk1-f175.google.com with SMTP id af79cd13be357-7c56a3def84so351424385a.0 for ; Sun, 04 May 2025 00:34:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1746344059; x=1746948859; darn=lists.php.net; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=pvYhfVKVY2PKfyOgaDJYkP4VeLbO8JT9nmYNANWoECY=; b=S3KCYTZdT1N/t6OP/LHMZ7AwwNdf99kXH9INOP2uouw5zO+wpBrcN3W2BX2R64RpjR LhIy4uJc3w+F5h47BAgy/FVGl+Oc34Jgz6OZMWaxFjfK22Dvj4BGgw1W6+tgI6fTLmcP vZzR9p+K4QXcHYrzgdl7ePCs6zRo08SPmHJRRef6TbHInwdC0eGaZ3AmktoTrh4Y0f88 u6SBTs73Y19PSP85MRwF1Uwq3Gx1P19SfNp6Y87UdruEGH5tgnmgacRcGWD1y2ueEG9S uBYONS1ksJFnfqZ8maRULTUACE/1Z+/tw2NtRqEKJ9Ecc7pU7sIv9/mbW8Z/xEJ7uzWy 9Alw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746344059; x=1746948859; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=pvYhfVKVY2PKfyOgaDJYkP4VeLbO8JT9nmYNANWoECY=; b=vJeRq1SY3WdK6AVkXi1mPzBukHTYQ8qLNSXYKM14OnJlC+ElY13ch5hGeQ0/eYAANn W47alO9pL+bWj3hqRPNplt+Y1fQ+sH1kq3qED0K86EPHobtd+UzWMuWtoU2n9SHuTfC9 jDAluqBlFYfRB3bflDpr0xcOqoTXysNl3q3Yu6p7JQD1U8Exdl30cSw3t6RPVXmXsdwe JBGYXy/guEVbXK/rPcZ0PCkgzkI2Q06LZypI/d0upqbAJPXQdUGdln0Hs3koway9kBFd S3MLg6SLqvtUgQV57LJwcat93gqJAxmgH9mFfupNoEuQ+ajntYIrVqZzIZTRgI7zV0qh jekw== X-Gm-Message-State: AOJu0YyJ6OstqxMIElADWiII9pE8VHnkd8Dl4Z7djsduZJtPO1FB7xqr FSWDd7pI+FTvWigFWLEGymWpeDKAGFsg1jzH49+SNLe+i53Sdq3BCQBf6YAbtUiG7jTlndfkrmY i2pnczYn/iIVmzV8jeDIksTx8HjFN1utC X-Gm-Gg: ASbGncvlfOMAJ0o+ryxwFSmYi2RpCgEGWEUL/lGd8O7Jw/5gg+Tg0UMvyhJHrU88b4p vQxpwPNZhWCCmYuaoeG7rhvhT5hlElPW26lrt1NxWmZqhcZNz9b7uw2QSDZn6SMQUsTKTBAVUoD EPqrhAA4a0t2NLfi23z2D4llVYgjYFG/fppBDOWvf++s3V331QLGyL X-Google-Smtp-Source: AGHT+IHhre+WQUU9qzlQh7FC/v+cWUG/4vyLtFFm5BQ0n5MR89a6QHxsWoC4h3zi6PwoP6CIDzbZAF4lAj56Vods3S8= X-Received: by 2002:a05:6214:ca7:b0:6e6:61f1:458a with SMTP id 6a1803df08f44-6f523755183mr78933346d6.14.1746344058843; Sun, 04 May 2025 00:34:18 -0700 (PDT) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 Date: Sun, 4 May 2025 03:34:05 -0400 X-Gm-Features: ATxdqUFYBFLovEj9ZWpYC8HJPGfu1jG1TV6tYV9wxhtd8H5ZN9nYMLjVo7JO26I Message-ID: Subject: [PHP-DEV] Modules, again. To: PHP internals Content-Type: multipart/alternative; boundary="00000000000048764a06344a696f" From: tendoaki@gmail.com (Michael Morris) --00000000000048764a06344a696f Content-Type: text/plain; charset="UTF-8" It's been 9 months. Been researching, working on other projects, mulling over points raised the last time I brought this up. And at the moment I don't think PHP 8.5 is in its final weeks so this isn't a distraction for that. The previous discussion got seriously, seriously derailed and I got lost even though I started it. I'm not going to ask anyone to dig into the archives, let's just start afresh. -------------------------------------------------------------------------------- THE PROBLEM -------------------------------------------------------------------------------- PHP has no way of dealing with userland code trying to write to the same entry on the symbol tables. Namespaces are a workaround of this and combined with autoloaders they've yielded the package environments we currently have (usually composer), but at the end of the day if two code blocks want different versions of the same package invoked at the same time there's going to be a crash. This is seen most visibly in WordPress plugins that use composer, as they resort to monkey-typing the composer packages they consume to avoid such collisions. -------------------------------------------------------------------------------- PROPOSAL -------------------------------------------------------------------------------- Modules - blocks of PHP code with independent symbol tables and autoload queues. Instead of using any new keywords along with the backwards compatibility problems that creates three existing keywords will be used in a new way: "use", "require", and "yield" The first file that PHP loads will always be on the "main thread". To bring in code as a module the use require structure is used. The simplest possible version of this is as follows: PHP Code ----------------------------------------------------------------------- use require 'mymodule.php'; -------------------------------------------------------------------------------- The contents of 'mymodule.php' have two requirements. First, a namespace is *required* of a module. Second, yield statements are used to mark the functions classes and constants the module exports, and at least one such yield must be present. Hence the file may look something like this: PHP Code ----------------------------------------------------------------------- namespace MyModule; yield function sum(a, b) { return a + b; } -------------------------------------------------------------------------------- Returning to our caller, it could make use of this function as follows. PHP Code ----------------------------------------------------------------------- use require 'mymodule.php'; echo MyModule::sum(3, 4); -------------------------------------------------------------------------------- So far there is nothing here that couldn't have been done with a static class. The important difference though is in behavior. 1. The module does not affect or see variables on the main thread - including the superglobals. A module can only get to them if it receives them as an argument in some sort of setter yield function. 2. The module does not affect or see constants or functions established on the main thread or in other modules. It can see and autoload classes from the main thread if the module author opts into this (discussed below). The use case above is not typical - usually inclusions from modules are more targeted. PHP Code ----------------------------------------------------------------------- use sum require 'mymodule.php'; echo sum(3, 4); // 7 -------------------------------------------------------------------------------- Here the class is not created in the main thread, only the yielded function is. Aliases and multiple objects can be declared in the use just as is the case now. PHP Code ----------------------------------------------------------------------- use sum, difference as subtract require 'mymodule.php'; -------------------------------------------------------------------------------- And if desired the namespace of the module can be aliased on the fly using the wildcard operator PHP Code ----------------------------------------------------------------------- use * as AliasModule require 'mymodule.php'; AliasModule::sum(3,4); -------------------------------------------------------------------------------- Require continues to invoke autoloaders, even when used in the context of use require. The callback defined in spl_autoload_register will receive a second argument from this context, boolean true if the main thread is loading a module, and if a module is loading a module its namespace string will be sent. Hence PHP Code ----------------------------------------------------------------------- require('./vendor/composer/autoload.php') use require 'mymodule'; // Callback will receive args ('mymodule', true) class A; // Callback will receive args ('A', false) - the false case should preserve BC. -------------------------------------------------------------------------------- And in a module PHP Code ----------------------------------------------------------------------- namespace MyModule; /* * Mod Author can elect to use the global autoloader * by passing the string "global" instead of a callback */ spl_autoload_register('global'); use require 'othermodule'; // Callback will receive args ('othermodule', 'MyModule') class A; // Callback will receive args ('A', 'mymodule') -------------------------------------------------------------------------------- Autoload callbacks are currently required to directly load the file. For modules this *will not work* because the loading of a module file involves the setup of a new symbol table, autoload queue, and slightly different parsing rules (again - namespace is required, not optional, and at least one top level yield statement must be present). So when an autoloader is asked about a module it must return the absolute path to the module, or false if it can't resolve it (handing off to the next loader in the queue). PHP will then load the module as if the filepath had been placed in the use require statement in the first place. -------------------------------------------------------------------------------- CLOSING REMARKS -------------------------------------------------------------------------------- I want to take a moment to ruminate on what doors are opened by the above, but all that follows is NOT part of my proposal. The above gets what I feel is needed - a way to cleanly run disparate packages in an application whose authors refuse to update it to embrace composer (cough - WordPress - cough). But even for projects that do embrace composer and its package management the above makes the prospect of a large API change in a major version far less frightening. In the current setup all the extensions have to be kept current and more or less on the same page. The larger the extension authoring set becomes the less feasible this is - projects get abandoned, stagnate and if you have a site that needs such then finding a replacement or upgrading it personally can be a pain. The string that gets passed to the autoloader could be anything btw - rules for which versions the module might accept like 'mymodule@7.x' can work if the package manager is written to parse out such, but the rules for such, not to mention what the composer.json file or equivalent would need to look like is outside the scope of this proposal. It should be noted that Modules offer a much more black box behavior than the current namespaces and autoloaders can provide. The only part of a module the outside world can see is what it yields. If a class isn't yielded the outside world can't make an instance of it. This shielding of internal API's should be useful because no matter how big a note you make in the comments about how a class shouldn't be used by outside code sooner or later someone will do it and their code will break when you change the internal API. While it is their fault for doing such, it can be a concern especially if the use of such "internal" API's becomes commonplace (Drupal has several such instances of this) Finally, my previous writings on this have mused about possibly having module files possess vastly different parsing rules. I bring this up as a possibility but I won't dwell on it as it could easily become a thread derailing distraction. That said, it should be possible to use the autoload callback to signal to PHP that a block of code should be loaded using the module parsing rules if having them be different in any way is desired. That or a new require_module statement could be used to pull in code under the module parsing rules. As to why this might be desirable - I'm no expert on the engine but I'm going to guess that giving each module an independent symbol table will incur overhead, possibly significant overhead. One way to claw back performance is to fix bugs that have been unfixable for backwards compatibility reasons. There is no existing module code, so in one fell swoop this code can step away from those problems. If this is done though it has to be done right as the window closes once projects with modules start appearing. --00000000000048764a06344a696f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
It's been 9 months. Been researching, working on = other projects, mulling over
points raised the last time I brought this= up. And at the moment I don't think
PHP 8.5 is in its final weeks = so this isn't a distraction for that.=C2=A0 The
previous discussion= got seriously, seriously derailed and I got lost even though
I started= it. I'm not going to ask anyone to dig into the archives, let's ju= st
start afresh.

-----------------------------------------------= ---------------------------------
THE PROBLEM
-----------------------= ---------------------------------------------------------
PHP has no way= of dealing with userland code trying to write to the same entry
on the= symbol tables.=C2=A0 Namespaces are a workaround of this and combined with=
autoloaders they've yielded the package environments we currently = have (usually
composer), but at the end of the day if two code blocks w= ant different versions
of the same package invoked at the same time the= re's going to be a crash. This
is seen most visibly in WordPress pl= ugins that use composer, as they resort to
monkey-typing the composer p= ackages they consume to avoid such collisions.

---------------------= -----------------------------------------------------------
PROPOSAL
= ---------------------------------------------------------------------------= -----
Modules - blocks of PHP code with independent symbol tables and au= toload queues.

Instead of using any new keywords along with the bac= kwards compatibility
problems that creates three existing keywords will= be used in a new way: "use",
"require", and "y= ield"

The first file that PHP loads will always be on the "= ;main thread". To bring in
code as a module the use require structu= re is used. The simplest possible
version of this is as follows:

= PHP Code ------------------------------------------------------------------= -----

use require 'mymodule.php';
-----------------------= ---------------------------------------------------------

The conten= ts of 'mymodule.php' have two requirements. First, a namespace is*required* of a module. Second, yield statements are used to mark the fun= ctions
classes and constants the module exports, and at least one such y= ield must be
present. Hence the file may look something like this:
PHP Code ----------------------------------------------------------------= -------

namespace MyModule;

yield function sum(a, b) { return= a + b; }

----------------------------------------------------------= ----------------------

Returning to our caller, it could make use of= this function as follows.

PHP Code --------------------------------= ---------------------------------------

use require 'mymodule.ph= p';

echo MyModule::sum(3, 4);
-------------------------------= -------------------------------------------------

So far there is no= thing here that couldn't have been done with a static class.
The imp= ortant difference though is in behavior.

1. The module does not aff= ect or see variables on the main thread - including
the superglobals. A= module can only get to them if it receives them as an
argument in some= sort of setter yield function.

2. The module does not affect or see= constants or functions established on the
main thread or in other modu= les. It can see and autoload classes from the main
thread if the module = author opts into this (discussed below).

The use case above is not t= ypical - usually inclusions from modules are more
targeted.

PHP C= ode -----------------------------------------------------------------------=

use sum require 'mymodule.php';

echo sum(3, 4); // 7=
-----------------------------------------------------------------------= ---------

Here the class is not created in the main thread, only the= yielded function is.

Aliases and multiple objects can be declared i= n the use just as is the case now.

PHP Code ------------------------= -----------------------------------------------

use sum, difference = as subtract require 'mymodule.php';
----------------------------= ----------------------------------------------------

And if desired = the namespace of the module can be aliased on the fly using the
wildcard= operator

PHP Code -------------------------------------------------= ----------------------

use * as AliasModule require 'mymodule.ph= p';

AliasModule::sum(3,4);
----------------------------------= ----------------------------------------------

Require continues to = invoke autoloaders, even when used in the context of use
require.=C2=A0= The callback defined in spl_autoload_register will receive a second
arg= ument from this context, boolean true if the main thread is loading a modul= e,
and if a module is loading a module its namespace string will be sent= . Hence

PHP Code ---------------------------------------------------= --------------------

require('./vendor/composer/autoload.php'= ;)

use require 'mymodule';
// Callback will receive args = ('mymodule', true)
class A;
// Callback will receive args (&#= 39;A', false) - the false case should preserve BC.
-----------------= ---------------------------------------------------------------

And = in a module

PHP Code -----------------------------------------------= ------------------------
namespace MyModule;

/*=
=C2=A0* Mod Author can elect to use the global autoloader
<= div>=C2=A0* by passing the string "global" instead of a callback<= /div>
=C2=A0*/
spl_autoload_register('global');
use require 'othermodule';
// Callback will receive args ('= ;othermodule', 'MyModule')
class A;
// Callback will rece= ive args ('A', 'mymodule')
-----------------------------= ---------------------------------------------------

Autoload callbac= ks are currently required to directly load the file. For
modules this *w= ill not work* because the loading of a module file involves the
setup of= a new symbol table, autoload queue, and slightly different parsing
rule= s (again - namespace is required, not optional, and at least one top level<= br>yield statement must be present). So when an autoloader is asked about a= module
it must return the absolute path to the module, or false if it c= an't resolve it
(handing off to the next loader in the queue). PHP w= ill then load the module as
if the filepath had been placed in the use r= equire statement in the first place.


---------------------------= -----------------------------------------------------
CLOSING REMARKS--------------------------------------------------------------------------= ------

I want to take a moment to ruminate on what doors are opened = by the above, but
all that follows is NOT part of my proposal.=C2=A0 The= above gets what I feel is
needed - a way to cleanly run disparate packa= ges in an application whose authors
refuse to update it to embrace compo= ser (cough - WordPress - cough). But even
for projects that do embrace c= omposer and its package management the above makes
the prospect of a lar= ge API change in a major version far less frightening. In
the current se= tup all the extensions have to be kept current and more or less
on the s= ame page. The larger the extension authoring set becomes the less
feasib= le this is - projects get abandoned, stagnate and if you have a site thatneeds such then finding a replacement or upgrading it personally can be a= pain.

The string that gets passed to the autoloader could be anythi= ng btw - rules for
which versions the module might accept like 'mymo= dule@7.x' can work if the
package manager is written to parse out su= ch, but the rules for such, not to
mention what the composer.json file o= r equivalent would need to look like is
outside the scope of this propos= al.

It should be noted that Modules offer a much more black box beha= vior than the
current namespaces and autoloaders can provide.=C2=A0 The = only part of a module the
outside world can see is what it yields. If a = class isn't yielded the outside
world can't make an instance of = it.=C2=A0 This shielding of internal API's should be
useful because = no matter how big a note you make in the comments about how a
class sho= uldn't be used by outside code sooner or later someone will do it andtheir code will break when you change the internal API. While it is their= fault
for doing such, it can be a concern especially if the use of such= "internal"
API's becomes commonplace (Drupal has several = such instances of this)

Finally, my previous writings on this have m= used about possibly having module files
possess vastly different parsin= g rules. I bring this up as a possibility but I
won't dwell on it a= s it could easily become a thread derailing distraction.
That said, it s= hould be possible to use the autoload callback to signal to PHP
that a b= lock of code should be loaded using the module parsing rules if having
t= hem be different in any way is desired. That or a new require_module statem= ent
could be used to pull in code under the module parsing rules.
As to why this might be desirable - I'm no expert on the engine but I= 'm going
to guess that giving each module an independent symbol tabl= e will incur
overhead, possibly significant overhead. One way to claw ba= ck performance is to
fix bugs that have been unfixable for backwards com= patibility reasons. There is
no existing module code, so in one fell swo= op this code can step away from those
problems.=C2=A0 If this is done th= ough it has to be done right as the window closes
once projects with mod= ules start appearing.
--00000000000048764a06344a696f--