function autoloading v4 RFC

11 months ago by Stephen Reay — view source

unread

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I've decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

— Rob

Hi Rob,

I like the simplicity of this, however your RFC doesn't document the changes required to spl_autoload[1] to allow it to keep working with this new functionality.

The same issue (unexpected additional argument) potentially affects userland autoloaders too, but obviously the individual authors can fix that themselves (whether this would count as a BC break is not immediately clear to me)

Slightly tangentially, you may also want to look at a change to spl_autoload_call to accept a SPL_AUTOLOAD_* argument, so that it works consistently.

Cheers

Stephen

1: https://www.php.net/spl_autoload

10 months ago by Rob Landers — view source

unread

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I've decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

— Rob

Hi Rob,

I like the simplicity of this, however your RFC doesn't document the changes required to spl_autoload[1] to allow it to keep working with this new functionality.

Ah, good catch. I've updated this and gone through other relevant functions.

The same issue (unexpected additional argument) potentially affects userland autoloaders too, but obviously the individual authors can fix that themselves (whether this would count as a BC break is not immediately clear to me)

Userland functions don't throw that error, so it shouldn't be an issue. (You can pass as many arguments as you want to a userland function, as long as there are enough of them).

Slightly tangentially, you may also want to look at a change to spl_autoload_call to accept a SPL_AUTOLOAD_* argument, so that it works consistently.

Done.

Thank you Stephen for pointing out this oversight.

Cheers

Stephen

1: https://www.php.net/spl_autoload

— Rob

10 months ago by Stephen Reay — view source

unread

Sent from my iPhone

Userland functions don't throw that error, so it shouldn't be an issue. (You can pass as many arguments as you want to a userland function, as long as there are enough of them).

Hi Rob,

I didn't mean the actual internal error about too many args I meant the concept - it's possible a userland autoloader callback has some signature other than a single string parameter.

I don't know if that happens in the wild or if that's considered a BC or not but there is a hypothetical case there where old code would stop working.

Cheers

Stephen

10 months ago by Rob Landers — view source

unread

Sent from my iPhone

Userland functions don't throw that error, so it shouldn't be an issue. (You can pass as many arguments as you want to a userland function, as long as there are enough of them).

Hi Rob,

I didn't mean the actual internal error about too many args I meant the concept - it's possible a userland autoloader callback has some signature other than a single string parameter.

I don't know if that happens in the wild or if that's considered a BC or not but there is a hypothetical case there where old code would stop working.

Cheers

Stephen

I wouldn't consider it a BC break, no. But (ironically?), Symfony crashes with this change. It really shouldn't but if you take a look at https://github.com/symfony/symfony/blob/7.2/src/Symfony/Component/Config/Resource/ClassExistenceResource.php#L70 and it's definition in https://github.com/symfony/symfony/blob/7.2/src/Symfony/Component/Config/Resource/ClassExistenceResource.php#L142, you'll see that they define a function that accepts two parameters. It crashes when it receives a second parameter of int instead of it's expected exception.

I've opened a drive-by PR as an attempt to fix it (https://github.com/symfony/symfony/pull/58030), but I suspect there will be other projects as well that do something silly like that. They'll just have to fix it.

— Rob

10 months ago by Rowan Tommins [IMSoP] — view source

unread

I wouldn't consider it a BC break, no. But (ironically?), Symfony crashes with this change. It really shouldn't but ...

I don't think it makes sense to say "it breaks existing code, but it's not a compatibility break".

Perhaps what you're saying is "it's only a BC break for code that's not following best practices"?

But more relevant than whether you think the current code is "correct" is the fact that a) it will need to be changed to work with your proposal; and b) the change is simple and can be done in advance.

So the RFC should acknowledge this BC break, but could argue that it's small enough to include in a minor version. This is actually really common - RFCs that introduce a new global function often acknowledge that it would break existing userland functions with that name. Between that and obviously serious BC breaks like removing a function, there's a big grey area where we have to make a judgement call.

Regards,
Rowan Tommins
[IMSoP]

10 months ago by Rob Landers — view source

unread

I wouldn't consider it a BC break, no. But (ironically?), Symfony crashes with this change. It really shouldn't but ...

I don't think it makes sense to say "it breaks existing code, but it's not a compatibility break".

Perhaps what you're saying is "it's only a BC break for code that's not following best practices"?

But more relevant than whether you think the current code is "correct" is the fact that a) it will need to be changed to work with your proposal; and b) the change is simple and can be done in advance.

So the RFC should acknowledge this BC break, but could argue that it's small enough to include in a minor version. This is actually really common - RFCs that introduce a new global function often acknowledge that it would break existing userland functions with that name. Between that and obviously serious BC breaks like removing a function, there's a big grey area where we have to make a judgement call.

Regards,
Rowan Tommins
[IMSoP]

Hey Rowan,

Ah, that's a good tip and point. Thank you. I'll update the RFC!

— Rob

10 months ago by Stephen Reay — view source

unread

Hi Rob,

I wouldn't consider it a BC break, no. But (ironically?), Symfony crashes with this change.

I wasn't aware of that specific code before but it's exactly the type of issue I was talking about earlier.

Ah, good catch. I've updated this and gone through other relevant functions.

The RFC now says "The spl_autoload function will not be updated.", but that will also break if it isn't updated to at least account for, even if it doesn't use the second argument given.

However I'm also curious why you would specifically make it not support function loading?
The current implementation should work unmodified, once the signature is changed to accept an int as the second parameter (and move the current 2nd parameter to 3rd), There is nothing "class specific" in the existing implementation except for a couple of variable names.

I would imagine you also need to update spl_autoload_unregister[1] - it also needs to be able to identify the type of autoloader it's operating on (because the same autoloader might be defined for both types).

And lastly I think you'd need to adapt spl_autoload_functions[2] too - perhaps the same as the others, introduce a parameter int $type = SPL_AUTOLOAD_CLASS, so existing code works as-is, otherwise it'd be impossible to know how a given autoloader was registered.

1: https://www.php.net/manual/en/function.spl-autoload-unregister.php
2: https://www.php.net/manual/en/function.spl-autoload-functions.php

Cheers

Stephen

10 months ago by Rob Landers — view source

unread

Hi Rob,

I wouldn't consider it a BC break, no. But (ironically?), Symfony crashes with this change.

I wasn't aware of that specific code before but it's exactly the type of issue I was talking about earlier.

Ah, good catch. I've updated this and gone through other relevant functions.
The RFC now says "The spl_autoload function will not be updated.", but that will also break if it isn't updated to at least account for, even if it doesn't use the second argument given.

However I'm also curious why you would specifically make it not support function loading?
The current implementation should work unmodified, once the signature is changed to accept an int as the second parameter (and move the current 2nd parameter to 3rd), There is nothing "class specific" in the existing implementation except for a couple of variable names.

I would imagine you also need to update spl_autoload_unregister[1] - it also needs to be able to identify the type of autoloader it's operating on (because the same autoloader might be defined for both types).

And lastly I think you'd need to adapt spl_autoload_functions[2] too - perhaps the same as the others, introduce a parameter int $type = SPL_AUTOLOAD_CLASS, so existing code works as-is, otherwise it'd be impossible to know how a given autoloader was registered.

1: https://www.php.net/manual/en/function.spl-autoload-unregister.php
2: https://www.php.net/manual/en/function.spl-autoload-functions.php

Cheers

Stephen

Hello again internals,

The implementation has now reached a stable point and things discovered during the implementation have been reflected in the RFC text itself.

I did my best to assess the impact to Opcache, but I suspect someone much more familiar with how opcache works will need to take a look. From my understanding of the changes needed in zend_execute, it looks like there are some changes needed to the JIT helpers, but there may be more changes required that aren't so obvious. I've added a helper that can be used as a drop-in replacement for looking up functions from the function table directly, which should help.

Lastly, I do plan to open a docs PR in the coming week that will probably trigger some smaller updates to the RFC; mostly to smooth out the wording and make it more friendly for non-technical people, but the technical aspects shouldn't change (barring the discussion here, of course).

The RFC now says "The spl_autoload function will not be updated.", but that will also break if it isn't updated to at least account for, even if it doesn't use the second argument given.

I've updated the text to reflect exactly that, it did require updating. ;)

However I'm also curious why you would specifically make it not support function loading?
The current implementation should work unmodified, once the signature is changed to accept an int as the second parameter (and move the current 2nd parameter to 3rd), There is nothing "class specific" in the existing implementation except for a couple of variable names.

I mostly decided not to support it to avoid the inevitable bikeshedding of how a "default function autoloader" would work. I really want to push that to a separate RFC, unless there was a general consensus of an implementation. If we are fine reusing the existing default class autoloading, then I am fine with that.

And lastly I think you'd need to adapt spl_autoload_functions[2] too - perhaps the same as the others, introduce a parameter int $type = SPL_AUTOLOAD_CLASS, so existing code works as-is, otherwise it'd be impossible to know how a given autoloader was registered.

I've also added those functions as well.

— Rob

11 months ago by Rowan Tommins [IMSoP] — view source

unread

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading
hundreds of ancient (and recent) emails relating to the topic along
with several abandoned RFCs from the past, and after much review, I've
decided to put forth a variation of a previous RFC, as it seemed the
least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Hi Rob,

While brevity can sometimes be a virtue, I feel like there's a lot left
to the reader's interpretation here.

Specifically, one of the main issues that has come up in previous
discussions of the topic is the behaviour of unqualified function names,
which check the current namespace first, then fall back to global scope.
Your RFC implies an approach to this, but doesn't actually spell it out,
nor discuss its pros and cons.

The fully qualified case is straight-forward: the autoloader is called,
and if still not defined, an error is thrown. But for the unqualified
case, there are multiple scenarios, and you only give the behaviour for
one of them:

Defined in current namespace? | Defined in global namespace? | Proposed
behaviour
------------------------------+------------------------------+--------------------------------------
No                            | No                           |
Autoloader called, with prefixed name
No                            | Yes                          | ???
Yes                           | No                           | ???
Yes                           | Yes                          | ???

The third and fourth cases (where the function exists in the current
namespace) are straight-forward, although it wouldn't hurt to spell them
out: presumably, the namespaced function is used as now, so no
autoloading is needed.

The complex case has always been the second one: the function doesn't
exist in the current namespace, but does exist in the global
namespace. (Or, an autoloader defines it in the global namespace.)

In concrete terms, what does this code output:

spl_autoload_register( function($function, $type) { echo "$function...";
}, type:SPL_AUTOLOAD_FUNCTION);

namespace Foo {
foreach (['hello', 'goodbye'] as $word) {
echo strlen($word), ';';
}
}

a) "Foo\strlen...5;Foo\strlen...7;" (the autoloader is called every time
the function is encountered)
b) "Foo\strlen...5;7;" (the autoloader is called once, then somehow
marked not to run again for this name
c) "5;7;" (the autoloader is never run for this code)

Note that there is an open RFC implementing option (b) by Gina and Dan
here: https://wiki.php.net/rfc/core-autoloading Last I heard, Gina was
still hoping to get back to it.

On a different note, there is no mention of autoloading namespaced
constants in this RFC, unlike in some previous proposals. Is this a
conscious decision to leave them out of scope, or an oversight?

Regards,

--
Rowan Tommins
[IMSoP]

10 months ago by Rob Landers — view source

unread

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I've decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Hi Rob,

While brevity can sometimes be a virtue, I feel like there's a lot left to the reader's interpretation here.

Specifically, one of the main issues that has come up in previous discussions of the topic is the behaviour of unqualified function names, which check the current namespace first, then fall back to global scope. Your RFC implies an approach to this, but doesn't actually spell it out, nor discuss its pros and cons.

It doesn't go too much into details here on purpose, especially since there was some recent discussion on changing the order. That being said, while writing the reply below, I realized it simply wasn't clear enough. I've updated the RFC to be more clear, in regards to the current behavior.

The fully qualified case is straight-forward: the autoloader is called, and if still not defined, an error is thrown. But for the unqualified case, there are multiple scenarios, and you only give the behaviour for one of them:

To fill in your chart:

Defined in current namespace? | Defined in global namespace? | Proposed behaviour
------------------------------+------------------------------+--------------------------------------
No | No | Prefixed name
No | Yes | Prefixed name
Yes | No | No change
Yes | Yes | No change

The third and fourth cases (where the function exists in the current namespace) are straight-forward, although it wouldn't hurt to spell them out: presumably, the namespaced function is used as now, so no autoloading is needed.

The complex case has always been the second one: the function doesn't exist in the current namespace, but does exist in the global namespace. (Or, an autoloader defines it in the global namespace.)

This should have the same behavior as in the class autoloader. In my attempt to be vague (in case the loading order is changed, which I assumed would affect class autoloading as well), I wasn't very clear on this. Meaning if you create a class called "Fiber" and a function called "strlen" in the current namespace, in unloaded files, an autoloader should be given the opportunity to load them.

I should also probably define some vocabulary so this is all less confusing. and Done.

In concrete terms, what does this code output:

spl_autoload_register( function($function, $type) { echo "$function..."; }, type:SPL_AUTOLOAD_FUNCTION);

namespace Foo {
foreach (['hello', 'goodbye'] as $word) {
echo strlen($word), ';';
}
}

a) "Foo\strlen...5;Foo\strlen...7;" (the autoloader is called every time the function is encountered)
b) "Foo\strlen...5;7;" (the autoloader is called once, then somehow marked not to run again for this name
c) "5;7;" (the autoloader is never run for this code)

I believe the "most correct" answer is (a) -- and is what the current class autoloader does as well. Option (b) would make it impossible to do any kind of dynamic code generation or old-fashioned file-based deployments.

namespace global {
spl_autoload_register(function ($function) {
echo "$function...";
});
}

namespace Foo {

foreach (['hello', 'goodbye'] as $word) {
   if (!class_exists(Fiber::class)) {
      echo "Test...";
   }
}

}

Since I foresee (based on previous conversations) about people being worried about performance:

I think this is best left to autoloader authors to manage. If there isn't a function autoloader, then there won't be any performance impact (or at least, minimal). However, if there is one, the author can take into account how their codebase works. For example, I highly suspect FIG will create a PSR for this (eventually), based on how they expect things to work. I suspect a project like WordPress could make use of it and implement things how they expect things to work, etc.

For the case where you have a file full of unqualified strlen()'s, the autoloader author could create an option "do not shadow global functions" where it just returns immediately for built-in functions and doesn't even try autoloading (I also suspect there will be someone who code-golfs this to death so this check is extremely efficient). Is there a performance impact there? Probably not as big as you'd think -- and there already is a performance impact by using unqualified global functions (as shown in a recent proposal to change the order). So, if you are doing that already; you likely don't care about performance anyway.

But I'm going to reserve any serious discussions about performance for when I have an actual implementation to play with (hopefully a PoC this weekend).

Note that there is an open RFC implementing option (b) by Gina and Dan here: https://wiki.php.net/rfc/core-autoloading Last I heard, Gina was still hoping to get back to it.

I went looking for this (I had sworn I had seen it last year), but unfortunately, I did not find it while searching the RFCs for prior work. I had thought it abandoned so I was looking in the wrong place. This RFC does not conflict with that RFC, if Gina/Dan were to pick it back up again. I may borrow the function_exists() change, which I just updated the RFC to do.

On a different note, there is no mention of autoloading namespaced constants in this RFC, unlike in some previous proposals. Is this a conscious decision to leave them out of scope, or an oversight?

I'm leaving them out on purpose to keep the scope tight. I've got a whole year ahead of me.

Regards,

--
Rowan Tommins
[IMSoP]

— Rob

10 months ago by Claude Pache — view source

unread

Le 15 août 2024 à 17:27, Rob Landers rob@bottled.codes a écrit :

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I've decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

— Rob

Hi Rob,

There is the following RFC, presented last year; it appears currently inactive, but not necessarily abandoned:

https://wiki.php.net/rfc/core-autoloading

—Claude

10 months ago by Gina P. Banyard — view source

unread

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I've decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

I had a quick glance at the RFC, and I really don't think this is a good approach or good API.
Having autoloading in SPL causes dumb issues, for example ext/session must register it has a dependency on ext/spl just for the autoloader. [1]
This means that currently ext/session depends on ext/spl, ext/spl depends on ext/standard, and ext/standard depends on ext/session, a glorious circular dependency.

This main problem has pushed me again to rebase my PR [2] for the Core Autoloading RFC. [3]
The wording of the RFC hasn't been updated, as this is frankly my least favourite part of any RFC.
The PR passes CI and has also produced benchmark results from CI. [4]

I would rather have some collaboration on a proper solution than, IMHO, this suboptimal solution.
I still need to get opinions from other people if it makes sense to remove the zend_autoload_class function pointer API and have the VM directly call zend_autoload.
Because from what I remember 2 years ago, some profiling tools hook into it to track autoloading time.
This might be improved by introducing new observer hooks.

Best regards,

Gina P. Banyard

[1] https://github.com/php/php-src/pull/14544#issuecomment-2294907817
[2] https://github.com/php/php-src/pull/8294
[3] https://wiki.php.net/rfc/core-autoloading
[4] https://github.com/php/php-src/actions/runs/10441267948

10 months ago by Rob Landers — view source

unread

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I've decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

I had a quick glance at the RFC, and I really don't think this is a good approach or good API.
Having autoloading in SPL causes dumb issues, for example ext/session must register it has a dependency on ext/spl just for the autoloader. [1]
This means that currently ext/session depends on ext/spl, ext/spl depends on ext/standard, and ext/standard depends on ext/session, a glorious circular dependency.

This main problem has pushed me again to rebase my PR [2] for the Core Autoloading RFC. [3]
The wording of the RFC hasn't been updated, as this is frankly my least favourite part of any RFC.
The PR passes CI and has also produced benchmark results from CI. [4]

I would rather have some collaboration on a proper solution than, IMHO, this suboptimal solution.
I still need to get opinions from other people if it makes sense to remove the zend_autoload_class function pointer API and have the VM directly call zend_autoload.
Because from what I remember 2 years ago, some profiling tools hook into it to track autoloading time.
This might be improved by introducing new observer hooks.

Best regards,

Gina P. Banyard

[1] https://github.com/php/php-src/pull/14544#issuecomment-2294907817
[2] https://github.com/php/php-src/pull/8294
[3] https://wiki.php.net/rfc/core-autoloading
[4] https://github.com/php/php-src/actions/runs/10441267948

Hey Gina,

I’d love to collaborate on this feature. For what it’s worth though, I did a ton of research on it (mostly reading every discussion I could find on the topic, and prior RFCs) and I felt that this API was the most likely to be accepted. You even have a comment on your PR asking why not an API similar to this one (!) though your reasoning is sound for why it is a bad idea, and I believe it is the superior API.

There are some key differences between our two RFC’s that I think are worth discussing (besides the obvious API differences):

in your RFC, it calls the autoloader with the global function it isn’t found in both scopes. In mine, it calls it once not found in the local scope and calls the autoloader once (for the local scope). This seemed to be a highly liked proposal in one of the older discussions (2013-ish?) that appeared to not result in a new RFC, as it would bypass a lot of perceived performance issues in earlier RFCs. If an autoloader so desires, the autoloader can check the global scope (by getting the “base name” of the function), but autoloading the global scope should be a niche application these days.
I like the “pinning” aspect. I haven’t seen your code yet, but I suspect it just registers the global function in the current namespace? If so, does this affect the namespace global? Probably not, but I am just curious. What happens if I manually require a file with a pinned function in it?
Should we update function_exists like I did in mine to include an autoload argument?
You mention no default autoloader for classes and functions, while I agree that this should be the case, will the spl library still provide the default class autoloader that can be registered?
As far as performance for ambiguous functions go, I was thinking of submitting an RFC, where ambiguous function calls are tagged during compilation and always resolve lexically, sorta like how it works now:

echo strlen($x); // resolves to global, always
require_once "my-strlen";
echo strlen($x); // resolves to my strlen, always

This works by basically rewriting the function name once resolved and may make the code more predictable. If I can pull it off, it can be relegated to a technical change that doesn’t need an RFC. Still working on it.

In other words, maybe pinning could be solved more generally in a future RFC, decrease your RFC’s scope and chance for sharp edge cases.

In any case, if you are up for a high bandwidth conversation to collaborate, we can join a call, collaborate in the open, or whatever you feel most comfortable with. I’m very excited to see this feature.

— Rob

10 months ago by Rowan Tommins [IMSoP] — view source

unread

As far as performance for ambiguous functions go, I was thinking of
submitting an RFC, where ambiguous function calls are tagged during
compilation and always resolve lexically, sorta like how it works now:

echo strlen($x); // resolves to global, always
require_once "my-strlen";
echo strlen($x); // resolves to my strlen, always

This works by basically rewriting the function name once resolved and
may make the code more predictable. If I can pull it off, it can be
relegated to a technical change that doesn’t need an RFC. Still
working on it.

I'm not entirely clear what you're suggesting, but I think it might be
either what already happens, or the same as Gina is proposing.

Consider this code [https://3v4l.org/184k3]:

namespace Foo;

foreach ( [1,2,3] as $i ) {
     echo strlen('hello'), ' ';
     shadow_strlen();
     echo strlen('hello'), '; ';
}

function shadow_strlen() {
    if ( ! function_exists('Foo\strlen') ) {
        function strlen($s) {
            return 42;
        }
    }
}

In PHP 5.3, this outputs '5 42; 42 42; 42 42;' That's fairly
straight-forward: each time the function is called, the engine checks if
"Foo\strlen" is defined.

Since PHP 5.4, it outputs '5 42; 5 42; 5 42;' The engine caches the
result of the lookup against each compiled opcode, so the first strlen()
is cached as a call to \strlen() and the second as a call to \Foo\strlen().

As I understand it, Gina is proposing that it would instead output '5 5;
5 5; 5 5;' - the function would be "pinned" by making "\Foo\strlen" an
alias to "\strlen" for the rest of the program, and the function_exists
call would immediately return true.

Neither, as far as I can see, can happen at compile time, because the
compiler doesn't know if and when a definition of \Foo\strlen() will be
encountered.

In other words, maybe pinning could be solved more generally in a
future RFC, decrease your RFC’s scope and chance for sharp edge cases.

If anything, I think it would need to be the other way around: change
the name resolution logic, so that an autoloading proposal was more
palatable, because it didn't require running the autoloader an unbounded
number of times for the same name.

Proposing both at once seems reasonable, as the autoloading gives an
extra benefit to outweigh the breaking change to shadowing behaviour.

Regards,

--
Rowan Tommins
[IMSoP]

10 months ago by Rob Landers — view source

unread

As far as performance for ambiguous functions go, I was thinking of
submitting an RFC, where ambiguous function calls are tagged during
compilation and always resolve lexically, sorta like how it works now:

echo strlen($x); // resolves to global, always
require_once "my-strlen";
echo strlen($x); // resolves to my strlen, always

This works by basically rewriting the function name once resolved and
may make the code more predictable. If I can pull it off, it can be
relegated to a technical change that doesn’t need an RFC. Still
working on it.

I'm not entirely clear what you're suggesting, but I think it might be
either what already happens, or the same as Gina is proposing.

Yeah, that would be a separate RFC, so I didn't go into the weeds, but my point, is that it would result in no change. But here we go :D

Consider this code [https://3v4l.org/184k3]:

namespace Foo;

foreach ( [1,2,3] as $i ) {
echo strlen('hello'), ' ';
shadow_strlen();
echo strlen('hello'), '; ';
}

function shadow_strlen() {
if ( ! function_exists('Foo\strlen') ) {
function strlen($s) {
return 42;
}
}
}

In PHP 5.3, this outputs '5 42; 42 42; 42 42;' That's fairly
straight-forward: each time the function is called, the engine checks if
"Foo\strlen" is defined.

Since PHP 5.4, it outputs '5 42; 5 42; 5 42;' The engine caches the
result of the lookup against each compiled opcode, so the first strlen()
is cached as a call to \strlen() and the second as a call to \Foo\strlen().

As I understand it, Gina is proposing that it would instead output '5 5;
5 5; 5 5;' - the function would be "pinned" by making "\Foo\strlen" an
alias to "\strlen" for the rest of the program, and the function_exists
call would immediately return true.

Neither, as far as I can see, can happen at compile time, because the
compiler doesn't know if and when a definition of \Foo\strlen() will be
encountered.

To be fair, this isn't even really completely figured out yet. I was mostly wanting to point out that it maybe could be a totally separate issue. But, the gist is that at compile time, we can mark a function as "ambiguous," meaning we don't really know if the function exists (because it isn't fully qualified). The main issue with Gina's implementation (if I am understanding it properly, and I potentially am not, so take this with a grain of salt) is that this (https://3v4l.org/0jCpW) could fail if it were pinned, where before, it would not.

In my idea, a function becomes pinned until it isn't, with strict rules that kick it out of being pinned and I think those rules are complex enough to warrant being a completely different RFC or PR.

In other words, maybe pinning could be solved more generally in a
future RFC, decrease your RFC’s scope and chance for sharp edge cases.

If anything, I think it would need to be the other way around: change
the name resolution logic, so that an autoloading proposal was more
palatable, because it didn't require running the autoloader an unbounded
number of times for the same name.

Proposing both at once seems reasonable, as the autoloading gives an
extra benefit to outweigh the breaking change to shadowing behaviour.

Regards,

I assume you are worried about something like this passing test?

--TEST--
show called only once
--FILE--
<?php

namespace test;

spl_autoload_register(function($name) {
echo "name=$name\n";
}, true, false, SPL_AUTOLOAD_FUNCTION);

echo strlen('foo');
echo strlen('bar');
echo strlen('baz');
?>
--EXPECT--
name=test\strlen
333

In my RFC, I mention it is called exactly once. I could maybe add it as a test in the PR. I've committed it as another test on the RFC implementation.

--
Rowan Tommins
[IMSoP]

— Rob

10 months ago by Rowan Tommins [IMSoP] — view source

unread

I assume you are worried about something like this passing test?

--TEST--
show called only once
--FILE--
<?php

namespace test;

spl_autoload_register(function($name) {
echo "name=$name\n";
}, true, false, SPL_AUTOLOAD_FUNCTION);

echo strlen('foo');
echo strlen('bar');
echo strlen('baz');
?>
--EXPECT--
name=test\strlen
333

In my RFC, I mention it is called exactly once.

I haven't looked at the PR, only the RFC, and I did not see this very important detail explained anywhere. The only thing I can see is this rather ambiguous sentence:

The function autoloader will not be called again.

That could mean not called again for the current call (compared with proposals that call it a second time with the unequalled name); it could mean not called again for the current line of code (based on the current caching behaviour); or never called again for that combination of namespace and name; or possibly, never called again for that combination of namespace, name, and callback function.

That's not a small detail of the implementation, it's a really fundamental difference from previous proposals.

So I would like to repeat my first response to your RFC: that it should sound more time explaining your approach to the multiple lookup problem.

Regards,
Rowan Tommins
[IMSoP]

10 months ago by Rob Landers — view source

unread

I assume you are worried about something like this passing test?

--TEST--
show called only once
--FILE--
<?php

namespace test;

spl_autoload_register(function($name) {
echo "name=$name\n";
}, true, false, SPL_AUTOLOAD_FUNCTION);

echo strlen('foo');
echo strlen('bar');
echo strlen('baz');
?>
--EXPECT--
name=test\strlen
333

In my RFC, I mention it is called exactly once.

I haven't looked at the PR, only the RFC, and I did not see this very important detail explained anywhere. The only thing I can see is this rather ambiguous sentence:

The function autoloader will not be called again.

That could mean not called again for the current call (compared with proposals that call it a second time with the unequalled name); it could mean not called again for the current line of code (based on the current caching behaviour); or never called again for that combination of namespace and name; or possibly, never called again for that combination of namespace, name, and callback function.

That's not a small detail of the implementation, it's a really fundamental difference from previous proposals.

So I would like to repeat my first response to your RFC: that it should sound more time explaining your approach to the multiple lookup problem.

Regards,
Rowan Tommins
[IMSoP]

Thanks Rowan,

That's a fair critique.

I expect some of the wording will be more clear once I write out the documentation -- even if it isn't used directly, I tend to write out documentation to force myself to reconcile the code with the plan, find logic bugs, perform larger scale tests, and create tests to verify assertions in the documentation. From there, I'll update the plan or code to get everything to match and spend some time on clarity. It's the hardest part, IMHO, as it requires diligently ensuring everything is correct. In other words, writing the documentation makes it feel like a "real thing" and it triggers what small amount of perfectionism I have.

— Rob

10 months ago by Rob Landers — view source

unread

I assume you are worried about something like this passing test?

--TEST--
show called only once
--FILE--
<?php

namespace test;

spl_autoload_register(function($name) {
echo "name=$name\n";
}, true, false, SPL_AUTOLOAD_FUNCTION);

echo strlen('foo');
echo strlen('bar');
echo strlen('baz');
?>
--EXPECT--
name=test\strlen
333

In my RFC, I mention it is called exactly once.

I haven't looked at the PR, only the RFC, and I did not see this very important detail explained anywhere. The only thing I can see is this rather ambiguous sentence:

The function autoloader will not be called again.

That could mean not called again for the current call (compared with proposals that call it a second time with the unequalled name); it could mean not called again for the current line of code (based on the current caching behaviour); or never called again for that combination of namespace and name; or possibly, never called again for that combination of namespace, name, and callback function.

That's not a small detail of the implementation, it's a really fundamental difference from previous proposals.

So I would like to repeat my first response to your RFC: that it should sound more time explaining your approach to the multiple lookup problem.

Regards,
Rowan Tommins
[IMSoP]

Thanks Rowan,

That's a fair critique.

I expect some of the wording will be more clear once I write out the documentation -- even if it isn't used directly, I tend to write out documentation to force myself to reconcile the code with the plan, find logic bugs, perform larger scale tests, and create tests to verify assertions in the documentation. From there, I'll update the plan or code to get everything to match and spend some time on clarity. It's the hardest part, IMHO, as it requires diligently ensuring everything is correct. In other words, writing the documentation makes it feel like a "real thing" and it triggers what small amount of perfectionism I have.

— Rob

I have an experimental library that I use for testing these kinds of things. There are aspects of it that I could work with to make use of function autoloading. Thus, I did so and benchmarked the performance of unit tests. The unit testing library makes a ton of "unqualified function calls".

I spent some time working on two autoloaders:

A naive autoloader: parses out the file to load, checks if it exists, and then requires the file.
An optimized autoloader: only cares about the namespace it has registered. All others are an instant return.
In the "vanilla" case, I was mostly concerned with variation. I wanted a statistically significant result, so once I got my system into a stable state and I was no longer seeing any variance, I started benchmarking.

For the naive autoloader, I saw a performance degradation of about 6% and lots of variability. This is probably due to the "file_exists" check being done every time an unqualified name was called.

However, for the optimized autoloader, I ended up with less variability (🤔) than the vanilla approach and absolutely no measurable performance degradation.

Now, time to try this on a larger scale... WordPress. It's pretty much the only large codebase I know of that makes use of tons of functions.

— Rob

10 months ago by Matthew Weier O'Phinney — view source

unread

I assume you are worried about something like this passing test?

--TEST--
show called only once
--FILE--
<?php

namespace test;

spl_autoload_register(function($name) {
echo "name=$name\n";
}, true, false, SPL_AUTOLOAD_FUNCTION);

echo strlen('foo');
echo strlen('bar');
echo strlen('baz');
?>
--EXPECT--
name=test\strlen
333

In my RFC, I mention it is called exactly once.

I haven't looked at the PR, only the RFC, and I did not see this very
important detail explained anywhere. The only thing I can see is this
rather ambiguous sentence:

The function autoloader will not be called again.

That could mean not called again for the current call (compared with
proposals that call it a second time with the unequalled name); it could
mean not called again for the current line of code (based on the current
caching behaviour); or never called again for that combination of namespace
and name; or possibly, never called again for that combination of
namespace, name, and callback function.

That's not a small detail of the implementation, it's a really fundamental
difference from previous proposals.

So I would like to repeat my first response to your RFC: that it should
sound more time explaining your approach to the multiple lookup problem.

Regards,
Rowan Tommins
[IMSoP]

Thanks Rowan,

That's a fair critique.

I expect some of the wording will be more clear once I write out the
documentation -- even if it isn't used directly, I tend to write out
documentation to force myself to reconcile the code with the plan, find
logic bugs, perform larger scale tests, and create tests to verify
assertions in the documentation. From there, I'll update the plan or code
to get everything to match and spend some time on clarity. It's the hardest
part, IMHO, as it requires diligently ensuring everything is correct. In
other words, writing the documentation makes it feel like a "real thing"
and it triggers what small amount of perfectionism I have.

— Rob

I have an experimental library that I use for testing these kinds of
things. There are aspects of it that I could work with to make use of
function autoloading. Thus, I did so and benchmarked the performance of
unit tests. The unit testing library makes a ton of "unqualified function
calls".

I spent some time working on two autoloaders:

A naive autoloader: parses out the file to load, checks if it
exists, and then requires the file.

An optimized autoloader: only cares about the namespace it has
registered. All others are an instant return.

Not to sound flippant, but you do realize that composer does a lot of
optimizations just like this, right?

PSR-4 exists in large part due to a realization that better optimizations
were possible when you mapped namespaces to source code directories. And
Composer takes it a step further when you have it create an optimized
loader, because then it maps classes directly to the files that provide
them, preventing I/O up to the point that a require is performed.

My point is that this is why folks have been suggesting that this is a
solved problem. Globally qualify functions, and you get immediate
performance benefits due to removal of the need to look up via the
namespace first. Most CS tools will even add the imports or qualifiers for
you, and you can have your IDE or editor configured to do it as well.

I wouldn't spend your time on this facet.

In the "vanilla" case, I was mostly concerned with variation. I wanted a
statistically significant result, so once I got my system into a stable
state and I was no longer seeing any variance, I started benchmarking.

For the naive autoloader, I saw a performance degradation of about 6% and
lots of variability. This is probably due to the "file_exists" check being
done every time an unqualified name was called.

However, for the optimized autoloader, I ended up with less variability
(🤔) than the vanilla approach and absolutely no measurable performance
degradation.

Now, time to try this on a larger scale... WordPress. It's pretty much the
only large codebase I know of that makes use of tons of functions.

— Rob

10 months ago by Rob Landers — view source

unread

__

I assume you are worried about something like this passing test?

--TEST--
show called only once
--FILE--
<?php

namespace test;

spl_autoload_register(function($name) {
echo "name=$name\n";
}, true, false, SPL_AUTOLOAD_FUNCTION);

echo strlen('foo');
echo strlen('bar');
echo strlen('baz');
?>
--EXPECT--
name=test\strlen
333

In my RFC, I mention it is called exactly once.

I haven't looked at the PR, only the RFC, and I did not see this very important detail explained anywhere. The only thing I can see is this rather ambiguous sentence:

The function autoloader will not be called again.

That could mean not called again for the current call (compared with proposals that call it a second time with the unequalled name); it could mean not called again for the current line of code (based on the current caching behaviour); or never called again for that combination of namespace and name; or possibly, never called again for that combination of namespace, name, and callback function.

That's not a small detail of the implementation, it's a really fundamental difference from previous proposals.

So I would like to repeat my first response to your RFC: that it should sound more time explaining your approach to the multiple lookup problem.

Regards,
Rowan Tommins
[IMSoP]

Thanks Rowan,

That's a fair critique.

I expect some of the wording will be more clear once I write out the documentation -- even if it isn't used directly, I tend to write out documentation to force myself to reconcile the code with the plan, find logic bugs, perform larger scale tests, and create tests to verify assertions in the documentation. From there, I'll update the plan or code to get everything to match and spend some time on clarity. It's the hardest part, IMHO, as it requires diligently ensuring everything is correct. In other words, writing the documentation makes it feel like a "real thing" and it triggers what small amount of perfectionism I have.

— Rob

I have an experimental library that I use for testing these kinds of things. There are aspects of it that I could work with to make use of function autoloading. Thus, I did so and benchmarked the performance of unit tests. The unit testing library makes a ton of "unqualified function calls".

I spent some time working on two autoloaders:

A naive autoloader: parses out the file to load, checks if it exists, and then requires the file.

An optimized autoloader: only cares about the namespace it has registered. All others are an instant return.

Not to sound flippant, but you do realize that composer does a lot of optimizations just like this, right?

PSR-4 exists in large part due to a realization that better optimizations were possible when you mapped namespaces to source code directories. And Composer takes it a step further when you have it create an optimized loader, because then it maps classes directly to the files that provide them, preventing I/O up to the point that a require is performed.

My point is that this is why folks have been suggesting that this is a solved problem. Globally qualify functions, and you get immediate performance benefits due to removal of the need to look up via the namespace first. Most CS tools will even add the imports or qualifiers for you, and you can have your IDE or editor configured to do it as well.

I wouldn't spend your time on this facet.

There are people worried about the performance issues with adding function autoloading (due to having to look up functions at every call, for unqualified global). And yes, Composer does this for classes, but it doesn't (yet) do it for functions, meaning it has to be done manually. So, my goal here is to either (1), convince people it is a non-issue, or (2) show that it is an issue and we should take steps to fix it (such as flipping the order functions are looked up, or whatever).

Now ... on to WordPress!

WordPress makes very few unqualified function calls; however, it can benefit from function autoloading since it uses a number of functions files and such. I had to patch it for this, remove all the require's and generate a function map for the autoloader. I ran three tests:

Vanilla WordPress with no changes
Vanilla WordPress with no changes, but an autoloader that just returned
Modified WordPress to take full advantage of function autoloading
Graphs are coming in the RFC, but it looks like WordPress would gain a 1-3% speed improvement if they used function autoloading. If anyone knows of any other function-heavy projects, I'd love to throw them at this as well. Tomorrow I want to take a look at some of the projects Ilija looked into that use a number of unqualified functions and see if they will take a major performance hit by having function autoloading enabled.

— Rob

10 months ago by Mel Dafert — view source

unread

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I've decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

— Rob

Hello,

I've also noted that on another RFC, but I think it applies here too:
Any new constants that are shaped like an enum and used like an enum should just be
an actual enum. I think instead of SPL_AUTOLOAD_CLASS and SPL_AUTOLOAD_FUNCTION,
there should be an enum AutoloadType (to be bikeshed), and the new parameters
should be typed as such, instead of int.

Regards,
Mel

10 months ago by Rob Landers — view source

unread

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I've decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

— Rob

Hello,

I've also noted that on another RFC, but I think it applies here too:
Any new constants that are shaped like an enum and used like an enum should just be
an actual enum. I think instead of SPL_AUTOLOAD_CLASS and SPL_AUTOLOAD_FUNCTION,
there should be an enum AutoloadType (to be bikeshed), and the new parameters
should be typed as such, instead of int.

Regards,
Mel

Hey Mel,

I spent a couple of hours trying to do this today. But due to the weird dependency hell that SPL lives in right now, I'm not sure it can be an enum (or at least, the linker kept messing up my version of PHP to the point where it couldn't even run—it was bizarre; I've only seen stuff like that with incompatible cgo shenanigans. I will have to try again later when I next rebase my branch).

I've reached out to Gina about their RFC and my RFC joining forces. Gina's RFC introduces an entirely new API for autoloading and neatly side-steps the whole thing.

— Rob

10 months ago by Rob Landers — view source

unread

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I've decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

— Rob

Hello internals,

I've updated the RFC text and clarified some language (hopefully) while also addressing a number of concerns.

I've removed the BC break—the 'type' of the autoloadee will not be passed to the autoloader. This can allow someone to use spl_autoload for function autoloading if they so desire.
I've removed artificial restrictions on the constants in which all functions that take them can accept both at the same time and behave appropriately.
I've performed multiple benchmarks on several projects (WordPress, Symfony, and others from techempower) and couldn't detect a statistically significant performance difference with this feature in multiple configurations.
Some simplified aspects of Gina's implementation for core autoloading were integrated into the patch. However, I see that RFC as a general refactoring of the API more than function autoloading. This RFC is strictly about function autoloading while being very conservative with the autoloading API.
While minor issues in the PR are still being worked on, I plan to open the vote next Wednesday, September 11th 2024, barring any significant pushback or objections.

— Rob

10 months ago by Jakob Givoni — view source

unread

I've removed the BC break—the 'type' of the autoloadee will not be
passed to the autoloader. This can allow someone to use spl_autoload for
function autoloading if they so desire.

Unless I'm missing something, the main example in the RFC still shows a
function which expects the $type as the second parameter. Is that
intentional?

I've removed artificial restrictions on the constants in which all
functions that take them can accept both at the same time and behave
appropriately.

I'm not a big fan passing flags and using binary operations to combine
options into a single parameter.
It works, but it's opaque and old-school.
We have both named parameters and enums now, can't we just use those going
forward, making each option a separate parameter or using enums with 3
cases, FUNCTION, CLASS or BOTH?

Best,
Jakob

10 months ago by Rob Landers — view source

unread

I've removed the BC break—the 'type' of the autoloadee will not be passed to the autoloader. This can allow someone to use spl_autoload for function autoloading if they so desire.

Unless I'm missing something, the main example in the RFC still shows a function which expects the $type as the second parameter. Is that intentional?

Nope! Thanks for the heads-up; despite reading it four or five times, I missed this! I've now fixed it.

I've removed artificial restrictions on the constants in which all functions that take them can accept both at the same time and behave appropriately.

I'm not a big fan passing flags and using binary operations to combine options into a single parameter.
It works, but it's opaque and old-school.
We have both named parameters and enums now, can't we just use those going forward, making each option a separate parameter or using enums with 3 cases, FUNCTION, CLASS or BOTH?

Thank you for your opinion, but for cases like this, enums is probably one of the worst choices IMHO. As mentioned towards the end of the RFC, I'd like to add further support for other things, such as constants and stream filters. Further, it appears that enums cannot be used in SPL (at least, I couldn't get it to link) due to SPL having a recursive dependency on itself. This is what Gina's RFC seeks to rectify by breaking autoloading out of SPL. This RFC focuses purely on function autoloading.

— Rob

10 months ago by Jakob Givoni — view source

unread

I've removed artificial restrictions on the constants in which all
functions that take them can accept both at the same time and behave
appropriately.

I'm not a big fan passing flags and using binary operations to combine
options into a single parameter.
It works, but it's opaque and old-school.
We have both named parameters and enums now, can't we just use those going
forward, making each option a separate parameter or using enums with 3
cases, FUNCTION, CLASS or BOTH?

Thank you for your opinion, but for cases like this, enums is probably one
of the worst choices IMHO. As mentioned towards the end of the RFC, I'd
like to add further support for other things, such as constants and stream
filters. Further, it appears that enums cannot be used in SPL (at least, I
couldn't get it to link) due to SPL having a recursive dependency on
itself. This is what Gina's RFC seeks to rectify by breaking autoloading
out of SPL. This RFC focuses purely on function autoloading.

I see that under "Future scope" you put:

Potentially, constants and stream wrappers can be added in a similar
fashion.

Trying to figure out if you are referring to the possibility of autoloading
stream wrappers and constants? Is that something there's a need/desire for?

In any case, it seems less than ideal to introduce a change to existing
functions now only to change them again after Gina's RFC.
Which means we'll be locked into binary flags even if enums (not even sure
what you have against their use in this case) will be possible later.

I also mentioned named parameters as an alternative, any thoughts on that?
F.ex. changing one of your examples to:

spl_autoload_call('func', function: true, class: true); // Calls both
autoloaders with the name 'func'

Best,
Jakob

10 months ago by Rob Landers — view source

unread

__

I've removed artificial restrictions on the constants in which all functions that take them can accept both at the same time and behave appropriately.

I'm not a big fan passing flags and using binary operations to combine options into a single parameter.
It works, but it's opaque and old-school.
We have both named parameters and enums now, can't we just use those going forward, making each option a separate parameter or using enums with 3 cases, FUNCTION, CLASS or BOTH?

Thank you for your opinion, but for cases like this, enums is probably one of the worst choices IMHO. As mentioned towards the end of the RFC, I'd like to add further support for other things, such as constants and stream filters. Further, it appears that enums cannot be used in SPL (at least, I couldn't get it to link) due to SPL having a recursive dependency on itself. This is what Gina's RFC seeks to rectify by breaking autoloading out of SPL. This RFC focuses purely on function autoloading.

I see that under "Future scope" you put:

Potentially, constants and stream wrappers can be added in a similar fashion.
Trying to figure out if you are referring to the possibility of autoloading stream wrappers and constants? Is that something there's a need/desire for?

In any case, it seems less than ideal to introduce a change to existing functions now only to change them again after Gina's RFC.

Which means we'll be locked into binary flags even if enums (not even sure what you have against their use in this case) will be possible later.

I also mentioned named parameters as an alternative, any thoughts on that?
F.ex. changing one of your examples to:

spl_autoload_call('func', function: true, class: true); // Calls both autoloaders with the name 'func'

Hey Jakob,

Thank you for sharing your thoughts on this. I see where you’re coming from regarding enums and named parameters as a more modern approach, and I appreciate your suggestion to consider them.

For the current RFC, I leaned towards using binary flags because they provide a level of consistency with existing PHP functions. There are also some technical challenges with using enums here, especially since enums seem incompatible with SPL due to its recursive dependency. This is something I hope we can address in the future, potentially through Gina's RFC.

I think named parameters could be a good middle ground and might make the function calls clearer. If you don't mind me asking, what do you have against binary flags?

I also understand your concern about making adjustments to the API twice and how it might have to change again if Gina's RFC passes. My thinking was that by moving forward with this proposal, we at least make some progress on function autoloading, which has been debated, off-and-on, for a long time.

I'm not opposed to holding off on merging the implementation if it makes sense to wait and see how Gina's proposal develops, but I also don't want to risk withdrawing my RFC and Gina's not passing for unrelated reasons; I hope that makes sense.

— Rob

10 months ago by Jakob Givoni — view source

unread

Hi Rob,

I've removed artificial restrictions on the constants in which all
functions that take them can accept both at the same time and behave
appropriately.

I'm not a big fan passing flags and using binary operations to combine
options into a single parameter.
It works, but it's opaque and old-school.
We have both named parameters and enums now, can't we just use those going
forward, making each option a separate parameter or using enums with 3
cases, FUNCTION, CLASS or BOTH?

Thank you for your opinion, but for cases like this, enums is probably one
of the worst choices IMHO. As mentioned towards the end of the RFC, I'd
like to add further support for other things, such as constants and stream
filters. Further, it appears that enums cannot be used in SPL (at least, I
couldn't get it to link) due to SPL having a recursive dependency on
itself. This is what Gina's RFC seeks to rectify by breaking autoloading
out of SPL. This RFC focuses purely on function autoloading.

I see that under "Future scope" you put:

Potentially, constants and stream wrappers can be added in a similar
fashion.

Trying to figure out if you are referring to the possibility
of autoloading stream wrappers and constants? Is that something there's a
need/desire for?

In any case, it seems less than ideal to introduce a change to existing
functions now only to change them again after Gina's RFC.

Which means we'll be locked into binary flags even if enums (not even sure
what you have against their use in this case) will be possible later.

I also mentioned named parameters as an alternative, any thoughts on that?
F.ex. changing one of your examples to:

spl_autoload_call('func', function: true, class: true); // Calls both
autoloaders with the name 'func'

Hey Jakob,

Thank you for sharing your thoughts on this. I see where you’re coming
from regarding enums and named parameters as a more modern approach, and I
appreciate your suggestion to consider them.

For the current RFC, I leaned towards using binary flags because they
provide a level of consistency with existing PHP functions. There are also
some technical challenges with using enums here, especially since enums
seem incompatible with SPL due to its recursive dependency. This is
something I hope we can address in the future, potentially through Gina's
RFC.

I think named parameters could be a good middle ground and might make the
function calls clearer. If you don't mind me asking, what do you have
against binary flags?

Sure! I've always just intuitively disliked them, but here's an attempt at
rationalizing my distaste:

It's not intuitive and easy to remember. If you are a casual user of
functions accepting bitmasks, you may remember that you can pass several
flags, but what is the syntax again? Say I want to pass FLAG_1 AND FLAG_2,
so I use the AND operator, right? That would be the most intuitive
answer... But no, it's OR - and how do we write OR in php? Using double
pipe ||, right? Ah, no for bitwise operations it's a single pipe |. Not to
mention that sometimes you need to use the exclusive or the not operator to
achieve your goal. The error_reporting function is another famous example
that needs its own bitmask calculator page! It's not a great user
experience and can easily lead to subtle bugs...
There's no standard way to document them. F.ex. see the documentation for
json_encode. The function API just says it's an integer. You have to look
further to understand that it's actually a "bitmask" - which is not a
native type in PHP. And you have to go to another page about bitwise
operators to understand how to compose the flags. Your IDE will most
probably not help you the least, the way it would if it was an enum or a
named boolean parameter.
Using a bitmask parameter seems "lazy" on part of the implementer (no
offense intended!), putting all the work of understanding the flags and how
to compose them on the end-user.

That being said, if the PHP community is generally enthusiastic about
bitmasks, it's not a deal breaker for me :-)

I also understand your concern about making adjustments to the API twice
and how it might have to change again if Gina's RFC passes. My thinking was
that by moving forward with this proposal, we at least make some progress
on function autoloading, which has been debated, off-and-on, for a long
time.

Function autoloading is much needed, thank you for working on this!

I'm not opposed to holding off on merging the implementation if it makes
sense to wait and see how Gina's proposal develops, but I also don't want
to risk withdrawing my RFC and Gina's not passing for unrelated reasons; I
hope that makes sense.

Your proposed PHP version is 8.5 or later, maybe there's still time to see
how it unfolds?

— Rob

10 months ago by Jordi Boggiano — view source

unread

I'm not opposed to holding off on merging the implementation if it
makes sense to wait and see how Gina's proposal develops, but I also
don't want to risk withdrawing my RFC and Gina's not passing for
unrelated reasons; I hope that makes sense.

I would not suggest withdrawing it but considering it would target a PHP
release coming in late 2025 at the earliest, I think it might be ok to
delay the vote a bit and first see how Gina's RFC is faring. Or even
collaborate further on a more unified RFC? Up to you of course, but I
just wanted to emphasize that there is no rush here.

Best,
Jordi

--
Jordi Boggiano
@seldaek -https://seld.be

10 months ago by Rob Landers — view source

unread

__

I've removed artificial restrictions on the constants in which all functions that take them can accept both at the same time and behave appropriately.

I'm not a big fan passing flags and using binary operations to combine options into a single parameter.
It works, but it's opaque and old-school.
We have both named parameters and enums now, can't we just use those going forward, making each option a separate parameter or using enums with 3 cases, FUNCTION, CLASS or BOTH?

Thank you for your opinion, but for cases like this, enums is probably one of the worst choices IMHO. As mentioned towards the end of the RFC, I'd like to add further support for other things, such as constants and stream filters. Further, it appears that enums cannot be used in SPL (at least, I couldn't get it to link) due to SPL having a recursive dependency on itself. This is what Gina's RFC seeks to rectify by breaking autoloading out of SPL. This RFC focuses purely on function autoloading.

I see that under "Future scope" you put:

Potentially, constants and stream wrappers can be added in a similar fashion.
Trying to figure out if you are referring to the possibility of autoloading stream wrappers and constants? Is that something there's a need/desire for?

Hey Jakob,

I'm replying this in a separate thread because it is more or less 'meta' than my other reply. There have been discussions about this off-and-on for awhile, but the gist is that there are people that would find it useful. For example, in my experimental "time" library there is a SECOND and MINUTE constant that must be required if you have the library, even if your code execution path never uses them. So, I think there might be some usefulness to constant autoloading. Further, there are some great async libraries that use stream wrappers to hijack/use file_get_contents and friends in an async way, so having autoloading there may also be useful.

That being said, I wanted to constrain the scope initially as this appeared to be a hot topic (autoloading functions) every time it came up. Thus I was preparing myself for a long and drawn-out discussion and kept the scope minimal.

I hope that helps explain why it is under future scope.

— Rob

10 months ago by Jakob Givoni — view source

unread

I've removed artificial restrictions on the constants in which all
functions that take them can accept both at the same time and behave
appropriately.

I'm not a big fan passing flags and using binary operations to combine
options into a single parameter.
It works, but it's opaque and old-school.
We have both named parameters and enums now, can't we just use those going
forward, making each option a separate parameter or using enums with 3
cases, FUNCTION, CLASS or BOTH?

Thank you for your opinion, but for cases like this, enums is probably one
of the worst choices IMHO. As mentioned towards the end of the RFC, I'd
like to add further support for other things, such as constants and stream
filters. Further, it appears that enums cannot be used in SPL (at least, I
couldn't get it to link) due to SPL having a recursive dependency on
itself. This is what Gina's RFC seeks to rectify by breaking autoloading
out of SPL. This RFC focuses purely on function autoloading.

I see that under "Future scope" you put:

Potentially, constants and stream wrappers can be added in a similar
fashion.

Trying to figure out if you are referring to the possibility
of autoloading stream wrappers and constants? Is that something there's a
need/desire for?

Hey Jakob,

I'm replying this in a separate thread because it is more or less 'meta'
than my other reply. There have been discussions about this off-and-on for
awhile, but the gist is that there are people that would find it useful.
For example, in my experimental "time" library there is a SECOND and MINUTE
constant that must be required if you have the library, even if your code
execution path never uses them. So, I think there might be some usefulness
to constant autoloading. Further, there are some great async libraries that
use stream wrappers to hijack/use file_get_contents and friends in an async
way, so having autoloading there may also be useful.

Ok, I had not thought about it, but it makes sense. Thanks for the
explanation!

That being said, I wanted to constrain the scope initially as this
appeared to be a hot topic (autoloading functions) every time it came up.
Thus I was preparing myself for a long and drawn-out discussion and kept
the scope minimal.

I hope that helps explain why it is under future scope.

— Rob

10 months ago by Gina P. Banyard — view source

unread

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I've decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

— Rob

Hello internals,

I've updated the RFC text and clarified some language (hopefully) while also addressing a number of concerns.

I've removed the BC break—the 'type' of the autoloadee will not be passed to the autoloader. This can allow someone to use spl_autoload for function autoloading if they so desire.

I've removed artificial restrictions on the constants in which all functions that take them can accept both at the same time and behave appropriately.

I've performed multiple benchmarks on several projects (WordPress, Symfony, and others from techempower) and couldn't detect a statistically significant performance difference with this feature in multiple configurations.

Some simplified aspects of Gina's implementation for core autoloading were integrated into the patch. However, I see that RFC as a general refactoring of the API more than function autoloading. This RFC is strictly about function autoloading while being very conservative with the autoloading API.

While minor issues in the PR are still being worked on, I plan to open the vote next Wednesday, September 11th 2024, barring any significant pushback or objections

What is the rush in wanting to put this RFC to a vote now?
You know that I was on holidays for 2 weeks and that I am open to collaborate, as I replied to you off list.
What benefit does opening the vote on the RFC give?
You are giving busy work for RFC voters when there is another related RFC and there is still a whole year to work on it.
I'd expect this behaviour if this were June, not September....

I can understand wanting to get stuff moving quickly, but this is not how things work, adding support for something in a jank way is not better than not adding it at all.
I still do not believe that shoehorning more stuff into the current SPL mechnaism is good, I do not think using a bitflag for setting the autoloaders is a good nor sensible API (because if you class and function autoloader behave the same I have questions).
Reading the RFC I have no idea how you are tackling the global namespace fallback, nor how you are going to prevent the lookup happening multiple times.

As such in the current state I will vote against it, and be annoyed that you are making me do busy work.

Best regards,
Gina P. Banyard

10 months ago by Rob Landers — view source

unread

Hello internals,

I've decided to attempt an RFC for function autoloading. After reading hundreds of ancient (and recent) emails relating to the topic along with several abandoned RFCs from the past, and after much review, I've decided to put forth a variation of a previous RFC, as it seemed the least ambitious and the most likely to work:

https://wiki.php.net/rfc/function_autoloading4

Please let me know what you think. An implementation should be along before opening it for a vote (now that I realize how important that is).

— Rob

Hello internals,

I've updated the RFC text and clarified some language (hopefully) while also addressing a number of concerns.

I've removed the BC break—the 'type' of the autoloadee will not be passed to the autoloader. This can allow someone to use spl_autoload for function autoloading if they so desire.

I've removed artificial restrictions on the constants in which all functions that take them can accept both at the same time and behave appropriately.

I've performed multiple benchmarks on several projects (WordPress, Symfony, and others from techempower) and couldn't detect a statistically significant performance difference with this feature in multiple configurations.

Some simplified aspects of Gina's implementation for core autoloading were integrated into the patch. However, I see that RFC as a general refactoring of the API more than function autoloading. This RFC is strictly about function autoloading while being very conservative with the autoloading API.
While minor issues in the PR are still being worked on, I plan to open the vote next Wednesday, September 11th 2024, barring any significant pushback or objections

What is the rush in wanting to put this RFC to a vote now?
You know that I was on holidays for 2 weeks and that I am open to collaborate, as I replied to you off list.
What benefit does opening the vote on the RFC give?
You are giving busy work for RFC voters when there is another related RFC and there is still a whole year to work on it.
I'd expect this behaviour if this were June, not September....

I can understand wanting to get stuff moving quickly, but this is not how things work, adding support for something in a jank way is not better than not adding it at all.
I still do not believe that shoehorning more stuff into the current SPL mechnaism is good, I do not think using a bitflag for setting the autoloaders is a good nor sensible API (because if you class and function autoloader behave the same I have questions).
Reading the RFC I have no idea how you are tackling the global namespace fallback, nor how you are going to prevent the lookup happening multiple times.

As such in the current state I will vote against it, and be annoyed that you are making me do busy work.

Best regards,

Gina P. Banyard

Hi Gina & Jordi,

I believe there has been some confusion, so I'm postponing the vote indefinitely.

From the beginning, I have stated that I don't see these two RFCs as competing or mutually exclusive. The core autoloading RFC (which I'd still love to collaborate on) seems to be about

The proposal consists of adding a better designed class autoloading mechanism and a new function autoloading mechanism, and aliasing the existing autoload functions to the new versions.

(emphasis mine) From my perspective, function autoloading is a secondary aspect of the core autoloading RFC, even if it is the main motivation. The new core autoloading API includes function autoloading, but separating it wouldn't change the proposed API much, if at all, and would remove the need to introduce function autoloading as a "new" feature.

Here's a summary of potential outcomes with what we have today:

If this RFC isn't put towards a vote, there's only a 50% chance of having function autoloading in some form. However, if it is, there is a 75% chance. So, to maximize our chances of achieving function autoloading (my primary goal atm), it makes sense to follow the RFC process and put it to a vote. This RFC tries really hard to "be as PHP as possible" and fit in with the legacy API it is surrounded by. It isn't trying to be modern, by any stretch of the imagination; rather, it is trying to look as though it belongs there from the beginning.

If this RFC succeeds, it will simplify the design of core autoloading, allowing us to focus expressly on the API, which I believe would improve its chances of success. This aligns with what I shared with Gina off-list 18 days ago—there's no rush, but there's also no reason to delay (except for this discussion, which can hopefully be productive).

If this RFC fails, nothing has to be done for core autoloading, it can remain as-is and try to reintroduce function autoloading. In either case, I believe there may be valuable feedback from voters that would be missing in this discussion—there almost always is, from seeing other RFCs pass/fail.

Maybe I am looking at this all wrong. I tend to look at problems differently than other people and 'just solve them' in as pragmatic a manner that is possible. Sometimes, this means I end up doing the wrong thing, but I hope that isn't the case here, though it seems like it might be. I'm not trying to be challenging or waste people's time; just following a process and trying to be logical.

Perhaps it is more logical to wait and see how core autoloading plays out?

— Rob

10 months ago by Rowan Tommins [IMSoP] — view source

unread

From the beginning, I have stated that I don't see these two RFCs as competing or mutually exclusive.
...
If this RFC succeeds, it will simplify the design of core autoloading, allowing us to focus expressly on the API, which I believe would improve its chances of success.

If your RFC didn't change the autoloading API at all, and concentrated on details of behaviour that weren't covered by Dan and Gina's existing draft, I could see how this might make sense.

But from what I can see, your RFC is almost entirely about the API, and doesn't even explain the behaviour it is proposing very clearly (I'm still not clear what "the function autoloader will be called only once for the current namespace" means, exactly).

Even if there weren't a separate RFC, I would urge you not to put the current text to a vote, because the issues which have been forefront in previous discussions are barely addressed at all.

there's no rush, but there's also no reason to delay

The reason to delay is that we're still discussing alternatives; the aim of the vote is to confirm that a conclusion and a consensus has been reached.

Regards,
Rowan Tommins
[IMSoP]