Idea: Function autoloading using dummy namespaces

8 years ago by Rowan Collins — view source

unread

Hi All,

In the discussion of escaping mechanisms, it's once again come up that
functions lack autoloading, and thus are hard to work with in larger
code bases.

Previous solutions to this, most notably Anthony Ferrara's very thorough
RFC [1], have looked at adding new modes for the existing autoload
functions, or new functions alongside, to specify the type being autoloaded.

A common response is that you can just use a class with static methods,
and thus leverage the existing autoload mechanism. But the introduction
of "static class" or "abstract final class" was rejected [2] in part
with the opposite justification: that you shouldn't need a class to hold
static methods now that we have namespaced functions.

How about an alternative approach where a function inside a namespace
can be autoloaded using the existing callback, by using a reserved
namespace segment? So to autoload function "foo\bar()", the engine would
construct a string like "__function\foo\bar" or "foo__function\bar",
and pass that to the registered autoloader stack.

This shouldn't result in errors or misbehaviour from existing
autoloaders, it just won't find anything to load. An autoloader that
knows how can then use the namespace path to determine what to load,
probably something like "src/foo/functions.php".

The focus on namespaced functions reflects the fact that
one-file-per-function is rare and somewhat unwieldy, so a call to load
"__function\foo" is unlikely to be that useful in practice.

Note that, like many previous proposals, this could apply to namespaced
constants, too, using a token such as __constant.

Any thoughts? Good idea, horrible idea?

[1] https://wiki.php.net/rfc/function_autoloading
[2] https://wiki.php.net/rfc/abstract_final_class

Regards,

--
Rowan Collins
[IMSoP]

8 years ago by Nikita Popov — view source

unread

On Sun, Jul 17, 2016 at 8:21 PM, Rowan Collins rowan.collins@gmail.com
wrote:

Hi All,

In the discussion of escaping mechanisms, it's once again come up that
functions lack autoloading, and thus are hard to work with in larger code
bases.

Previous solutions to this, most notably Anthony Ferrara's very thorough
RFC [1], have looked at adding new modes for the existing autoload
functions, or new functions alongside, to specify the type being autoloaded.

A common response is that you can just use a class with static methods,
and thus leverage the existing autoload mechanism. But the introduction of
"static class" or "abstract final class" was rejected [2] in part with the
opposite justification: that you shouldn't need a class to hold static
methods now that we have namespaced functions.

How about an alternative approach where a function inside a namespace can
be autoloaded using the existing callback, by using a reserved namespace
segment? So to autoload function "foo\bar()", the engine would construct a
string like "__function\foo\bar" or "foo__function\bar", and pass that to
the registered autoloader stack.

This shouldn't result in errors or misbehaviour from existing autoloaders,
it just won't find anything to load. An autoloader that knows how can then
use the namespace path to determine what to load, probably something like
"src/foo/functions.php".

The focus on namespaced functions reflects the fact that
one-file-per-function is rare and somewhat unwieldy, so a call to load
"__function\foo" is unlikely to be that useful in practice.

Note that, like many previous proposals, this could apply to namespaced
constants, too, using a token such as __constant.

Any thoughts? Good idea, horrible idea?

I don't like this and strongly prefer handling function autoloading using
either separate handler or mode flags, and not magic strings. I don't think
this "solution" solves a problem that actually exists.

The main problem with function autoloading is not integration, but loading
order. Unqualified function calls in PHP have a global namespace fallback,
i.e. a call like "foo()" in namespace "Bar" will first check if "Bar\foo()"
exists and if it doesn't, try global "foo()" instead. How does function
autoloading integrate in this?

Option A: foo() in namespace Bar will
a) Check if Bar\foo() exists
b) Otherwise try to load 'Bar\foo'
c) If that fails, check if foo() exists
d) Otherwise try to load 'foo'
e) If that fails, throw.

This has the distinct disadvantage that any unqualified use of a function
in a namespace will result in an autoloader call. As practically nobody
qualifies all their calls, this would add a huge overhead to each internal
function call (those are nowadays the only ones in the global namespace). I
don't think this is realistic. As such:

Option B: foo() in namespace Bar will
a) Check if Bar\foo() exists
b) Otherwise check if foo() exists
c) Otherwise try to load 'Bar\foo'
d) Otherwise try to load 'foo'
e) If all fails, throw.

This avoids the autoloading overhead when calling functions from the global
namespace. However this also means that you cannot autoload a function with
the same name as a global function through an unqualified call.

In practice, this is probably less problematic than it might sound, because
realistically function autoloading would likely operate on the
namespace-level rather than the function level, i.e. if one function from a
namespace is loaded all of them are, because they are all defined in the
same file. In such a setting you would never run into this problem as there
would be no unqualified calls to functions that have not been loaded yet.
However, it would be an issue if you tried to, say, map each function to
a separate file.

Regards,
Nikita

8 years ago by Levi Morrison — view source

unread

I don't think this is realistic.

I'll come back to this one in a moment.

As such:

Option B: foo() in namespace Bar will
a) Check if Bar\foo() exists
b) Otherwise check if foo() exists
c) Otherwise try to load 'Bar\foo'
d) Otherwise try to load 'foo'
e) If all fails, throw.

This avoids the autoloading overhead when calling functions from the global
namespace. However this also means that you cannot autoload a function with
the same name as a global function through an unqualified call.

In practice, this is probably less problematic than it might sound, because
realistically function autoloading would likely operate on the
namespace-level rather than the function level, i.e. if one function from a
namespace is loaded all of them are, because they are all defined in the
same file. In such a setting you would never run into this problem as there
would be no unqualified calls to functions that have not been loaded yet.
However, it would be an issue if you tried to, say, map each function to
a separate file.

If we are going to use the "one file for all functions" as
justification for defining this behavior a certain way then please
acknowledge it helps both options. One extra autoload from option A is
not a big deal if we are okay with this "one file for all functions"
justification. If we are not okay with that justification then I don't
think option B is viable either.

I think a better option is to try to solve this in 8 by unifying how
functions and classes (and interfaces and traits) work in namespaces.
Ideally the lookup behavior should be the same and all that differs
are the symbol tables being looked in.

8 years ago by Nikita Popov — view source

unread

I don't think this is realistic.

I'll come back to this one in a moment.

As such:

Option B: foo() in namespace Bar will
a) Check if Bar\foo() exists
b) Otherwise check if foo() exists
c) Otherwise try to load 'Bar\foo'
d) Otherwise try to load 'foo'
e) If all fails, throw.

This avoids the autoloading overhead when calling functions from the
global
namespace. However this also means that you cannot autoload a function
with
the same name as a global function through an unqualified call.

In practice, this is probably less problematic than it might sound,
because
realistically function autoloading would likely operate on the
namespace-level rather than the function level, i.e. if one function
from a
namespace is loaded all of them are, because they are all defined in the
same file. In such a setting you would never run into this problem as
there
would be no unqualified calls to functions that have not been loaded yet.
However, it would be an issue if you tried to, say, map each function
to
a separate file.

If we are going to use the "one file for all functions" as
justification for defining this behavior a certain way then please
acknowledge it helps both options. One extra autoload from option A is
not a big deal if we are okay with this "one file for all functions"
justification. If we are not okay with that justification then I don't
think option B is viable either.

I don't follow what you mean here. The problem is not the case where the
unqualified call actually refers to a namespaced function. The problem is
the case where it refers to a global function (which is also far more
likely). In this case option A requires that we invoke the autoloader for
each call. A relaxed variant would only invoke the autoloader once for each
call, assuming that a function that is not loadable at one time will never
become loadable (which is of course a significant restriction). The former
variant would be prohibitively slow. The latter is more realistic, but I
expect it to have non-trivial performance impact nonetheless.

As all of this only applies to the case where the unqualified call refers
to a global function, it does not actually matter how you namespace things
(single or multiple files). As such, I'm not sure I understand your
argument.

I think a better option is to try to solve this in 8 by unifying how
functions and classes (and interfaces and traits) work in namespaces.
Ideally the lookup behavior should be the same and all that differs
are the symbol tables being looked in.

Are you suggesting that we require global function references to be written
as fully qualified names? This would be a huge BC break for namespaced code.

Regards,
Nikita

8 years ago by Levi Morrison — view source

unread

I think a better option is to try to solve this in 8 by unifying how
functions and classes (and interfaces and traits) work in namespaces.
Ideally the lookup behavior should be the same and all that differs
are the symbol tables being looked in.

Are you suggesting that we require global function references to be written
as fully qualified names? This would be a huge BC break for namespaced code.

Not necessarily, but yes that is an option. Class like types could
also fall back to global namespace. Even though this change probably
would not break any code I am aware of (since it fails at runtime and
seems unlikely to be relied on) I'd still rather make that in a major
version.

8 years ago by Jesse Schalken — view source

unread

A relaxed variant would only invoke the autoloader once for each
call, assuming that a function that is not loadable at one time will never
become loadable (which is of course a significant restriction). The former
variant would be prohibitively slow. The latter is more realistic, but I
expect it to have non-trivial performance impact nonetheless.

Is that really a significant restriction? The only difference it would make
is for code that expects "namespace Bar { foo(); }" to refer to global
function "foo()" at one point in the program, and later to refer to
"Bar\foo()" at another point in the program, after "Bar\foo" has become
loadable. That kind of temporal coupling with the symbols that happen to be
loaded/loadable at a given point in time sounds just awful, and I can't
imagine what the use case would be.

Considering Option B can cause the introduction of global functions to
break code (my previous email), the caching of negative results from the
autoloader sounds like the only feasible option.

8 years ago by Jesse Schalken — view source

unread

Option B: foo() in namespace Bar will
a) Check if Bar\foo() exists
b) Otherwise check if foo() exists
c) Otherwise try to load 'Bar\foo'
d) Otherwise try to load 'foo'
e) If all fails, throw.

This avoids the autoloading overhead when calling functions from the global
namespace. However this also means that you cannot autoload a function with
the same name as a global function through an unqualified call.

Doesn't this mean that an innocent introduction of global function "foo()"
could break "namespace Bar { foo(); }", since it would then refer to the
global function "foo" (step b) and "Bar\foo" would not be autoloaded (step
c)?

8 years ago by Stanislav Malyshev — view source

unread

Hi!

How about an alternative approach where a function inside a namespace
can be autoloaded using the existing callback, by using a reserved
namespace segment? So to autoload function "foo\bar()", the engine would
construct a string like "__function\foo\bar" or "foo__function\bar",
and pass that to the registered autoloader stack.

Magic out-of-domain values usually are bad design, and lead to a lot of
trouble since now the system needs to deal with two sets of assumptions
instead of one. I wouldn't recommend doing it.

Stas Malyshev
smalyshev@gmail.com

8 years ago by Rowan Collins — view source

unread

Hi!

How about an alternative approach where a function inside a namespace
can be autoloaded using the existing callback, by using a reserved
namespace segment? So to autoload function "foo\bar()", the engine would
construct a string like "__function\foo\bar" or "foo__function\bar",
and pass that to the registered autoloader stack.

Magic out-of-domain values usually are bad design, and lead to a lot of
trouble since now the system needs to deal with two sets of assumptions
instead of one. I wouldn't recommend doing it.

Hi Stas,

My original idea was actually to have the autoloader look up
"foo\bar__functions" for any function in namespace "foo\bar", with
the idea that some existing autoloaders would actually be able to work
without any modification (just call your file "__functions.php").

If you include the function name (to give users more flexibility), that
doesn't really work (unless you have one function per file), so you're
right that it requires some muddling assumptions.

I agree that it's all a bit magic either way, but I thought it might get
the feature moving quicker than a complete rewrite of the existing
autoload system.

Like I say, it was just an idea. :)

Regards,

Rowan Collins
[IMSoP]

8 years ago by michal.brzuchalski@gmail.com — view source

unread

18 lip 2016 15:58 "Rowan Collins" rowan.collins@gmail.com napisał(a):

Hi!

How about an alternative approach where a function inside a namespace
can be autoloaded using the existing callback, by using a reserved
namespace segment? So to autoload function "foo\bar()", the engine would
construct a string like "__function\foo\bar" or "foo__function\bar",
and pass that to the registered autoloader stack.

I was thinking on passing some context information into autoload about
context, eg.: class | funtion. Wouldn't it be satisfiable information for
autoloader?
IMHO it would be the easiest way to satisfy autoloader to find proprietary
implementation.
Such problem as naming convention of functions file could be resolved in
userland! Without complex implementation in core.

Magic out-of-domain values usually are bad design, and lead to a lot of
trouble since now the system needs to deal with two sets of assumptions
instead of one. I wouldn't recommend doing it.

Hi Stas,

My original idea was actually to have the autoloader look up
"foo\bar__functions" for any function in namespace "foo\bar", with the
idea that some existing autoloaders would actually be able to work without
any modification (just call your file "__functions.php").

If you include the function name (to give users more flexibility), that
doesn't really work (unless you have one function per file), so you're
right that it requires some muddling assumptions.

I agree that it's all a bit magic either way, but I thought it might get
the feature moving quicker than a complete rewrite of the existing autoload
system.

Like I say, it was just an idea. :)

Regards,

Rowan Collins
[IMSoP]

8 years ago by Rowan Collins — view source

unread

I was thinking on passing some context information into autoload about
context, eg.: class | funtion. Wouldn't it be satisfiable information
for autoloader?
IMHO it would be the easiest way to satisfy autoloader to find
proprietary implementation.

Yep, that's how previous suggestions have worked, e.g.
https://wiki.php.net/rfc/function_autoloading

There are a few complexities around making the callbacks compatible
(some already take a second argument for a different purpose), but the
above RFC takes care of those.

Regards,

Rowan Collins
[IMSoP]

Idea: Function autoloading using dummy namespaces

Magic out-of-domain values usually are bad design, and lead to a lot of trouble since now the system needs to deal with two sets of assumptions instead of one. I wouldn't recommend doing it.

Regards,

Regards,

Regards,

Magic out-of-domain values usually are bad design, and lead to a lot of
trouble since now the system needs to deal with two sets of assumptions
instead of one. I wouldn't recommend doing it.