Expansion of PHP Symbols?

1 year ago by Deleu — view source

unread

Hi Internals, sorry for the potential long email, but I can't sleep and I
want to dump this out of my brain.

Just a quick intro, let's recap PHP simple nature with this index.php file:

&lt;?php

interface Foo {}

final class Bar implements Foo {}

function blah() {}

enum Options {}

This is legitimate and valid PHP code today. Now if I were to have a
sample.php:

&lt;?php
class T implements Foo {} // Fatal error: Uncaught Error: Interface "Foo"
not found

new Bar(); // Fatal error: Uncaught Error: Class "Bar" not found

blah(); // Fatal error: Uncaught Error: Call to undefined function blah()

Options::Option1; // Fatal error: Uncaught Error: Class "Options" not found

Nothing new or fancy here. Simple and pure PHP. If you want to fix all
these errors you can:
1- Use include/require
2- Custom Autoloading (not for functions but bear with me)
3- Use Composer/PSR-4

Nowadays we barely use options 1 and 2 because Composer/PSR takes care of
all that for us, but autoloading seems to be a matter of "I haven't found
this symbol X, would you like to include/require a file you think has this
symbol before I give up?"

Now let's talk about Function Autoloading (
https://wiki.php.net/rfc/core-autoloading) and Callable Interfaces (
https://externals.io/message/120083) and particularly this comment
https://externals.io/message/120083#120088 which I'll take a snippet here:

Mainly the type alias question. Every time I see callable types discussed,

it immediately sidetracks into "how do we make that less fugly to write,
because callable types are naturally very verbose?" That leads directly to
typedefs/type aliases, which take one of two forms:

type TwoInts = callable(int $x, int $y): int

type LinkedResponse = ResponseInterface&LinkCollectionInterface

which raises autoloading questions and means a dependency on a type
defined in another package, in many cases

Here I want to raise the discussion about the little I know about PHP
symbols. If a symbol interface Foo is discovered and registered by PHP,
it will be usable as an interface throughout its entire execution. If the
symbol is not discovered and is used, it will trigger autoloading as a
last-chance before fatal error.

Can type Number = int|number; be a symbol?
Can type TwoInts = callable(int $x, int $y): int; be a symbol?
and lastly, can function bar() {} be a symbol?

Here is how I see that unfolding:

file-1.php

&lt;?php
class Foo {}

class Bar {}

type Blah = Foo|Bar;
?>

file-2.php

<php

function foo(Blah $blah) {}

foo(new Foo()); // Class Foo not found
?>

file-3.php

<php
require 'file-1.php';

function foo(Blah $blah) {}

foo(new Foo()); // Works!
?>

How do we solve this?

Include/Require
Custom Autoloading
Composer/PSR-4

Sounds familiar?

PHP already has a namespace to solve for grouping of types such that
\Foo\Bar and \Bar\Bar can already co-exist because they are two
different symbols. We already rely on classes/interfaces/enums defined on
third-party packages, why not make type aliases work the same?
And if functions could be symbols, it would work out the same as well.

One limitation I see is that symbols cannot have conflicts.
Enum/Interfaces/Classes cannot be named exactly the same under the exact
same namespace. Type Alias would follow the same limitation. Function would
be able to escape one part of that limitation if the engine could prefix
every function name with f_ or fn_ when registering its symbol, making
it backward compatible and allowing class Foo and function Foo
co-exist, but not two functions called Foo (which is already an error
anyway).

Now one questionable user experience I see is defining:

1 Class = 1 File
1 Interface = 1 File
1 Enum = 1 File (this is already annoying)
1 function = 1 file
1 symbol = 1 file

But this user experience does not come from PHP's nature, but instead it
comes from Composer/PSR-4 because PSR-4 maps namespaces to directories and
symbols to files. We need a static analyser to scan our entire project,
discover every symbol and create a mapping such as:

Symbol Class_Foo -> x.php
Symbol Interface_Bar -> x.php
Symbol Enum_Options -> y.php
Symbol Enum_ExtraOptions -> y.php
Symbol fn_foo -> z.php

so that our registered autoload can include/require the necessary file when
the symbol triggers an autoload. This static analyser already exists and is
called composer dump-autoload. It just has not implemented a PSR-X which
allows for this pattern?

In conclusion

PHP Scripts are very simple and dummy things and it already has a
limitation of symbol discovery. We have already built tools to work with
that limitation around autoloading/psr/composer. We could extend the PHP
Symbol system to include functions and type alises. If this can be done in
a backward-compatible manner, we can already integrate those into PSR-4
from day one. Lastly, we lift the social-construct limitation of 1 file = 1
symbol with PSR and Composer since PHP never really had this limitation
built-in. We come out of it with 3 major wins (from my pov):

Function autoloading
Type Aliasing
Never creating 3 files for 3 Enums again

If you managed to read up to here, I apologize for late confessing I know
nearly nothing about PHP internals codebase. Is this obviously wrong and am
I just hallucinating a new awesome PHP version here?

--
Marco Deleu

1 year ago by Robert Landers — view source

unread

Hey Deleu,

I ended up on this list in the first place because I wanted to figure
out how to implement type aliases. I can confidently say why no one
wants to touch it with a ten-foot pole: implementing this will require
refactoring the entire type system. There currently isn't a 'type
table'; supporting type aliases will probably require one. I was able
to hack something together that used the import statements as a way to
work without that much changes (I never got autoloading to work, but I
suspect that was my unfamiliarity with the code, not that it was
impossible):

use int|float as Number;

at the top and it would work with fairly minimal changes to the
engine. Anything more complex or different required a huge
refactoring. However, from previous discussions, when that syntax was
brought up, it was shot down. It appears that people want code-wide
aliases, which goes back to refactoring (to the point of
re-implementing an entire type system) and people can't agree that
it's a good idea in the first place (i.e., what if you think int|float
is Number, but some other library has only float aliased to Number?)
and then that can be solved via imports ... but how does that work?

Basically, since it requires reimplementing an entire type system, I
think the maintainers (this is my impression, I'm not speaking for
them) want to be 100% sure what the syntax will be, how it will work,
and more importantly, agree as much as possible on that before ripping
out the existing system and replacing it with something else.

Anyway, that's my opinion on the matter, I'm sure others will have
different opinions. :)

Rob Landers
Utrecht, Netherlands

1 year ago by Deleu — view source

unread

On Fri, Apr 21, 2023 at 4:10 AM Robert Landers landers.robert@gmail.com
wrote:

Hey Deleu,

I ended up on this list in the first place because I wanted to figure
out how to implement type aliases. I can confidently say why no one
wants to touch it with a ten-foot pole: implementing this will require
refactoring the entire type system. There currently isn't a 'type
table'; supporting type aliases will probably require one.

Hey Rob, thanks for this info! I just have some small follow-up for me to
make some sense of it.

What is fundamentally different about Type Alias as opposed to something
like Enums that was recently introduced?

enum Options{}

class Options{} // Fatal error: Cannot declare class Options, because the
name is already in use

What I would expect here is

type Number = int|float;

class Number {} // Fatal error: Cannot declare class Number, because the
name is already in use

When the engine needs to know about a type alias and it's not in the
symbols table, trigger autoload. When the type alias is registered, check
if it's value matches what was sent (Type checking). I can kind of imagine
the implementation of this could be somewhat awkward because it would
involve instanceof, is_int(), is_float(), is_string(), etc -
meaning we don't have a unified verification system/syntax that encapsulate
types and scalar types all at once, but it's not like we have infinite
scalar types so it's still doable?

I was able
to hack something together that used the import statements as a way to
work without that much changes (I never got autoloading to work, but I
suspect that was my unfamiliarity with the code, not that it was
impossible):

use int|float as Number;

at the top and it would work with fairly minimal changes to the
engine. Anything more complex or different required a huge
refactoring. However, from previous discussions, when that syntax was
brought up, it was shot down. It appears that people want code-wide
aliases, which goes back to refactoring (to the point of
re-implementing an entire type system) and people can't agree that
it's a good idea in the first place (i.e., what if you think int|float
is Number, but some other library has only float aliased to Number?)
and then that can be solved via imports ... but how does that work?

What if you think \Marco\Deleu\Stringable is something that works like
PHP strings, but it turns out it's actually a math interface that only
works with numbers? I know I'm exaggerating, but it's only to make it more
obvious that namespaces have solved that problem a long time ago. A library
can use \Lib1\Number = int|float and make use of it internally and I can
also import that. If I want to import \Lib1\Number and Lib2\Number
simultaneously, we have

use \Lib1\Number as Lib1_Number;
use \Lib2\Number as Lib2_Number;

They can be different things and mean different things and be used
differently. PHP has "solved" this problem ages ago. If users want a
standard meaning for Numbers, PSR can do that. If there is consensus on
what Number should be, perhaps we will have a psr/type-aliases package
that defines common type aliases and everyone that wants it can use it for
interoperability.

Sorry if it sounds stupid or obviously broken, I really am just looking to
understand where my rationale flaw is.

--
Marco Deleu

1 year ago by tim@bastelstu.be — view source

unread

What is fundamentally different about Type Alias as opposed to something
like Enums that was recently introduced?

Type aliases might include an unbounded number of types and are …
aliases, which based on [1] is a major issue.

To my understanding, PHP currently assumes that no object of a
non-existent type can exist, thus is able to short-circuit checks like:

 $foo instanceof Bar; // If Bar does not exist,
                      // $foo cannot be Bar by definition.

However this might be violated if such type aliases exist, as Bar could
be an alias to a type that actually exists. Bar could probably even be
an alias to another alias:

 type Bar = Baz|Quux
 type Baz = Foo

So to check if $foo instanceof Bar in this case, the engine would need
to load Bar, Baz, Quux and Foo in order to determine if $foo is any of
these.

If folks also want intersection types in there, which they likely want,
this gets complicated quickly, because nested type definitions would not
necessarily be in DNF and thus expensive to check. Consider:

 type Foo = Bar&Baz;
 type Bar = Quux|Asdf;
 type Baz = Apple|Banana

So you'd likely have to eagerly load all pieces of the definition and
normalize it, whereas to my understanding currently many type checks can
happen more lazily. The normalization in itself can be expensive as
shown with the combinatorial explosion above.

Disclaimer though: I'm just parroting what smarter people than me have
said before and also drawing my own conclusions from my understanding of
what they said.

Best regards
Tim Düsterhus

[1] https://chat.stackoverflow.com/transcript/message/55310273#55310273

1 year ago by Rowan Tommins — view source

unread

Hi Marco!

First of all, please do stick around, and keep learning and being
curious. Sometimes out of the box thinking does get somewhere.

That said, a lot of what you've written here is actually what already
happens, and the problems are elsewhere.

1- Use include/require
2- Custom Autoloading (not for functions but bear with me)
3- Use Composer/PSR-4

The first thing to note is that these aren't options, they're layers
on top of each other: to load some code from a file, you use
include/require; to load some code on demand, you write an autoload
function, which uses include/require; to let someone else write that
function, you lay your files out according to PSR-4 and use Composer,
which writes the autoload function, which uses include/require.

Here I want to raise the discussion about the little I know about PHP
symbols. If a symbol interface Foo is discovered and registered by PHP,
it will be usable as an interface throughout its entire execution. If the
symbol is not discovered and is used, it will trigger autoloading as a
last-chance before fatal error.

Yes, when PHP needs the definition of a class/interface, etc, it looks
up its name in a table. If it's not there, it can pass the name to an
autoload function, and then try again (after the function has,
hopefully, included a file with an appropriate definition).

In fact, the table is shared between classes, interfaces, traits, and
enums, which is why you can't have overlapping names between them, and
why they all trigger the same autoloader. This makes sense because
they're often used in the same context - e.g. in "$foo instanceof Foo",
"Foo" can be a class, an interface, or an enum.

I'll come back to type aliases in a second, but first:

can function bar() {} be a symbol?

It already is. When you run foo('bar'), PHP has to look up what the
definition of "foo" is in a table of functions. In principle,
autoloading could work for those just the same way it does for classes
... except for one awkward feature which seemed like a good idea 15
years ago. If you write this:

namespace Foo {
echo strlen('hello');
}

Should PHP lookup the function "\Foo\strlen", or the function "\strlen"?
The "clever" answer 15 years ago was to do both: try the namespaced
name first, then if it doesn't exist, try the global name. This does not
happen for classes, only for functions and constants.

This is a real pain for autoloading: how do you avoid repeatedly calling
the autoloader looking for "\Foo\strlen", finding there's still no
definition, and falling back to "\strlen"? That's one of the main points
of discussion whenever function autoloading comes up.

Can type Number = int|number; be a symbol?
Can type TwoInts = callable(int $x, int $y): int; be a symbol?

Fundamentally, yes. They'd probably need to occupy the same lookup table
as classes - if I write "function foo(SomeType $someArg)", PHP doesn't
know until it's looked it up whether "SomeType" is a type alias or class

but that's not a big problem.

I think there is some discussion around whether people want it to work
that way - should it be more like a "use" statement, and just take
effect for the current file? Then there's questions around how to
actually implement it, and do so efficiently. But creating a table of
names to types isn't the hard part of the problem.

Now one questionable user experience I see is defining:

1 Class = 1 File
[...]

But this user experience does not come from PHP's nature, but instead it
comes from Composer/PSR-4 because PSR-4 maps namespaces to directories and
symbols to files.

Precisely. PHP doesn't need to do anything to change this, it's entirely
up to users. There are ways that PHP might be able to optimise for
different use cases, but the power of the autoloader being a callback
function is that it can do whatever you want it to. It doesn't even need
to involve files at all, if you don't want it to.

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Deleu — view source

unread

On Fri, Apr 21, 2023 at 6:02 AM Rowan Tommins rowan.collins@gmail.com
wrote:

Hi Marco!

I'll come back to type aliases in a second, but first:

can function bar() {} be a symbol?

It already is. When you run foo('bar'), PHP has to look up what the
definition of "foo" is in a table of functions. In principle,
autoloading could work for those just the same way it does for classes
... except for one awkward feature which seemed like a good idea 15
years ago. If you write this:

namespace Foo {
echo strlen('hello');
}

Should PHP lookup the function "\Foo\strlen", or the function "\strlen"?
The "clever" answer 15 years ago was to do both: try the namespaced
name first, then if it doesn't exist, try the global name. This does not
happen for classes, only for functions and constants.

This is a real pain for autoloading: how do you avoid repeatedly calling
the autoloader looking for "\Foo\strlen", finding there's still no
definition, and falling back to "\strlen"? That's one of the main points
of discussion whenever function autoloading comes up.

This is super interesting, thank you for the info. I guess maybe I'm a bit
naive on why there would be controversy here. I see it kind of like this:

PHP looks up \Foo\strlen. It's not on the Symbols table. Trigger autoload
for it.
Autoload finds a definition, register it on the Symbols table and execute
\Foo\strlen

I'm guessing this already works today. Now let's see the other case:

PHP looks up \Foo\strlen. It's not on the Symbols table. Trigger autoload
for it.
Autoload does NOT find a definition, register in the Symbols table that
\Foo\strlen is undefined.
PHP looks up \strlen. It's not on the Symbols table, trigger autoload for
it.
Autoload finds a definition, registers in the Symbols table and executes
\strlen.

Since \Foo\strlen already had an autoload execution, it already had a
chance to be loaded, if it wasn't we can "cache" that result. However, if
the script is doing something crazy, we can let the definition of
\Foo\strlen be overwritten if it ever gets "naturally discovered as a
symbol". That is, if you e.g. try to use a class \Foo\Bar and autoloading
loads the file ./foo/Bar.php and inside that file there is a definition for
\Foo\strlen, even though \Foo\strlen has already been written as
"undefined", the Symbols table could overwrite the "undefined" value
instead of crashing with "\Foo\strlen already exists".

This wouldn't go against PHP's core nature and it would avoid triggering an
autoload for the same function over and over again even though it's
naturally expected that it will not exist.

I can imagine one would argue that caching every native function with a
huge amount of namespace with a "undefined" definition could take up some
memory space, but wouldn't that just encourage the PHP Community, PHP IDEs
and static analysers to promote top-level file use function \strlen,
avoid the recursive trial-and-error and avoid registering many namespaces
and consuming more memory? It's already natural (and not just in PHP) for
IDEs to just hide the entire import statement block, so :shrug:

Here at least I can understand why it's a bit controversial. The obvious
solution to me might not be an acceptable take for other people and it's
somewhat understandable.

Can type Number = int|number; be a symbol?
Can type TwoInts = callable(int $x, int $y): int; be a symbol?

Fundamentally, yes. They'd probably need to occupy the same lookup table
as classes - if I write "function foo(SomeType $someArg)", PHP doesn't
know until it's looked it up whether "SomeType" is a type alias or class

but that's not a big problem.

I think there is some discussion around whether people want it to work
that way - should it be more like a "use" statement, and just take
effect for the current file? Then there's questions around how to
actually implement it, and do so efficiently. But creating a table of
names to types isn't the hard part of the problem.

From Rob's email (https://externals.io/message/120094#120097), the argument
against a simple "use" statement seems quite natural. I certainly don't
want to redefine "use int|float as Number" in every PHP file I work with,
so naturally we would go back to type alias definition, symbol registration
and autoloading. So I guess my final question is: what is fundamentally
different about Type Alias when compared to interfaces, classes, enums that
make this controversial?

--
Marco Deleu

1 year ago by Rowan Tommins — view source

unread

PHP looks up \Foo\strlen. It's not on the Symbols table. Trigger
autoload for it.

Autoload does NOT find a definition, register in the Symbols table
that \Foo\strlen is undefined.

PHP looks up \strlen. It's not on the Symbols table, trigger
autoload for it.

Autoload finds a definition, registers in the Symbols table and
executes \strlen.

Yep, that's pretty much what's currently being proposed. The devil is in
the detail of how to implement the "register that \Foo\strlen is
undefined" part. In the current implementation, there's some unfortunate
side effects.

I can imagine one would argue that caching every native function with
a huge amount of namespace with a "undefined" definition could take up
some memory space, but wouldn't that just encourage the PHP Community,
PHP IDEs and static analysers to promote top-level file use function \strlen

Indeed, although bear in mind that the impact would be immediate on all
existing code bases. Saying "all your code will be slower / use more
memory in PHP 8.3 until you run this tool" is not the greatest message,
so we still want to minimise that impact.

(As an aside, I'd personally much rather type \ in front of functions
than maintain a long "use function" list in every file, even with the
aid of tools; but apparently I'm in a minority on that.)

Regards,

--
Rowan Tommins
[IMSoP]

1 year ago by Ilija Tovilo — view source

unread

Hi Deleu

From Rob's email (https://externals.io/message/120094#120097), the argument
against a simple "use" statement seems quite natural. I certainly don't
want to redefine "use int|float as Number" in every PHP file I work with,
so naturally we would go back to type alias definition, symbol registration
and autoloading. So I guess my final question is: what is fundamentally
different about Type Alias when compared to interfaces, classes, enums that
make this controversial?

I don't think autoloading is the fundamental issue with type aliases,
nor are the symbol tables. Enums live in the class symbol table, as
they are just classes. Type aliases don't need most things classes
need, but they could live there too with a discriminator flag if we're
ready to waste that space for "convenience" of not rewriting all
accesses to the class table.

I believe the bigger issue is typing itself. There are multiple complications.

Currently, any name that is not a known named type is considered a
class type. With type aliases this assumption is no longer correct,
which may require many changes in the engine (and in the optimizer).
Combinations of union and intersection types are limited at the
moment (Foo|Bar, Foo&Bar, (Foo&Bar)|Baz). With type aliases we can
nest types indirectly to create new combinations that were previously
disallowed on a syntax level. We'll either have to handle these
correctly (which from what I understand is quite complicated) or
disallow them at runtime.
Type variance may be challenging. E.g. do we allow substituting a
type alias with its concrete types and vice versa? What about
substituting two equivalent typealiases? There are infinite
combinations.
For runtime type checking itself we would need to compare the value
against the concrete types instead of the typealias, thus complicating
and slowing down the type check.

All of those could be solved (to some extent) by substituting the
typealias with the concrete types as early as possible and reusing the
existing type system. This is the approach I've tried some years ago:
https://github.com/php/php-src/compare/master...iluuu1994:php-src:typealias

The main issue with this approach is that classes/functions are
generally immutable (with OPcache) because we want to store them in
shared memory where all processes can access them. We have mechanisms
to make parts of the class/function mutable per request but
adjusting this for all types might once again require many code
changes. Furthermore, every type (with a typealias, at least) would
require copying to process space to substitute the typealiases with
the concrete type, for every request. This might or might not be
significant, it's hard to tell without measuring.

But the main reason why I stopped working on this was, what do we use
it for? Right now the main use cases are union and intersection types
which are fairly limited or short in my personal PHP code. A
reasonable use case might be closure types. However, I have become
increasingly sceptical whether runtime types for closures are the
direction we should take, as 1. they may be slow, hard to implement or
both and 2. most code doesn't want to add closures types that could
be inferred in most other typed languages.

This e-mail is not too structured and not exhaustive, let me know if
you have any more questions.

Ilija

1 year ago by Larry Garfield — view source

unread

Hi Marco!

First of all, please do stick around, and keep learning and being
curious. Sometimes out of the box thinking does get somewhere.

That said, a lot of what you've written here is actually what already
happens, and the problems are elsewhere.

1- Use include/require
2- Custom Autoloading (not for functions but bear with me)
3- Use Composer/PSR-4

The first thing to note is that these aren't options, they're layers
on top of each other: to load some code from a file, you use
include/require; to load some code on demand, you write an autoload
function, which uses include/require; to let someone else write that
function, you lay your files out according to PSR-4 and use Composer,
which writes the autoload function, which uses include/require.

To piggyback on this point and go deeper into the "Composer/PSR-4" part:

PSR-4 (and PSR-0 before it) are just standard rule for mapping a class name to a file. The intent is that an arbitrary autoloader written by anyone can then "find" classes from any package. PSR-0 came out in 2009.

Composer came out in ~2012. It actually has four different loading mechanisms: PSR-0, PSR-4, "one big lookup map", and "files".

PSR-0 and PSR-4 use those specs' class-name-to-file-name translation rules to look up a file on the fly.

The "one big lookup map" scans the entire code base at dump time and builds up a big table of where every class-like is. Whether it conforms to PSR-4 or not is irrelevant. Then at runtime it just does an array lookup. I've many times run into the "fun" situation where a class was not properly named per PSR-4, but because the autoloader was set to auto-dump always, it never caused an issue as the built-map still worked! (Until it didn't, of course, because the lookup table is a cache that can get out of sync and give you no helpful error messages about it. I've lost much time this way.)

And the "files" approach just greedily includes those files as soon as Composer initializes, and lets PHP take it from there. It's not even using an "autoloader" at all.

(Fun fact: In Drupal 7, in the pre-PSR days, we built our own autoloader that worked on a "register and scan" model and stored a lookup table in the database. It... was not the best code I've ever written, but it did work for many years, and it's still running in the millions of Drupal 7 sites that still exist. For Drupal 8 we, thankfully, switched to Composer as the autoloader.)

I've long felt that the advent of universal opcache usage and preloading (although few people use the latter, sadly) means that leveraging "files" should be a lot more popular than it is. That's what I do for my functional code (in Crell/fp); I just "files" include everything (which is admittedly not much) and move on with life. PHP is a lot faster at this than it once was.

It's also why I'm only kind of luke warm on function autoloading. I'm not against it, but I don't see it as the blocker to more functional code that it once was. And as noted, how the heck do you then organize your functions so they're autoloadable? One per file is silly. So now we need to build some kind of map, and... we're back to the "one big lookkup map" approach, or something.

I think there's still a lot of cultural aversion to front-loading code from the days when it was a lot more costly. PSR-4 is very good for what it is, but it's not the final word on code organization. It doesn't have to be treated that way, and that's a change that requires no core code changes.

If we actually got working shared typedefs in the language, TBH I'd probably recommend people put all of their package's defs in a single file, "file" load it with Composer, and move on. Don't even bring PSR-4 into the picture.

--Larry Garfield

1 year ago by Deleu — view source

unread

On Fri, Apr 21, 2023 at 11:18 AM Larry Garfield larry@garfieldtech.com
wrote:

To piggyback on this point and go deeper into the "Composer/PSR-4" part:

PSR-4 (and PSR-0 before it) are just standard rule for mapping a class
name to a file. The intent is that an arbitrary autoloader written by
anyone can then "find" classes from any package. PSR-0 came out in 2009.

Composer came out in ~2012. It actually has four different loading
mechanisms: PSR-0, PSR-4, "one big lookup map", and "files".

PSR-0 and PSR-4 use those specs' class-name-to-file-name translation rules
to look up a file on the fly.

The "one big lookup map" scans the entire code base at dump time and
builds up a big table of where every class-like is. Whether it conforms to
PSR-4 or not is irrelevant. Then at runtime it just does an array lookup.
I've many times run into the "fun" situation where a class was not properly
named per PSR-4, but because the autoloader was set to auto-dump always, it
never caused an issue as the built-map still worked! (Until it didn't, of
course, because the lookup table is a cache that can get out of sync and
give you no helpful error messages about it. I've lost much time this way.)

And the "files" approach just greedily includes those files as soon as
Composer initializes, and lets PHP take it from there. It's not even using
an "autoloader" at all.

(Fun fact: In Drupal 7, in the pre-PSR days, we built our own autoloader
that worked on a "register and scan" model and stored a lookup table in the
database. It... was not the best code I've ever written, but it did work
for many years, and it's still running in the millions of Drupal 7 sites
that still exist. For Drupal 8 we, thankfully, switched to Composer as the
autoloader.)

I've long felt that the advent of universal opcache usage and preloading
(although few people use the latter, sadly) means that leveraging "files"
should be a lot more popular than it is. That's what I do for my
functional code (in Crell/fp); I just "files" include everything (which is
admittedly not much) and move on with life. PHP is a lot faster at this
than it once was.

It's also why I'm only kind of luke warm on function autoloading. I'm not
against it, but I don't see it as the blocker to more functional code that
it once was. And as noted, how the heck do you then organize your
functions so they're autoloadable? One per file is silly. So now we need
to build some kind of map, and... we're back to the "one big lookkup map"
approach, or something.

I think there's still a lot of cultural aversion to front-loading code
from the days when it was a lot more costly. PSR-4 is very good for what
it is, but it's not the final word on code organization. It doesn't have
to be treated that way, and that's a change that requires no core code
changes.

If we actually got working shared typedefs in the language, TBH I'd
probably recommend people put all of their package's defs in a single file,
"file" load it with Composer, and move on. Don't even bring PSR-4 into the
picture.

--Larry Garfield

--

To unsubscribe, visit: https://www.php.net/unsub.php

Hey thanks for the reply. I just want to go an extra mile to make my points
clearer. Please take no negative connotation from what I posted towards
Composer, PSR-0, PSR-4 or such. I was merely describing things from an
end-user perspective to bring everything I think about into context.

My simplistic and naive view of things come from reading about typed
callables, function autoloading and type aliases. My biggest question is
still understanding what is fundamentally different between class Bar and
type Foo = Bar|Baz. I understand that the current state of things in the
PHP world, folks will likely hate declaring 1 type = 1 file to conform with
PSR-4, but that has nothing to do with the engine - it's not good nor bad
for the PHP engine and just extends on an extremely well defined existing
principle of PHP symbols.

The recursive resolution of function autoloading does explain a little bit
why it has been controversial and a final decision has not come out of it
yet, but even there it's just a matter of whether consensus can be achieved
that a combination of PHP Engine + Community work can bring to a great
final state and not every detail needs to be burdened on internals.

At the end of the day I would love to declare a few functions on a single
file, a few enums on another file, a few type aliases on another file and
perhaps even a few intertwined/related small classes on the same file. PHP
Autoloading functionality is 90% there and we would just need function
autoloading, type aliases definition and PSR-X + Composer static symbol
scanner (something that seems to already exist?) to make it a great
developer experience. If you don't use Composer, nothing gets worse (except
maybe function autoloading + performance), but if you use composer already,
this can be a seamless transition and perhaps even bring more interest
towards preloading.

To maybe beat up more on function autoloading, if I understand correctly
the current open proposal sidesteps the performance impact by making
function autoloading a "separate autoloading filtered by type" kind of
mechanism. If you upgrade from PHP 8.x to 8.x+1 with no code changes, your
performance impact is zero. Your first performance impact arises when you
register a function autoloading (making it opt-in) & then you can choose to
opt-in by using a strategy that perhaps allows you to fade away the
performance impact with the Composer Static Scanner dumping mechanism.

I feel like this all sounds too good to be true/possible because if it were
easy, it would maybe have been done by now. Even if we park function
autoloading altogether (for its controversy) and focus just on type
aliases, the question remains: Why is it not possible to make Type Alias
the same way that Enum was recently introduced?

--
Marco Deleu

1 year ago by Larry Garfield — view source

unread

Hey thanks for the reply. I just want to go an extra mile to make my points
clearer. Please take no negative connotation from what I posted towards
Composer, PSR-0, PSR-4 or such. I was merely describing things from an
end-user perspective to bring everything I think about into context.

No offense taken. A lot of people get confused about the Composer/PSR-4 relationship, and don't actually understand what Composer is doing, so I just wanted to be sure we were all on the same page, explicitly.

I feel like this all sounds too good to be true/possible because if it were
easy, it would maybe have been done by now. Even if we park function
autoloading altogether (for its controversy) and focus just on type
aliases, the question remains: Why is it not possible to make Type Alias
the same way that Enum was recently introduced?

I think Tim already answered this effectively. Enums are, to the engine, classes with funny syntax. They're not a new type, they're classes with some extra machinery. So piggybacking on the class autoloading is trivial.

Type aliases are vastly more complex, because they may or may not exist as their own thing, they may (depending on the implementation approach) be nestable, they cannot be used everywhere that classes can (I don't think using an alias in instanceof would actually work, but it would work with the "is" keyword that has been proposed), etc. They're a fundamentally different beast with different syntactic implications.

--Larry Garfield