Hi Internals,
SPL is an extension that is always available in PHP.
It provides some classes, interfaces and functions etc. such as
- ArrayObject class
- Countable interface
- iterator_count function
What I'd like to wrap my head around is the position of this extension
in PHP and the sentiments towards SPL from the Internals developers.
- What is the status of this extension currently? Is it being actively
developed or just supported? - Is there any interest in adding stuff here - f.ex. new classes,
interfaces or traits? - And technically, how is something like ArrayObject class
implemented? And should you implement it again, would it be done the
same way?
Thanks for your time,
Jakob
Hi Internals,
Hi Jakob,
Obviously, all of the following is my own personal opinion, and other
people may have different opinions.
There are two main lessons learnt from the SPL experience.
i) Some APIs need to evolve separately from the PHP release schedule*.
As otherwise any mistake in the design of the API is locked in until
the next minor release.
ii) PHP needs a better way of installing extensions. There was a lot
of work done on this in https://github.com/FriendsOfPHP/pickle but
that effort seems to have been abandoned after a huge amount of work
was done for it.
If anyone has any info on what the problems were with the approach
taken in that project, sharing that knowledge with the rest of the PHP
community would be helpful in a new attempt to solve that problem.
To answer your questions:
- What is the status of this extension currently? Is it being actively
developed or just supported?
It's pretty dead and the code is quite scary to even touch.
- Is there any interest in adding stuff here - f.ex. new classes,
interfaces or traits?
Absolutely no interest for me. As I said, new attempts to provide
common libraries should be done separately from PHP core. Preferably
in userland code where possible.
And technically, how is something like ArrayObject class
implemented? And should you implement it again, would it be done the
same way?
The code is in:
https://github.com/php/php-src/blob/d7f7080bb5b42a4dd2d08c91c02645b9d9a74a50/ext/spl/spl_array.c
And a different approach should be taken.
I've posted some notes of my thoughts on the individual parts of the SPL below.
cheers
Dan
Ack
Iterators
These are quite useful, though possibly could do with a better
developer experience around using them.
The file related ones are best avoided though - see File Handling below.
Datastructures
People tend not to use them, but it is hard to express exactly why.
It is partly due to some issues in their implementations. For example
that the function
splpriorityqueue.recoverfromcorruption
exists is a pretty bad sign.
There are a better set of datastructures available in the Ds
extension.
I think possibly it is related about the difficulty in converting from
arrays to custom data structures and back again, being a not good
experience.
Exceptions
The attempt has two fundamental mistakes in my opinion.
First, I think all exceptions should extend a base exception that is
specific to the library that the code is in. e.g.
try {
$image = new Imagick("foo.png");
$image->someMethodThatMightThrow();
}
catch (ImagickException $e) {
// This catch should be guaranteed to catch all exceptions
// that could possibly be thrown by 'someMethodThatMightThrow'
}
Any exceptions to that rule, like a TypeError that is not caught
internally and rethrown with an Imagick specific version, should be
considered as a bug.
Second, having a hierarchy of exceptions that builds up more specific
meaning is something that has a strong aesthetic appeal to developers,
but has no actual benefit. Other than extending a base exception for
that library.
File Handling
There are multiple issues for them. They are not well designed
classes, are kind of difficult to work with, and also some of the
assumptions in them are unsafe. For example cloning a
FileSystemIterator assumes that the directory has not changed:
https://bugs.php.net/bug.php?id=69291
More fundamentally I think these classes are also a mistake. Here is a
quote from a paper by Edsger W. Dijkstra*:
the purpose of abstracting is not to be vague, but to create
a new semantic level in which one can be absolutely precise.
I think this can be turned round the other way. If an abstraction does
not provide a new, more precise semantic level, then it does not
provide any value.
Those classes are not simpler to use than the low level unix file
handling routines. And so they do not provide any value over just
using the low level functions. In fact they are harmful as they hide
some details that you probably want to know about.
That is an opinion that I think also applies to the idea of providing
an OO api to the functions for handling http requests, e.g.
get_headers()
. Although they could do with improvement through having
less magic, and being more complete (e.g. why isn't there a get_body()
function?) putting them all in an OO api seems the wrong thing to do
to me.
cheers
Dan
Ack
-
Edsger W. Dijkstra -
https://www.cs.utexas.edu/~EWD/transcriptions/EWD03xx/EWD340.html -
PHP release schedule problem -
https://github.com/Danack/RfcCodex/blob/master/rfc_attitudes.md#not-being-compatible-with-the-php-release-schedule
Obviously, all of the following is my own personal opinion, and other
people may have different opinions.There are two main lessons learnt from the SPL experience.
i) Some APIs need to evolve separately from the PHP release
schedule*.
As otherwise any mistake in the design of the API is locked in until
the next minor release.ii) PHP needs a better way of installing extensions. There was a lot
of work done on this in https://github.com/FriendsOfPHP/pickle but
that effort seems to have been abandoned after a huge amount of work
was done for it.If anyone has any info on what the problems were with the approach
taken in that project, sharing that knowledge with the rest of the
PHP
community would be helpful in a new attempt to solve that problem.
I don't think this is the specific SPL learning.
The goal of having many of those things in SPL (aside from helly having
time and fun and doing things he liked) was that they were supposed to
be the "Standard". Having one "observer" interface around everywhere.
(picking that, since I never saw code using SPL's observer ...)
Having it in a pecl module (even with a better tool) doesn't make them
ubiquitous standards.
The fact is that the internals group is a good group to discuss how the
language itself should change, how the implementation should work. It
however isn't good in defining APIs and interfaces.
This is not only due to experience of the ones involved, but also since
best practices evolve. What PHP provides is locked in the BC trap.
Evolving it, changing it is hard.
Luckily we are not in the times SPL was created, but 15 or so years
later. We now have composer and a way smaller gap between C code and
PHP userland for such things.
It is now way simpler and way more approachable to put such things in
userland packages and have other groups (like PSR process) design
those.
In the rare case where a design shows obvious benefits, but is slow one
can consider putting it in C and then in the distribution, but we
should aim to keep as many things in userland as we can. Actual users
can simpler contribute their experience, it's simpler to debug, it's
simpler to evolve/replace.
APIs and interface in the implementation should be the basic
foundation, as unopinionated as possible and enable to build "nice"
interfaces in userland on top. Matching the framework of the week.
For these things having a "simpler way" to install extensions isn't
really critical. While rethinking PECL and its tooling is important!
(what is it's role, it's function? Regarding code ownership,
maintenance, hosting, being directory, offering [windows] build
services, distribution, ...) If somebody were willing to think through
all the related aspects and investing time on execution ... I'm happy
to share my thoughts in more detail, while they have no proposed path,
if somebody has the energy to drive it.
Of course the line between what to put in C or userland is complicated.
Let's take the iterators in SPL. A userland implementation will often
have something like
function next()
{ $this->innerIterator->next(); }
in it. In userland code we need to do the function lookups each time.
In the extension the lookup can be cached and the call can be sped up.
With lots of iterator nesting and iterations over large collections
this can be notable. If it warrants doing it in C one can discuss. (I
haven't measured in PHP 7 days, but SPL contains pure PHP
implementations of these things if somebody is intrigued)
For archaeologists on PHP internals: This is the reason helly added
the set of zend_call_method functions and macros in zend_interface.h
allowing the cached lookups over the previous call_user_function()
in zend_API.h.
johannes
In my opinion, another key takeaway: inheritance as a code reuse
mechanism can really bite you. SplStack extends SplDoublyLinkedList
and this exposes a bunch of methods on a stack that don't make any
sense. It also means there are constraints on how well you can
optimize the stack, because you have these methods that don't make
sense on a stack that you must support because some poor bloke
somewhere actually used them (may God have mercy on them).