Hello everyone. I would like to announce that the RFC for "StreamWrapper
Support for glob()
" is now ready for Discussion.
RFC
https://wiki.php.net/rfc/glob_streamwrapper_support
Feature Request and Discussions
https://github.com/php/php-src/issues/9224
Regards,
Tim
Hi Timmy,
Good suggestion.
This seems like a no brainer, and definitely good to add streams support
for glob filepaths. You're right, we currently have to work around this in
userland. Your code example workaround is super ugly :)
I'm intersted to know, from others, if there are deeper technical reasons
why this wouldn't be a good idea, because I'm not it
Unless my memory is wrong, Sara was heavily involved in the initial streams
API, and might be good to see what she thinks too, from an implementation
POV :)
Many thanks,
Paul
Hello everyone. I would like to announce that the RFC for "StreamWrapper
Support forglob()
" is now ready for Discussion.RFC
https://wiki.php.net/rfc/glob_streamwrapper_supportFeature Request and Discussions
https://github.com/php/php-src/issues/9224Regards,
Tim
Couldn't find the original message to reply to, but #1 I'd like to see the
RFC actually fleshed out. There's a PR linked, but we should define the
actual goals/APIs etc... in the RFC itself as well as the numerous edge
cases Christophe noted and why they aren't (or are?) concerns.
I do recall discussing this with Wez many years ago and the sharp corners
felt like a "wait until someone is asking for it" kind of situation.
-Sara
Hi @Dan, hi @Sara. Thanks for giving us your feedback on this.
I think that although the RFC discussion can go ahead without a patch,
it would be better to have a patch before it went to vote, as there
seem to be quite a few hidden details that might not be able to be
made to work.
I wanted that too. But it's hard to find someone willing to invest the time
and effort. And it's even harder when that time invested is not guaranteed
to lead to something. Therefore a PoC patch was produced that should give a
good understanding of what is intended. It was not my choice but I
understood the reasons.
but as noted in the PR (https://github.com/php/php-src/issues/9224),
where cmb69 wrote:In some cases it just makes no sense (e.g. compress.zlib:// and
data://), in some
cases it is impossible (e.g. http://), and in some cases it still might
be
impractical (e.g. ftp://).The exact details of edge cases probably need to be at least listed as
people (well at least myself) would view a lack of those details as a
reason to vote 'no', even if they thought the general idea was good.The RFC probably also needs to make an argument of why something that
sounds like it would be a leaky abstraction should be in core (and so
generating more support requests) rather than people using the already
existing userland package
(https://packagist.org/packages/webmozart/glob) which currently has
over 10million installs.
I don't know what you mean by "a leaky abstraction". If someone is
proposing we should raise awareness of file operations with
compress.zlib://, data://, ftp://, or http:// then that should be a
discussion for all filesystem functions and not only glob()
. scandir()
for
one already supports streamwrappers and I have no idea why somone would
attempt scandir('http://')? The current error handling in glob()
is
returning an empty array if it fails. I wouldn't find it odd if it also did
that for glob('http://').
Regarding webmozart's workaround. I don't see third party libraries as a
reason why we can't have nice things on the core level. Rather the
opposite. They tell me the efforts that were needed just to cover for
something that wasn't available on the core level. And the vast amount of
people who came across this limitation.
There's a PR linked, but we should define the actual goals/APIs etc... in
the RFC itself as well as the numerous edge cases
I agree. If you wanna put it in writing I can put it in the RFC. For the
edge cases my proposal is to return that empty array like it does for any
invalid path today. We could leave a note or tip in the documentation, but
again then you might need to do so withscandir()
and all the other
filesystem functions.
@Sara let me know if you would like to take over this RFC? You are more
experienced than I am.
/Tim
On Sun, Sep 18, 2022 at 5:02 PM Timmy Almroth timmy.almroth@gmail.com
wrote:
There's a PR linked, but we should define the actual goals/APIs etc...
in the RFC itself as well as the numerous edge casesI agree. If you wanna put it in writing I can put it in the RFC. For the
edge cases my proposal is to return that empty array like it does for any
invalid path today. We could leave a note or tip in the documentation, but
again then you might need to do so withscandir()
and all the other
filesystem functions.@Sara let me know if you would like to take over this RFC? You are more
experienced than I am.
I don't. I'm split too many ways atm to take on a PHP RFC.
I was using "we" in a more collective sense to mean "us as a project when
writing any RFC". The mechanics of wiring in a method in a wrapper to be
called as a proxy from a global function isn't hard. What's hard in this
case is defining what is expected from that function, specifically what a
typical user might expect. One might expect glob('https://www.php.net/*')
to return all names of all root pages on the site, and perhaps all short
aliases as well (which effectively includes all functions and classes).
There's no way a StreamWrapper implementation can accomplish that in the
general case, the protocol simply doesn't support it. But okay, we don't
implement it for http(s), but maybe we do got ftp(s), and might as well let
end users decide when they have a custom wrapper where it makes sense.
Cool. I'm just saying write all that out and put it in the RFC. There's
no need to know C or PHP's internal APIs to write that much. Write out the
API from the user's point of view, what assumptions you're making about
their use cases, and what risks are presented by cases like http(s) where
we won't be able to pull off a meaningful implementation.
-Sara
I was using "we" in a more collective sense to mean "us as a project when
writing any RFC". The mechanics of wiring in a method in a wrapper to be
called as a proxy from a global function isn't hard. What's hard in this
case is defining what is expected from that function, specifically what a
typical user might expect. One might expect glob('https://www.php.net/*')
to return all names of all root pages on the site, and perhaps all short
aliases as well (which effectively includes all functions and classes).
There's no way a StreamWrapper implementation can accomplish that in the
general case, the protocol simply doesn't support it. But okay, we don't
implement it for http(s), but maybe we do got ftp(s), and might as well let
end users decide when they have a custom wrapper where it makes sense.
Cool. I'm just saying write all that out and put it in the RFC. There's
no need to know C or PHP's internal APIs to write that much. Write out the
API from the user's point of view, what assumptions you're making about
their use cases, and what risks are presented by cases like http(s) where
we won't be able to pull off a meaningful implementation.
Whether a streamwrapper can be used with glob()
or not is not a limitation
of the intended glob()
implementation, but a limitation of the
StreamWrappers themselves. The problem already applies to scandir()
today.
scandir('ftp://.../') would only resolve contents if the streamwrapper
supports it. Someone can substitute PHP's streamwrapper for ftp with one
that does resolve content. I'm not sure how many would attempt these border
cases, but the documentation could state some words of advice. We could
list what streamwrappers that work out of the box, or vice versa.
I would not change the way glob()
returns data. For invalid paths or
streamwrappers it would just continue to return an empty array() like it
does today.
I edited the RFC to mention these edge cases and limitations. And how they
would be handled.
If you feel I missed something out I'm grateful if you wanna send a few
words fit for copy and pasting to the RFC.
/Tim
Hi,
On Sun, Sep 18, 2022 at 11:02 PM Timmy Almroth timmy.almroth@gmail.com
wrote:
Hi @Dan, hi @Sara. Thanks for giving us your feedback on this.
I think that although the RFC discussion can go ahead without a patch,
it would be better to have a patch before it went to vote, as there
seem to be quite a few hidden details that might not be able to be
made to work.I wanted that too. But it's hard to find someone willing to invest the time
and effort. And it's even harder when that time invested is not guaranteed
to lead to something. Therefore a PoC patch was produced that should give a
good understanding of what is intended. It was not my choice but I
understood the reasons.
I had a quick look to that PoC and it's basically just a quick wrapper that
depends on GLOB_ALTDIRFUNC. Unfortunately that's a non standard extension
that might be missing on some platform (e.g. alpine won't probably work
because from a quick look the musl libc doesn't seeem to implement it -
https://git.musl-libc.org/cgit/musl/tree/include/glob.h ). The code
obviously needs more work and we will need bunch of tests for this. Well
obviously it's just a PoC but what I want to say is that implementation is
really main thing here and should also help to provide more details in RFC.
For example it's not currently clear that you would change underlaying glob
implementation on some platforms - quite important thing to mention though.
To be honest if there is a good implementation, I think it's quite unlikely
that the RFC will fail.
Regards
Jakub
I had a quick look to that PoC and it's basically just a quick wrapper
that depends on GLOB_ALTDIRFUNC. Unfortunately that's a non standard
extension that might be missing on some platform (e.g. alpine won't
probably work because from a quick look the musl libc doesn't seeem to
implement it - https://git.musl-libc.org/cgit/musl/tree/include/glob.h ).
The code obviously needs more work and we will need bunch of tests for
this. Well obviously it's just a PoC but what I want to say is that
implementation is really main thing here and should also help to provide
more details in RFC. For example it's not currently clear that you would
change underlaying glob implementation on some platforms - quite important
thing to mention though. To be honest if there is a good implementation, I
think it's quite unlikely that the RFC will fail.
Hi Jakub. Indeed, it looks like Alpine users would not benefit from this. I
made notes of this in the RFC, thank you.
I understand you would have wanted to see the final PR. I do too. I will
not be involved in producing the final PR and these were the conditions I
agreed on. It's hard to find someone who will invest the time not knowing
if all their work will carry through. I wish there were two steps in the
RFC process. One voting for qualifying the idea/concept, and a second part
to vote for the final PR. The first qualifying part could sort out the vast
undesired ideas. And the second could attract coders and maybe allow for
several PR proposals. Do the php team have any future intentions to maybe
implement an RFC management system that could optimize both the creation,
user experience, and voting process?
Hi Tim,
Hello everyone. I would like to announce that the RFC for "StreamWrapper
Support forglob()
" is now ready for Discussion.
The RFC has:
Final patch will be produced ... if this RFC is approved.
I think that although the RFC discussion can go ahead without a patch,
it would be better to have a patch before it went to vote, as there
seem to be quite a few hidden details that might not be able to be
made to work.
The RFC has:
Consistently implement StreamWrapper support for
glob()
. Example:
glob('vfs://*.ext')
but as noted in the PR (https://github.com/php/php-src/issues/9224),
where cmb69 wrote:
In some cases it just makes no sense (e.g. compress.zlib:// and data://), in some
cases it is impossible (e.g. http://), and in some cases it still might be
impractical (e.g. ftp://).
The exact details of edge cases probably need to be at least listed as
people (well at least myself) would view a lack of those details as a
reason to vote 'no', even if they thought the general idea was good.
The RFC probably also needs to make an argument of why something that
sounds like it would be a leaky abstraction should be in core (and so
generating more support requests) rather than people using the already
existing userland package
(https://packagist.org/packages/webmozart/glob) which currently has
over 10million installs.
cheers
Dan
Ack