Hey Everyone,
As I mentioned in a previous email of mine (
https://externals.io/message/129486#130077),
I recently separated the query parameter handling sub-proposal from
https://wiki.php.net/rfc/uri_followup into its own RFC because it was way
too complex.
Therefore I'm officially opening its discussion.
After the separation, I reworked the proposal quite a lot: the single
biggest change
is that now, only a single class would be added: Uri\QueryParams instead of
both an RFC 3986
and a WHATWG URL compatible implementation.
The focus of the RFC is now to move away from the usage of the $_GET
superglobal,
which goal comes with two additional expectations:
- the new implementation should have comparable performance to $_GET
- the new implementation should support most capabilities of $_GET (e.g.
arrays)
The first one is probably straightforward to achieve, the latter one has
fundamental
problems: PHP's feature set (mostly: array support) is not compatible with
the WHATWG URL,
so some behavior likely wouldn't comply with this specification. That's why
the RFC
still has some TBD parts (e.g. Array API), or some contradicting info
related to some getters' and setters' signature/behavior.
Other than the API itself, the proposal doesn't have many questions, except
one thing: whether the new class should be readonly or not? I tend to make
it readonly, but I'm still not sure (since
this class can be used as a Builder).
Regards,
Máté
Hi
Therefore I'm officially opening its discussion.
You forgot to link the RFC. I'm doing that here:
https://wiki.php.net/rfc/query_params
Best regards
Tim Düsterhus
Hi
Other than the API itself, the proposal doesn't have many questions, except
one thing: whether the new class should be readonly or not? I tend to make
it readonly, but I'm still not sure (since
this class can be used as a Builder).
I don't think it is as necessary to make it readonly as it is with the
URI classes, since the query parameters class is likely to be used for
“local modification” and is less likely to be passed around as a value
object. If efficient copy-on-write is possible internally, then making
it readonly is the safer choice, of course.
With regard to the RFC text, I've given it a first quick skim and have
the following notes:
- Within the "Relation to the query component" section.
The $uri = new Uri\Rfc3986\Uri("https://example.com?foo=a b"); example
is invalid. An URI containing a space is not valid and the Uri class
will throw an exception. Can you double-check that part of the RFC?
Without looking it up, I suspect that the space should be %20 for RFC3986?
- Within the "Modification" section:
“Neither append(), nor set() do any percent-encoding or decoding of
their arguments.”
This should explain that during stringification, the % will be encoded
as %25. Thus the example will result in:
foo%255B%255D=ab%2563&bar%255B%255D=de%2566
- Within the "Modification" section:
“Finally, sort() sorts the query parameter list alphabetically:”
It is implied by the example, but should be spelled out explicitly:
Parameters with the same name should keep their original order and
should not be sorted by value.
- Within the "Supported types" section.
“The above conversion rules work for both UriQueryParams and
UrlQueryParams.”
and the examples also use Uri\Rfc3986\UriQueryParams /
Uri\WhatWg\UrlQueryParams.
This seems like an oversight from the original implementation, since
there are no longer separate classes. Can you double-check the entire
section?
- Within the "Array API" section.
This also uses Uri\Rfc3986\UriQueryParams in the examples.
- Within the "RFC Impact" section.
The ecosystem impact can probably be “None”.
I'll give this an in-depth read later, when the obvious mistakes are fixed.
Best regards
Tim Düsterhus
Hey Tim,
Thank you for the help, I forgot to link the RFC indeed.
Let me answer your questions:
I don't think it is as necessary to make it readonly as it is with the
URI classes, since the query parameters class is likely to be used for
“local modification” and is less likely to be passed around as a value
object. If efficient copy-on-write is possible internally, then making
it readonly is the safer choice, of course.
Yes, as far as I know from Ilija, it's possible to optimize modification
based on
refcount: if a readonly property which is to be modified has a refcount of
1 and
there's no weakref, it can be safely updated in-place.
With regard to the RFC text, I've given it a first quick skim and have
the following notes:
- Within the "Relation to the query component" section.
The
$uri = new Uri\Rfc3986\Uri("https://example.com?foo=a b");example
is invalid. An URI containing a space is not valid and theUriclass
will throw an exception. Can you double-check that part of the RFC?
Without looking it up, I suspect that the space should be%20for
RFC3986?
Nice catch, I'll review and fix the examples using "a b". It is indeed
incorrect, and %20
should be used instead. as you said.
- Within the "Supported types" section.
“The above conversion rules work for both UriQueryParams and
UrlQueryParams.”and the examples also use Uri\Rfc3986\UriQueryParams /
Uri\WhatWg\UrlQueryParams.This seems like an oversight from the original implementation, since
there are no longer separate classes. Can you double-check the entire
section?
I intentionally haven't updated this section yet: because there's important
info about
the difference between how WHATWG specification and uriparser handles null,
so I didn't want to "lose" this (I know it remains in the changelog).
- Within the "Array API" section.
This also uses Uri\Rfc3986\UriQueryParams in the examples
Same situation as above, but basically the whole section is a big question
mark:
I couldn't come up with a good enough idea how arrays could be handled
efficiently and ergonomically, and in the same time, how to (mostly) conform
to WHATWG URL.
Some of my problems:
- How should $params->get("foo[bar]") behave? Should we try to parse the
supplied key as an array and return the
nested item if it exists? I suppose, we would internally store the query
params as an array (just like how $_GET does),
so then we would definitely need parsing in order to be able to return item
"bar" stored within array "foo". - However, what should a $params->append("foo[bar]") call do when the query
param "foo" is already added with a string value?
Should it modify the existing item to be an array (but then how?) or we
should just add "foo[bar]" to the query params as-is? - How to recompose list parameters? Ideally, we should not append [] to
these parameters (e.g.
"list=1&list=2" vs "list[]=1&list[]=2"), and actually, it is not needed for
arrays in the root level, but as far as I understood,
it's a must-have for nested lists (e.g. "map[list]=1&map[list]=2" vs.
map[list[]]=1&map[list[]]=2), otherwise the representation
would be ambiguous for lists with 1 item: "map[list]=1" would mean that the
map has a list item with a value of "1". So
we either always have to use the [] suffix for lists, or we should make a
distinction between root-level and nested lists.
I know, most of my questions revolve around parsing the $name parameter,
but I would really want to minimize the number
of times when the $name param has to be parsed run-time so that the
performance is comparable to native array key
lookups. I don't have any idea how much the parsing would cost though...
So at this point, coming up with a sensible plan is needed, and I can fix
most inconsistencies in the RFC afterwards.
Regards,
Máté
Hey Máté,
Nice work, I really appreciate the effort! While adopting Uri I stumbled upon this missing feature as well.
In the examples you mention the usage of parse and toString. I assume there is no “default” and this is a placeholder for one of the given implementations. But it’s perhaps better to pick one of for the examples as it suggests it would also exist.
When it comes to the implementation I believe it would be nice to also have a getKeys() method. That would only return the keys.
You do mention the focus is to move away from $_GET. I like the idea, but implementing QueryString would still require something like the following:
Uri\QueryParams::fromArray($_GET);
If we really want to make $_GET obsolete, wouldn’t it be nice to have a fromRequest() which would directly parse $_GET or use $_SERVER["QUERY_STRING”].
As for whether or not this should be a readonly class. I think it should be. The class itself is following various specs. And since all other classes in the same namespace are also readonly, it should follow the same principle (also for consistency) I believe.
Regards,
Jordi
Hey Everyone,
As I mentioned in a previous email of mine (https://externals.io/message/129486#130077),
I recently separated the query parameter handling sub-proposal from https://wiki.php.net/rfc/uri_followup into its own RFC because it was way too complex.
Therefore I'm officially opening its discussion.After the separation, I reworked the proposal quite a lot: the single biggest change
is that now, only a single class would be added: Uri\QueryParams instead of both an RFC 3986
and a WHATWG URL compatible implementation.The focus of the RFC is now to move away from the usage of the $_GET superglobal,
which goal comes with two additional expectations:
- the new implementation should have comparable performance to $_GET
- the new implementation should support most capabilities of $_GET (e.g. arrays)
The first one is probably straightforward to achieve, the latter one has fundamental
problems: PHP's feature set (mostly: array support) is not compatible with the WHATWG URL,
so some behavior likely wouldn't comply with this specification. That's why the RFC
still has some TBD parts (e.g. Array API), or some contradicting info related to some getters' and setters' signature/behavior.Other than the API itself, the proposal doesn't have many questions, except one thing: whether the new class should be readonly or not? I tend to make it readonly, but I'm still not sure (since
this class can be used as a Builder).Regards,
Máté
Hi Jordi,
In the examples you mention the usage of parse and toString. I assume
there is no “default” and this is a placeholder for one of the given
implementations. But it’s perhaps better to pick one of for the examples as
it suggests it would also exist.
Ah, thanks for the info! Most of these were omission in fact, as I forgot
to update the original method names after I unified the two QueryParams
classes.
When it comes to the implementation I believe it would be nice to also
have agetKeys()method. That would only return the keys.You do mention the focus is to move away from $_GET. I like the idea, but
implementing QueryString would still require something like the following:Uri\QueryParams::fromArray($_GET);
If we really want to make
$_GETobsolete, wouldn’t it be nice to have a
fromRequest()which would directly parse$_GETor use
$_SERVER["QUERY_STRING”].
As far as I can see, Uri\QueryParams::fromArray($_GET); wouldn't be needed.
Do you have any use-case in mind where the suggested
"Uri\QueryParams::parseRfc3986($_SERVER["QUERY_STRING"]);" call wouldn't
work instead? I'm not against adding a fromRequest()
method though if others find it useful (I'm a bit neutral about it). And
I'd probably use a fromCurrentRequest() or a similar name to
highlight the fact that it uses the current request as source.
As for whether or not this should be a readonly class. I think it should
be. The class itself is following various specs. And since all other
classes in the same namespace are also readonly, it should follow the same
principle (also for consistency) I believe.
I think the most problematic case is to return a mutable class from a
readonly class (e.g. Uri\Rfc3986\Uri::getQueryParams()). So we should either
omit this method, or make Uri\QueryParams readonly indeed.
Regards,
Máté