Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:129733 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id 86FA81A00BC for ; Sat, 3 Jan 2026 13:06:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1767445572; bh=a+Ml6VetEp81KSas728Xvhh0uNotmS8fgXA8tZLnQ40=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=E5qGYmcJh1wgWsnk82VFDOfsqS6ktkcycGXyz5pCE3GK2EAPYsZp39ATogppjmtKk R3F3TSufdaT60sHyqlgg3xjx6v3A54oiaqPPC1uQ1Y71bedETWg1ajsO5US0sNIy6z 8ocgiVsUcrtadnwLbm6lcTQNM4xD+BLFnYm333P5nl3CCuuH16rsaiFxX/CQcirYtB BbEqFU1oGHMMZATOJEkO+DuOn/nOnHaHTCENpkhP+O5nos7lDaRwuX7hgcdM9b0vqi HnCE6Po52yBvz5CIoM+7vpD/okHpj5vh3ApoXz8c3a+wzop+1zLOFQdxgfdera5Jha gSROYIA8phbVw== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id E84CA180053 for ; Sat, 3 Jan 2026 13:06:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from mail-oa1-f52.google.com (mail-oa1-f52.google.com [209.85.160.52]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 3 Jan 2026 13:06:00 +0000 (UTC) Received: by mail-oa1-f52.google.com with SMTP id 586e51a60fabf-3e3dac349easo13936409fac.2 for ; Sat, 03 Jan 2026 05:05:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767445555; x=1768050355; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=0vpLpvVBQvBrs8qGmnR2zYYgYq0Jt9bnY65HMr7M8/k=; b=A84DU8y0YWPJEPaNNg+urp4H3ewr2yUR7AGNQpOlT8DrL6bdmy4k4GwJZdHVMkgf8w g6UCTWgaLYCD7WkTed1VASwZQc24o4HhwZaRGbNB+gNqYNy0D1Q3ejGUFiZnJzKBRDH7 OaPSupFDOa8pMwktC9vYRAHHOMZ6q8ZHiKWzGXbJFdMIiR3zLnHuKjmjE/IA5Xtourb8 u2rPaCC96JYRGxFgszQZsA+gJGLchr+PAkPvwhnxBrNIqJEsiOrDm8BZazvdMnriO/04 /hqShFZJOKzAAv3Ul39M56W6UoPC/VwwaRwcDEP/Dg/gfLXu/O+FRC0rnwbmq1Rhx0P8 AZyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767445555; x=1768050355; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=0vpLpvVBQvBrs8qGmnR2zYYgYq0Jt9bnY65HMr7M8/k=; b=fLbllItawuzRkBGlPhifjWhzl82Y5TD3IR7pBrlb2ZtPNXQYyf9WhqU5FL96nYF577 pZu+9Ezoj7bGzm43aw1t0/73Exl+W5R5Ozu3b0zwqhGCN6h0RXqoUksOemE5XDuawBIc 47sRAMd2cFb4FFhSJ0rYAn3JfhZycYBaGqa4BlqevfLw5o0sVvKvK/IruZdHiNoOl6dH 2dOtXLezMZnnx3K3eEf78pem7HGLJdFdK61Rtxb34bLZYMBRpzIE7E/ohk4AmYgEFi+H RN1pB60x3SXZgAT4uVf8Jgh67rOWZz2ukVekc88TVunaouJXjpL9l6Gv5bP5CFQSBYL6 +zuQ== X-Gm-Message-State: AOJu0Yzx0TsO0R+4Xi0FfvU+BFcGCZ/V++t6L1+t9hI2osJ638Y+CKUo xV9HbjOmWJE/VE+nfwMJJNcgtMBUNKAOO7YgCUWU3UkN5mxf4mvYOMV3A/qmkrx1uQwUahiW0Af fA1jCnDdLAeC4aDeLtFYkptkOenykh4VaNHgQ X-Gm-Gg: AY/fxX5D+UX6jWSMweZL2XoP+J2rCoWqqeWHyh/vJ6HoFNNGLAs1xXVNHQxGwOVMSx/ eDoJBI4zPydNPk+f0e62MlDWWgOi6KumTmb+E9StZej0gwcbbW7CkPPlbxiqCjrPtx5ZASE/UO8 FyL5Wuy/RMxpFNb/ZtoCjPv3rQLjOki3QcsdRccJphx3yPR/2r7I9caL9fptyQtAnxxxK9tZmu9 pCHN5NooRJmjRjgfYmQB9IgX5QetAMqkUf08+Zp2LGZx3sd9USULES58oNvxw312yU2CM7igDSh FIxUd0lajaX5WB0F6yfAjPvbRtcoiw== X-Google-Smtp-Source: AGHT+IGYzHLWj6YyQ+wsBZBRlwqZavXaxDjaQNaHhLtNR3zj/BdAWwd3lBHWeocr4yoQQs5LAMZlzDs4U73JCmAZZJw= X-Received: by 2002:a05:6820:6bc1:b0:65f:1012:69bc with SMTP id 006d021491bc7-65f10126b29mr2788638eaf.58.1767445554780; Sat, 03 Jan 2026 05:05:54 -0800 (PST) Precedence: list list-help: list-unsubscribe: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 References: <83238ad3-c844-4457-dfb3-11321787e022@php.net> In-Reply-To: Date: Sat, 3 Jan 2026 14:05:43 +0100 X-Gm-Features: AQt7F2rOTHsJ2MlWv5sx5D8jR6BbuBeppjDaVXmvevRmC_YAzALjBLRyiPTu8Ls Message-ID: Subject: Re: [PHP-DEV] [RFC] [Discussion] Followup Improvements for ext/uri To: =?UTF-8?B?TcOhdMOpIEtvY3Npcw==?= Cc: PHP Internals List Content-Type: multipart/alternative; boundary="00000000000073df3806477b7ccf" From: nyamsprod@gmail.com (ignace nyamagana butera) --00000000000073df3806477b7ccf Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Dec 21, 2025 at 4:51=E2=80=AFPM ignace nyamagana butera wrote: > On Sun, Dec 21, 2025 at 1:13=E2=80=AFPM M=C3=A1t=C3=A9 Kocsis > wrote: > >> Hi Ignace, >> >> >>> When I talk about data mangling I am talking about this >> >> >>> parse_str('foo.bar=3Dbaz', $params); >>> var_dump($params); //returns ['foo_bar' =3D> 'baz'] >> >> >> Sure! I just wanted to point it out that name mangling will still be >> present due to arrays, and this will have some disadvantages, >> namely that adding params and retrieving them won't be symmetric: >> >> $params =3D new Uri\Rfc3986\UriQueryParams() >> ->append("foo", [2 =3D> "bar", 4 =3D> "baz"]); >> >> var_dump($params->getFirst("foo")); // NULL >> >> One cannot be sure if a parameter that was added can really be retrieved >> later via a get*() method. >> >> Another edge cases: >> >> $params =3D new Uri\Rfc3986\UriQueryParams() >> ->append("foo", ["bar", "baz"]) // Value is >> a list, so "foo" is added without brackets >> ->append("foo", [2 =3D> "qux", 4 =3D> "quux"]); // Value is an >> array, so "foo" is added with brackets >> >> var_dump($params->toRfc3986String()); // >> foo=3Dbar&foo=3Dbaz&foo%5B2%5D=3Dqux&foo%5B4%5D=3Dquux >> >> var_dump($params->getLast("foo")) // Should it be "baz= " >> or "quux"? >> var_dump($params->getAll("foo")) // Should it only >> include the params with name "foo", or also "foo[]"? >> >> And of course this behavior also makes the implementation incompatible >> with the WHATWG URL specification: although I do think this part of >> the specification is way too underspecified and vague, so URLSearchParam= s >> doesn't seem well-usable in practice... >> >> So an idea that I'm now pondering about is to keep the append() and set(= ) >> methods compatible with WHATWG: and they would only support >> scalar values, therefore the param name wasn't mangled. And an extra >> appendArray() and setArray() method could be added that would possible >> mangle the param names, and they would only support passing arrays. This >> solution would hopefully result in a slightly less surprising >> behavior: one could immediately know if a previously added parameter is >> really retrievable (when append() or set() was used), or extra >> checks may be needed (when using appendArray() or setArray()). >> >> >>> This to me should yield the same result as ->append('foo', null); as >>> the array construct is only indicative of a repeating parameter name >>> if there is no repeat then it means no data is attached to the name. >> >> >> Alright, that seems the least problematic solution indeed, and >> http_build_query() also represents empty arrays just like null values >> (omitting them). >> >> So in your implementation it would mean: >>> >>> allowed type: null, int, float, string, boolean, and Backed Enum (to >>> minic json_encode and PHP8.4+ behaviour) >>> arrays with values containing valid allowed type or array. are also >>> supported to allow complex type support. >>> >>> Any other type (object, resource, Pure Enum) are disallowed they shoul= d >>> throw a TypeError >>> >> >> +1 >> >> >>> Maybe in the future scope of this RFC or in this RFC depending on how >>> you scope the RFC you may introduce an Interface which will allow >>> serializing objects using a representation that >>> follows the described rules above. Similar to what the >>> JsonSerializable interface is for json_encode. >>> >> >> Hm, good idea! I'm not particularly interested in this feature, but I >> agree it's a good way to add support for objects. >> >> Last but not Last, all this SHOULD not affect how http_buid_query works. >>> The function should never have been modified IMHO so it should be left >>> untouched by all this except if we allow it >>> to opt-in the behaviour once the interface is approved and added to PHP= . >>> >> >> +1 >> >> Regards, >> M=C3=A1t=C3=A9 >> >> > Hi M=C3=A1t=C3=A9, > > And an extra appendArray() and setArray() method could be added that >> would possible >> mangle the param names, and they would only support passing arrays. This >> solution would hopefully result in a slightly less surprising >> behavior: one could immediately know if a previously added parameter is >> really retrievable (when append() or set() was used), or extra >> checks may be needed (when using appendArray() or setArray()). > > > I believe adding the appendArray and setArray is the way forward as the > bracket addition and thus mangling is really a PHP specificity that we MU= ST > keep to avoid hard BC. > I would even go a step further and add a getArray and hasArray methods > which will lead to the following API > > $params =3D (new Uri\Rfc3986\UriQueryParams()) > ->append("foo", ["bar", "baz"]) // Value is a list, so "foo"= is added without brackets > ->appendArray("foo", ["qux", "quux"]); // Value is a list, using PH= P serialization "foo" is added with brackets > > var_dump($params->toRfc3986String()); // foo=3Dbar&foo=3Dbaz&foo%5= B0%5D=3Dqux&foo%5B1%5D=3Dquux > > $params->hasArray('foo'); //returns true > $params->getArray("foo"); //returns ["qux", "quux"] > > $params->has('foo'); //returns true > $params->getFirst("foo"); //returns "bar" > $params->getLast("foo"); //returns "baz" > $params->getAll('foo'); //returns ["bar", "baz"] > > Hope this makes sense > > Regards, > Ignace > > Hi M=C3=A0t=C3=A9, I have been playing around your Query Param API and I have a couple of questions: Question 1) While I am not a proponent of the addition of the getQueryParams on both classes even though I know the method exists in the WHATWG URL spec I find strange is that the method may return null. To me this makes for an awkward API where the user will always have to add some conditional checks before using the method returned value. Why can't this be true ? $url =3D Uri\Rfc3986\Uri::parse('https://www.example.com/path/to/whatever')= ; $url->getQueryParams(); // should return a empty UriQueryParams instance $url =3D Uri\Rfc3986\Uri::parse('https://www.example.com/path/to/whatever?'= ); $url->getQueryParams(); // should return UriQueryParams with a pair // represented like this ['' =3D> null] or like this ['', null] This IMHO should also be the case for the UrlQueryParams instance Question 1-bis) I prefer having some extra named constructors on the UrlQueryParams instead of having a getter on the Uri/Url classes. This fully decoupled the Ur(i|l)QueryParams from the Uri/Url classes and let the user opt-in the new API if needed. In case of errors/bugs etc... only the QueryParams cointainer bags would be affected ... not the Url/Uri classes. Question 2) I see you have - UriQueryParams::fromArray, - UriQueryParams::list, If I read it correctly, this returns 2 array representations of the query ? My question is shouldn't we have either a fromList named constructor and/or a toArray which return both distinctive forms ? This might confused the developer who will have a hard time understand which form is what and when to use it and it which one in which context can be used to instantiate a new instance ? Question 3) I wanted to know how the following code will be processed ? $query =3D 'a[]=3Dfoo&a[]=3Dbar&a=3Dqux'; parse_str($query, $result); $result['a']; //returns "qux" As seen in the example with parse_str the full array notation is overwritten and can not be used/accessed Will the getArray API still be able to access the array data or will it act like parse_str and skip the array notation ? Best regards, Ignace --00000000000073df3806477b7ccf Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Sun, Dec 21, 2025= at 4:51=E2=80=AFPM ignace nyamagana butera <nyamsprod@gmail.com> wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l= eft:1px solid rgb(204,204,204);padding-left:1ex">
On Sun, Dec 21, 2025 at 1:13=E2=80=AFPM M=C3=A1t=C3=A9 Kocsis <= kocsismate90@gm= ail.com> wrote:
<= div>Hi Ignace,
=C2=A0
When I talk about data mangling I am talking about this

parse_str('f= oo.bar=3Dbaz', $params);
var_dump($params); //returns ['foo_bar&= #39; =3D> 'baz']

Sure! I just wa= nted to point it out that name mangling will still be present due to arrays= , and this will have some disadvantages,
namely that adding param= s and retrieving them won't be symmetric:

$params =3D new Uri\Rfc3986\UriQueryParams()
=C2=A0 =C2=A0 ->append(= "foo", [2 =3D> "bar", 4 =3D> "baz"]);

var_dump($params->getFirst("foo"))= ;=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0// NU= LL

One cannot be sure if a parameter that wa= s added can really be retrieved later via a get*() method.

Another edge cases:

$params =3D new = Uri\Rfc3986\UriQueryParams()
=C2=A0 =C2=A0 ->append("foo", = ["bar", "baz"])=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 // Value is a list, so= "foo" is added without brackets
=C2=A0 =C2=A0 ->app= end("foo", [2 =3D> "qux", 4 =3D> "quux"= ]);=C2=A0 =C2=A0 =C2=A0 // Value is an array, so "foo" is added w= ith brackets

var_dump($params->toRfc3986Str= ing());=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 // foo=3Dbar&foo=3Dbaz= &foo%5B2%5D=3Dqux&foo%5B4%5D=3Dquux

= var_dump($params->getLast("foo"))=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0// Should it be "baz" or= "quux"?
var_dump($params->getAll("foo"))= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0// Should it only include the params with name "foo", or also = "foo[]"?

And of course this behavior als= o makes the implementation incompatible with the WHATWG URL specification: = although I do think this part of
the specification is way too und= erspecified and vague, so URLSearchParams doesn't seem well-usable in p= ractice...

So an idea that I'm now pondering a= bout is to keep the append() and set() methods compatible with WHATWG: and = they would only support
scalar values, therefore the param name w= asn't mangled. And an extra appendArray() and setArray() method could b= e added that would possible
mangle the param names, and they woul= d only support passing arrays. This solution would hopefully result in a sl= ightly=C2=A0less surprising
behavior: one could immediately know = if a previously added parameter is really retrievable (when append() or set= () was used), or extra
checks may be needed (when using appendArr= ay() or setArray()).
=C2=A0
This to me should yield the same result as ->append(&#= 39;foo', null); =C2=A0as the array construct is only indicative of a re= peating parameter name
if there is no repeat then it means no data is at= tached to the name.

Alright, that seems th= e least problematic solution indeed, and http_build_query() also represents= empty arrays just like null values (omitting them).

So in your implementation it would mean:
allowed type: null, int, float, string, boolean, and Backed En= um (to minic json_encode and PHP8.4+ behaviour)
arrays with value= s containing valid allowed type or array. are also supported to allow compl= ex type support.

Any other type (object, resource,= Pure Enum)=C2=A0=C2=A0are disallowed they should throw a TypeError

+1
=C2=A0
Maybe in the future scope of this RFC or in this RFC = depending on how you scope the RFC you may introduce=C2=A0an Interface whic= h will allow serializing objects using a representation that
foll= ows the described rules above. Similar to what the JsonSerializable=C2=A0in= terface is for json_encode.

Hm, good idea! I'm not particularly interested in this feature, = but I agree it's a good way to add support for objects.

<= /div>
Last but not Last, all this SHOULD not affect = how http_buid_query works. The function should never have been modified IMH= O so it should be left untouched by all this except if we allow it
to opt-in the behaviour once the interface is approved and added to PHP.<= /div>

+1
=C2=A0
=
Regards,
M=C3=A1t=C3=A9


Hi M=C3=A1t=C3=A9,

=C2=A0And an extra appendArray()= and setArray() method could be added that would possible
mangle the par= am names, and they would only support passing arrays. This solution would h= opefully result in a slightly=C2=A0less surprising
behavior: one could i= mmediately know if a previously added parameter is really retrievable (when= append() or set() was used), or extra
checks may be needed (when using = appendArray() or setArray()).=C2=A0
=C2=A0
I bel= ieve adding the appendArray and setArray is the way forward as the bracket = addition and thus mangling is really a PHP specificity that we MUST keep to= avoid hard BC.
I would even go a step further and add a getArray= and hasArray methods which will lead to the following API

$params =
=3D (new Uri\Rfc3986\UriQueryParams())
->append(= "foo", ["bar"= ;, "baz"]) = // Value is a= list, so "foo" is added without brackets
->appendArray("foo", ["qux= ", "quux"]= ); // Value is a= list, using PHP serialization "foo" is added with brackets

var_dump
($params->toRfc3986S= tring()); // foo=3Dbar&foo=3Dbaz&foo%5B0%5D=3Dqux&foo%5B1%5D=3Dquux=

$params->hasArray('= ;foo'); //returns true
$params= ->getArray("foo"); //returns ["qux", "quux"]
<= /span>
<= span style=3D"color:rgb(102,0,0)">$params
->has('foo'<= /span>); //re= turns true
$params->= getFirst("foo"); //returns "bar"
$params->getL= ast("foo"); //returns "baz&= quot;
$params->getAll
('foo'); //returns ["bar", "baz"]
Hope this makes sense <=
/pre>
Regards,
Ignace

Hi M=C3=A0t=C3=A9,=C2=A0

I have been playing arou= nd your Query Param API and I have a couple of questions:

Question 1) While I am not a proponent of the addition of the getQu= eryParams on both classes even though I know the method exists in
the WHATWG URL spec=C2=A0 I find strange is that the method may return nul= l. To me this makes for an awkward API where the user will always
have to add some conditional checks before using the method returned value= . Why can't this be true ?

$url =3D =
Uri\Rfc3986\Uri::parse('https://www.examp=
le.com/path/to/whatever');
$url->getQueryParams();
// shou= ld return a empty UriQueryParams instance

$url =3D Uri\Rfc3986\Uri<= /span>::parse(= 'https://www.example.com/path/to/whatev= er?');
$url->= ;getQueryParams();
// should return UriQueryPar= ams with a pair
// represented like this ['' =3D> null] or like this [&#= 39;', null]
This =
IMHO should also be the case for the UrlQueryParams instance

Question 1-bis) 
I prefer having some extra named constructors on the UrlQueryParams=
 instead of having a getter on the Uri/Url classes. This fully decoupled
the Ur(i|l)QueryParams from=
 the Uri/Url classes and let the user opt-in the new API if needed. In case=
 of errors/bugs etc... only the QueryParams
cointainer bags would be affected ... not the Url/Uri cl=
asses.

Que=
stion 2)
 I see you have =
- UriQueryParams::fromArray, 
- UriQueryParams:=
:list, 
If I read it correctly, =
this returns 2 array representations of the query ?
 My question is shouldn't we have either a f=
romList named constructor and/or a toArray which return both distinctive fo=
rms ?
This might confused=
 the developer who will have a hard time understand which form is what and =
when to use it and it which
one in which context can be used to instantiate a new instance ?<=
/pre>

Question 3)
I wanted to know how the follo=
wing code will be processed ?
$query =3D 'a[]=3Dfoo&a[]=3Dbar&am=
p;a=3Dqux';
parse_str($query, $result);
$resul= t['a']; //returns "qux"
As =
seen in the example with  parse_str the full array notation is overwritten =
and can not be used/accessed
Will the getArray API still be able to access the array data=
 or will it act like parse_str and skip the array notation ?
Best regards,
Ignace
--00000000000073df3806477b7ccf--