Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:129669 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id 359E31A00BC for ; Sun, 21 Dec 2025 15:51:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1766332296; bh=6zIOdBFcdAQFyXaTqOEGQeYnwjH9eNKJ+28PUP5z67g=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=QRgLCQpeGWIst561gqEzZroYE4RxVzKduLs2CyRT8TNS6bSi28hoKkYwDyzxdkPhc qdlpJfjAp3yciARDDOIXsDP1XI3ykOT2ZxeY1zKlbp81KzItjR/PVG4ts9vAfJd5yz 6IvZt/AVXCtFIgIPya2vAXp628eoI24mFG+NK2Q8WbXLpb1YZcKshCp6BPx8llxpn6 /ydVUp18Au7Dp6yID/APpQ5nAEtS+MUf7YjTxezM0023aWVTFUGHRGwH1RB1fBrysU Kk2c6P35CiJTErEYqdiLLhhpmgk2uD7NaXURBEF59DrZodn2bbM+63icjGY0UqyrMi A6bHzVbohIkcg== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 8D0F618007E for ; Sun, 21 Dec 2025 15:51:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from mail-oa1-f47.google.com (mail-oa1-f47.google.com [209.85.160.47]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sun, 21 Dec 2025 15:51:35 +0000 (UTC) Received: by mail-oa1-f47.google.com with SMTP id 586e51a60fabf-3f0cbfae787so2949787fac.3 for ; Sun, 21 Dec 2025 07:51:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1766332289; x=1766937089; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=fOq1+OqM2wwIelywC/HpTX6HgwBT0b9vkEa4d9Mt5VQ=; b=WjEDj0+F413Pd3DeczqBJOIaWlGwGVjBv5xdbDjqXNKs2lwdpGsDgeMp8fhOgX0GtM oMIIix6pfGpRgL01tCeQIk8RlrmNzByj8ox+o7h9G84sRa0gYg5nZF3EsrlLCw2hBlv0 uWRuVENxEBVBcDbNWQW9W+E6PUjgNyDT02KJFExOF85EMWTWZthwIRatUs8Z2qJa6pYG JKC38jRUW5g1eSe+CQv5Y52ZWr1vnk/UXM9+sLhiBpGWppKnmsgPl1PNhi9zIaaqRv9U +v85OP7KND2z/G8XeOavZQs76xS7/riNWyjcVW0Zj2QWvVUrlSqZEyoRi3VKZcvhprj0 eTog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766332289; x=1766937089; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=fOq1+OqM2wwIelywC/HpTX6HgwBT0b9vkEa4d9Mt5VQ=; b=RDwiI+MmgPVmzKmw3cHBol/592wZvQmQEsmtMOnobDgybGzD24HRbdJGkCBeOFZ8Es 5DCVWUY5OfD6BLtoEZ9A0NxXxr84/QckgGTBgUMlMFAX+vke/b42LVsgMmlnl3JSAzMG e8xszKxkPE4UUw8Z1P69quNZy49rReOP8VUZQ1Jt+g7krv2tdZ7yLAu3mNlw3Hxqx29B yxFmfCUPEYlOwDRHPbx/RlS0Kg7n5mCrrDYdoa/NwB1H8fM3VvP/haVBozfLOs6o3yqX 47zS+Sz/FGTzwwGblFFh0PgL0RBulbKuCvB7gC/Uw4tOCDpYmHz70iuYifJC0dq3YEAl 12kw== X-Gm-Message-State: AOJu0YzdZGU8R4/wRNnotRfGg1mCLgOWtHTT8FlTzQV28CKMmEeWVzKg ikXhA07/zhELPhw09X95DvvRrKBqjrGKwvsDnzY4tpIxiIqrxGf5O+bzMyJkf4/LWrz5CKJUerV PdNZqOKF8BhuHGVaNnVy5oMXELah3xEu7rYk/ X-Gm-Gg: AY/fxX7ni1sFvrSRS/Vrw3pAgGeGIZ7KGSL6Avr3hSnGjIiJy8BLXbFMwtU08nXtS0a BBor3IsvIuDmNgE66HbXS591sIXvnTYPnXX9L4U8OevU0LA1GEtH8XyvYJqj7pYfDB/4wJuwvgU 5xZ8vxs57ZivBVt1UD0p740AUsdcRT9xTQBveQM1UhMsWOx/iM8cAB5bE0eVQHXL5RkFUnbXPYL MjHRCHusmoIWu5yPGY+6DxpJ/0L3EHPUUd9LcLMFnVNQfgb83g8rn5sciyZ+Js2ETVbM1zJpJya GK08ipHqBjFFisnUEmxU1xwHSK5u66QFtZmADpTQ X-Google-Smtp-Source: AGHT+IFhh4bN/sAQEvoIn1LKs1lX7vXsVB2xUtNZKqPkOeOH5Rc1wVssve3FUwVEGBnKupTLRqP++m/gFOB52zDI3v0= X-Received: by 2002:a05:6820:610e:b0:659:9a49:8f87 with SMTP id 006d021491bc7-65d0eb1f73emr2364724eaf.76.1766332289444; Sun, 21 Dec 2025 07:51:29 -0800 (PST) Precedence: list list-help: list-unsubscribe: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 References: <83238ad3-c844-4457-dfb3-11321787e022@php.net> In-Reply-To: Date: Sun, 21 Dec 2025 16:51:18 +0100 X-Gm-Features: AQt7F2rdldRlG0G-sXzeRMu-EcdmGvMNDg4SZtYVAzXBr_xyHLwEgyboWnVTZiQ Message-ID: Subject: Re: [PHP-DEV] [RFC] [Discussion] Followup Improvements for ext/uri To: =?UTF-8?B?TcOhdMOpIEtvY3Npcw==?= Cc: PHP Internals List Content-Type: multipart/alternative; boundary="000000000000aaf2d006467848a5" From: nyamsprod@gmail.com (ignace nyamagana butera) --000000000000aaf2d006467848a5 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Dec 21, 2025 at 1:13=E2=80=AFPM M=C3=A1t=C3=A9 Kocsis wrote: > Hi Ignace, > > >> When I talk about data mangling I am talking about this > > >> parse_str('foo.bar=3Dbaz', $params); >> var_dump($params); //returns ['foo_bar' =3D> 'baz'] > > > Sure! I just wanted to point it out that name mangling will still be > present due to arrays, and this will have some disadvantages, > namely that adding params and retrieving them won't be symmetric: > > $params =3D new Uri\Rfc3986\UriQueryParams() > ->append("foo", [2 =3D> "bar", 4 =3D> "baz"]); > > var_dump($params->getFirst("foo")); // NULL > > One cannot be sure if a parameter that was added can really be retrieved > later via a get*() method. > > Another edge cases: > > $params =3D new Uri\Rfc3986\UriQueryParams() > ->append("foo", ["bar", "baz"]) // Value is = a > list, so "foo" is added without brackets > ->append("foo", [2 =3D> "qux", 4 =3D> "quux"]); // Value is an a= rray, > so "foo" is added with brackets > > var_dump($params->toRfc3986String()); // > foo=3Dbar&foo=3Dbaz&foo%5B2%5D=3Dqux&foo%5B4%5D=3Dquux > > var_dump($params->getLast("foo")) // Should it be "baz" > or "quux"? > var_dump($params->getAll("foo")) // Should it only > include the params with name "foo", or also "foo[]"? > > And of course this behavior also makes the implementation incompatible > with the WHATWG URL specification: although I do think this part of > the specification is way too underspecified and vague, so URLSearchParams > doesn't seem well-usable in practice... > > So an idea that I'm now pondering about is to keep the append() and set() > methods compatible with WHATWG: and they would only support > scalar values, therefore the param name wasn't mangled. And an extra > appendArray() and setArray() method could be added that would possible > mangle the param names, and they would only support passing arrays. This > solution would hopefully result in a slightly less surprising > behavior: one could immediately know if a previously added parameter is > really retrievable (when append() or set() was used), or extra > checks may be needed (when using appendArray() or setArray()). > > >> This to me should yield the same result as ->append('foo', null); as th= e >> array construct is only indicative of a repeating parameter name >> if there is no repeat then it means no data is attached to the name. > > > Alright, that seems the least problematic solution indeed, and > http_build_query() also represents empty arrays just like null values > (omitting them). > > So in your implementation it would mean: >> >> allowed type: null, int, float, string, boolean, and Backed Enum (to >> minic json_encode and PHP8.4+ behaviour) >> arrays with values containing valid allowed type or array. are also >> supported to allow complex type support. >> >> Any other type (object, resource, Pure Enum) are disallowed they should >> throw a TypeError >> > > +1 > > >> Maybe in the future scope of this RFC or in this RFC depending on how yo= u >> scope the RFC you may introduce an Interface which will allow serializin= g >> objects using a representation that >> follows the described rules above. Similar to what the >> JsonSerializable interface is for json_encode. >> > > Hm, good idea! I'm not particularly interested in this feature, but I > agree it's a good way to add support for objects. > > Last but not Last, all this SHOULD not affect how http_buid_query works. >> The function should never have been modified IMHO so it should be left >> untouched by all this except if we allow it >> to opt-in the behaviour once the interface is approved and added to PHP. >> > > +1 > > Regards, > M=C3=A1t=C3=A9 > > Hi M=C3=A1t=C3=A9, And an extra appendArray() and setArray() method could be added that would > possible > mangle the param names, and they would only support passing arrays. This > solution would hopefully result in a slightly less surprising > behavior: one could immediately know if a previously added parameter is > really retrievable (when append() or set() was used), or extra > checks may be needed (when using appendArray() or setArray()). I believe adding the appendArray and setArray is the way forward as the bracket addition and thus mangling is really a PHP specificity that we MUST keep to avoid hard BC. I would even go a step further and add a getArray and hasArray methods which will lead to the following API $params =3D (new Uri\Rfc3986\UriQueryParams()) ->append("foo", ["bar", "baz"]) // Value is a list, so "foo" is added without brackets ->appendArray("foo", ["qux", "quux"]); // Value is a list, using PHP serialization "foo" is added with brackets var_dump($params->toRfc3986String()); // foo=3Dbar&foo=3Dbaz&foo%5B0%5D=3Dqux&foo%5B1%5D=3Dquux $params->hasArray('foo'); //returns true $params->getArray("foo"); //returns ["qux", "quux"] $params->has('foo'); //returns true $params->getFirst("foo"); //returns "bar" $params->getLast("foo"); //returns "baz" $params->getAll('foo'); //returns ["bar", "baz"] Hope this makes sense Regards, Ignace --000000000000aaf2d006467848a5 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Sun, Dec 21, 2025 at 1:13=E2=80=AFPM M= =C3=A1t=C3=A9 Kocsis <kocsisma= te90@gmail.com> wrote:
Hi Ignace,
=C2=A0
When I talk about data mangling I = am talking about this

parse_str('foo.bar=3Dbaz', $params);
var_dump($pa= rams); //returns ['foo_bar' =3D> 'baz']

Sure! I just wanted to point it out that name mangling will= still be present due to arrays, and this will have some disadvantages,
namely that adding params and retrieving them won't be symmetric= :

$params =3D new Uri\Rfc3986\UriQueryParams(= )
=C2=A0 =C2=A0 ->append("foo", [2 =3D> "bar",= 4 =3D> "baz"]);

var_dump($params= ->getFirst("foo"));=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0// NULL

One canno= t be sure if a parameter that was added can really be retrieved later via a= get*() method.

Another edge cases:

=
$params =3D new Uri\Rfc3986\UriQueryParams()
=C2=A0 =C2= =A0 ->append("foo", ["bar", "baz"])=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 // Value is a list, so "foo" is added without brackets=
=C2=A0 =C2=A0 ->append("foo", [2 =3D> "qux&= quot;, 4 =3D> "quux"]);=C2=A0 =C2=A0 =C2=A0 // Value is an arr= ay, so "foo" is added with brackets

= var_dump($params->toRfc3986String());=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 // foo=3Dbar&foo=3Dbaz&foo%5B2%5D=3Dqux&foo%5B4%5D=3Dquu= x

var_dump($params->getLast("foo&quo= t;))=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0//= Should it be "baz" or "quux"?
var_dump($para= ms->getAll("foo"))=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0// Should it only include the params with= name "foo", or also "foo[]"?

= And of course this behavior also makes the implementation incompatible with= the WHATWG URL specification: although I do think this part of
t= he specification is way too underspecified and vague, so URLSearchParams do= esn't seem well-usable in practice...

So an id= ea that I'm now pondering about is to keep the append() and set() metho= ds compatible with WHATWG: and they would only support
scalar val= ues, therefore the param name wasn't mangled. And an extra appendArray(= ) and setArray() method could be added that would possible
mangle= the param names, and they would only support passing arrays. This solution= would hopefully result in a slightly=C2=A0less surprising
behavi= or: one could immediately know if a previously added parameter is really re= trievable (when append() or set() was used), or extra
checks may = be needed (when using appendArray() or setArray()).
=C2=A0
<= blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l= eft:1px solid rgb(204,204,204);padding-left:1ex">This to me should yield th= e same result as ->append('foo', null); =C2=A0as the array const= ruct is only indicative of a repeating parameter name
if there is no rep= eat then it means no data is attached to the name.

Alright, that seems the least problematic solution indeed, and htt= p_build_query() also represents empty arrays just like null values (omittin= g them).

So in your implementat= ion it would mean:

allowed type: null, int, float,= string, boolean, and Backed Enum (to minic json_encode and PHP8.4+ behavio= ur)
arrays with values containing valid allowed type or array. ar= e also supported to allow complex type support.

An= y other type (object, resource, Pure Enum)=C2=A0=C2=A0are disallowed they s= hould throw a TypeError

+= 1
=C2=A0
Maybe in the future sco= pe of this RFC or in this RFC depending on how you scope the RFC you may in= troduce=C2=A0an Interface which will allow serializing objects using a repr= esentation that
follows the described rules above. Similar to wha= t the JsonSerializable=C2=A0interface is for json_encode.

Hm, good idea! I'm not particularl= y interested in this feature, but I agree it's a good way to add suppor= t for objects.

Last but not La= st, all this SHOULD not affect how http_buid_query works. The function shou= ld never have been modified IMHO so it should be left untouched by all this= except if we allow it
to opt-in the behaviour once the interface= is approved and added to PHP.

+1
=C2=A0
Regards,
M=C3=A1t=C3=A9
=


Hi M=C3=A1t=C3= =A9,

= =C2=A0And an extra appendArray() and setArray() method could be added that = would possible
mangle the param names, and they would only support passi= ng arrays. This solution would hopefully result in a slightly=C2=A0less sur= prising
behavior: one could immediately know if a previously added param= eter is really retrievable (when append() or set() was used), or extra
c= hecks may be needed (when using appendArray() or setArray()).=C2=A0
=C2=A0
I believe adding the appendArray and setArray is= the way forward as the bracket addition and thus mangling is really a PHP = specificity that we MUST keep to avoid hard BC.
I would even go a= step further and add a getArray and hasArray methods which will lead to th= e following API

$params =3D (new=
 Uri\Rfc3986\UriQueryParams())
->append("foo"
, ["bar", "baz"]) // Value is a list, so "foo" is added without b= rackets
= ->appendArray("foo", ["qux", "quux"]); // Value is a list, using PHP serialization "foo" i= s added with brackets

var_dump($params->toRfc3986String()); // foo=3Dbar&foo=3Dbaz&foo%5B0%5D= =3Dqux&foo%5B1%5D=3Dquux

$params<= /span>->hasArray('foo'); //returns true
$params->getAr= ray("foo"); //returns ["qux&= quot;, "quux"]

$params
->has('foo'); //returns true
$params->getFirst("foo"); //returns "bar"<= br>$params->getLast("foo"); //returns "baz"
$params->getAll(<= span style=3D"color:rgb(6,125,23)">'foo'
); //returns ["bar", "= ;baz"]
Hope this makes sense 
Regards,
Ignace

--000000000000aaf2d006467848a5--