Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:129668 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id 3660A1A00BC for ; Sun, 21 Dec 2025 12:14:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1766319247; bh=B1B5rm547XxozFtj/4UvsURuGLCcE7Tqohwue5apJ9c=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=eZU4u5f/SgSrI0gR/sIhccQfJr2xLuNuaalGoG/tMtXnAWBskE9ekQvnDGxNdXGv3 +bbFctwChJ/+alftrP1guRJKX62D6Yk2tlRbqOYN60DNnrP0CmscVBj+Auk6UPbw2F CgSte1l0xKCJ5dRaQWGmxks8Q5vawcgLToDU0s5KvCotB4IiACbBy3QqiHxXXZe9e2 ZVIbObw3TX7X8Nm4Ooou9OtssHCeriuVJyNYFMYyceHitaAUr4rVhMMsJbqUGe261Z mHLpHQ5ux23PoQ8Ue3dGaE1eW4w4Qx9vJEMraDgufpEhqBV3BDj7SmnqMS+i0WzyLp rKwaAtA1Ory8Q== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id BA5EA18004A for ; Sun, 21 Dec 2025 12:14:03 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.9 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS, FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, T_SPF_TEMPERROR autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sun, 21 Dec 2025 12:14:03 +0000 (UTC) Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-4eda77e2358so25480461cf.1 for ; Sun, 21 Dec 2025 04:13:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1766319238; x=1766924038; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=W79QO/SwrYkRWAmWXiqeJBNNc/2F9ld/opiHf3W+mCY=; b=OwJtG25E7T6QxruSwvAYhzMBioLEf2wxLpIkNSt/CHyni71qBTajOldPPKBNzQda9Y mJjIpm7XZ8cvB48kggIIXwUIkSeMaZTPazp1a23OkEkfW2SRMHEryLlJtLUh98W590Nt bgvUI8xlu1IW38TnN5ayZseXZcDwtEgrR/zrNzwDcV9139FGodhgqVFnNzCM+HAQEFKZ 0wly+ozM8FzBUEEXsyL2irpkIgdapHAlK/QWh1OYJuEdRBhTnE4+A0N1ns8dWFbyDTvM CRGygLLrzhAGAdCcDrgyThQOzP7oVqr17t0oOtFFMgsw+D83HoI9cD95MJ5+1Q7OblRY flcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1766319238; x=1766924038; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=W79QO/SwrYkRWAmWXiqeJBNNc/2F9ld/opiHf3W+mCY=; b=n8ztEfEiCVp/ET83flBnzhFhvE2VPmfqCG4iiqghp/qsJplh9U8SOfRECrEM8VeQfl IH9muBWaIJf4WmK5SNUW/zCTtzD1GpS7hbC4BHAJEyqVBgkd/kaMp41+3xgICTKHIoo6 kBt0KEH4UiZC5RxaYEBE3/HIchD4iZer1LEjeIsMAxYvDgC9xo2rVGn9lrhMBA3fWWi5 mT5PEZFBKPO/OD16WqgOUFexT1b/6BPcZrcLHXD/f0iiQfUkK7BGo4w3Av4Y6hPwkutM A6qM7EwkS3BUttiYGhzZNQ4c/BRvYAbusWufzp5zpQCSRzWtaTBhRLQRBlB2G+9omd0c fxDQ== X-Gm-Message-State: AOJu0YyFSb+ekvMmP0Rpvr8cyp8l24VN6bm116hlFVIjqmmeP/OrNT45 8vJtetOfZF3unWu8H3eWylqBu1/tCuWh6kNIKXjfE2Je75uq8byXCXbJ5XMPDOCw0VwikLEGk+3 u+LKVnz8CxfUDj+Q5nIzVRDdahh10/a9x0y79lUA= X-Gm-Gg: AY/fxX6Yokt2IxOvrTCDHWgqoc5jfru/VWQSBK4PCTanIF2cG8oTu59EYzounCA1pV1 MrbEetSNzBPEHJDnj6Xl0n/yC+DG2GWLsIQPruPayDP3hxX8pihaSSmcnjTM3K0aivnpnEX7rvX rfbnSUulXkeMgv7jIlUspoH2JO2uAADD0DjbHQirth85UMec95u6f3tFuYICNvyyZ2zGIPX4Oim bf+/CftLvfgWV+QNC8Ji2VLV47STr8PBUJ/98OQfgI8aja41EGk5Us7rJhxZmRBC2lDvmycRFIc Y9L2ZA== X-Google-Smtp-Source: AGHT+IGdLofujnVUH9VyGLEkoxdPY8APfMpKCSVlsUc+0B0tRd5gjjtsxkj5auzA2h7ME9vBVgaVxgYM91WZTEBKa0o= X-Received: by 2002:a05:622a:5a8e:b0:4ec:eecf:66e4 with SMTP id d75a77b69052e-4f4abcd08c1mr119749561cf.7.1766319237813; Sun, 21 Dec 2025 04:13:57 -0800 (PST) Precedence: list list-help: list-unsubscribe: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 References: <83238ad3-c844-4457-dfb3-11321787e022@php.net> In-Reply-To: Date: Sun, 21 Dec 2025 13:13:46 +0100 X-Gm-Features: AQt7F2qWLKYZ4mY9u24ixGkqACr8ZNq1iT4Z-zvMy1hSsNt3ntvXX2rSkrqABpY Message-ID: Subject: Re: [PHP-DEV] [RFC] [Discussion] Followup Improvements for ext/uri To: ignace nyamagana butera Cc: PHP Internals List Content-Type: multipart/alternative; boundary="000000000000badae40646753e38" From: kocsismate90@gmail.com (=?UTF-8?B?TcOhdMOpIEtvY3Npcw==?=) --000000000000badae40646753e38 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Ignace, > When I talk about data mangling I am talking about this > parse_str('foo.bar=3Dbaz', $params); > var_dump($params); //returns ['foo_bar' =3D> 'baz'] Sure! I just wanted to point it out that name mangling will still be present due to arrays, and this will have some disadvantages, namely that adding params and retrieving them won't be symmetric: $params =3D new Uri\Rfc3986\UriQueryParams() ->append("foo", [2 =3D> "bar", 4 =3D> "baz"]); var_dump($params->getFirst("foo")); // NULL One cannot be sure if a parameter that was added can really be retrieved later via a get*() method. Another edge cases: $params =3D new Uri\Rfc3986\UriQueryParams() ->append("foo", ["bar", "baz"]) // Value is a list, so "foo" is added without brackets ->append("foo", [2 =3D> "qux", 4 =3D> "quux"]); // Value is an arr= ay, so "foo" is added with brackets var_dump($params->toRfc3986String()); // foo=3Dbar&foo=3Dbaz&foo%5B2%5D=3Dqux&foo%5B4%5D=3Dquux var_dump($params->getLast("foo")) // Should it be "baz" or "quux"? var_dump($params->getAll("foo")) // Should it only include the params with name "foo", or also "foo[]"? And of course this behavior also makes the implementation incompatible with the WHATWG URL specification: although I do think this part of the specification is way too underspecified and vague, so URLSearchParams doesn't seem well-usable in practice... So an idea that I'm now pondering about is to keep the append() and set() methods compatible with WHATWG: and they would only support scalar values, therefore the param name wasn't mangled. And an extra appendArray() and setArray() method could be added that would possible mangle the param names, and they would only support passing arrays. This solution would hopefully result in a slightly less surprising behavior: one could immediately know if a previously added parameter is really retrievable (when append() or set() was used), or extra checks may be needed (when using appendArray() or setArray()). > This to me should yield the same result as ->append('foo', null); as the > array construct is only indicative of a repeating parameter name > if there is no repeat then it means no data is attached to the name. Alright, that seems the least problematic solution indeed, and http_build_query() also represents empty arrays just like null values (omitting them). So in your implementation it would mean: > > allowed type: null, int, float, string, boolean, and Backed Enum (to mini= c > json_encode and PHP8.4+ behaviour) > arrays with values containing valid allowed type or array. are also > supported to allow complex type support. > > Any other type (object, resource, Pure Enum) are disallowed they should > throw a TypeError > +1 > Maybe in the future scope of this RFC or in this RFC depending on how you > scope the RFC you may introduce an Interface which will allow serializing > objects using a representation that > follows the described rules above. Similar to what the > JsonSerializable interface is for json_encode. > Hm, good idea! I'm not particularly interested in this feature, but I agree it's a good way to add support for objects. Last but not Last, all this SHOULD not affect how http_buid_query works. > The function should never have been modified IMHO so it should be left > untouched by all this except if we allow it > to opt-in the behaviour once the interface is approved and added to PHP. > +1 Regards, M=C3=A1t=C3=A9 --000000000000badae40646753e38 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi I= gnace,
=C2=A0
When I talk about data mangling I am talking about this

parse_str('foo.bar= =3Dbaz', $params);
var_dump($params); //returns ['foo_bar' = =3D> 'baz']

Sure! I just wanted = to point it out that name mangling will still be present due to arrays, and= this will have some disadvantages,
namely that adding params and= retrieving them won't be symmetric:

$par= ams =3D new Uri\Rfc3986\UriQueryParams()
=C2=A0 =C2=A0 ->append("= ;foo", [2 =3D> "bar", 4 =3D> "baz"]);

var_dump($params->getFirst("foo"));=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0// NULL

One cannot be sure if a parameter that was ad= ded can really be retrieved later via a get*() method.

=
Another edge cases:

$params =3D new Uri\= Rfc3986\UriQueryParams()
=C2=A0 =C2=A0 ->append("foo", [&qu= ot;bar", "baz"])=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 // Value is a list, so &qu= ot;foo" is added without brackets
=C2=A0 =C2=A0 ->append(= "foo", [2 =3D> "qux", 4 =3D> "quux"]);= =C2=A0 =C2=A0 =C2=A0 // Value is an array, so "foo" is added with= brackets

var_dump($params->toRfc3986String= ());=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 // foo=3Dbar&foo=3Dbaz&am= p;foo%5B2%5D=3Dqux&foo%5B4%5D=3Dquux

var= _dump($params->getLast("foo"))=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0// Should it be "baz" or &q= uot;quux"?
var_dump($params->getAll("foo"))=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0//= Should it only include the params with name "foo", or also "= ;foo[]"?

And of course this behavior also mak= es the implementation incompatible with the WHATWG URL specification: altho= ugh I do think this part of
the specification is way too underspe= cified and vague, so URLSearchParams doesn't seem well-usable in practi= ce...

So an idea that I'm now pondering about = is to keep the append() and set() methods compatible with WHATWG: and they = would only support
scalar values, therefore the param name wasn&#= 39;t mangled. And an extra appendArray() and setArray() method could be add= ed that would possible
mangle the param names, and they would onl= y support passing arrays. This solution would hopefully result in a slightl= y=C2=A0less surprising
behavior: one could immediately know if a = previously added parameter is really retrievable (when append() or set() wa= s used), or extra
checks may be needed (when using appendArray() = or setArray()).
=C2=A0
This to me should yield the same result as ->append('fo= o', null); =C2=A0as the array construct is only indicative of a repeati= ng parameter name
if there is no repeat then it means no data is attache= d to the name.

Alright, that seems the lea= st problematic solution indeed, and http_build_query() also represents empt= y arrays just like null values (omitting them).

So in your implementation it would mean:

allowed type: null, int, float, string, boolean, and Backed Enum (t= o minic json_encode and PHP8.4+ behaviour)
arrays with values con= taining valid allowed type or array. are also supported to allow complex ty= pe support.

Any other type (object, resource, Pure= Enum)=C2=A0=C2=A0are disallowed they should throw a TypeError
<= /div>

+1
=C2=A0
Maybe in the future scope of this RFC or in this RFC depend= ing on how you scope the RFC you may introduce=C2=A0an Interface which will= allow serializing objects using a representation that
follows th= e described rules above. Similar to what the JsonSerializable=C2=A0interfac= e is for json_encode.

Hm, good idea! I'm not particularly interested in this feature, but I = agree it's a good way to add support for objects.

<= blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l= eft:1px solid rgb(204,204,204);padding-left:1ex">
Last but not Last, all this SHOULD not affect how ht= tp_buid_query works. The function should never have been modified IMHO so i= t should be left untouched by all this except if we allow it
to o= pt-in the behaviour once the interface is approved and added to PHP.
<= /div>

+1
=C2=A0
R= egards,
M=C3=A1t=C3=A9

--000000000000badae40646753e38--