Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:129625 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id 6CDE71A00BC for ; Tue, 16 Dec 2025 20:38:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1765917530; bh=51JTkVhRYgdsAxqAU0IAGp2JtmVCTBE6E1nBVS2ahUg=; h=References:In-Reply-To:From:Date:Subject:To:From; b=KfI91fkrPrXmUp6vM/GPw8S0YNxzR9pbZbHqvzOMFzmOzTPEx0z0Th6m04G8FiVo4 KOmb9tLmpC56H5I0SPsczJdDViPYMvOVnAsMzjYLPLAM3gHUnSgjsOwl36tqUpJrvk g8e1sllgKnGNQruGMrUf0JhZRO6y+fnjvJ2Y1sAMJHSNm/kCi72hmP1Jz1YVOFyW53 QYALy1A6zgtQkoPmvMgMZ0TnJznZFXP3/svBLtuUWosXmOnLbxnSep8SEFKbXv1XRW MMtQoUs/cL/I4SMbZF8g7nGodgZ7i3pu54jraSBI85wGK5R01EZvXkB2rQ2pTEPfTU 14nwv6u4/cqVA== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id BAD571801D8 for ; Tue, 16 Dec 2025 20:38:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,T_SPF_TEMPERROR autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from mail-ot1-f46.google.com (mail-ot1-f46.google.com [209.85.210.46]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 16 Dec 2025 20:38:36 +0000 (UTC) Received: by mail-ot1-f46.google.com with SMTP id 46e09a7af769-7cae2330765so3302194a34.0 for ; Tue, 16 Dec 2025 12:38:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765917511; x=1766522311; darn=lists.php.net; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=Gx8E9UPsG3VNUuXg6i07NyR5VfYHbOCekN0cv+TwpG8=; b=fG+2KgNwF7fcySM6SW1Bpho17MJvtbvJXcnEDJW1kGWUF1qU+SUgZwuD4WJ4P+Vscy C9yP7wxlL9WH9aeVsoa66dq9V8SYWVQIVqaIIVlHIgMjT2GHwntTs9GNqqsNGHUpPIyl MsfhpvWzWqq6209jsf2Jy9yE9jcKzV2tZ5lFhZUBgJiJbS6um30euBv8FjKvFcF6dsef oBNYnz8HHwGxu3e8mtg6s7TsEeF7mIX2GZMArsdt9FInZl8OrdUg875Bi77xZDcp4q9w sI7FPu2OIAputqcTNI6btH3YbflsDfmRsB+Mh5D5JFMj6QEFxwUN+doPUcYj7hbn1OmY LQBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765917511; x=1766522311; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Gx8E9UPsG3VNUuXg6i07NyR5VfYHbOCekN0cv+TwpG8=; b=RskozVObrEqkQGGgzlXvlDHMbBK1gvKwvVu0bk1AFxIPrdmkPpPhMB7ybdEglkm7lk GyIYBACq4xmvi89O2fMqP1IYnKqKe/qIpyMmsy8rWijQZg+23bL1wjd52EVfwDkR7DIa tovtv6b8nIh1ODjgziCddOjRRmwoWFnCAHmHuDIx81BnkpML4/d1kApVKXrdRKyIch/G 5RI2CRYiO2aBW7TyUpKIrW3O5hnlxUj5up+vac6w5OUsC46aYoE1k8KvkOivMr4bqyYx cTz/+NkEjEZ/k9C6rxHdLaZnki+3asoPBE4H+5A6N1MaZBsvc8uXyNzDTn530QjEeq/v WzIg== X-Forwarded-Encrypted: i=1; AJvYcCV/SdDY8zRiyDAty4W72zQQBf2noMsVRbvIkSoVobYHF3tMgpaXxOrLNCAOS23AfmUR3cfKs/ImNrU=@lists.php.net X-Gm-Message-State: AOJu0YxlOb/3JjZ3luGCTSuOTmri5QAxPl+5OnmiNkGvnX1doRp3+oso gKvA8uJAziiSUMeLRpglPdJSnCvBPj24QJQl91tgJsjJ2yrD3fwehJe91281S/XpdbJV7NWra51 uwlJ0k3/ye4I7yPKZeD4OPaJSoGlb027xvCFv X-Gm-Gg: AY/fxX4bzvXf48lNUGogLx846BVUWiRJBCtpWKtmpassy56GDpBMOKH/V03vSVdgQyG SrX98FkK5ZMd4xGUxUdJuSDeYomFXIh1CZhQGyotzX8wcRfMgKseWDoV7NzJgaJTqvnQ4YrOw7X X/QjAEQkMYuJyeFPJNsz5AxrGcusKCWekAb3egMHSNMERzijdFEqn+jn8lieFWInaM13At4oBg4 1hxsQjQGsRJtfGzZX0zbsedKIc8qisDngCxkZ3J0NQ+WkN1EcykZjiEk+1b60sf7Y38LFwk/ji3 W2jXhGXY28+lTHifeOh88kgwm5Ps X-Google-Smtp-Source: AGHT+IF3s9CibB/JG5mAbkJLIkm/Q+DQNAWJGtuHaZgQV0rL9jhMsiSSlQXH8PpboE6yPOpwUBGCFsCpehmtT1/ZFTs= X-Received: by 2002:a05:6820:610a:b0:65c:efe9:5b62 with SMTP id 006d021491bc7-65cefe95eafmr78343eaf.21.1765917510609; Tue, 16 Dec 2025 12:38:30 -0800 (PST) Precedence: list list-help: list-unsubscribe: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 References: <83238ad3-c844-4457-dfb3-11321787e022@php.net> In-Reply-To: Date: Tue, 16 Dec 2025 21:38:18 +0100 X-Gm-Features: AQt7F2oigbD8MFcJ5W_-FD-TpDqtHcZkm8-rL9J5Kws41_rC1r4geB2BycQIXYU Message-ID: Subject: Re: [PHP-DEV] [RFC] [Discussion] Followup Improvements for ext/uri To: Juris Evertovskis , PHP Internals List Content-Type: multipart/alternative; boundary="000000000000ec3201064617b57d" From: nyamsprod@gmail.com (ignace nyamagana butera) --000000000000ec3201064617b57d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Dec 16, 2025 at 7:14=E2=80=AFPM Juris Evertovskis wrote: > On 2025-12-16 09:53, ignace nyamagana butera wrote: > > Since we will be dealing with arrays the following rules could be updated > when parsing the string using PHP behaviour: > > - "&a" should be converted to ['a' =3D> null] > > Hey Ignace, > > In practice valueless arguments like `?debug` are most often "flags" or > "booleans" and their presence implies truthiness. > > Do you think it would be wrong or confusing to have it converted to > `['debug' =3D> true]`? > > I'm worried that `['a' =3D> null]` would not be that handy since both > `$params['a']` and `isset($params['a'])` would return falsy which would > likely be opposite to the intended value. > > BR, > Juris > Hi Juris, > Do you think it would be wrong or confusing to have it converted to `['debug' =3D> true]`? Yes IMHO it would be wrong because flag parameters or booleans are converted to ['debug' =3D> 1] The ['debug' =3D> null] expresses the presence of the name pair and the absence of value associated with it. Let's see how it is currently done: The WHATWG URL living standard does the following: let url =3D new URL('https://example.com?debug&foo=3Dbar&debug=3D'); console.log( url.searchParams.toString(), //returns debug=3D&foo=3Dbar&debug=3D' ); the pair gets converted to ['debug' =3D> '']. The roundtrip does not conser= ve the query string as is but all key/pair (tuples) are present. In PHP you have currently the following behaviour: example 1 parse_str('debug&foo=3Dbar&debug=3D', $params); var_dump($params, http_build_query($params)); //$params ['debug' =3D> '', 'foo' =3D> 'bar'] //after roundtrip you get 'debug=3D&foo=3Dbar' example 2 parse_str('debug&foo=3Dbar&debug=3D1', $params); var_dump($params, http_build_query($params)); //$params ['debug' =3D> '1', 'foo' =3D> 'bar'] //after roundtrip you get 'debug=3D1&foo=3Dbar' So *you lose data and the query data can be randomly sorted* parse_str convert the first debug into ['debug' =3D> ''] parse_str overwrites the value (This may be a security concern if you need to hash/validate your query string) Since IMHO interoperability and security is important you should prefer an algorithm that preserves the original query. The proposed solution is already in use for instance in League/Uri or in Guzzle echo Uri::withQueryValues(Utils::uriFor('https://example.com'), [ 'debug' =3D> null, 'foo' =3D> 'bar', 'baz' =3D> '', ]), PHP_EOL; // https://example.com?debug&foo=3Dbar&baz=3D Because Guzzle uses an associative array, the debug variable can only appear once but there is a difference using null and the empty string. This improves interoperability with other languages and you no longer have data loss or random query re-arrangement. Last but not least, the Query objects proposed by M=C3=A0t=C3=A9 all expose= : - a `has` method which will always tell if the key is present regardless of its value an equivalent to array_key_exists. - provide a way to have the same parameter appear multiple times in the query string So IMHO it is an improvement to also allow the distinction between null and the empty string so we can finally write in PHP echo (new Uri\Rfc3986\Query()) ->append('debug', null) ->append('foo', 'bar') ->append('debug', '') ->toRfc3986String(); // debug&foo=3Dbar&baz=3D Best regards, Ignace --000000000000ec3201064617b57d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

On Tue, Dec 16, 202= 5 at 7:14=E2=80=AFPM Juris Evertovskis <juris@glaive.pro> wrote:

On 2025-12-16 09:53, ignace nyam= agana butera wrote:

Since we will be dealing with arrays the following rules could be upda= ted when parsing the string using PHP behaviour:
  • "&a"=C2=A0should be converted to=C2=A0['a'= =3D> null]

Hey Ignace,

In practice valueless arguments like `?debug` are most often "flags= " or "booleans" and their presence implies truthiness.

Do you think it would be wrong or confusing to have it converted to `[&#= 39;debug' =3D> true]`?

I'm worried that `['a' =3D> null]` would not be that hand= y since both `$params['a']` and `isset($params['a'])` would= return falsy which would likely be opposite to the intended value.

BR,
Juris


Hi Juris,

>=C2=A0Do you think it would be wrong or confusing to have it conv= erted to `['debug' =3D> true]`?

Yes IMHO it would be wrong because flag parameters or booleans=C2=A0ar= e converted to ['debug' =3D> 1]
The ['debug&= #39; =3D> null] expresses the presence of the name pair and the absence= =C2=A0of value associated with it.
Let's see how it is = currently done:

The WHATWG URL living = standard does the following:
let url =3D new URL('=
https://ex=
ample.com?debug&foo=3Dbar&debug=3D');
console.log(
url.searchParams.toString(), //returns debug=3D&foo=3Dbar&debug= =3D'
);
the pair gets converted to ['debug&= #39; =3D> '']. The roundtrip does not conserve the query string = as is but all key/pair (tuples) are present.

<= /div>
In PHP you have currently the following behaviour:

example 1
parse_str('debug&foo=3Dbar&debug=3D', $params);
var_dump($params, http_build_query($params));
//$params ['debug' =3D> '', 'foo'= ; =3D> 'bar']
//after roundtrip you get 'debug=3D&foo=3Dbar'=
example 2
parse_str('debug&foo=3Dbar&debug=3D1'=
;, $params);
var_dump($params, http_build_query($params));
//$params ['debug' =3D>= '1', 'foo' =3D> 'bar']
//after roundtrip you get 'd= ebug=3D1&foo=3Dbar'
So=C2=A0you lose= data and the query data can be randomly sorted
parse_s= tr convert the first debug into ['debug' =3D> '']=
parse_str=C2=A0overwrites the value (This may be a security= concern if you need to hash/validate your query string)
<= span style=3D"font-family:Verdana,Geneva,sans-serif;font-size:13.3333px">
Since IMHO interoperability and security is import= ant you should prefer an algorithm that=C2=A0preserves the original query.= =C2=A0
<= span style=3D"font-size:13.3333px">The proposed solution is already in use = for instance in League/Uri or in Guzzle
=
echo =
Uri::withQuery=
Values(Utils::uriFor('https://example.com=
'), [
'debug'= =3D> null,
'foo' =3D> 'bar',
'baz' =3D> '',
]), PHP_EOL;
// https://example.com?debug&foo=3Dbar&baz=3D
= Because Guzzle uses an associative arra= y, the debug variable can only appear once but there is=C2=A0a difference u= sing null and the empty string.
This improves i= nteroperability with other languages and you no longer have data loss or ra= ndom query re-arrangement.

Last=C2=A0= but not least,=C2=A0the=C2=A0Query objects proposed by M=C3=A0t=C3=A9= all expose:
- a `has` method which will always tell if the= key is=C2=A0present regardless of its value an equivalent to array_k= ey_exists.
- provide a way to have the same parameter appea= r multiple times in the query string

= S= o IMHO it is an improvement to also allow the distinction between null and = the empty string so we can finally write in PHP
echo (new Uri=
\Rfc3986\Query())
->appe= nd('debug', null)
->append('foo= ', 'bar')
= ->append('debug', '')
->toR= fc3986String();
// debug&foo=3Dbar&baz=3D
Best = regards,
Ignace=C2=A0
--000000000000ec3201064617b57d--