Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:129566 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id 50A101A00BC for ; Sat, 6 Dec 2025 11:46:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1765021566; bh=RNT1b8XhScuywcBTvVMyv8rS2flrkw2769fute+uAzI=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=KxEI6kQsi6e5ZUgQ8KTVR2klea5aaTpOAHhx9jmezFesaeoUjzrgB/QZp8V1VRLyJ Nus73cIS0/uVwa2PuEl1xhpZjHvvOhCzitvbQ6a9HrS2TOwAoKDeCWTGfG+DMH/xLV dIBBBWc0s7nX6hpXhUtAnPueRl6Me9djO3R48Kia2JnVf9wSF/2oGuPtiSqi+nT8GU fnPlajZ5gJRZlB6HDplYD8hWNRHWPWWWcFZrE0qlEQx3jYspGto2kMAdAoG7UHY3PA lZ6ONyzC+6Dk7aRQjVE74KQ4jd1PY0/p6AkCl1Ylii1daOPJQscabaXKPlyH8vjQOc L7SDgDtUJL0Iw== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 6CABD18006C for ; Sat, 6 Dec 2025 11:46:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.9 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS, FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 6 Dec 2025 11:46:05 +0000 (UTC) Received: by mail-qt1-f181.google.com with SMTP id d75a77b69052e-4ee14ba3d9cso27819021cf.1 for ; Sat, 06 Dec 2025 03:46:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765021559; x=1765626359; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=hhx0yCUsUPouugskdm1M4v7VBj9++IoTufHIRXSHxmU=; b=hg10veCXZf8mtYPaHMLuZY8aVjlkFuWJ4Ot8R+9n9PDH8HbCVApa1RZV0X32Yl/piq k2j8bnbUMGGYRSv5KO6EpkKyIHBfYbV1NPEwgHIuBV/HJLYf5Hrk57PbrB1iuERVGwLN EnjpOHCdpPn+LDqI0Dz1xP8lvaywC64cLxQldwrWssFOVQja5fGBFvt/qCO48U/CeJHO tR2fDZQ8EvzOE6ZHcg8RueqRXz3mBvknomvv4SI+65JvapbyoxrNj009fql1BL5RB0Qb UQ+TE0oFlM5JF0AtsOPQAajsY2gfoFlmKxo2siUNvRJgU1aUOmgK5LyrDxyFc2rSKMqM 5CKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765021559; x=1765626359; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=hhx0yCUsUPouugskdm1M4v7VBj9++IoTufHIRXSHxmU=; b=h1EG8nykDMyWUVyM5IJj/ociwfjaMuA7W75syX9Pr6MNnaRLj5NwDm0eYpDMC++sr2 dnmj8w8yLClH9q7mwd4yUc2cWYBtaQ4fBjxetyiqjuxI2k+wYXjHXJLMQ07E20jkZ5WG 0ECDNSk7cGVMoR0FY6XC4duK7+Drf+xhBZD3m9k9QdEGGl1Oh7jSFbmxbK7e4FgKFVVa 6qDQ1y57Vp6R5HjnjLSEn+D5ad/zlSU/bo//7q/LEh98rctxFtrlGcWQEY3kXNv8Dltn BwPf5c1xmmkAMJ1oyK4uzSknTlKK/Z1JOjuO/4V7WGJnm2SH3JiTzPYaDk6Ih6ZUdQMm j7Zw== X-Gm-Message-State: AOJu0YyTsRq16Q2bpjZOYJrWjZ/F5FFRs2gfsBwczV2VL+uXtNZbBjJC UgDwFUJdYb2cxwQLWdZpUar+K15XgOGfQT99ga1mtlpeywOMgo3WgjrJvadlwOvD8GlU0ju8Ty4 8vGOv7c3OhPHBvg8xMmFgKYO66yTyM33tCjLf X-Gm-Gg: ASbGnctUyEteBPCItnfXd5aAsLBxRN5/K0Gbq+icQ4ATzqFEV3BXrq2xMsK2PAvNyBX xqFDwk5O0btDIKXADK+BT0Xulk3dJSgCbr/jSC/RNecmi+ruX5/YWxvd7pe3LKra7L9F9xyM0QO ycGXBgKiqGsVEYC93cSOMcHdjgBQeZWcTboCsidI8UZwc9MjL3C8xgPaDmQDcbdJGEW5qCkaevA /e/ku5Um4IOpo7Nr4+mio83Zmqalf/YMPXJJktSIMvf0+GzqoElsH1w3hmx/eSi2oA6Rg== X-Google-Smtp-Source: AGHT+IEaeykwLOFRap2NOs//1uWjFwbj7CZfPMbPtTQxFG6xmXgCx/jB6UbaGD7vvHxSrF74SXZVn0ZJYWhaOyP8b1k= X-Received: by 2002:a05:622a:244f:b0:4ee:b2b:aa23 with SMTP id d75a77b69052e-4f03fd96befmr28791661cf.15.1765021559374; Sat, 06 Dec 2025 03:45:59 -0800 (PST) Precedence: list list-help: list-unsubscribe: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 References: <9873b03e-1260-44b3-8285-af9511d8766e@app.fastmail.com> In-Reply-To: <9873b03e-1260-44b3-8285-af9511d8766e@app.fastmail.com> Date: Sat, 6 Dec 2025 12:45:48 +0100 X-Gm-Features: AWmQ_bmdt6L6fsoqfg6oUX1XdD55mx4WGGfZpk1v-8FZTQZ_IfPUy18jbocZQaY Message-ID: Subject: Re: [PHP-DEV] [RFC] [Discussion] Followup Improvements for ext/uri To: Larry Garfield Cc: php internals Content-Type: multipart/alternative; boundary="0000000000001145410645471bbb" From: kocsismate90@gmail.com (=?UTF-8?B?TcOhdMOpIEtvY3Npcw==?=) --0000000000001145410645471bbb Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Larry, - I really, really hate the "set" prefix on all the methods. It's a > builder object, surely the "set" is implied? I like that the "set" prefix makes all setters grouped together. For example, should we have to add some extra methods (like authority()), the build() method would appear between authority() and fragment() in IDE autocomplete lists without the set() prefix. > - It really feels like there's an interface to extract here from the > Url/UriBuilder classes. There's literally only one type-specific method > (build()). > It's the very same thing that we discussed for a very long time last time... What would be the purpose of the interface? To make the two builder= s interchangeable? But they produce fundamentally different URIs. Even if we don't include build(), then we still have some differences between the two implementations: - components are different: even though most components match, RFC 3986 and WHATWG URL still have a difference; notably, the userinfo component is only acknowledged by RFC 3986, on the other hand, the username and password components are only modifiable in case of WHATWG URL - validation rules are different: each implementation has their own validation rules for each component. I'll need to clarify this in the RFC, but some purely syntax based validations should be performed during the setter/wither calls (e.g. scheme cannot contain "%"), but the ones which rely on the "global state" (e.g. the host is required when the userinfo is set) should be performed by the build() method in order to avoid the temporal coupling I mentioned in the RFC. > - UriQueryParams::hasWithValue(), could that be just hasValue()? You > still need to specify the key anyway, and that's self-evident from the > signature. > Yes, I was already considering updating this name, but I was sure that someone (with 99% confidence of you) will point this out and suggest a better one. I agree that hasValue() is probably the right choice, although hasNameAndValue() would be the most technically correct name... - There's a count() method, so shouldn't Ur{i|l]QueryParams implement > Countable? > Yes, it can. I thought that implementing Countable on its own (without IteratorAggregate) was less useful, so I omitted it. Ignace suggested another approach that would allow implementing IteratorAggregate: if it happens then I'm totally fine with also implementing Countable. > - As above, there really is an interface lurking in UriQueryParams... > I have the same comment as for the builder with one small caveat: as far as I know the implementations, the biggest differences between UriQuery=C3=ADParams and UrlQueryParams are how they parse the input, and h= ow they percent-encode them during recomposition. The rest is fairly similar for now at least. I even had a brief moment when I thought that merging the two implementations into one is a good idea, but I came to the conclusion that it isn't so that the two classes have the possibility to evolve separately, if needed. So I'd follow the path of the original URI/URL debate and would not try to make the two implementations interoperable. They are only interoperable on the surface. :) > - Why both Uri getRawQueryParams() and getQueryParams()? It looks like > they would return the same value, no? (If not, that should be explained.= ) > This is actually already briefly explained in the RFC (but thanks to Ignace how also described this part): The difference between Uri\Rfc3986\Uri::getRawQueryParams() and > Uri\Rfc3986\Uri::getQueryParams() is that the former one passes the =E2= =80=9Craw=E2=80=9D > (non-normalized) query string as an input when instantiating Uri\Rfc3986\Uri\UriQueryParams. > - The sort() method... should it take an optional user callback, or do we > lock people in to lexical ordering? > Only WHATWG URL specifies its behavior, and it uses basic alphanumeric sorting. Even though there's nothing that could stop us from implementing fancier sorting ways, I think it's already fine as-is. Sort() can be used to guarantee that the query components are in deterministic order, and I think that's all that we need. > - It would be quite convenient of set() and append() returned $this, > allowing them to be chained. > That's fine for me. WHATWG URL specifies their return type as void, so I went with this, but there's nothing wrong with returning $this. > - The fromArray() logic is... totally weird and unexpected and I hate it. > :-) Why can't you support repeated query parameters using nested arrays > rather than gumming up all calls with a wonky format? > Do you mean something like ["foo" =3D> [0, 1, 2, 3]])"? I think it is indee= d possible to implement what you suggested. Whenever the basic structure of the proposal settles a little bit, I'll update the implementation, and I'll try to find out a sensible behavior for arrays/objects. - It's not clear how one would start a new query from scratch, with the > private constructor. There doesn't seem to be a justification for the > private. I can't see why new UriQueryParams()->set('foo', 'bar') is a bad thing. > Yes, starting from scratch is only possible by using UriQueryParams::fromArray([]) or UriQueryParams::parse(""). But I don't have any fundamental issue with adding support for the empty constructor variant. > - Url::isSpecial() Could we come up with a better name here? "Special" > could mean anything unless you know the RFC; it feels like "real escape > string" all over again. > The "special URL" is indeed the technicus terminus that WHATWG URL uses. The RFC explains the concept briefly: The WHATWG URL specification defines some special schemes (http, https, > ftp, file, ws, wss), which have distinct parsing and serialization rules. I don't have any issues with the current name, but the only alternative I could imagine is Uri\WhatWg\Url::isSpecialScheme(). Regards, M=C3=A1t=C3=A9 --0000000000001145410645471bbb Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Larry,

- I really, really hate the "set" prefix on all the methods.=C2= =A0 It's a builder object, surely the "set" is implied?=C2=A0=

I like that the "set" prefix mak= es all setters grouped together. For example, should we have to add some ex= tra methods (like authority()),
the build() method would appear b= etween authority() and fragment() in IDE autocomplete lists without the set= () prefix.
=C2=A0
- It really feels like there's an interface to extract here from the Ur= l/UriBuilder classes.=C2=A0 There's literally only one type-specific me= thod (build()).

It's the very same = thing that we discussed for a very long time last time... What would be the= purpose of the interface? To make the two builders
interchangeab= le? But they produce fundamentally different URIs. Even if we don't inc= lude build(), then we still have some differences between
the two= implementations:

- components are different: even= though most components match, RFC 3986 and WHATWG URL still have a differe= nce; notably, the userinfo
component is only acknowledged by RFC = 3986, on the other hand, the username and password components are only modi= fiable in case of WHATWG URL
- validation rules are different: ea= ch implementation has their own validation rules for each component. I'= ll need to clarify this in the RFC, but some purely
syntax based = validations should be performed during the setter/wither calls (e.g. scheme= cannot contain "%"), but the ones which rely on the "global= state" (e.g.
the host is required when the userinfo is set)= should be performed by the build() method in order to avoid the temporal c= oupling I mentioned in the RFC.
=C2=A0
- UriQueryParams::hasWithValue(), could that be just hasValue()?=C2=A0 You = still need to specify the key anyway, and that's self-evident from the = signature.

Yes, I was already consideri= ng updating this name, but I was sure that someone (with 99% confidence of = you) will point this out and suggest a better one.
I agree that h= asValue() is probably the right choice, although hasNameAndValue() would be= the most technically correct name...

- There's a count() method, so shouldn't Ur{i|l]QueryParams impleme= nt Countable?

Yes, it can. I thought th= at implementing Countable on its own (without IteratorAggregate) was less u= seful, so I omitted it. Ignace suggested another approach that
wo= uld allow implementing IteratorAggregate: if it happens then I'm totall= y fine with also implementing Countable.
=C2=A0
- As above, there really is an interface lurking in UriQueryParams...

I have the same comment as for the builder w= ith one small caveat: as far as I know the implementations, the biggest dif= ferences between
UriQuery=C3=ADParams and UrlQueryParams are how = they parse the input, and how they percent-encode them during recomposition= . The rest is fairly similar for
now at least.

I even had a brief moment when I thought that merging the two implem= entations into one is a good idea, but I came to the conclusion that it isn= 't so that the two
classes have the possibility to evolve sep= arately, if needed. So I'd follow the path of the original URI/URL deba= te and would not try to make the two implementations
interoperabl= e. They are only interoperable on the surface. :)
=C2=A0
- Why both Uri getRawQueryParams() and getQueryParams()?=C2=A0 It looks lik= e they would return the same value, no?=C2=A0 (If not, that should be expla= ined.)

This is actually already briefly= explained in the RFC (but thanks to Ignace how also described this part):<= /div>

The d= ifference between Uri\Rfc3986\Uri::getRawQueryParams() and Uri\Rfc3986\Uri:= :getQueryParams() is that the former one passes the =E2=80=9Craw=E2=80=9D (= non-normalized)=C2=A0
query string as an input when instantiating Uri\Rfc3986\Uri\UriQ= ueryParams.
=C2=A0
- The sort() method... should it take an optional user callback, or do we l= ock people in to lexical ordering?

Only= WHATWG URL specifies its behavior, and it uses basic alphanumeric sorting.= Even though there's nothing that could stop us from implementing fanci= er
sorting ways, I think it's already fine as-is. Sort() can = be used to guarantee that the query components are in deterministic order, = and I think that's all that we need.
=C2=A0
- It would be quite convenient of set() and append() returned $this, allowi= ng them to be chained.

That's fine = for me. WHATWG URL specifies their return type as void, so I went with this= , but there's nothing wrong with returning=C2=A0$this.
=C2=A0=
- The fromArray() logic is... totally weird and unexpected and I hate it. := -)=C2=A0 Why can't you support repeated query parameters using nested a= rrays rather than=C2=A0
gumming up all calls with a wonky format?
<= br>
Do you mean something=C2=A0like ["foo" =3D> [0, = 1, 2, 3]])"? I think it is indeed possible to implement what you sugge= sted. Whenever the basic structure of the proposal settles a little bit,
I'll update the implementation, and I'll try to find out a = sensible behavior for arrays/objects.

- It's not clear how one would start a new query from scratch, with the= private constructor. There doesn't seem to be a justification for the = private.=C2=A0
- Url::isSpecial() Could we come up with a better name here?=C2=A0 "Sp= ecial" could mean anything unless you know the RFC; it feels like &quo= t;real escape string" all over again.

<= div>The=C2=A0"special URL" is indeed the technicus terminus that = WHATWG URL uses. The RFC explains the concept briefly:

=
The WHATWG URL specificat= ion defines some special schemes (http, https, ftp, file, ws, wss), which h= ave distinct parsing and serialization rules.

I don't have any issues with the current name, but the only alternat= ive I could imagine is Uri\WhatWg\Url::isSpecialScheme().
=C2=A0<= /div>
Regards,
M=C3=A1t=C3=A9
--0000000000001145410645471bbb--