Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:130306 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id CE1BB1A00BC for ; Fri, 13 Mar 2026 09:39:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1773394754; bh=YlKcoqSza+VxKsHlsi6iKTI46AmKe3le1lhATcFbi9E=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=X0UPSL51SU37tQeV5BKEXmYZFjLoPsP9mPq0pU9XmnnrnzwV04fi+ZVdn3hYaDz0c PpiNrheGhjtcJJMQeCxC5mT3Qa/ry+Um7+76jZQOHnxKpNCm/eB46/eOXcmAjAD+Y2 UDn2cIAFNUTb16li/1mMIhVaOZ/8W72BzH9FV+HwFWSkF7TQEKMZRLKRmnp/2pQJAd tG6/ZMYg509I+ycM/exQj31yQxVrJk9I7D0X33kF44eNaoXf/OhPzCJCp6V6eyYPpc fRQkIRin+2bEDgMXJvN8t7b9PiS/j2eACXLR3tSf5Uqv2MGkTbjE06ByK/Cctbqa2q HZa8qtnaQjHTQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 35872180057 for ; Fri, 13 Mar 2026 09:39:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.9 required=5.0 tests=ARC_SIGNED,ARC_VALID,BAYES_50, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS, FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 13 Mar 2026 09:39:03 +0000 (UTC) Received: by mail-qt1-f178.google.com with SMTP id d75a77b69052e-50917417efbso17845151cf.0 for ; Fri, 13 Mar 2026 02:38:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1773394737; cv=none; d=google.com; s=arc-20240605; b=VKL5vptIt7S02aNSmsQuMSmdzqj2aly7RULyABqCFn4IgxGMW8lZy4dgXFU7KIlrjG pzMk+g/zmEBR3z65GA+3glxaSkiJvZhLjngXwFUhypNRhYrhij/QzAzo/TLK+ayrLMZB VSEvG58WTaXIgupsmhZk85wZzIi1ZWRM/fRfhELNlhFtkaeOQIHKreDX2NTRqQD4fBAZ gP9lQBFkBMJVT5Ti1+4Vv+yyz8chs2ksce3EWEXOvA85exWDhLZyXGj/Gp3VDcW+DY6v pvuBXstTgjg746EhVeG/2/OCbRJEdb1ED29ijcpIyz27MJciIlhIMfKNv4McrXX3xdkg Ximw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=uS83pOmhlmcdxr25SPauZTDAPVoq5wfkNxf3BcZsFso=; fh=3C1VhfjYCmfsvejTlCH3BcpPG/GLWc3cObPtTLoMTOw=; b=Ips16QwQ5N6R9Fcfrqe23jWAIhT9hxLBaRh1ixrLBGHrOLedPSMWCX18boE3oExbNw Kv7j0b+8XeMI7+ZbqLZ08NPrGmWBuIB4S+43E+vhtB2sPVa+jlTX+J3vjezM2S5CRQO3 IDkx9tuaxaJPNDF2fNfIVE1xsLAM6CXd+tYFB1aldVpVm2SkpePQFLTdsNIp9LM5zrkJ hwBhto6rbv6uqp+UrEKLy7I25Q3Vih6mE7S5KpDwaQm9gpCMeYG4CTCX2WfA9NolQeLB +/77IX9/W6rTlg56XfGoY099FmL0PG61cG+ccnFluqx5UNaQdWRp5rDcSdd+Avr1rhwk FvQg==; darn=lists.php.net ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773394737; x=1773999537; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=uS83pOmhlmcdxr25SPauZTDAPVoq5wfkNxf3BcZsFso=; b=MFQqxKx73f/jyUxN64zNOc/5Tx8Urh9GmbtV1jzTUsaM0RYOMS3PTBkyN1E+Y2RxCZ rhVMFwsoSfxN+NEC1Lz+Hm+6nSEB1M6dfWhb0EhdN+H6G9N7ZPMwpBbQWjyXMLyusrAx uGagUnIhFi5nboOIFlNUv1eiNNjawdZzMczoXorE5rHxaMgo9LYSQf84Sqld+Olgab/K mKuYLdo7+JlhZtWMhcZ/43h/Mbo9Av136A7VfPplBbrGc/X+HOnpzCDKtzm7H6lY4RN2 G8pPI2tpuSS8ae+B9JY4N4SkhVKDkKq3JyQIFTfMPHkwPTWytqTXBdseP7HlY/uFs0PD EaHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773394737; x=1773999537; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=uS83pOmhlmcdxr25SPauZTDAPVoq5wfkNxf3BcZsFso=; b=R43cVLkbRQE0JSMjjuU8JpC7Ml4Ek5FRF0cF53oy1EUTvL+qSyaz4AdUUZNolkOVxX 11mMYu8eiF/yZDlQwElXMwyP0RiI4EDG2+2vEuq4WOAEkU+NgPmUM93oaqyZZ3O37UsT bbPZ9akImgtDYa3CZtHm5i5QnxdWMNvzCsrsOmtf4cvHImbVctNSC/1i6NqmB9riNDIQ huM0UiMVpBUbWkijMHOKRQALYnSKfYlMAWkNQ4xKo19HkDpZOk6dzxWgWKYXa2MdVAWZ U7HuDcw8EJmHG0rgjU8zycm2jat1GU96tzWSO1nAi9hi5A1iHfuEh3GmhnVwjauPOnWi YLMg== X-Gm-Message-State: AOJu0YxEp5qnUX7x1mdk8JWPf0fRuztQqesaedBrGfBZtwWFiFIhjLjk dVypmTXjixwG9xU4Nq0qMsVQ3AcqcRjuYLeOHOT7goRfCf/x91NdABkqtz5WGjTWNgmYAlNLzu6 pA37rOmNJTv29NTq0GxPigg5ei816xOBKGCY0 X-Gm-Gg: ATEYQzxaToB9WN0wytKCsDC0lliubKsGOgi7Qr+/Ktut6HwgrG73f04j174WtSGQHQN +XmasY+z5aNOP2lcNVlyq6K+sIDnIldxrvRO7N9ebM9qF/THmH3y+hFX/Cd5kdSTp8n6K9KD33/ JvLtsK3PcxzrjUR2L7rxfxPrn0rY+7oD0QAQ8r75CcS0Ljd33YbyDavrEvofPJx2lwLpYi2ZLpc UVsEYCoO/e/U9n/zRgVzjqtJ44jg4cJBWLuph1MN5hMKY0oApPbzXl3JJRetJmwh6Y5mtqVKe7s C/pt5UQ= X-Received: by 2002:a05:622a:8a:b0:4ff:a8c1:b00e with SMTP id d75a77b69052e-5094707a7a5mr79079041cf.2.1773394736919; Fri, 13 Mar 2026 02:38:56 -0700 (PDT) Precedence: list list-help: list-unsubscribe: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 References: <83238ad3-c844-4457-dfb3-11321787e022@php.net> In-Reply-To: Date: Fri, 13 Mar 2026 10:38:45 +0100 X-Gm-Features: AaiRm517PLPYrejjgfjLrP1QkizPUVY-69Tpbb5I7naUG7DY6_rCZLDGEY4kD9Q Message-ID: Subject: Re: [PHP-DEV] [RFC] [Discussion] Followup Improvements for ext/uri To: ignace nyamagana butera Cc: PHP Internals List Content-Type: multipart/alternative; boundary="000000000000573797064ce4a35f" From: kocsismate90@gmail.com (=?UTF-8?B?TcOhdMOpIEtvY3Npcw==?=) --000000000000573797064ce4a35f Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Ignace, I just re-read the RFC and I like the updates and precision you've brought > to it here's my review: > For the builders I have nothing more design wise to add this is already > solid. I may nitpick on the *Builder::clear() method name I would have go= ne > with *Builder::reset() but I presume other developers would go with clear= . > Other than that the public API is spot on. > I also like the reset() method name much more! So I'll update the RFC accordingly. For the Enum, my only concern is that they serve just as flags and their > usage is tightly coupled to the Uri classes. I would add 2 static named > constructors fromUrl and tryFromUrl just for completeness. I believe the > maintenance cost is negligible but the developer DX is improved and allow= s > for a broader usage of the Enum. > I don't really understand why it would make sense to invert the coupling? (decouple the UriHostType/UrlHostType and the UriType enums from the Uri/Url classes, and in the same time, couple the enums to the Uri/Url classes). IMO it's far more ergonomic to retrieve the URI type/host type from the URI class directly, rather than to instantiate an enum each time? > Last but not least, The Percent encoding feature should be IMHO improved > by moving the encode/decode methods from being static methods on the URI > classes to becoming public API on the Enum. This would indeed imply > renaming the enum from Uri\Rfc3986\UriPercentEncodingMode to > Uri\Rfc3986\UriPercentEncoder with two methods encode/decode. Again it > makes for a more self-contained feature and adds to the DX. Developer wil= l > not have to always statically call the URI classes for encoding/decoding > strings as the Enums and their cases already convey the information > correctly. > Personally - and I may be in the minority - I don't see an issue with having two static methods on Uri/Url. Uri::percentEncode() and Uri::percentDecode() as well as Url::percentEncode() and Url::percentDecode() could indeed be implemented via a dedicated UriPercentEncoder and UrlPercentEncoder class, or even a shared PercentEncoder one, but: - Its methods would still be static - I don't think it's worth to add one or two dedicated classes just for this purpose I also got feedback that these functions could be free-standing in the URI namespace. But I really don't like free standing functions, so I won't go in this direction. IMO even two static methods on the Uri and Url classes are easier to use and find. So if we reject these ideas, then the next candidate is your suggestion of having a UriComponent/UrlComponent enum with an encode() and decode() method ( https://github.com/thephpleague/uri-src/pull/186#issuecomment-4016602880): $uri =3D new Uri\Rfc3986\Uri("https://example.com/?q=3D%3A%29"); $query =3D $uri->getQuery(); // returns "q=3D%3A%29" echo Uri\Rfc3986\UriComponent::Query->decode($query); // returns "q=3D:)" But as I mentioned in my comment on the PR, percent-encoding/decoding is not necessarily tied to URI/URL components, for example because the proposal currently contains the Uri\Rfc3986\UriPercentEncodingMode::AllReservedCharacters, Uri\Rfc3986\UriPercentEncodingMode::AllButUnreservedCharacters, or Uri\Rfc3986\UriPercentEncodingMode::PathSegment. Some of the enum cases which don't relate to a component could be removed, but at least the AllReservedCharacters case is important because it provides a direct alternative for rawurlencode() and rawurldecode(). My side-quest is to gradually phase out *urlencode() and *urldecode() functions because their naming is very confusing, and people usually don't know when to use which. And I've just noticed that probably yet another enum case would be needed to provide a direct alternative for urlencode() and urldecode(), because they differ from rawurlencode() and rawurldecode() with regards to how the "~" is handled, besides the " " character. But at this point, I became unsure if it's worth to pursue this goal, because this is not RFC 3986 compliant behavior anymore (and TBH not even Uri\Rfc3986\UriPercentEncodingMode::FormQuery is compliant), so it has nothing to do with the Uri\Rfc3986 namespace. So all in all... As far as I can see, not even a Uri\Rfc3986\UriComponent enum could provide a complete solution for the custom percent-encoding/decoding part of the proposal. If we used a Uri\Rfc3986\UriEncoding enum name instead, then there would be no issue with the various kinds of encoding/decoding modes not referring to URI components, but the naming would probably still not be right, as I wouldn't expect a class name with "ing" suffix to perform percent-encoding/decoding itself. But I'm happy to be corrected by native English speakers :) As I don't have any other ideas, I think I still prefer the static method based approach. Regards, M=C3=A1t=C3=A9 --000000000000573797064ce4a35f Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Ignace,

I just re-read the RFC and I lik= e the updates and precision you've brought to it here's my review:<= /div>
For the builders I have nothing more design wise to add this is a= lready solid. I may nitpick on the *Builder::clear() method name I would ha= ve gone with *Builder::reset() but I presume other developers would go with= clear. Other than that the public API is spot on.
=

I also like the reset() method name much more! So I'= ;ll update the RFC accordingly.

For the Enum, my only conc= ern is that they serve just as flags and their usage is tightly=C2=A0couple= d to the Uri classes. I would add=C2=A02 static=C2=A0named constructors fro= mUrl and tryFromUrl just for completeness. I believe the maintenance cost i= s negligible=C2=A0but the developer DX is improved and allows for a broader= usage of the Enum.

I don't= really understand why it would make sense to invert the coupling? (decoupl= e the UriHostType/UrlHostType and the
UriType enums from the Uri/= Url classes, and in the same time, couple the enums to the Uri/Url classes)= . IMO it's far more
ergonomic to retrieve the URI type/host t= ype from the URI class directly, rather than to instantiate an enum each ti= me?
=C2=A0
Last=C2=A0but not=C2=A0least, The Percent= encoding feature should be IMHO improved by moving the encode/decode metho= ds from being static methods on the URI classes to becoming public API on t= he Enum. This would indeed imply renaming the enum from=C2=A0 Uri\Rfc3986\U= riPercentEncodingMode to Uri\Rfc3986\UriPercentEncoder with two methods enc= ode/decode. Again it makes for a more self-contained feature and adds to th= e DX. Developer will not have to always statically call the URI classes for= encoding/decoding strings as the Enums and their cases already convey the = information correctly.

Personal= ly - and I may be in the minority - I don't see an issue with having tw= o static methods on Uri/Url.
Uri::percentEncode() and Uri::percen= tDecode() as well as Url::percentEncode() and Url::percentDecode()
could indeed be implemented via a dedicated UriPercentEncoder and UrlPerc= entEncoder class, or even a
shared PercentEncoder one, but:
=

- Its methods would still be static
- I don&#= 39;t think it's worth to add one or two dedicated classes just for this= purpose

I also got feedback that these functions = could be free-standing in the URI namespace. But I really don't like=C2= =A0free=C2=A0standing functions,
so I won't=C2=A0go in this d= irection. IMO even two static methods on the Uri and Url classes are easier= to use and find.

So if we reject these ideas, the= n the next candidate is your suggestion of having a UriComponent/UrlCompone= nt enum

$uri =3D new Uri\Rfc3986\Uri("https://example= .com/?q=3D%3A%29");
$query =3D $uri->getQuery(); // returns = "q=3D%3A%29"
echo Uri\Rfc3986\UriComponent::Query->decode($= query); // returns "q=3D:)"

But as I men= tioned in my comment on the PR, percent-encoding/decoding is not necessaril= y tied to URI/URL components,
for example because the proposal cu= rrently contains the Uri\Rfc3986\UriPercentEncodingMode::AllReservedCharact= ers,
Uri\Rfc3986\UriPercentEncodingMode::AllButUnreservedCharacte= rs, or Uri\Rfc3986\UriPercentEncodingMode::PathSegment.

Some of the enum cases which don't relate to a component could be= removed, but at least the AllReservedCharacters case is
importan= t because it provides a direct alternative for rawurlencode() and rawurldec= ode(). My side-quest is to gradually phase out
*urlencode() and *= urldecode() functions because their naming is very confusing,=C2=A0and peop= le usually don't know when to use which.

And I= 've just noticed that probably yet another enum case would be needed to= provide a direct alternative for urlencode() and urldecode(),
be= cause they differ from rawurlencode() and rawurldecode() with regards to ho= w the "~" is handled, besides the " " character.
<= div>But at this point, I became unsure if it's worth to pursue this goa= l, because this is not RFC 3986 compliant behavior anymore (and TBH
not even Uri\Rfc3986\UriPercentEncodingMode::FormQuery is compliant), so= it has nothing to do with the Uri\Rfc3986 namespace.

<= div>So all in all... As far as I can see, not even a Uri\Rfc3986\UriCompone= nt enum could provide a complete solution for the custom
percent-= encoding/decoding part of the proposal. If we used a Uri\Rfc3986\UriEncodin= g enum name instead, then there would be no issue
with the variou= s kinds of encoding/decoding modes not referring to URI components, but the= naming would probably still not be right,
as I wouldn't expe= ct a class name with "ing" suffix to perform percent-encoding/dec= oding itself. But I'm happy to be corrected by
native English= speakers :)=C2=A0

As I don't have any other i= deas, I think I still prefer the static method based approach.
Regards,
M=C3=A1t=C3=A9


<= div dir=3D"ltr">
--000000000000573797064ce4a35f--