Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:129511 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id 481EC1A00BC for ; Tue, 2 Dec 2025 14:30:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1764685849; bh=WaFISY56EvyzKWSedpW+9XocV1bE0G05wMSejRApI+0=; h=References:In-Reply-To:From:Date:Subject:To:From; b=Xv6agOj/K8aDlCkdTvvRE8VYLFzKxWEXakrJu9HoN+6Xkfe6QApD/C7Tm+WNX9P87 9q4Ij101YFynhWjrhZOSO7nHEG1mYjD3FYxirWOpCzh81HBuIbeT2rUSek6rKyvp/Y b8/dPDR/FpFYUUD/LE5acckEH6qr9RLg4xe4EwerqPs4qyad62cpVrVpOgJBinoIX4 qzzWKDRP5PxS2h+L3YiC6OWP0l5DoLYq647ua6IL3LE34i8SikaUjIjYsS+RdA7i3n UXM2WxlIGGFUeuBH/HuOAZJOLjZoMhN3/94NYw1h3gzr8YAi6gtHcfWTyimg2Hctc8 5M3d9+Tg6FRlw== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 948BF18004C for ; Tue, 2 Dec 2025 14:30:45 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, T_SPF_TEMPERROR autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from mail-oi1-f178.google.com (mail-oi1-f178.google.com [209.85.167.178]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 2 Dec 2025 14:30:45 +0000 (UTC) Received: by mail-oi1-f178.google.com with SMTP id 5614622812f47-45358572a11so705724b6e.3 for ; Tue, 02 Dec 2025 06:30:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764685839; x=1765290639; darn=lists.php.net; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=wWbFch7VON0fHCw0EnObLGjMIB1/awdv6XZKrjZl2MU=; b=LBubDdEjrz78b/F/Mn2bMMXIV+aV3UOsAXYPKDQKK4nP+od2QG6k2xncEx6IFlMlht A04pIaZKaGK6YTFisq0uWpXRqhCRnVtxBSKbe4FAlXlu8yGJna1+0KB5NlEvzsHYHFhC RXoyhTMArfsYBo1Q7LoDj38uVIvaPoUk0J+9tLUPon6fORAUTefLT+mYuUh4Y4qpoSFM szfRzKQCOuuakXvb1Pm4qSXr741TpR/qCYPW3yOHDDCxynjoAaS0kiPf0QXua+dTBWfy uDYnvKW2khalULfg/+NCfGE1eBHzdqmNnC/yX8znk9JTurvgSu+X4cc9+sae/2dnTxj1 nt6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764685839; x=1765290639; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=wWbFch7VON0fHCw0EnObLGjMIB1/awdv6XZKrjZl2MU=; b=NiR+PtveI7CuZRYWcS2CjBaeP33QLtbhNb2PJCm4ukdVY65JUCXkvEPSCCZDhgRwnB YmJ+D10UTGkncLuWvCQb3Fa7z/sioVBgqYBVpLJPncVUh19fpFWTNQ83PV8XYULAuH0F Sf4oB9GZipUVZb7WhhfDEIdJCx0elZmbQb0oB26jsw0MkKQJ9lk7+HpygjOVjqGNukpz m12ot0RT4sL6ns9iiTAJj3i6piHZpPwgi4UEw98WZG29+d2W63nEeuVMtgkP9hNlteuu c4jB/Mma7bxPy0P9TbfnrUhcuRwMoryTmIWvT5naq1f7nGxDOJkKD/XHNt/dXSs7yrmb jxAg== X-Forwarded-Encrypted: i=1; AJvYcCVNjZCwwhy8duLUOuxxe2yFAHxdYc8FhBo4CUhd9kYEQY2Jqvk+Pi/UmIAFtdi3R4jGpeU01LYJ5jQ=@lists.php.net X-Gm-Message-State: AOJu0Yx6J5CklIVFg3ncKhpZj9wg6qlNM/dX4bJDh7SYvg/MdK7caw4b /513N3N7IeP43yChY+yMX/UuoiSmNLiunIlGlIgdWZ0s4KI8WKlNUcx+L+xhBR+cmm0Q5+zXGPR Kv1BssaYAjpeRaW2yQgDikORFKmpsg10= X-Gm-Gg: ASbGncsUk4cYcNlM8mHFed20GRFD4kvUZxxn+xKZA4v7xMNo7YhenA4heD5Ah4fs6i4 7XcEdo2ILwSSqW4LT8snW89q589ICp+ov9WTr40rHLWXZRHS6R5NOW9nDEQ7Y7jkgUBuWAfcNgF ZgOqoR/dMaxYwHSvHRsDn/nGgaISXMsDyVKAwRoXiUSE6EJS+ApBAOMkdLBX8JMKLbSv3eXjm8M Cmi8f9ner4GCxHArQxiawnZaBiVW5Iv2ch5nKKLBKR1vVcpZ7SYYETakigZms8A+Rz8RD3rMtWD DJyaAA4= X-Google-Smtp-Source: AGHT+IEHeC1w2PWT9rl6rg/LLWfhCnKFnCtFqwtG7jcW6SLn2M9M1PYXDtLDVWBfCDc+H8LDM4augUh/Ld+a5BtZXdg= X-Received: by 2002:a05:6808:3448:b0:450:c9f4:ca18 with SMTP id 5614622812f47-4514e7a238dmr14339849b6e.41.1764685839323; Tue, 02 Dec 2025 06:30:39 -0800 (PST) Precedence: list list-help: list-unsubscribe: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 References: In-Reply-To: Date: Tue, 2 Dec 2025 15:30:28 +0100 X-Gm-Features: AWmQ_bnMvI4r0035Sac5kU75oPmctZxhNiQh71XehAhgemYhbeJTXGHH7PxMOUY Message-ID: Subject: Re: [PHP-DEV] [RFC] [Discussion] Followup Improvements for ext/uri To: =?UTF-8?B?TcOhdMOpIEtvY3Npcw==?= , PHP Internals List , =?UTF-8?Q?Tim_D=C3=BCsterhus?= Content-Type: multipart/alternative; boundary="00000000000097dbca0644f8f0f5" From: nyamsprod@gmail.com (ignace nyamagana butera) --00000000000097dbca0644f8f0f5 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi M=C3=A0t=C3=A9, I read the Accessing Path Segments as an Array sub RFC and I have a couple of remarks, suggestions. In the RFC text it is said that: > The getter methods return null if the path is empty (https://example.com)= , an empty array when the path > consists of a single slash (https://example.com/), and a non-empty array otherwise. This is suboptimal to me because it means that the signature for the getter methods is `array|null` which would lead developers to always add a check in the code whenever using the method to distinguish the path state absolute or not. Instead, I would rather always get a single type, the array as return value. The issue you are facing is that you want to convey via your return type if the path is absolute or not. But, we already have access to this information via the UriType Enum, at least in the case of the Uri\Rfc3986\Uri class. For the Uri\WhatWg\Uri the information is less crucial as the validation and normalization rules of the WHATWG specifications will autocorrect the path if needed. This leads me to propose the following alternative: For Uri\Rfc3986\Uri: ``` /** @return list */ Uri::getPathSegments(): array {} /** @return list */ Uri::getRawPathSegments(): array {} #[\NoDiscard(message: "as Uri\Rfc3986\Uri::withPathSegments() does not modify the object itself")] Uri::withPathSegments(array $segments, Uri\Rfc3986\UriType $uriType =3D Uri\Rfc3986\UriType::RelativePathReference): static {} ``` (the default value for the `$uriType` parameter is TBD). For Uri\WhatWg\Url: ``` /** @return list */ Url::getPathSegments(): array {} #[\NoDiscard(message: "as Uri\WhatWg\Url::withPathSegments() does not modify the object itself")] /** @param list $errors */ Url::withPathSegments(array $segments): static {} ``` with the following behaviour *The getter methods return the empty array if the path is empty (https://example.com ), or a single slash (https://example.com/ ),and a non-empty array otherwise.* To distinguish between an absolute path and a relative path you can refer to the Uri\Rfc3986\Uri::getUriType(), method, in case of RFC 3986 URI, and the information does not matter otherwise (ie: for WHATWG URL). During update, for RFC 3986 URI, The additional `$uriType` argument would serve to tell if a `/` should be prepended or not to the generated string path. For the WHATWG URL, no soft errors are emitted, which show that the starting slash does not really matter. Best regards, Ignace On Mon, Dec 1, 2025 at 9:53=E2=80=AFPM M=C3=A1t=C3=A9 Kocsis wrote: > Hi Everyone, > > I'd like to introduce my latest RFC that I've been working on for a while > now: https://wiki.php.net/rfc/uri_followup. > > It proposes 5 followup improvements for ext/uri in the following areas: > - URI Building > - Query Parameter Manipulation > - Accessing Path Segments as an Array > - Host Type Detection > - URI Type Detection > - Percent-Encoding and Decoding Support > > I did my best to write an RFC that was at least as extensive as > https://wiki.php.net/rfc/url_parsing_api had become by the end. Despite > my efforts, > there are still a couple things which need a final decision, or which > need to be polished/improved. Some examples: > > - How to support array/object values for constructing query strings? ( > https://wiki.php.net/rfc/uri_followup#type_support) > - How to make the UriQueryParams and UrlQueryParams classes more > interoperable with the query string component (mainly with respect to > percent-encoding)? ( > https://wiki.php.net/rfc/uri_followup#percent-encoding_and_decoding) > - Exactly how the advanced percent-decoding capabilities should work? Doe= s > it make sense to support all the possible modes (UriPercentEncodingMode) > for percent-decoding as well ( > https://wiki.php.net/rfc/uri_followup#percent-encoding_and_decoding_suppo= rt > ) > - etc. > > Regards, > M=C3=A1t=C3=A9 > --00000000000097dbca0644f8f0f5 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi M=C3=A0t=C3=A9,

I read the Accessing Path Segmen= ts as an Array sub RFC and I have a couple of remarks, suggestions.
In t= he RFC text it is said that:

> The getter methods return null if = the path is empty (https://example.com)= , an empty array when the path
> consists of a single slash (https://example.com/), and a non-empty array = otherwise.

This is suboptimal to me because it means that the signat= ure for the getter methods is `array|null` which would lead
developers t= o always add a check in the code whenever using the method to distinguish t= he path state absolute or not.
Instead, I would rather always get a sing= le type, the array as return value. The issue you are facing is that
you= want to convey via your return type if the path is absolute or not. But, w= e already have access to this
information via the UriType Enum, at least= in the case of the Uri\Rfc3986\Uri class. For the Uri\WhatWg\Uri
the in= formation is less crucial as the validation and normalization rules of the = WHATWG specifications
will autocorrect the path if needed. This leads me= to propose the following alternative:

For Uri\Rfc3986\Uri:

`= ``
/** @return list<string> */
Uri::getPathSegments(): array {}=
/** @return list<string> */
Uri::getRawPathSegments(): array {= }
#[\NoDiscard(message: "as Uri\Rfc3986\Uri::withPathSegments() doe= s not modify the object itself")]
Uri::withPathSegments(array $segm= ents, Uri\Rfc3986\UriType $uriType =3D Uri\Rfc3986\UriType::RelativePathRef= erence): static {}
```
(the default value for the `$uriType` paramete= r is TBD).

For Uri\WhatWg\Url:

```
/** @return list<str= ing> */
Url::getPathSegments(): array {}
#[\NoDiscard(message: &qu= ot;as Uri\WhatWg\Url::withPathSegments() does not modify the object itself&= quot;)]
/** =C2=A0@param list<UrlValidationError> $errors */
Ur= l::withPathSegments(array $segments): static {}
```

with the foll= owing behaviour

The getter methods return the empty array if the = path is empty (https://example.com), or= a single slash (https://example.com/)= ,
and a non-empty array otherwise.
To distinguish between an absolut= e path and a relative path you can refer to the Uri\Rfc3986\Uri::getUriType= (),
method, in case of RFC 3986 URI, and the information does not matter= otherwise (ie: for WHATWG URL).

During update, for RFC 3986 URI, Th= e additional `$uriType` argument would serve to tell if a `/` should be pre= pended or not to the generated
string path. For the WHATWG URL, no soft = errors are emitted, which show that the starting slash does not really matt= er.

Best regards,
Ignace

On Mon, Dec 1, 2025= at 9:53=E2=80=AFPM M=C3=A1t=C3=A9 Kocsis <kocsismate90@gmail.com> wrote:
Hi Everyone,

I'd like to introduce my latest RFC that I've been working= on for a while now: https://wiki.php.net/rfc/uri_followup.

It proposes 5=C2=A0followup=C2=A0improvements for ext/uri in the fol= lowing areas:
- URI Building
- Query Parameter Manipulation- Accessing Path Segments as an Array
- Host Type Detection
- URI Ty= pe Detection
- Percent-Encoding and Decoding Support

I did my best to write an RFC that was at least as extensive as=C2=A0= http= s://wiki.php.net/rfc/url_parsing_api had become by the end. Despite my = efforts,
there are still a couple things which need=C2=A0a final = decision, or which need=C2=A0to be polished/improved. Some examples:
<= div>
- How to support array/object values for constructing qu= ery strings? (https://wiki.php.net/rfc/uri_followup#type_support)=
- How to make the UriQueryParams and=C2=A0UrlQueryParams classes= more interoperable with the query string component (mainly with respect to= percent-encoding)? (https://wiki.php.net/rfc/uri_fo= llowup#percent-encoding_and_decoding)
- Exactly how the advan= ced percent-decoding capabilities should work? Does it make sense to suppor= t all the possible modes (UriPercentEncodingMode) for percent-decoding as w= ell (https://wiki.php.net/rfc/uri_followup#p= ercent-encoding_and_decoding_support)
- etc.

Regards,
M=C3=A1t=C3=A9
--00000000000097dbca0644f8f0f5--