Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:124061 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 19F271A009C for ; Sat, 29 Jun 2024 20:28:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1719692979; bh=98ziuzcpMpJTWYacx1c1fYAOrYFUu/f01s/Sa7Q7t8A=; h=In-Reply-To:References:Date:From:To:Subject:From; b=oRVO8F1mr7fGfuZQ1jKv3U9x/PC26hd98NuymJnqzPAyDXBl8sVhOv5l1rnEi/xL3 Uwp8qboRw4FyIvB6MD3SzuwPjYCRbpdJM4cBUg0Gz8yyOaI9xUBJM2cUhKcf/l+uLa D0zn2PlHYbOVLzYjHmg9ta8cFt4mgFcKtNYbcNH+lXdFlKyK6TzQIiWTTSb9yVp/O5 9h72hAS83V4oOF1wnpwj4Qq8m+m3cScb/tM9gfgkL7f43Ab5AUMjzn0A75RRAffQak +KNaDqLDu2nWjfUa+NawOi06vRdjBH0kJwwbXQQO6WMEXUwe/pwV+PXDxRNcN8aRJC WdlxxdPlpTVyw== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 77B5F180B26 for ; Sat, 29 Jun 2024 20:29:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: Error (Cannot connect to unix socket '/var/run/clamav/clamd.ctl': connect: Connection refused) X-Envelope-From: Received: from fout1-smtp.messagingengine.com (fout1-smtp.messagingengine.com [103.168.172.144]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 29 Jun 2024 20:29:37 +0000 (UTC) Received: from compute7.internal (compute7.nyi.internal [10.202.2.48]) by mailfout.nyi.internal (Postfix) with ESMTP id D0F75138006B for ; Sat, 29 Jun 2024 16:28:17 -0400 (EDT) Received: from imap50 ([10.202.2.100]) by compute7.internal (MEProxy); Sat, 29 Jun 2024 16:28:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastmail.com; h= cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1719692897; x=1719779297; bh=Yh81Zdy+sw 6noZATuSP7tJO9Zvc8zU/pIhf7m1yLq10=; b=NSIIv/9s8LdjN2Nnrn2F4MaO7/ 1jtvgSktxczWXC6C2WIC2lZKl/prjYO4zmFPE1X+IxwWD2IcuknLkRZbo9UhhKjU FA6HsY79iHpIsG7sVT0GdBPxqvHDddXnv0uTCdZBPgvbJaZSJWpYqoYceux4wegh 2Aj33v8FIAeh+K5ykSfBTqdqyn7JWBjlCVv/8/9IyQOaDzbL/hO1ueNzMstLVV6Z nDtirT1jRfwikd6MkCV12/ELrs63usBVqUtl/2X72f1orkfUz+PyxYbiXzcxQq0l fpDBePRXywE4x3xEQjjUtwSwzPylKrwCRZ299ATI1MNm76pMBBffLfazgS0A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1719692897; x=1719779297; bh=Yh81Zdy+sw6noZATuSP7tJO9Zvc8 zU/pIhf7m1yLq10=; b=eCrBR3cBIbMosJJx9fvyrI45a98ibJKitjL3e61q5AGM 8KGIQnnFtiwh3fGIqxoec5XkyncDOghm5EiKd1/gjCYdTp5syX+2QLIQoWHw47jP BvHTqxHWfpuZFFtQGwUEwI/RYDO4l8+YkapbWer2BR2b2xMHq23DACz0cYb9qpIK 7KwYjp5kC5Vk5UDSeroHhWx8w+Mjgw9XWmFuZh45QSHgPWTUm7ZS0PHEEBk9t+Fo wH7y/30JX61WbEBwKd3S3InIaers1BTJcoWH75hgsw7hAXqMLwviF5I9NRlt7PsY 7Pe9YsejVFz69OjMGRsCuJU0OgYzXIl1efmYWwacFQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrtdelgdduhedtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefofgggkfgjfhffhffvufgtsegrtd erreerreejnecuhfhrohhmpefmrhhinhhklhgvuceokhhrihhnkhhlvgesfhgrshhtmhgr ihhlrdgtohhmqeenucggtffrrghtthgvrhhnpeevveeljeeffefhvdehgeelheelhefhgf evudelkeffffefleellefhhfdvtdefffenucffohhmrghinhepphhhphdrnhgvthdpfihi khhiphgvughirgdrohhrghdpvgigrghmphhlvgdrohhrghdpghhithhhuhgsrdgtohhmpd htihhmohhtihhjhhhofhdrnhgvthenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgr mhepmhgrihhlfhhrohhmpehkrhhinhhklhgvsehfrghsthhmrghilhdrtghomh X-ME-Proxy: Feedback-ID: i7dd1477f:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 86BCB1700093; Sat, 29 Jun 2024 16:28:17 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.11.0-alpha0-538-g1508afaa2-fm-20240616.001-g1508afaa Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 Message-ID: In-Reply-To: References: Date: Sat, 29 Jun 2024 21:27:50 +0100 To: internals@lists.php.net Subject: Re: [PHP-DEV] [RFC] [Discussion] Add WHATWG compliant URL parsing API Content-Type: multipart/alternative; boundary=714affaf3b254a81bb42ddc2365d501e From: krinkle@fastmail.com (Krinkle) --714affaf3b254a81bb42ddc2365d501e Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable On Fri, 28 Jun 2024, at 21:06, M=C3=A1t=C3=A9 Kocsis wrote: > [=E2=80=A6] add a WHATWG compliant URL parser functionality to the sta= ndard library. The API itself is not final by any means, the RFC only re= presents how I imagined it first. >=20 > You can find the RFC at the following link: https://wiki.php.net/rfc/u= rl_parsing_api First-pass comments/thoughts. As others have mentioned, it seems the class would/could not actually sa= tisfy PSR-7. Realistically, the PSR-7 interface package or someone else = would need to create a new class that combines the two, potentially as p= art of a transition away from it to the built-in class, with future PSRs= building directly on Url. If we take that as given, we might as well de= sign for the end state, and accept that there will be a (minimal) transi= tion. This end state would benefit from being designed with the logical = constraints of PSR-7 (so that migration is possible without major surpri= ses), but without restricting us to its exact API shape, since an interm= ediary class would come into existence either way. For example, Url could be a value class with merely 8 public properties.= Possibly with a UrlImmutable subclass, akin to DateTime, where the prop= erties are read-only instead a clone method could return Url?). It might be more ergonomic to leave the parser as implementation detail,= allowing the API to be accessed from a single import rather than requir= ing two. This could look like Url::parse() or Url::parseFromString().=20 For the Url::parseComponent() method, did you consider accepting the exi= sting PHP_URL_* constants? They appear to fit exactly, in naming, descri= ption, and associated return types. Without UrlParser/UrlComponent, I'd adopt it direclty in applications an= d frameworks. WIthout it, further wrapping seems likely for improved usa= bility. This is sometimes benefitial when exposing low-level APIs, but i= t seems like this is close to fitting in a single class, as demonstrated= by the WHATWG URL API. One thing I feel is missing, is a method to parse a (partial) URL relati= ve to another. E.g. to expand or translate paths between two URLs. Consi= der expanding "/w/index.php", or "index.php" relative to "https://wikipe= dia.org/w/". Or expanding "//example.org" relative to either "https://wi= kipedia.org" vs "http://wikipedia.org". The WHATWG URL API does this in = the form of a second optional string|Stringable parameter to Url::parse(= ). Implementing "expand URL" with parsing of incomplete URLs is error-p= rone and hard to get right. Including this would be valuable. See also Net_URL2 and its resolve() method https://pear.php.net/package/= Net_URL2 https://github.com/pear/Net_URL2=20 -- Timo Tijhof https://timotijhof.net/ --714affaf3b254a81bb42ddc2365d501e Content-Type: text/html;charset=utf-8 Content-Transfer-Encoding: quoted-printable
On Fri, 28 Jun = 2024, at 21:06, M=C3=A1t=C3=A9 Kocsis wrote:
[=E2=80=A6] add a WHAT= WG compliant URL parser functionality to the standard library. The API i= tself is not final by any means, the RFC only represents how I imagined = it first.

You can find the RFC at the follo= wing link: htt= ps://wiki.php.net/rfc/url_parsing_api

First-pass comments/thoughts.

=
As others have mentioned, it seems the class would/could not actual= ly satisfy PSR-7. Realistically, the PSR-7 interface package or someone = else would need to create a new class that combines the two, potentially= as part of a transition away from it to the built-in class, with future= PSRs building directly on Url. If we take that as given, we might as we= ll design for the end state, and accept that there will be a (minimal) t= ransition. This end state would benefit from being designed with the log= ical constraints of PSR-7 (so that migration is possible without major s= urprises), but without restricting us to its exact API shape, since an i= ntermediary class would come into existence either way.
For example, Url could be a value class with merely 8 publi= c properties. Possibly with a UrlImmutable subclass, akin to DateTime, w= here the properties are read-only instead a clone method could return Ur= l?).

It might be more ergonomic to leave th= e parser as implementation detail, allowing the API to be accessed from = a single import rather than requiring two. This could look like Url::par= se() or Url::parseFromString(). 

For t= he Url::parseComponent() method, did you consider accepting the existing= PHP_URL_* constants? They appear to fit exactly, in naming, description= , and associated return types.

Without UrlP= arser/UrlComponent, I'd adopt it direclty in applications and frameworks= . WIthout it, further wrapping seems likely for improved usability. This= is sometimes benefitial when exposing low-level APIs, but it seems like= this is close to fitting in a single class, as demonstrated by the WHAT= WG URL API.

One thing I feel is missing, is= a method to parse a (partial) URL relative to another. E.g. to expand o= r translate paths between two URLs. Consider expanding "/w/index.ph= p", or "index.php" relative to "https://wikipedia.org/w/". Or expanding "//example.org" relative t= o either "https://wikipedia.org" vs "http://wikipedia.org".= The WHATWG URL API does this in the form of a second optional string|St= ringable parameter to Url::parse(). Implementing "expand URL"  with= parsing of incomplete URLs is error-prone and hard to get right.  = Including this would be valuable.

See also = Net_URL2 and its resolve() method https://pear.php.net/package/Net_URL2 https://github.com/pear/Net_URL2

--
Timo Tijhof
https://timotijhof.net/
<= div>
--714affaf3b254a81bb42ddc2365d501e--