Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:127307 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id AA6DA1A00BC for ; Wed, 7 May 2025 19:16:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1746645256; bh=XePoikj6h/wIVxit1lD/Fgk3fj6n2yfWwjecAj3FvOM=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=cGT37awgktQnYtP1J0dfn+w3pNTfNQ/4C1uOgxY0WVT9D+bxBMCmDosX5xAvv/xKH wGJtzKg2nz0FJAZ8IMrOBiijYB4HIzpUC3L9RxGHtpjdjSNmwEn7l61kJ11/Zp37jm R24gfUg30hx+7mYERffzm7WeoSfJTHhmrEvokCzYHZQM2du/1cCwvETN2qGe1qPiTq eoRk2vtlUb5dmE9Gkk8yr749u02DZFXO0pXBCMfio0W7zrpHbOHS+b/0Sqgp1/yd3T a+wO1hixzKVIUakaiuE+i5jWPKgqLzhUJdOWqcXa+qn/EklaHAI7nYaLsbOfkjasuA 5bqF1EKxbllrg== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 7F178180053 for ; Wed, 7 May 2025 19:14:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: Error (Cannot connect to unix socket '/var/run/clamav/clamd.ctl': connect: Connection refused) X-Envelope-From: Received: from premium76-5.web-hosting.com (premium76-5.web-hosting.com [162.213.255.108]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 7 May 2025 19:14:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=pmjones.io; s=default; h=To:References:Message-Id:Content-Transfer-Encoding:Cc:Date: In-Reply-To:From:Subject:Mime-Version:Content-Type:Sender:Reply-To:Content-ID :Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To: Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe :List-Post:List-Owner:List-Archive; bh=LCO62w7LLO3kPS+O/4JozFMfLQLxYYC5ShNVei1EYIo=; b=vY8cmV9BWl6tMNYzcNm3tohsq5 Q2L8MvLYRSNPTezDvcm3rJOx5OgoMG53Xtjuvle9U7iN4DW9ol9fP31ArBXjjuPVGomw7LHVUfYWb NPEqqC1d6z8NtZx98plPxigoyGSrVq1dE9176KMxdJ61cnJPU2JlHxNjiUtI5YfHGAoljzD9kqsir mQL1ZTNuVoVgF0Zlmk9cDfut8D5ABLh5LFcJCjRXlv3Xdvml+1Ly65ZCKa2/OVJdyCE8iM+JiLBoB kFcx4Dfp9P3YbuVOdotPsV1jsQgeMDMiKUurtrVbbgKiKVgXswhs88v0yZkr1WuBwsJ9hMdJKE3iU X3K5al+A==; Received: from 107-223-28-39.lightspeed.nsvltn.sbcglobal.net ([107.223.28.39]:53955 helo=smtpclient.apple) by premium76.web-hosting.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.1) (envelope-from ) id 1uCkFl-00000002db8-35YM; Wed, 07 May 2025 15:16:24 -0400 Content-Type: text/plain; charset=utf-8 Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.500.181.1.5\)) Subject: Re: [PHP-DEV] [RFC] [Discussion] Add WHATWG compliant URL parsing API In-Reply-To: Date: Wed, 7 May 2025 14:16:11 -0500 Cc: PHP Internals List Content-Transfer-Encoding: quoted-printable Message-ID: References: To: =?utf-8?B?TcOhdMOpIEtvY3Npcw==?= X-Mailer: Apple Mail (2.3826.500.181.1.5) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - premium76.web-hosting.com X-AntiAbuse: Original Domain - lists.php.net X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - pmjones.io X-Get-Message-Sender-Via: premium76.web-hosting.com: authenticated_id: pmjones@pmjones.io X-Authenticated-Sender: premium76.web-hosting.com: pmjones@pmjones.io X-Source: X-Source-Args: X-Source-Dir: X-From-Rewrite: unmodified, already matched From: pmjones@pmjones.io ("Paul M. Jones") Hi Mat=C3=A9 and all, > On May 5, 2025, at 16:36, M=C3=A1t=C3=A9 Kocsis = wrote: >=20 > Hello Internals, >=20 > After more than a hundred emails refining even the tiniest details, we = have reached a point where I'd like to call for a vote. > I know that the new API still doesn't support many use-cases, it still = has missing pieces, but now it includes a cohesive set > of functionality that could be a very useful basic building block for = most people. >=20 > That said, I don't intend to change anything about the RFC anymore, = unless there's still some factual error in it. There are a lot of > possibilities how such a large API can look like, and this RFC = approaches the problem the way it is currently described, > and not in any other way. >=20 > So unless some very serious issues arise, I'm going to start the vote = on 8th May, possibly in the morning (according to UTC). I am on record as wanting very much to see some decent web-centric = objects in core PHP (Request, Response, Uri/Url, etc). To my chagrin, despite the fact that its goals are laudable, I do not = think this RFC is in a ready state to provide such objects. Among other things I find troubling, the RFC as presented ... - is too broad in scope; - acknowledges it is incomplete, with work left undone; - admits to standards non-compliance; and, - has an uncertain API. ## Too Broad In Scope The RFC attempts to do too much at once: not just making URI/URL parsing = "pluggable" for internals, and providing an RFC 3986 compliant parser, = but also creating from scratch entirely new RFC 3986 URI and related = Exception classes for userland consumption, along with entirely new = WHATWG-URL classes and Exceptions. The RFC itself remarks on "[t]he already large scope of the RFC" -- and = the same has been observed during the on-list discussions. Even Mat=C3=A9'= s message above mentions "There are a lot of possibilities how such a = large API can look like". It would be better to narrow the scope of the RFC to something more = manageable. ## Incomplete, Work Left Undone This is a consequence of the overly-broad scope. The work remaining is = by no means certain to be completed or voted in after followup RFCs, = either on a short timeline or a long one. Mat=C3=A9 notes above that the RFC "has missing pieces" -- and here are = some examples from the RFC itself: - "Builder classes are not offered by the present RFC just yet. ... this = feature is one of the top candidates of a followup RFC." - "The topic of query parameter manipulation should be discussed as a = followup to the current RFC." - "There are multiple planned features in future scope that should be = supported." - "There are immediate plans to add new capabilities to the new API" - "the position of this RFC is not to include this interface = [URLSearchParams] yet" It would better to present a single finished product instead of multiple = partially-finished products. ## Standards Non-Compliance The RFC states early on that "the parse_url() function is offered for = parsing URLs, however, it isn't compliant with any standards. ... = Incompatibility with current standards is a serious issue" -- but later = it says: > Getters of Uri\WhatWg\Url have a few gotchas for the ones who are = inherently familiar with the WHATWG URL specification: they don't = (entirely) follow the =E2=80=9Cgetter steps=E2=80=9D that are defined by = the specification, but the individual components are returned directly = without any other changes that the =E2=80=9Cgetter steps=E2=80=9D would = otherwise specify. The RFC doesn't fully follow the WHATWG-URL standard. This is = reminiscent of the complaint regarding parse_url(). Further, "the WHATWG URL specification contains a URLSearchParams = interface" but "the position of this RFC is not to include this = interface yet". It would be better to actually follow the WHATWG-URL standard, and not = add a partially-compliant and somewhat-nonstandard implementation to = core. ## Uncertain API Because of the unfinished work, and because of the "living standard" = nature of WHATWG-URL, the foundation of the API is unsteady: > WHATWG URL doesn't specify percent-decoding rules for most components = ... But since the WHATWG URL specification is subject to constant = updates, it's possible that normalization or percent-decoding rules = change in the future. "Constant updates" makes me think it is too early to include a = WHATWG-URL implementation in core. Then we have this ... > the current RFC chooses to make the built-in URI implementations final = ... until the new API becomes mature enough and becomes tested in = practice. ... and this: > Once the API settles, we plan to lift these restrictions [around final = classes] at some extent. If the API needs to "become tested in practice" so that it can "mature" = and "settle", it would be better do that in userland (maybe published on = Packagist or PECL) instead of in core. ## Remedies I think all of the above can be remedied, so that we can finally have = some decent web-centric objects in core. But that's a discussion for a = later time, one we can have if the RFC does not pass. -- pmj