Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:124002 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 9D6501A009C for ; Fri, 28 Jun 2024 22:14:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1719612961; bh=+TMZsA0ZW+1EM/PjqwtuuYoKuH0dOv70Mp3LGbj7kjE=; h=In-Reply-To:References:Date:From:To:Subject:From; b=A6/8FVBYMzwJ/us8n8snS1hiFJYVWdBKwSveCjr0mEf7UTJnIYV9lyWrDjOGl43Q8 FsZnolaYxEVM87jd3PL/ay0qDM7roD255GscZqEo1EJNxAPpzAkWyFR9Y5pJXtY/0u bnJtHR6t3mxtvER1ks5Y0DOW/BAZ+8b5Nh3PhLTDBPF8tnfRIKe8jaqJgXJpAVDySO 9V85H0I8uXh+DsDeXCZrGh556sK0MLc0A5/2pCDmND1rT9ztzDph/+0q2dug/KfWLx To1kGOiR+kAvRHy6HCd0CDQWZ74aO5uogHIQhlZH40lkC23Gmkun3AK74D0WVsY4mV mwFKEry3y/0GQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 24E0818059D for ; Fri, 28 Jun 2024 22:16:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,RCVD_IN_DNSWL_LOW, SPF_HELO_PASS,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: Error (Cannot connect to unix socket '/var/run/clamav/clamd.ctl': connect: Connection refused) X-Envelope-From: Received: from fout3-smtp.messagingengine.com (fout3-smtp.messagingengine.com [103.168.172.146]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 28 Jun 2024 22:15:59 +0000 (UTC) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailfout.nyi.internal (Postfix) with ESMTP id 34D7F13801F4 for ; Fri, 28 Jun 2024 18:14:40 -0400 (EDT) Received: from imap50 ([10.202.2.100]) by compute1.internal (MEProxy); Fri, 28 Jun 2024 18:14:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= garfieldtech.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to; s=fm2; t=1719612880; x=1719699280; bh=JWIDa4zunEEp9EupuMfLV DyFEaYRklS74ovzXNsNHGk=; b=UG8+mUVW05K9sIYYWdLyND8Bl/uLxERIcHT48 o5Uc/XS3kf2QdKOzafgrcuppuJ+O4/6cATRI9GbiQgqGPU7LoouEoDQ0ly8NxeHa rHwy3cXsxSA2GCamGMiyn3d83j5Qo6Mx9d3PEuSnQD5Ujslb3th0c1mMhTfupHhQ 8/QzCVqdjyHAeqIZzXRj9yCjqkKEj7HyBPcirUHTwWEt2lPw6+RrRiw6DFtfQRia IfcdJGb+ymtdrJ6tRNQ7xU1I6+MWFwfnK/Y2P2vM/uKRaXdgsBoxrlXu7UaC77xi ga6zErm9wYaH7LR2+ouMGTi4ItZf2B3X5GXdZdGopqelCMBiA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1719612880; x= 1719699280; bh=JWIDa4zunEEp9EupuMfLVDyFEaYRklS74ovzXNsNHGk=; b=K 8oWRdQMQ58Wsuz+7blX+0ca56K1WMv1w/KCb8N8GX0+WqqiEDO8uql361paE4uo6 nXhnnbMPIKRvOdF+nrNy7V+JP5sSXB4PiRxAXwNTeA6sX3z7piZUiWDT+skZnsxp QY67AAlE/RBWG42CMMVJ+pe4+Ph/OUVhMLdK9LUvXXU3puyoSPjjB0Oy21uzeKP1 jUmlcgBLT756IZxgNRl0BacVxbXp+p0cnu/cF7AUMkNKjvSshba8WZfwj16+JDjl DQJq0QbHevtHvuxHxEUt0s31wVYABKdKF+bGnx7R1k5/aAC0RieneBypb4DFu7gR EZehuuiGoiKOGxYRiNSrw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrtdekgddtjecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefofgggkfgjfhffhffvufgtgfesthhqredtreerjeenucfhrhhomhepfdfnrghr rhihucfirghrfhhivghlugdfuceolhgrrhhrhiesghgrrhhfihgvlhguthgvtghhrdgtoh hmqeenucggtffrrghtthgvrhhnpeeggeehgfetjeehgefggefhleeugefgtdejieevvdet hfevgeeuudefleehvdetieenucffohhmrghinhepphhhphdrnhgvthenucevlhhushhtvg hrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehlrghrrhihsehgrghrfhhi vghlughtvggthhdrtghomh X-ME-Proxy: Feedback-ID: i8414410d:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id C0BE81700093; Fri, 28 Jun 2024 18:14:39 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.11.0-alpha0-538-g1508afaa2-fm-20240616.001-g1508afaa Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 Message-ID: In-Reply-To: References: Date: Fri, 28 Jun 2024 22:14:19 +0000 To: "php internals" Subject: Re: [PHP-DEV] [RFC] [Discussion] Add WHATWG compliant URL parsing API Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable From: larry@garfieldtech.com ("Larry Garfield") On Fri, Jun 28, 2024, at 8:06 PM, M=C3=A1t=C3=A9 Kocsis wrote: > Hi Everyone, > > I've been working on a new RFC for a while now, and time has come to=20 > present it to a wider audience. > > Last year, I learnt that PHP doesn't have built-in support for parsing=20 > URLs according to any well established standards (RFC 1738 or the=20 > WHATWG URL living standard), since the parse_url() function is=20 > optimized for performance instead of correctness. > > In order to improve compatibility with external tools consuming URLs=20 > (like browsers), my new RFC would add a WHATWG compliant URL parser=20 > functionality to the standard library. The API itself is not final by=20 > any means, the RFC only represents how I imagined it first. > > You can find the RFC at the following link:=20 > https://wiki.php.net/rfc/url_parsing_api > > Regards, > M=C3=A1t=C3=A9 I am all for proper data modeling of all the things, so I support this e= ffort. Comments: * There's no need for UrlComponent to be backed. * I don't understand why UrlParser is a static class. We just had a who= le big debate about that. :-) There's a couple of ways I could see it working, and I'm not sure which = I prefer: 1. Better if we envision the parser getting options or configuration in = the future. $url =3D new UrlParser()->parseUrl(): Url; 2. The named-constructor pattern is quite common. $url =3D Url::parseFromString() $url =3D Url::parseToArray(); * I... do not understand the point of having public properties AND gette= rs/withers. A readonly class with withers, OK, a bit clunky to implemen= t but it would be your problem in C, not mine, so I don't care. :-) But= why getters AND public properties? If going that far, why not finish u= p clone-with and then we don't need the withers, either? :-) * Making all the parameters to Url required except port makes little sen= se to me. User/pass is more likely to be omitted 99% of the time than p= ort. In practice, most components are optional, in which case it would = be inaccurate to not make them nullable. Empty string wouldn't be quite= the same, as that is still a value and code that knows to skip empty st= ring when doing something is basically the same as code that knows to sk= ip nulls. We should assume people are going to instantiate this class t= hemselves often, not just get it from the parser, so it should be design= ed to support that. * I would not make Url final. "OMG but then people can extend it!" Exa= ctly. I can absolutely see a case for an HttpUrl subclass that enforces= scheme as http/https, or an FtpUrl that enforces a scheme of ftp, etc. = Or even an InternalUrl that assumes the host is one particular company,= or something. (If this sounds like scope creep, it's because I am conf= ident that people will want to creep this direction and we should plan a= head for it.) * If the intent of the withers is to mimic PSR-7, I don't think it does = so effectively. Without the interface, it couldn't be a drop-in replace= ment for UriInterface anyway. And we cannot extend it to add the interf= ace if it's final. Widening the parameters in PSR-7 interfaces to suppo= rt both wouldn't work, as that would be a hard-BC break for any existing= implementations. So I don't really see what the goal is here. * If we ever get "data classes", this would be a good candidate. :-) * Crazy idea: new UriParser(HttpUrl::class)->parse(string); To allow a more restrictive set of rules. Or even just to cast the obje= ct to that child class. --Larry Garfield