Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:124429 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id D9C3D1A00B7 for ; Mon, 15 Jul 2024 13:23:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1721049903; bh=Y0d9WhqvrXLkmO7rOp5Gb1+9NnFgpE4WZYwUqd0iQjg=; h=In-Reply-To:References:Date:From:To:Subject:From; b=K3ODBtrDPxic/SZ2HIL/1WS6ongA9oqoR6d/lyenaH65Ysay5xdb3G6qhPGcpGeiB +SBruIj8m75eK9XPjgT4FCTWbV9imijeIvUmCfA2vZir+3A4v8Og5VMphJJwRrvk4i w4NJpKbCBN6Q80Qp9zX2NMSjXcjtXIndK4KP2T51uQSqOL1AESSUefoEYP6/JTA9/Z J2fsorcyBPnaoXWYg3SMSPUpzNQBGmU4cQFBviFFPrnZPBFt6PSLy1jKgmUsLchA4T 7YsPjNDMUj8yLcL8NFqINWDrREL5cIRLnmpG64zanTWRcLbpVXgo9A6oWzSzxzNy8R 7cDEpJYPQDOFQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 9831B18002E for ; Mon, 15 Jul 2024 13:25:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,RCVD_IN_DNSWL_LOW, SPF_HELO_PASS,SPF_NONE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from fout1-smtp.messagingengine.com (fout1-smtp.messagingengine.com [103.168.172.144]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 15 Jul 2024 13:25:01 +0000 (UTC) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailfout.nyi.internal (Postfix) with ESMTP id 8EAB21388AF2 for ; Mon, 15 Jul 2024 09:23:32 -0400 (EDT) Received: from imap50 ([10.202.2.100]) by compute1.internal (MEProxy); Mon, 15 Jul 2024 09:23:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= garfieldtech.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to; s=fm2; t=1721049812; x=1721136212; bh=H9LB7wTxNLgTluFKEANQy iP831CdICFxJcl3APZxHdc=; b=fjJnHdLFcL6LVowxSzJDjqXclmUbKzKEe2MCz Uo9O8H14ZxMs56BvnoOVqEVP9Q6M3a7sabM/kAgNsO//IvHY7qH+I2nPxsCBgm6H 9m/iKcLzcuMcMWyzGn1+jWwitfCznlwUpyT3A56FpeZ+kn729GDaBde9Z8ONcNwS TqRuCUfsOvQ7xsNAunz5uRP2JQuAm4Cj9J+2LdkaauItKJYARw7G/2pbPQkHNwhH /UBlPu/TJpnrpkhZZ1J46o6/JxlfpgnpISAaLxV4GX2oILUhsaOHWWW4ieIZ+/Ca 8DfcptoQU6D283+kn/GXWC1bp9um8ph3NIJDDkp+lq4wp20ag== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm2; t=1721049812; x= 1721136212; bh=H9LB7wTxNLgTluFKEANQyiP831CdICFxJcl3APZxHdc=; b=Q xMYV2oPCLMUpfQ+pFEJKpPskMKcN1XQKbS3Gwm4rFjNq7BcjSOM0mWnlXwTlRZbA cyh8SJkqkpX67/hQrEYOCbcOMVxd90r+OM0DpzNyFig/ecU9xpg/ht6ZhgjWH/tN ru0vLtJKo3R3tdH3y0eyQMDZK0R/sQnWeCBhZHv556A2u38fr3y8sNeU794uFh/A yOxGJgXTKI5g+9ALcWzk0Pijdr4VD3CxurIp+Oos8yqCaj5nsc3D/2y5/vduMr9p eDy1UcCUije5YNuj+oeA5X6+VT9wMerTss6FUATVRl83dANtZusW2KlLV1ktTNlx +AR2gDqy4xOpUUk0oWCjw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrgedvgdeifecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenog fuuhhsphgvtghtffhomhgrihhnucdlgeelmdenucfjughrpefofgggkfgjfhffhffvufgt gfesthhqredtreerjeenucfhrhhomhepfdfnrghrrhihucfirghrfhhivghlugdfuceolh grrhhrhiesghgrrhhfihgvlhguthgvtghhrdgtohhmqeenucggtffrrghtthgvrhhnpefg tdeijedvveekieejvdegleeuueejhefgtedvfeegvdevvdekleetfedvueeuieenucffoh hmrghinhepghhithhhuhgsrdgtohhmpdhgihhthhhusgdrihhonecuvehluhhsthgvrhfu ihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomheplhgrrhhrhiesghgrrhhfihgvlh guthgvtghhrdgtohhm X-ME-Proxy: Feedback-ID: i8414410d:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 1C3871700096; Mon, 15 Jul 2024 09:23:31 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.11.0-alpha0-568-g843fbadbe-fm-20240701.003-g843fbadb Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 Message-ID: <6b078c2a-91fa-48dd-b7d9-154eb6136249@app.fastmail.com> In-Reply-To: References: <71a73b87-cc2f-4ee5-a961-7bf2b191fbb6@gmail.com> <5159E0AB-C8B0-4A54-9654-986C1D9C858F@koalephant.com> <07160e83-7333-44a1-81f2-b121e2cf0ffd@gmail.com> Date: Mon, 15 Jul 2024 13:23:10 +0000 To: "php internals" Subject: Re: [PHP-DEV] [RFC] [Discussion] Add WHATWG compliant URL parsing API Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable From: larry@garfieldtech.com ("Larry Garfield") On Mon, Jul 15, 2024, at 9:20 AM, M=C3=A1t=C3=A9 Kocsis wrote: > Hey Ignace, Nicolas, > > Based on your request for adding support for RFC 3986 spec compatible=20 > parsing, > I evaluated another library (https://github.com/uriparser/uriparser/)=20 > in the recent days > in order to add support for the requested functionality. As far as I=20 > can tell, the results > were very promising, so I'm ok to include this into my proposal (I=20 > haven't pushed my > changes yet and haven't updated the RFC yet). > > Regarding the reference resolution=20 > (https://uriparser.github.io/doc/api/latest/#resolution) > feature which has also already been asked for, I'm genuinely wondering=20 > what the use-case is? > But in any case, I'm fine with incorporating this as well into the RFC= ,=20 > since apparently > both Lexbor and uriparser support this (naturally). > > What I became puzzled about is the correct object structure and naming= .=20 > Now that uriparser > which can deal with URIs came into the picture, while Lexbor can parse=20 > URLs, I don't > know if it's a good idea to have a dedicated URI and a URL class=20 > extending the former one... > If it is, then in my opinion, the logical behavior would be that Lexbo= r=20 > always instantiates URL > classes, while uriparser would have to decide if the passed-in URI is=20 > actually an URL, and > choose the instantiated class based on this factor... But in this case=20 > the differences between > the RFC 3986 and WHATWG specifications couldn't be spelled out, since=20 > URL objects > could hold URLs parsed based on both specs (and therefore having a=20 > unified interface is required). > > Or rather we should have a separate URI and a WhatwgUrl class so that=20 > the former one would > always be created by uriparser, while the latter one by Lexbor? This=20 > way we could have a dedicated > object interface for both standards (e.g. the RFC 3986 related one=20 > could have a getUserInfo() method, > while the WHATWG related one could have both getUser() and=20 > getPassword() methods). But then > the question is how interchangeable these classes should be? I.e.=20 > should we be able to convert them > back and forth, or should there be an interface that is implemented by=20 > the two classes? > > I'd appreciate any suggestions regarding these questions. > > P.S. due to its bad receptance, I got rid of the UrlParser class as=20 > well as the UrlComponent enum from my > implementation in the meantime. > > Regards, > M=C3=A1t=C3=A9 I apologize if I missed this up-thread somewhere, but what precisely are= the differences between URI and URL? My understanding was that URL is = a subset of URI (all URLs are URIs, but not all URIs are URLs). You're = saying they're slightly disjoint sets? Can you give some concrete examp= les of where the parsing rules would produce different results? That ma= y give us a better sense of what the logic should be. --Larry Garfield