Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:124258 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id C6BFB1A009C for ; Sun, 7 Jul 2024 11:10:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1720350717; bh=y564YN7UbFGCkhTYl9VEovN8JnU99ogMQxmkhV5Usaw=; h=In-Reply-To:References:Date:From:To:Subject:From; b=kj4hNuoYKMqIm2bdtEt39MzSI31L2U/yY0lFH2BV3a7LLAAvO4JHovoj4rZeZ2+PY OSdTJPnhCRQVk69wxo8bj+14ddE5Yr1SZ/rySPbV+pjiZ36Vby5EqSuCoNdtdZD/3y 2CDdjDN+B181pp3iYT+jRgVyaEjag/TlxsDEoBGfy6/ibmZQV8ifFS7ob8aacK8BVs d+jy8rVEiSmuys/C3ldMSCiITktEIT5PgPTVcuUPy5EXR5M004Fg/z1lYcgbGPF6UF ce9ub/BuSZwaxMfq8pUCYYuCgB/J0J/KaBWIZeCmPSZPuxIMS3RxOvGelkmxt4//ft sIV81hac6azSw== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id C1D71180A00 for ; Sun, 7 Jul 2024 11:11:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: Error (Cannot connect to unix socket '/var/run/clamav/clamd.ctl': connect: Connection refused) X-Envelope-From: Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sun, 7 Jul 2024 11:11:56 +0000 (UTC) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailfhigh.nyi.internal (Postfix) with ESMTP id C8AD211403A9; Sun, 7 Jul 2024 07:10:31 -0400 (EDT) Received: from imap49 ([10.202.2.99]) by compute1.internal (MEProxy); Sun, 07 Jul 2024 07:10:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bottled.codes; h=cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1720350631; x=1720437031; bh=y564YN7UbF GCkhTYl9VEovN8JnU99ogMQxmkhV5Usaw=; b=NEmDOxH6keApRFJipWCT7KOQnQ ZxbVvtPBUfRvIME8Bzmlqjv3jGminLvBpvslhPqiq/72At4TmttPA8W8N/G281ey zy/hZvUMxyhbxUlJB085wou9nJEgNAtey/o5ULHVQZkqpU7W0kaciYK9T7TkgtN5 5u3MesMcWpVt/1TGBUY6ZGyBmDB8b5/5YMz0h0e+6tdYyqHg4OygR0Tl1uPNgaEV ceKQvaarLnC7cZW9TL1ZS1qLWv/x+aJcL/JL+e4fr3II0TfKFZXYhDgrt8nYZTg3 R9+iPopou9v2WA2C0Kl2oJ82MfxQjCKRXm0BvlTCxlwDqxLqWY4D7FpRrJJA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1720350631; x=1720437031; bh=y564YN7UbFGCkhTYl9VEovN8JnU9 9ogMQxmkhV5Usaw=; b=tEKe5XqXtWh9jCP0b5wBySmLxYXIm0/yOFj30IwQYzSy ptK25JNrdzoi8CejclUn0P6utftNxhfeSvQdejbLhThVZM7CtsxvsQ7ZwAsLwYmR VcFNiPYBeTRDydb9lxAoIjcO/AuE59GIce97NzdOgWmoWutgHkhaRVd6aStrTR90 mRrqgjWel+SPZfbOFJVcekEfeT4JvHDgszGmZ8E8WpOc4RjRxDw1BgpImiyJQekZ Z3zPJbYwChrB24DiwRS6OgLuINMdmBWXDOkBZ7E3uAVuXtUzi63OP5E8tBk7vvKF zOgsu6pjIU0OtU09ZlVL7RoCe5x80UEtttpWSm2hLQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdehgdefkecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefofgggkfgjfhffhffvufgtsegrtderreerreejnecuhfhrohhmpedftfhosgcu nfgrnhguvghrshdfuceorhhosgessghothhtlhgvugdrtghouggvsheqnecuggftrfgrth htvghrnhepvdelveegteelueegieevveduudetgfdtiedtgeegjeehffeiieduieekgfeg keelnecuffhomhgrihhnpeifhhgrthifghdrohhrghenucevlhhushhtvghrufhiiigvpe dtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehrohgssegsohhtthhlvggurdgtohguvghs X-ME-Proxy: Feedback-ID: ifab94697:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 698A415A0092; Sun, 7 Jul 2024 07:10:31 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.11.0-alpha0-566-g3812ddbbc-fm-20240627.001-g3812ddbb Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 Message-ID: In-Reply-To: <5b83b423-f95f-4bec-9bc0-ec1f0114426d@gmail.com> References: <71a73b87-cc2f-4ee5-a961-7bf2b191fbb6@gmail.com> <5159E0AB-C8B0-4A54-9654-986C1D9C858F@koalephant.com> <07160e83-7333-44a1-81f2-b121e2cf0ffd@gmail.com> <5b83b423-f95f-4bec-9bc0-ec1f0114426d@gmail.com> Date: Sun, 07 Jul 2024 13:10:11 +0200 To: "ignace nyamagana butera" , internals@lists.php.net, =?UTF-8?Q?M=C3=A1t=C3=A9_Kocsis?= Subject: Re: [PHP-DEV] [RFC] [Discussion] Add WHATWG compliant URL parsing API Content-Type: multipart/alternative; boundary=ba67ad66c48b46ebbd26cfe30cb16bb4 From: rob@bottled.codes ("Rob Landers") --ba67ad66c48b46ebbd26cfe30cb16bb4 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: quoted-printable On Sun, Jul 7, 2024, at 12:55, ignace nyamagana butera wrote: > Hi M=C3=A1t=C3=A9, >=20 > > Supporting IANA registered schemes is a valid request, and is=20 > definitely useful. However, I think this feature is not strictly=20 > required to have in the current RFC. >=20 > True. Having a WHATWG compliant parser in PHP source code is a big +1=20 > from me I have nothing against that inclusion. >=20 > > Based on your and others' feedback, it has now become clear for me=20 > that parse_url() is still useful and ext/url needs quite some addition= al=20 > capabilities until this function really becomes superfluous. >=20 > `parse_url` can only be deprecated when a RFC3986 compliant parser is=20 > added to php-src, hence why I insist in having that parser being prese= nt=20 > too. >=20 > I will also add that everything up to now in PHP uses RFC3986 as basis=20 > for generating or representing URLs (cURL extension, streams, etc...).=20 > Having the first and only OOP representation of an URL in the language=20 > not following that same specification seems odd to me. It opens the do= or=20 > to inconcistencies that will only be resolved once an equivalent RFC39= 86=20 > URL object made its way into the source code. >=20 > On the public API side I would recommend the following: >=20 > - if you are to strictly follow the WHATWG specification no URI=20 > component can be null. They must all be strings. If we have to plan to=20 > use the same object for RFC3986 compliant parser, then all components=20 > should be nullable except for the path component which can never be nu= ll=20 > as it is always present. This isn't true. It's just that in the language it is specified in, any = element can be null (i.e., no nullable types). It specifies what may be = null here: URL Standard (whatwg.org) =E2=80=94 Rob --ba67ad66c48b46ebbd26cfe30cb16bb4 Content-Type: text/html;charset=utf-8 Content-Transfer-Encoding: quoted-printable

=

On Sun, Jul 7, 2024, at 12:55, ignace nyamagana bute= ra wrote:
H= i M=C3=A1t=C3=A9,

> Supporting IANA regi= stered schemes is a valid request, and is 
definitely= useful. However, I think this feature is not strictly 
required to have in the current RFC.

Tru= e. Having a WHATWG compliant parser in PHP source code is a big +1 =
from me I have nothing against that inclusion.
<= div>
> Based on your and others' feedback, it has now b= ecome clear for me 
that parse_url() is still useful = and ext/url needs quite some additional 
capabilities= until this function really becomes superfluous.

`parse_url` can only be deprecated when a RFC3986 compliant parser= is 
added to php-src, hence why I insist in having t= hat parser being present 
too.

I will also add that everything up to now in PHP uses RFC3986 as = basis 
for generating or representing URLs (cURL exte= nsion, streams, etc...). 
Having the first and only O= OP representation of an URL in the language 
not foll= owing that same specification seems odd to me. It opens the door 
to inconcistencies that will only be resolved once an equiv= alent RFC3986 
URL object made its way into the sourc= e code.

On the public API side I would reco= mmend the following:

- if you are to strict= ly follow the WHATWG specification no URI 
component = can be null. They must all be strings. If we have to plan to 
use the same object for RFC3986 compliant parser, then all comp= onents 
should be nullable except for the path compon= ent which can never be null 
as it is always present.=

This isn't true. It's just th= at in the language it is specified in, any element can be null (i.e., no= nullable types). It specifies what may be null here: URL Standard (whatwg.org)=

=E2=80=94 Rob
<= /div> --ba67ad66c48b46ebbd26cfe30cb16bb4--