Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:124258
X-Original-To: internals@lists.php.net
Delivered-To: internals@lists.php.net
Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5])
	by qa.php.net (Postfix) with ESMTPS id C6BFB1A009C
	for <internals@lists.php.net>; Sun,  7 Jul 2024 11:10:33 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail;
	t=1720350717; bh=y564YN7UbFGCkhTYl9VEovN8JnU99ogMQxmkhV5Usaw=;
	h=In-Reply-To:References:Date:From:To:Subject:From;
	b=kj4hNuoYKMqIm2bdtEt39MzSI31L2U/yY0lFH2BV3a7LLAAvO4JHovoj4rZeZ2+PY
	 OSdTJPnhCRQVk69wxo8bj+14ddE5Yr1SZ/rySPbV+pjiZ36Vby5EqSuCoNdtdZD/3y
	 2CDdjDN+B181pp3iYT+jRgVyaEjag/TlxsDEoBGfy6/ibmZQV8ifFS7ob8aacK8BVs
	 d+jy8rVEiSmuys/C3ldMSCiITktEIT5PgPTVcuUPy5EXR5M004Fg/z1lYcgbGPF6UF
	 ce9ub/BuSZwaxMfq8pUCYYuCgB/J0J/KaBWIZeCmPSZPuxIMS3RxOvGelkmxt4//ft
	 sIV81hac6azSw==
Received: from php-smtp4.php.net (localhost [127.0.0.1])
	by php-smtp4.php.net (Postfix) with ESMTP id C1D71180A00
	for <internals@lists.php.net>; Sun,  7 Jul 2024 11:11:56 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net
X-Spam-Level: 
X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,HTML_MESSAGE,
	RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,
	SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no
	version=4.0.0
X-Spam-Virus: Error (Cannot connect to unix socket
	'/var/run/clamav/clamd.ctl': connect: Connection refused)
X-Envelope-From: <rob@bottled.codes>
Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits))
	(No client certificate requested)
	by php-smtp4.php.net (Postfix) with ESMTPS
	for <internals@lists.php.net>; Sun,  7 Jul 2024 11:11:56 +0000 (UTC)
Received: from compute1.internal (compute1.nyi.internal [10.202.2.41])
	by mailfhigh.nyi.internal (Postfix) with ESMTP id C8AD211403A9;
	Sun,  7 Jul 2024 07:10:31 -0400 (EDT)
Received: from imap49 ([10.202.2.99])
  by compute1.internal (MEProxy); Sun, 07 Jul 2024 07:10:31 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bottled.codes;
	 h=cc:content-type:content-type:date:date:from:from:in-reply-to
	:in-reply-to:message-id:mime-version:references:reply-to:subject
	:subject:to:to; s=fm3; t=1720350631; x=1720437031; bh=y564YN7UbF
	GCkhTYl9VEovN8JnU99ogMQxmkhV5Usaw=; b=NEmDOxH6keApRFJipWCT7KOQnQ
	ZxbVvtPBUfRvIME8Bzmlqjv3jGminLvBpvslhPqiq/72At4TmttPA8W8N/G281ey
	zy/hZvUMxyhbxUlJB085wou9nJEgNAtey/o5ULHVQZkqpU7W0kaciYK9T7TkgtN5
	5u3MesMcWpVt/1TGBUY6ZGyBmDB8b5/5YMz0h0e+6tdYyqHg4OygR0Tl1uPNgaEV
	ceKQvaarLnC7cZW9TL1ZS1qLWv/x+aJcL/JL+e4fr3II0TfKFZXYhDgrt8nYZTg3
	R9+iPopou9v2WA2C0Kl2oJ82MfxQjCKRXm0BvlTCxlwDqxLqWY4D7FpRrJJA==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:content-type:content-type:date:date
	:feedback-id:feedback-id:from:from:in-reply-to:in-reply-to
	:message-id:mime-version:references:reply-to:subject:subject:to
	:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=
	fm2; t=1720350631; x=1720437031; bh=y564YN7UbFGCkhTYl9VEovN8JnU9
	9ogMQxmkhV5Usaw=; b=tEKe5XqXtWh9jCP0b5wBySmLxYXIm0/yOFj30IwQYzSy
	ptK25JNrdzoi8CejclUn0P6utftNxhfeSvQdejbLhThVZM7CtsxvsQ7ZwAsLwYmR
	VcFNiPYBeTRDydb9lxAoIjcO/AuE59GIce97NzdOgWmoWutgHkhaRVd6aStrTR90
	mRrqgjWel+SPZfbOFJVcekEfeT4JvHDgszGmZ8E8WpOc4RjRxDw1BgpImiyJQekZ
	Z3zPJbYwChrB24DiwRS6OgLuINMdmBWXDOkBZ7E3uAVuXtUzi63OP5E8tBk7vvKF
	zOgsu6pjIU0OtU09ZlVL7RoCe5x80UEtttpWSm2hLQ==
X-ME-Sender: <xms:p3eKZgHhffEskRlDaZFHKdVCyvC_XcW4omeRyyltry60FKC_MRtjjg>
    <xme:p3eKZpXPoX3slPRR1QynRNPV6VqUbh7tJLHLWDytMzqD8tvFSiACCQbUFYkDiQl8F
    ZafRShDb1C7zBt0v1Q>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrvdehgdefkecutefuodetggdotefrodftvf
    curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu
    uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc
    fjughrpefofgggkfgjfhffhffvufgtsegrtderreerreejnecuhfhrohhmpedftfhosgcu
    nfgrnhguvghrshdfuceorhhosgessghothhtlhgvugdrtghouggvsheqnecuggftrfgrth
    htvghrnhepvdelveegteelueegieevveduudetgfdtiedtgeegjeehffeiieduieekgfeg
    keelnecuffhomhgrihhnpeifhhgrthifghdrohhrghenucevlhhushhtvghrufhiiigvpe
    dtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehrohgssegsohhtthhlvggurdgtohguvghs
X-ME-Proxy: <xmx:p3eKZqKYIWna06BkFdE6LH2_8GUG1v0gqaInENDCMHenjBVgkTtEFg>
    <xmx:p3eKZiH1qmhtq0Kf6_6uZKx8CBSiy-qVCuQA0NCZuRY3uthdrV9q2A>
    <xmx:p3eKZmXJtf2gu15yFzdZIAOroPXbV8l6D7u2ovYE7crZcq7sOtX-Vg>
    <xmx:p3eKZlMaWhmflJk2ApZxUBc6io8mS1Q-bF329Wm9mAkzt435F8hQTg>
    <xmx:p3eKZji-6C89OwoxhdO6lA2NJkysUOEq149rh_ihUes45xQs4IsFU-Vb>
Feedback-ID: ifab94697:Fastmail
Received: by mailuser.nyi.internal (Postfix, from userid 501)
	id 698A415A0092; Sun,  7 Jul 2024 07:10:31 -0400 (EDT)
X-Mailer: MessagingEngine.com Webmail Interface
User-Agent: Cyrus-JMAP/3.11.0-alpha0-566-g3812ddbbc-fm-20240627.001-g3812ddbb
Precedence: bulk
list-help: <mailto:internals+help@lists.php.net
list-unsubscribe: <mailto:internals+unsubscribe@lists.php.net>
list-post: <mailto:internals@lists.php.net>
List-Id: internals.lists.php.net
MIME-Version: 1.0
Message-ID: <f328f0f8-fa07-48dd-ae9d-cb93ac2ddb61@app.fastmail.com>
In-Reply-To: <5b83b423-f95f-4bec-9bc0-ec1f0114426d@gmail.com>
References: <71a73b87-cc2f-4ee5-a961-7bf2b191fbb6@gmail.com>
 <5159E0AB-C8B0-4A54-9654-986C1D9C858F@koalephant.com>
 <07160e83-7333-44a1-81f2-b121e2cf0ffd@gmail.com>
 <CAH5C8xVVHMn3Q4U2Aa9n3sED3tGdnWoabryTSL6Dk0MrenD+3g@mail.gmail.com>
 <5b83b423-f95f-4bec-9bc0-ec1f0114426d@gmail.com>
Date: Sun, 07 Jul 2024 13:10:11 +0200
To: "ignace nyamagana butera" <nyamsprod@gmail.com>, internals@lists.php.net,
 =?UTF-8?Q?M=C3=A1t=C3=A9_Kocsis?= <kocsismate90@gmail.com>
Subject: Re: [PHP-DEV] [RFC] [Discussion] Add WHATWG compliant URL parsing API
Content-Type: multipart/alternative;
 boundary=ba67ad66c48b46ebbd26cfe30cb16bb4
From: rob@bottled.codes ("Rob Landers")

--ba67ad66c48b46ebbd26cfe30cb16bb4
Content-Type: text/plain;charset=utf-8
Content-Transfer-Encoding: quoted-printable



On Sun, Jul 7, 2024, at 12:55, ignace nyamagana butera wrote:
> Hi M=C3=A1t=C3=A9,
>=20
> > Supporting IANA registered schemes is a valid request, and is=20
> definitely useful. However, I think this feature is not strictly=20
> required to have in the current RFC.
>=20
> True. Having a WHATWG compliant parser in PHP source code is a big +1=20
> from me I have nothing against that inclusion.
>=20
> > Based on your and others' feedback, it has now become clear for me=20
> that parse_url() is still useful and ext/url needs quite some addition=
al=20
> capabilities until this function really becomes superfluous.
>=20
> `parse_url` can only be deprecated when a RFC3986 compliant parser is=20
> added to php-src, hence why I insist in having that parser being prese=
nt=20
> too.
>=20
> I will also add that everything up to now in PHP uses RFC3986 as basis=20
> for generating or representing URLs (cURL extension, streams, etc...).=20
> Having the first and only OOP representation of an URL in the language=20
> not following that same specification seems odd to me. It opens the do=
or=20
> to inconcistencies that will only be resolved once an equivalent RFC39=
86=20
> URL object made its way into the source code.
>=20
> On the public API side I would recommend the following:
>=20
> - if you are to strictly follow the WHATWG specification no URI=20
> component can be null. They must all be strings. If we have to plan to=20
> use the same object for RFC3986 compliant parser, then all components=20
> should be nullable except for the path component which can never be nu=
ll=20
> as it is always present.

This isn't true. It's just that in the language it is specified in, any =
element can be null (i.e., no nullable types). It specifies what may be =
null here: URL Standard (whatwg.org) <https://url.spec.whatwg.org/#url-r=
epresentation>

=E2=80=94 Rob
--ba67ad66c48b46ebbd26cfe30cb16bb4
Content-Type: text/html;charset=utf-8
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE html><html><head><title></title><style type=3D"text/css">p.Mso=
Normal,p.MsoNoSpacing{margin:0}
p.MsoNormal,p.MsoNoSpacing{margin:0}</style></head><body><div><br></div>=
<div><br></div><div>On Sun, Jul 7, 2024, at 12:55, ignace nyamagana bute=
ra wrote:<br></div><blockquote type=3D"cite" id=3D"qt" style=3D""><div>H=
i M=C3=A1t=C3=A9,<br></div><div><br></div><div>&gt; Supporting IANA regi=
stered schemes is a valid request, and is&nbsp;<br></div><div>definitely=
 useful. However, I think this feature is not strictly&nbsp;<br></div><d=
iv>required to have in the current RFC.<br></div><div><br></div><div>Tru=
e. Having a WHATWG compliant parser in PHP source code is a big +1&nbsp;=
<br></div><div>from me I have nothing against that inclusion.<br></div><=
div><br></div><div>&gt; Based on your and others' feedback, it has now b=
ecome clear for me&nbsp;<br></div><div>that parse_url() is still useful =
and ext/url needs quite some additional&nbsp;<br></div><div>capabilities=
 until this function really becomes superfluous.<br></div><div><br></div=
><div>`parse_url` can only be deprecated when a RFC3986 compliant parser=
 is&nbsp;<br></div><div>added to php-src, hence why I insist in having t=
hat parser being present&nbsp;<br></div><div>too.<br></div><div><br></di=
v><div>I will also add that everything up to now in PHP uses RFC3986 as =
basis&nbsp;<br></div><div>for generating or representing URLs (cURL exte=
nsion, streams, etc...).&nbsp;<br></div><div>Having the first and only O=
OP representation of an URL in the language&nbsp;<br></div><div>not foll=
owing that same specification seems odd to me. It opens the door&nbsp;<b=
r></div><div>to inconcistencies that will only be resolved once an equiv=
alent RFC3986&nbsp;<br></div><div>URL object made its way into the sourc=
e code.<br></div><div><br></div><div>On the public API side I would reco=
mmend the following:<br></div><div><br></div><div>- if you are to strict=
ly follow the WHATWG specification no URI&nbsp;<br></div><div>component =
can be null. They must all be strings. If we have to plan to&nbsp;<br></=
div><div>use the same object for RFC3986 compliant parser, then all comp=
onents&nbsp;<br></div><div>should be nullable except for the path compon=
ent which can never be null&nbsp;<br></div><div>as it is always present.=
<br></div></blockquote><div><br></div><div>This isn't true. It's just th=
at in the language it is specified in, any element can be null (i.e., no=
 nullable types). It specifies what may be null here:&nbsp;<a href=3D"ht=
tps://url.spec.whatwg.org/#url-representation">URL Standard (whatwg.org)=
</a><br></div><div><br></div><div id=3D"sig121229152">=E2=80=94 Rob<br><=
/div></body></html>
--ba67ad66c48b46ebbd26cfe30cb16bb4--