Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:124554 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id B34411A00B7 for ; Tue, 23 Jul 2024 06:38:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1721716826; bh=OGA/TlSt95pUokDHf6wkZTCzorQ5U5xRX7h/ovTgqAE=; h=References:In-Reply-To:From:Date:Subject:To:From; b=M2E4yS1XLjphfK/1j1wI88fb/8bupctQ+oGfoWDAl0+q7eyV+ayzshgXUsUmpbHSs 0gzOJ8SwOy15MVqoqR57dtbow7xkRPfShX9TncR/zNJIzXUXzRxRjC92v/5Zgoj4gH JiawEojZVVJloD8MIlAUfYhMO0dIHDdtGG3kIWu+GZBqjQOXXnDeK0oiazAVgtQn2W y4YCG3c2Ec1Lb8ZJJtahg+9j45+9R966/AnIp2yp37fgegMiP35mBXi5av+7aFQpyC yKRCNxJwh0kjdNj3bHhhYt383AShmqOj1mUorFzw6sTZRrBWXZuLe/hJT6pWNNyPgr pBLCw72wyppYQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 451A8180039 for ; Tue, 23 Jul 2024 06:40:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-yw1-f179.google.com (mail-yw1-f179.google.com [209.85.128.179]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 23 Jul 2024 06:40:25 +0000 (UTC) Received: by mail-yw1-f179.google.com with SMTP id 00721157ae682-65f9e25fffaso55037067b3.3 for ; Mon, 22 Jul 2024 23:38:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1721716731; x=1722321531; darn=lists.php.net; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=OGA/TlSt95pUokDHf6wkZTCzorQ5U5xRX7h/ovTgqAE=; b=b5HKKmYYx7PVh/IJEt1WBOtVGz0He/DI9xqO/PIX6GoFaYP+5eKmK7BNmVl+bjWh4D hZK0bP+veWaNKqqhC4/aLzFP1uZvube0Z758vQyWHwuQgUg7v/aaw2fjMyIPjeDXLu7T t8ZLtmElkd/JpPYyCKZLwaV1I+8RoYnLmKWuHaWDrgKEviHY8n/XWYSWHapxQE0aKCI5 K3iVHPSQYdpRxnhe/hiedbZN1QAUBp9JgxnDQFGbeHqoSkh5mpZFKY3vWki4nU79gxtn 1CU9qTfucvQrD0efh9IKvYFRt/DOQkLcOQ+UaXoEnDP4ibvrGOq59mIncpRuWUEbU/XF YrTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721716731; x=1722321531; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OGA/TlSt95pUokDHf6wkZTCzorQ5U5xRX7h/ovTgqAE=; b=rMTwnTmfGREwChtbDSWzbr483tRSdWzENQyX9AFxdcGe1H9VzMtePygCzhukGOrEpx 74mu4phqe5YKUKpiu8FUCDj+2odUzdgpf4puF4BlcUyKKo/sFh6GLtM+trgVnhlu2NWT movI0raeLJhbl+xS+ps4rCYp4tT2eyfuUnzSe7SzZW8sN+HwYVE47W9KAEfDGQ5kU9tp j7PNvyjnGJQs/zlJDEmuOAkzcNQjXfXliRbDlqmoGkpjL2na8B7vrOT0hUxLKtoB85PV LJwFA+PkYxkUx2CWjiErEcY9rSSTTLgSm9njnn7HmT3h7ff9vTpI1dxdRPzn6Oh0ICo1 gRDw== X-Forwarded-Encrypted: i=1; AJvYcCXc5Cp/MP+E5zLBWv4Xw6oRXhjEPWaAVj53hU5uHIUAgvsDTSlORE1+ooXp3fRtcufkEKxD9fLolLkg0yct9rK0F67dv182Cg== X-Gm-Message-State: AOJu0YzFwi+q2fdE8Eb2pBHx1EoCn90ys+cGaRWuJOhY8qJP96bb5gGq pNdLsZ/IJRyoCV772IwmfAVJJZ987pBJXu9pYuyLnwORjvS/CpPTI8dmlRx+3ub4QUeEJO9Qny2 ipgqRL092mg4g6TrZN4ltNCRmG4Q= X-Google-Smtp-Source: AGHT+IGwP/eO7qvCx+IFH6Nx1W7EGSSfomGlFzxnvIiqccmhVk2CG8D9x0dcmHJ/hs2yy4N4fUWmZvAqKxZm/xvlvxo= X-Received: by 2002:a81:d101:0:b0:627:a757:cdfa with SMTP id 00721157ae682-66ada1fc200mr92986067b3.38.1721716731251; Mon, 22 Jul 2024 23:38:51 -0700 (PDT) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 References: In-Reply-To: Date: Tue, 23 Jul 2024 08:38:40 +0200 Message-ID: Subject: Re: [PHP-DEV] [RFC] [Discussion] Add WHATWG compliant URL parsing API To: Niels Dossche , internals@lists.php.net Content-Type: multipart/alternative; boundary="0000000000002b9b7a061de46a57" From: nyamsprod@gmail.com (ignace nyamagana butera) --0000000000002b9b7a061de46a57 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable > Hi M=C3=A1t=C3=A9 > > Something that I thought about lately is how the existing URL parser in PHP is used in various different places. > So for example, in the http fopen wrapper or in the filter extension we rely on the built-in URL parser. > I think it would be beneficial if a URL parser was "pluggable" and the url extension could be used instead of the current one for those usages (opt-in). > > Kind regards > Niels Hi Niels, As mentioned before, I believe the "pluggable" system can only be applied once a RFC3986 URL object is available, using the WHATWG URL would constitute a major BC. I would even go a step further and state that even by using the RFC3986 URL object you would still face some issues, for instance, in regards to `file` scheme based URL. Those are not parsed the same way with `parse_url` function and RFC3986 rules. Maybe that change may land on PHP9 or the behaviour may be deprecated to be removed in PHP10 whenever that one happens. On Sun, Jul 21, 2024 at 1:22=E2=80=AFPM Niels Dossche wrote: > On 28/06/2024 22:06, M=C3=A1t=C3=A9 Kocsis wrote: > > Hi Everyone, > > > > I've been working on a new RFC for a while now, and time has come to > present it to a wider audience. > > > > Last year, I learnt that PHP doesn't have built-in support for parsing > URLs according to any well established standards (RFC 1738 or the WHATWG > URL living standard), since the parse_url() function is optimized for > performance instead of correctness. > > > > In order to improve compatibility with external tools consuming URLs > (like browsers), my new RFC would add a WHATWG compliant URL parser > functionality to the standard library. The API itself is not final by any > means, the RFC only represents how I imagined it first. > > > > You can find the RFC at the following link: > https://wiki.php.net/rfc/url_parsing_api < > https://wiki.php.net/rfc/url_parsing_api> > > > > Regards, > > M=C3=A1t=C3=A9 > > > > Hi M=C3=A1t=C3=A9 > > Something that I thought about lately is how the existing URL parser in > PHP is used in various different places. > So for example, in the http fopen wrapper or in the filter extension we > rely on the built-in URL parser. > I think it would be beneficial if a URL parser was "pluggable" and the ur= l > extension could be used instead of the current one for those usages > (opt-in). > > Kind regards > Niels > --0000000000002b9b7a061de46a57 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
> Hi M=C3=A1t=C3=A9
>=C2=A0
> Something tha= t I thought about lately is how the existing URL parser in PHP is used in v= arious different places.
> So for example, in the http fopen wrapper = or in the filter extension we rely on the built-in URL parser.
> I th= ink it would be beneficial if a URL parser was "pluggable" and th= e url extension could be used instead of the current one for those usages (= opt-in).
>=C2=A0
> Kind regards
>= Niels


Hi Niels,
As ment= ioned=C2=A0before, I believe the "pluggable" system can only be a= pplied once a=C2=A0RFC3986=C2=A0URL object is available, u= sing the WHATWG URL
would constitu= te a major BC. I would even go a step further and state that even by using = the=C2=A0RFC3986=C2=A0= URL object you would still face some= issues, for instance,
in regards to `file` scheme=C2=A0based URL. Those are not parsed the sa= me way=C2=A0with `parse_url`=C2=A0function and RFC3986 rules.
Maybe that change may land on PHP9 or the behaviour may be de= precated to be removed in PHP10 whenever=C2=A0that one happens.

On Sun, Jul 21, 2024 at 1:22= =E2=80=AFPM Niels Dossche <do= ssche.niels@gmail.com> wrote:
On 28/06/2024 22:06, M=C3=A1t=C3=A9 Kocsis wrote:
> Hi Everyone,
>
> I've been working on a new RFC for a while now, and time has come = to present it to a wider audience.
>
> Last year, I learnt that PHP doesn't have built-in support for par= sing URLs according to any well established=C2=A0standards (RFC=C2=A01738 o= r the WHATWG URL living standard), since the parse_url() function is optimi= zed for performance instead of correctness.
>
> In order to improve compatibility with external tools consuming=C2=A0U= RLs (like browsers), my new RFC would add a WHATWG compliant URL parser fun= ctionality to the standard library. The API itself is not final by any mean= s, the RFC only represents how I imagined it first.
>
> You can find the RFC at the following link:=C2=A0https:= //wiki.php.net/rfc/url_parsing_api <https://wiki.php.= net/rfc/url_parsing_api>
>
> Regards,
> M=C3=A1t=C3=A9
>

Hi M=C3=A1t=C3=A9

Something that I thought about lately is how the existing URL parser in PHP= is used in various different places.
So for example, in the http fopen wrapper or in the filter extension we rel= y on the built-in URL parser.
I think it would be beneficial if a URL parser was "pluggable" an= d the url extension could be used instead of the current one for those usag= es (opt-in).

Kind regards
Niels
--0000000000002b9b7a061de46a57--