Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:124507 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 0AFA61A00B7 for ; Fri, 19 Jul 2024 22:55:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1721429831; bh=SbamX2aWxm+9PYAXM7fzWPZp9C4dxiTFDz3OSfcH53M=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=nTe9FK0Wd1Hc+T0pdoPgD99uCvBReqwh2PaeokpwEcvNY0peZHPV+fR/ZaIPSudUb V4fpbJVbhujxFSVxuqvzxzpbfJ4A6eDiI1dmst5hsvp9FFkNoKqiBj8UiWwvNxZjGj eqy2WPOZglwOQkEWfSNKcPDwLIVNUELPDt3H7MGN4/e1AuopemgEKP7TEmxY2gFePa KPNrypI10AZEJXDgWO7I9gPXi8w9muda0+WOu0nS5V04zgbJwG4e2Y/rOQOY0Y37m7 IuG5FXJy0GvkK3ZCtMj8zeWNbbl1FGT9BCCcryqF0lJLXCD3t95Dd84Wl8YVtOxL7H bagI73Sq8qgAg== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 8AC16180051 for ; Fri, 19 Jul 2024 22:57:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-vk1-f170.google.com (mail-vk1-f170.google.com [209.85.221.170]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 19 Jul 2024 22:57:10 +0000 (UTC) Received: by mail-vk1-f170.google.com with SMTP id 71dfb90a1353d-4f485148d8fso858786e0c.2 for ; Fri, 19 Jul 2024 15:55:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1721429738; x=1722034538; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=SbamX2aWxm+9PYAXM7fzWPZp9C4dxiTFDz3OSfcH53M=; b=J52ERFPXB4UlUzpi8A44KHBkdWXptGQ6GAfHFzOlafUdtkvua8bI20LV4tvBK1xtYN WeV2h00CsksnI+WxBjINI3LnGXwvM9k0uqTHrOxCWwhbL0gokVvaTQBP0g1bQvnF+VxD gpp6C8Zv7YSar2BMHWsiGoUNCzKjl0FxuSGtOShoG48pVAaeHRe+uHHBO6C7MzolGGAz 5NuPxyUTz4d+5A1JYvxiNwxFTsP9ShfK+KgSbFuKvefN9u0cNctucjDpBffnGc2ROjZ3 lJxxmRUdsuVWx7jpxUnBXoaWnLyUloEeYANRxLifV2UguDv+hXulE2I2mDuWrXibY+zK jhCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721429738; x=1722034538; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=SbamX2aWxm+9PYAXM7fzWPZp9C4dxiTFDz3OSfcH53M=; b=N+NyIrb4KoRt8sjUrorGom6Ho1LABc0XZsO4zUNpoVCKu7eQtYj5cjldfzckL8qZAC tietrS0OziRWUbmdCb40GClFugR6YqriKtTM0XmY7AfMEdliDFpd9GLWC0hz/7T4wdY+ pVaNnmtJ4a2SUEyult7biTTeYH+VtVfwQNF2PTk5NFHhdwJE86jezgQ88qgdkT8+8rkz Aj64ZpwPWvrSixmwyvGiDLNH1CbtCXwen1rRQ1X9QSbDiZXzYGl3pseJ2keuzXGhFXwq 5uJd/8A09GBPD02iDlzY0VtVYFyfOBiJTf4tXutdTXYfV+767xyi48qIY2B06Wgi7wh5 Y8Jw== X-Gm-Message-State: AOJu0YyzIZpGygosTPK2aQuVXTtvfNzx+mqmoaSUjQF77yqaGd47Mmj7 8/a20dhu/GDiY8nMxARPbrMPOM9GHTnxgV9fW0YPIxNQi9l0WZeQmAji06V4Wcpl0c6N7agW7ee aKpc/uR3FmVg8HJWpUeO1wYv3YaU= X-Google-Smtp-Source: AGHT+IE03dawTQf0yqAuQtY4K/51AuDrtd8S11QFxPla3qIfjTw+ns5wYCPrc2N7XACH9Indojxe8keSQghx3FUAWhQ= X-Received: by 2002:a05:6122:2a43:b0:4e9:7e39:cc9c with SMTP id 71dfb90a1353d-4f50685e280mr1381014e0c.11.1721429737635; Fri, 19 Jul 2024 15:55:37 -0700 (PDT) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 References: In-Reply-To: Date: Fri, 19 Jul 2024 16:55:27 -0600 Message-ID: Subject: Re: [PHP-DEV] [RFC] [Discussion] Add WHATWG compliant URL parsing API To: =?UTF-8?B?TcOhdMOpIEtvY3Npcw==?= Cc: PHP Internals List Content-Type: multipart/alternative; boundary="00000000000004880f061da19831" From: lnearwaju@gmail.com (Lanre) --00000000000004880f061da19831 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Jul 8, 2024 at 11:24=E2=80=AFAM Lanre wrote: > > On Fri, Jun 28, 2024 at 3:38=E2=80=AFPM M=C3=A1t=C3=A9 Kocsis > wrote: > >> Hi Everyone, >> >> I've been working on a new RFC for a while now, and time has come to >> present it to a wider audience. >> >> Last year, I learnt that PHP doesn't have built-in support for parsing >> URLs according to any well established standards (RFC 1738 or the WHATWG >> URL living standard), since the parse_url() function is optimized for >> performance instead of correctness. >> >> In order to improve compatibility with external tools consuming URLs >> (like browsers), my new RFC would add a WHATWG compliant URL parser >> functionality to the standard library. The API itself is not final by an= y >> means, the RFC only represents how I imagined it first. >> >> You can find the RFC at the following link: >> https://wiki.php.net/rfc/url_parsing_api >> >> Regards, >> M=C3=A1t=C3=A9 >> >> > I was exploring wrapping ada_url for PHP ( > https://github.com/lnear-dev/ada-url). It works, but it's a bit slower, > likely due to the implementation of the objects. I was planning to embed > the zvals directly in the object, similar to PhpToken, but I haven't had > the chance and don't really need it anymore. Shouldn't be too much work t= o > clean it up though > I=E2=80=99ve updated the implementation, and with Ada 2.9.0, the performanc= e is now closer to `parse_url` for short URLs and even outperforms it for longer URLs. You can see the benchmarks in the "Run benchmark script" section of [this GitHub Actions run]( https://github.com/lnear-dev/ada-url/actions/runs/9982725628/job/2758901155= 4 ). cheers, Lanre --00000000000004880f061da19831 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Mon, Jul 8, 2024 at 11:24=E2=80=AFAM L= anre <lnearwaju@gmail.com>= wrote:

On Fri, Jun 28, 2024= at 3:38=E2=80=AFPM M=C3=A1t=C3=A9 Kocsis <kocsismate90@gmail.com> wrote:
Hi Ev= eryone,

I've been working on a new RFC for a while n= ow, and time has come to present it to a wider audience.

Last year, I learnt that PHP doesn't have built-in support for p= arsing URLs according to any well established=C2=A0standards (RFC=C2=A01738= or the WHATWG URL living standard), since the parse_url() function is opti= mized for performance instead of correctness.

In o= rder to improve compatibility with external tools consuming=C2=A0URLs (like= browsers), my new RFC would add a WHATWG compliant URL parser functionalit= y to the standard library. The API itself is not final by any means, the RF= C only represents how I imagined it first.

You can= find the RFC at the following link:=C2=A0https://wiki.php.net/rfc/url_parsing_= api

Regards,
M=C3=A1t=C3=A9


I was exploring wrapping= ada_url for PHP (https://github.com/lnear-dev/ada-= url). It works, but it's a bit slower, likely due to the implementa= tion of the objects. I was planning to embed the zvals directly in the obje= ct, similar to PhpToken, but I haven't had the chance and = don't really need it anymore. Shouldn't be too much work to clean i= t up though=C2=A0

I= =E2=80=99ve updated the implementation, and with Ada 2.9.0, the performance= is now closer to `parse_url` for short URLs and even outperforms it for lo= nger URLs. You can see the benchmarks in the "Run benchmark script&quo= t; section of [this GitHub Actions run](https://github.com/ln= ear-dev/ada-url/actions/runs/9982725628/job/27589011554).
cheers,
Lanre=C2=A0
--00000000000004880f061da19831--