Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:127115 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 84D181A00BC for ; Tue, 15 Apr 2025 17:12:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1744737020; bh=TovtPJVwKsJbzNFD6wn7ZhIJ27Rduo0pM7ne3s/BCfs=; h=Date:Subject:To:References:Cc:From:In-Reply-To:From; b=lYDb1kYVF30c2o+vc6MHiNLjf0PqwVX9wsT+paNEPlqF8WKe+ZXd2Ly3iUV+59AV+ ylrQaqNo60OFghgZ2ql6O/fxF9HiC4r8qYWnSIdPWIOgfubI3OconbI3ArZTE6gnmM WqnJMeWVFlo5fpavzmoykQ5RU9FV/D54ucsSgwxjrJ0nzm9RaTHn1ESWftfxkZXXhC YLxm4i4agaA05VYM9pBTtiRTJhvysPlp2RQxpYzSHfuWz7gF2p/0gKpN/nle8IPMSN FfAfj8KcwHQhNTuoK16RkXf3n/ATjJHj2QgQh4xPXBH/BrJFs7rzilzZn4gW+HC0cu 0m/KtEK3FlS4g== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 884FF18007E for ; Tue, 15 Apr 2025 17:10:19 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-wm1-f47.google.com (mail-wm1-f47.google.com [209.85.128.47]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 15 Apr 2025 17:10:19 +0000 (UTC) Received: by mail-wm1-f47.google.com with SMTP id 5b1f17b1804b1-43cebe06e9eso44329855e9.3 for ; Tue, 15 Apr 2025 10:12:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1744737160; x=1745341960; darn=lists.php.net; h=content-transfer-encoding:in-reply-to:from:cc:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date:message-id:reply-to; bh=lBP1xuHRkiUHbUs/VsNc6vghPS5rxTKrgTjUxJSc8Uw=; b=Y318v/jndtZhkA8lZeMhJaRIA1lyz5jFlnuKz7nsT1BJIO2R0b25NEywCgYtr4mMtD ykduicIgCI70Rs/iy4+hgaud+anDLfTLfa1pta0EofHeuoWZXwLO56Rp++x7eLSAt7PJ hTiId8s9A0lF+cEnbocGTyk6tbf5E4L7K4YMdZl0154IyxRDIqkc1JAl/ItWRaNonKNQ iMlcET6XiZ1qXEUcxj99UrCljII75++TRmmG9HzHSyuuOApVYqOJWIxMM2xYrdqjrCRO vCAeRzNiulo9lBfueVY6jn1Rn6EM374egxZ8p2E5Ydt4O/gx4dOcfBlGjvYGlzawPauS 5D7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744737160; x=1745341960; h=content-transfer-encoding:in-reply-to:from:cc:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lBP1xuHRkiUHbUs/VsNc6vghPS5rxTKrgTjUxJSc8Uw=; b=MBn9erKkxiQ0+xSU6B/cqOwNjKBtu1r7GkMUgTLII4/yhCWP+49cWZ31BrlooZLB5J o0orbM+9N1LzsYQfbIS2+ILmnsv6eEKmFnVVgM542Oen9BWKSt15+v2rC+eUF8Jj9Cpa cXFtBtQCcDAH+YogIARSaDSD+5Hr9qixploYMW7nVjayB7rtqaQv8YF/DAKduZ/dOOja M33nBlfZ7z5m6wSfRldPGbn6pC8Jq8I4r+6gAnQ1A5yB5NnpjNKXS71OvHcjuAZv4zbm 4rwtYpnlKXU2jzk/ddZp0KzeCE4MtQKLfBr0wLEEQEyo+0PKRIV12fL/aSlig3+hSfBz RyVg== X-Gm-Message-State: AOJu0Yz0Ut04XqZYIHLB1NyLLVEPS2fH2jfig0KpxHun3jj4hl4EL77d qSMSAbeb6xbcbchvgYEPEq58ZyrDUoSCWH1f97Zg4wZLUg/SKXNl X-Gm-Gg: ASbGncv/rCTFvy0MzyRRZjItvxzO7Ul3ucxz8OkJqtjf9QvqGmhprQL/w/dkxkpW20b HUCtpaNZJ02iVZTyQex5cyycCP/NqMlByEDhl8x/DrjkP/QLOfxJcaXp6EAA3YslVgP4tqShiNe xPfnRPdCb7thK8TPvoen/ShlNl2aepHpqXMbU3uMwlS431xB21mVz6+UFGFzhYasF9wRigetj3S 9n76LuBFodhN7TgfPq5ltEzNz0rBWt+Fn9y8L6B4CWohf10y4ZFsRQxcCgmnGxd7yCtPGpiF06I hJ8xZKg9FO+xlXSBA7HeZJZIhR4zWAJzGhkkhnrk9Ckzz4k94ZmNe6TEx85Ph5o/NLLqOUZuOOA 145rwSb9KuhyZPD428DdQgB2crJqVrWFx/pKoW5bqEKU4LERMYbtS84vVBHFLXcA5K5YQ+C+o+I z/cLJWMICLDGjMSGlbC1I= X-Google-Smtp-Source: AGHT+IEK4jgnSVLFGXTQ8ESeVCwml0Dao3m1+UrRUnFtXOj8fi0Ysa1uv0hgmDfMA5On0T9LcMiogw== X-Received: by 2002:a05:600c:45c8:b0:43d:a90:9f1 with SMTP id 5b1f17b1804b1-43f3a9258a7mr137190715e9.6.1744737159444; Tue, 15 Apr 2025 10:12:39 -0700 (PDT) Received: from ?IPV6:2a02:1811:3716:cb00:a4d4:8a70:4502:39f6? (ptr-9c16nbdyg6hh2pjxjw6.18120a2.ip6.access.telenet.be. [2a02:1811:3716:cb00:a4d4:8a70:4502:39f6]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43f205ecf30sm217308115e9.1.2025.04.15.10.12.38 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 15 Apr 2025 10:12:38 -0700 (PDT) Message-ID: <2fde0354-7d45-47a7-8ac1-b9b8eea8b3e7@gmail.com> Date: Tue, 15 Apr 2025 19:12:37 +0200 Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PHP-DEV] [RFC] [Discussion] Add WHATWG compliant URL parsing API To: =?UTF-8?Q?Tim_D=C3=BCsterhus?= , =?UTF-8?B?TcOhdMOpIEtvY3Npcw==?= References: <1BCB4144-231D-45EA-A914-98EE8F0F503A@automattic.com> <8E614C9C-BA85-45D8-9A4E-A30D69981C5D@automattic.com> <8df04e01-deac-404b-beb7-cd982423db63@bastelstu.be> <33427cd03035ef084245c44290b56a55@bastelstu.be> Content-Language: fr Cc: PHP Internals List In-Reply-To: <33427cd03035ef084245c44290b56a55@bastelstu.be> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit From: nyamsprod@gmail.com (Ignace Nyamagana Butera) > Perhaps the correct solution would be to offer only the non-raw > methods for WHATWG URL and to not attempt any additional > percent-decoding there? My reasoning is that the WHATWG URL is a > living standard anyways, so trying to add additional semantics on top > will result in sadness. My understanding is also that it is primarily > intended for interaction with web browsers or to embed these URLs into > HTML. For access control, e.g. in your framework the RFC3986 URI > should be used. It's what HTTP uses internally and it supports > well-defined normalization. > > What do you think? > Hi Tim and Maté As a primary user of RFC3986/87 and with my experiences with WHATWG URL I fully support the removal of the `raw` methods on the WhatWgUrl implementation. The specification defines in one go via a state machine parsing, validation and normalization basically you always work with normalized URLs. I believe Javascript developers and browser vendors expect normalization out of the box for security and coherence between browsers. So in the context of browsers raw values are never expected nor wanted. I always wonder how you could extract raw value since the specification always talk about codepoints and parse the URL while normalizing the input. As Tim also pointed out, the WHATWG is a living standard so the URL produces today may not be the one produces tomorrow which would then add more burden on the maintenance side if you constantly need to update how raw values are being extract in a specification that does not even consider them (does not offer an official way to access them). Last but not least I tried several time to implement a polyfill for the Whatwg Url and I fail for that specific reason. I always go back to my initial comment both specs are great in that they complement each other. They may overlaps but they are fundamently different, so their public API should probably also reflect that. (ie WhatwgURL supports IDN host, RFC3986 does not) encoding differs for query string and so on. Trying to offer the same API for both even for `raw` method is IMHO not helping. And probably it may ease even your implementation since you would not have to worry about more edge cases. Best regards, Ignace Nyamagana Butera