Newsgroups: php.internals,php.internals Path: news.php.net Xref: news.php.net php.internals:126502 php.internals:126503 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: by qa.php.net (Postfix, from userid 65534) id 545101A00CF; Tue, 25 Feb 2025 16:00:33 +0000 (UTC) To: internals@lists.php.net,Nicolas Grekas , internals@lists.php.net Message-ID: <9265d300-6f9b-43a3-8479-c280bfab42d3@gmail.com> Date: Tue, 25 Feb 2025 17:00:32 +0100 Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Reply-To: nyamsprod@gmail.com Subject: Re: [PHP-DEV] [RFC] [Discussion] Add WHATWG compliant URL parsing API References: <1BCB4144-231D-45EA-A914-98EE8F0F503A@automattic.com> <8E614C9C-BA85-45D8-9A4E-A30D69981C5D@automattic.com> <9bf11a89-39d9-457b-b0ea-789fd07d7370@gmail.com> Content-Language: fr In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Posted-By: 94.225.114.231 From: nyamsprod@gmail.com (ignace nyamagana butera) On 24/02/2025 12:08, Nicolas Grekas wrote: > Hi, > > Thanks for all the efforts making this RFC happen, it'll be a game > changer in the domain! > > I'm seeing a push to make the classes final. Please don't! > This would badly break the open/closed principle to me. > > When shipping a new class, one ships two things: a behavior and a type. > The behavior is what some want to close by making the class final. But > the result is that the type will also be final. And this would lead to a > situation where people tighly couple their code to one single > implementation - the internal one. > > The situation I'm telling about is when one will accept an argument > described as > function (\Uri\WhatWg\Url $url) > > If the Url class is final, this signature means only one possible > implementation can ever be passed: the native one. Composition cannot be > achieve because there's no type to compose. > > Fine-tuning the behavior provided by the RFC is what we might be most > interested in, but we should not forget that we also ship a type. By > making the type non-final, we keep things open enough for userland to > build on it. If not, we're going to end up with a fragmented community: > some will tightly couple to the native Url implementation, some others > will define a UriInterface of their own and will compose it with the > native implementation, all these with non-interoperable base types of > course, because interop is hard. > > By making the classes non-final, there will be one base type to build > upon for userland. > (the alternative would be to define native UrlInterface, but that'd > increase complexity for little to no gain IMHO - althought that'd solve > my main concern). > > > > 5 - Can the returned array from __debugInfo be used in a "normal" > > method like `toComponents` naming can be changed/improve to ease > > migration from parse_url or is this left for userland library ? > > I would prefer not expose this functionality for the same reason that > there are no raw properties provided: The user must make an explicit > choice whether they are interested in the raw or in the normalized > version of the individual components. > > > The RFC is also missing whether __debugInfo returns raw or non-raw > components. Then, I'm wondering if we need this per-component break for > debugging at all? It might be less confusing (on this encoding aspect) > to dump basically what __serialize() returns (under another key than > __uri of course). > This would also close the avenue of calling __debugInfo() directly (at > the cost of making it possibly harder to move away from parse_url(), but > I don't think we need to make this simpler - getting familiar with the > new API before would be required and welcome actually.) > > It can make sense to normalize a hostname, but not the path. My usual > example against normalizing the path is that SAML signs the *encoded* > URI instead of the payload and changing the case in percent-encoded > characters is sufficient to break the signature > > > I would be careful with this argument: signature validation should be > done on raw bytes. Requiring an object to preserve byte-level accuracy > while the very purpose of OOP is to provide abstractions might be > conflicting. The signing topic can be solved by keeping the raw signed > payload around. > > Hi Nicolas, > > 5 - Can the returned array from __debugInfo be used in a "normal" > > method like `toComponents` naming can be changed/improve to ease > > migration from parse_url or is this left for userland library ? > > I would prefer not expose this functionality for the same reason that > there are no raw properties provided: The user must make an explicit > choice whether they are interested in the raw or in the normalized > version of the individual components. I only mention this because I saw the debugInfo method being implemented. TBH I would be more be in favor of removing the method all together I fail to see the added value of such method unless we want to hide the class internal property in which case it should then "just" show the raw URL and nothing more.