Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:126524 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 86A0D1A00BC for ; Fri, 28 Feb 2025 09:27:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1740734674; bh=VQaUgJKPk1cEIUvQqUtHZdSPCcjlcWUvQPoQ48PbsZE=; h=Date:From:To:Cc:In-Reply-To:References:Subject:From; b=NZ9YRhIh5sIx7x48Ur4JQz54+NZnX+aaGwVTR27XltV5fpPRK+1l5JhO3j0vDDwA9 rlm56It0+PsdE02WCk9T/YLPSZgfX59z8HFnf5147V6P4TJJ+3efX6vZ/n9s0Ehedj cv5YGHi8KWa3nqAEARHbDLbz00SzF+6O+aAcdGDDJdCBKdiFQaGu1v5dd7Y4+CULX4 IzUDcSawhlgPncyjn7DsmDkeS2NN3sAJGlPB/x5YZ/SgR40bJOtJBF2AN+Obsv3wbP ArFg/hKPOp4BwDnqncMZElc/wWo8Hnu3smqOLO3xBtz5tTXsZeG8dLPwFTyRCtnogm EQAHx+lFeyb/A== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 37C571801C7 for ; Fri, 28 Feb 2025 09:24:33 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from fout-b8-smtp.messagingengine.com (fout-b8-smtp.messagingengine.com [202.12.124.151]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 28 Feb 2025 09:24:32 +0000 (UTC) Received: from phl-compute-12.internal (phl-compute-12.phl.internal [10.202.2.52]) by mailfout.stl.internal (Postfix) with ESMTP id C2CDE114019B; Fri, 28 Feb 2025 04:27:09 -0500 (EST) Received: from phl-imap-09 ([10.202.2.99]) by phl-compute-12.internal (MEProxy); Fri, 28 Feb 2025 04:27:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bottled.codes; h=cc:cc:content-type:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to; s=fm2; t=1740734829; x= 1740821229; bh=VQaUgJKPk1cEIUvQqUtHZdSPCcjlcWUvQPoQ48PbsZE=; b=n qNUzzbN5rroJzC5Yw2DBFCHQ0bq+b82QQC4nBfq0rkXMQ2uNjd1O9lVzZN2ff5Nm rjvmK6R6oxs94GRu8fq67hVeDNTYzfrbASX3Z3BgwzYDFziWpRiCZbnKmArj5SCf KFIvTsJMsjlY6yLU4IlF58YH+cfnkAMc2q82dNtN+ghYaTlAg5MSyizZ/EZAnjK4 NI8JctjiYhcRoAEQ30f+J/tWZZUI635i67OxGmfIb1MzL5JG7ph38AY1I22FisoA BCHSrHZozIDfMnVUCJLufS6cbe7b8qE2b9t28QeQoXVnKuAltdNuIskGVxDRrCah EPOb+wv9Ja4KncIG4osCA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t= 1740734829; x=1740821229; bh=VQaUgJKPk1cEIUvQqUtHZdSPCcjlcWUvQPo Q48PbsZE=; b=m28NeleSDEkX6Is6Fyp0YYdeJ3udqGWqTli4xIpUMuGxNr8luUp f1L0oonnN/QUsM/HaetXOdF2cRnfE4jw4mMa/fk1OoDJS2M+z/rn6PnZhG9K7U// bzxTxzUNYnfpXFo65csyXZ309kYJC6EU477KBOClJq3f5Zh8YWXeLX2mToF4Hx5W xg8ry/w74hF2LSbtgRwycGnFT61nu3fkEU9iPHOWfuSj3Xwnx4oMrKfVJeT+9GQr a6cA9zX0497FUfi7uWfSplL8leViwJpMvFTYq+/BK8SvFj9ulZR68WYIU5jv7tW9 YUAdf+k+7fKJHjSoWn+l3cTbRUqPlPbzPIA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefvddrtddtgdeltddtvdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefoggffhffvvefkjghfufgtsegrtderreertdej necuhfhrohhmpedftfhosgcunfgrnhguvghrshdfuceorhhosgessghothhtlhgvugdrtg houggvsheqnecuggftrfgrthhtvghrnhepieeuteehvddvfeejhffgieehleehhedthfef keejffelgfevvdekudetjeejtddtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrg hmpehmrghilhhfrhhomheprhhosgessghothhtlhgvugdrtghouggvshdpnhgspghrtghp thhtohepjedpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepthhimhessggrshhtvg hlshhtuhdrsggvpdhrtghpthhtohephhgvlhhlohesfhgrihiirghnrghkrhgrmhdrmhgv pdhrtghpthhtohepkhhjrghrlhhisehgmhgrihhlrdgtohhmpdhrtghpthhtohepnhihrg hmshhprhhougesghhmrghilhdrtghomhdprhgtphhtthhopehinhhtvghrnhgrlhhssehg phgsrdhmohgvpdhrtghpthhtohepihhnthgvrhhnrghlsheslhhishhtshdrphhhphdrnh gvthdprhgtphhtthhopehpmhhjohhnvghssehpmhhjohhnvghsrdhioh X-ME-Proxy: Feedback-ID: ifab94697:Fastmail Received: by mailuser.phl.internal (Postfix, from userid 501) id EF7F9780068; Fri, 28 Feb 2025 04:27:08 -0500 (EST) X-Mailer: MessagingEngine.com Webmail Interface Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 Date: Fri, 28 Feb 2025 10:26:48 +0100 To: Lynn Cc: "Faizan Akram Dar" , "Paul M. Jones" , "ignace nyamagana butera" , =?UTF-8?Q?Tim_D=C3=BCsterhus?= , "Gina P. Banyard" , "PHP Internals List" Message-ID: <7baeea00-b41d-47ea-9edd-dd8005a9fc14@app.fastmail.com> In-Reply-To: References: <811B65CE-1DF0-4A47-937C-4FCB6E945B92@pmjones.io> <19377F60-FB7F-4E6E-A085-4DCB6CD92234@pmjones.io> <7B7987C6-37F3-432A-8BA2-9D93F428FAD5@pmjones.io> <93898e96-bbd6-4c2e-a1ad-35499bb2510d@app.fastmail.com> Subject: Re: [PHP-DEV] [RFC] [Discussion] Add WHATWG compliant URL parsing API Content-Type: multipart/alternative; boundary=3253d1a9f8f1432594635e2cc931f0f4 From: rob@bottled.codes ("Rob Landers") --3253d1a9f8f1432594635e2cc931f0f4 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Fri, Feb 28, 2025, at 09:38, Lynn wrote: >=20 >=20 > On Fri, Feb 28, 2025 at 12:05=E2=80=AFAM Rob Landers wrote: >> __ >>=20 >>=20 >> On Thu, Feb 27, 2025, at 22:01, Faizan Akram Dar wrote: >>> Hi, >>>=20 >>>=20 >>> On Thu, 27 Feb 2025, 20:55 Paul M. Jones, wrote: >>>>=20 >>>> > On Feb 25, 2025, at 09:55, ignace nyamagana butera wrote: >>>> > >>>> > The problem with your suggestion is that the specification from W= HATWG and RFC3986/3987 are so different and that the function you are pr= oposing won't be able to cover the outcome correctly (ie give the develo= pper all the needed information). This is why, for instance, Mat=C3=A9 a= dded the getRaw* method alongside the normalized getter (method without = the Raw prefix). >>>>=20 >>>> The two functions need not return an identical array of components;= e.g., the 3986 parsing function might return an array much like parse_u= rl() does now, and the WHATWG function might return a completely differe= nt array of components (one that includes the normalized and/or raw comp= onents). >>>>=20 >>>> All of this is to say that the parsing functionality does not have = to be in an object to be useful *both* to the internal API *and* to user= land. >>>=20 >>>=20 >>>>=20 >>>=20 >>> It most definitely needs to be an object. Arrays are awful DX wise, = there is array shape which modern IDEs like phpstorm support and so does= static analysis but the overall experience remains subpar compared to c= lasses (and objects).=20 >>=20 >> I=E2=80=99m curious why you say this other than an opinion about deve= loper experience? Arrays are values, objects are not. A parsed uri seems= more like a value and less like an object. Just reading through the com= ments so far, it appears that whatever is used will just be wrapped in l= ibrary code regardless, for userland code, but the objective is to be us= eful for other extensions and core code. In that case, a hashmap is much= easier to work with than a class. >>=20 >> Looking at the objectives of the RFC and the comments here, it almost= sounds like it is begging to be a simple array instead of an object.=20 >>=20 >> =E2=80=94 Rob >=20 > Depends on there being the intention to have it as parameter type. If = it's designed to be passed around to functions I really don't want it to= be an array. I am maintaining a legacy codebase where arrays are being = used as hashmaps pretty much everywhere, and it's error prone. We lose a= ll kinds of features like "find usages" and refactoring key/property nam= es. Silly typos in array keys with no actual validation of any kind caus= e null values and annoying to find bugs. >=20 > I agree that hashmaps can be really easy to use, but not as data struc= tures outside of the function/method scope they were defined in. If valu= e vs object semantics are important here, then something that is forward= compatible with whatever structs may hold in the future could be intere= sting. I meant hashmaps from within C, not within php. If it is just going to w= rapped in userland libraries as people seem to be suggesting in this thr= ead, then you only have to get it right once, and it is easy to use from= C. =E2=80=94 Rob --3253d1a9f8f1432594635e2cc931f0f4 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable

=
On Fri, Feb 28, 2025, at 09:38, Lynn wrote:


On Fri, Feb 28, 20= 25 at 12:05=E2=80=AFAM Rob Landers <rob@bottled.codes> wrote:
<= /div>

=


On Thu, Feb 27, 2025, at 22:01, Faiza= n Akram Dar wrote:
Hi,


On Thu, 27= Feb 2025, 20:55 Paul M. Jones, <pmjones@pmjones.io> wrote:=

=
> On Feb 25, 2025, at 09:55, ignace nyamagana butera <nyamsprod@gmail.com> wrote:
>
<= div>> The problem with your suggestion is that the specification from= WHATWG and RFC3986/3987 are so different and that the function you are = proposing won't be able to cover the outcome correctly (ie give the deve= lopper all the needed information). This is why, for instance, Mat=C3=A9= added the getRaw* method alongside the normalized getter (method withou= t the Raw prefix).

The two functions need n= ot return an identical array of components; e.g., the 3986 parsing funct= ion might return an array much like parse_url() does now, and the WHATWG= function might return a completely different array of components (one t= hat includes the normalized and/or raw components).

All of this is to say that the parsing functionality does not h= ave to be in an object to be useful *both* to the internal API *and* to = userland.

=



It most definitely needs to be= an object. Arrays are awful DX wise, there is array shape which modern = IDEs like phpstorm support and so does static analysis but the overall e= xperience remains subpar compared to classes (and objects). 

I=E2=80=99m curious why you sa= y this other than an opinion about developer experience? Arrays are valu= es, objects are not. A parsed uri seems more like a value and less like = an object. Just reading through the comments so far, it appears that wha= tever is used will just be wrapped in library code regardless, for userl= and code, but the objective is to be useful for other extensions and cor= e code. In that case, a hashmap is much easier to work with than a class= .

Looking at the objectives of the RFC and = the comments here, it almost sounds like it is begging to be a simple ar= ray instead of an object. 

=E2=80=94 Rob

Depends on there being the intention to have= it as parameter type. If it's designed to be passed around to functions= I really don't want it to be an array. I am maintaining a legacy codeba= se where arrays are being used as hashmaps pretty much everywhere, and i= t's error prone. We lose all kinds of features like "find usages" a= nd refactoring key/property names. Silly typos in array keys with no act= ual validation of any kind cause null values and annoying to find b= ugs.

I agree that hashmaps can be real= ly easy to use, but not as data structures outside of the function/metho= d scope they were defined in. If value vs object semantics are important= here, then something that is forward compatible with whatever structs m= ay hold in the future could be interesting.

I meant hashmaps from within C, not within php= . If it is just going to wrapped in userland libraries as people seem to= be suggesting in this thread, then you only have to get it right once, = and it is easy to use from C.

=E2=80=94 Rob
--3253d1a9f8f1432594635e2cc931f0f4--