Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:124072 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id DEA571ADBE8 for ; Sun, 30 Jun 2024 06:01:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1719727394; bh=AtJG+WlrRfhSjtPS8EQjxk67n2Ucnq7EMBrkXgszC6s=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=lt0BNC5av+aJpa5MOHR0wSIG8YgIyUaIw68SalfGvthroWSKJqamgH58Ut74AT0p/ G2P5tyhV+iR9ztc2OBJavZ09ESMg15XL2nzkMbtJe9Y9ScjvFXkDA+FD5uwYoIrUel CeYWA8KXXh9uNDTMj7FysA4MuCFzVeqRqg+PW64mtq2mAyyj8ClB2P5FixX/zTldib z1JPbSYn/UUb7Th4Oi0k8WQfvUmRA9xJ1huA4aH49ovATSowpoNVDXZzyCn+XXXTQ7 tX/tL9PzJYnsarN7KkQ0TxfewQ31LauEsQ2xYcLkSB5zjfSbVkGHkiQTY4mCAiTfSk 4RAW0yRjmccEg== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 374A2180AB7 for ; Sun, 30 Jun 2024 06:03:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS, FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: Error (Cannot connect to unix socket '/var/run/clamav/clamd.ctl': connect: Connection refused) X-Envelope-From: Received: from mail-qt1-f179.google.com (mail-qt1-f179.google.com [209.85.160.179]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sun, 30 Jun 2024 06:03:10 +0000 (UTC) Received: by mail-qt1-f179.google.com with SMTP id d75a77b69052e-44502bfbb4bso15524291cf.2 for ; Sat, 29 Jun 2024 23:01:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719727310; x=1720332110; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=AtJG+WlrRfhSjtPS8EQjxk67n2Ucnq7EMBrkXgszC6s=; b=MA8U7ETTKv5Qvw45lvElFqGcw38us1xxYNx1VSjSmfK5Fz0bB7O/OfBGNs7oyPvH3o tek10GaKFr5bBHK2d1UTOs689/Deib/e1DtldsvLWeUMH+Y7JpMeuxkOT9CsnQwEA+pb Rj0f/GeGewXtjOSCu3WQqtqWOHVqxxLHv18/YJcdFKxtv1wztZpCzNVZmg+m3SsVGASu LnBFQXQayiM8E6QVYVMXXaIzviEPo9vRxeFs/QdCHAH5zVOhTBEguFujOm5okf70PWgF zouQ3DuVEepICi9jQZGwED2qfiLCD35z/Knkr68X4MDlHPPcfZZ8/xFvDNOY2+coEoTl +qTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719727310; x=1720332110; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=AtJG+WlrRfhSjtPS8EQjxk67n2Ucnq7EMBrkXgszC6s=; b=Wgip1m26lg5GeVnXlVSub2pOeGTt2/sMWl34kK4Rf+yfFafKyzTXz8+9+k9w3pivZc tL3FH7J1lB/n14Iq5sv0+12jbG24cv1ql6NkljE7TDQ/0nNWtd8xSh8lIhvsXMcrdVuR v0A3mMtkh6Njb/m1MaZiGZLhH3WEhAZwL/r7MkX7OHrQtS2mFbTJDreewPYmV1EVlShg LQPmDwu5V6DV41eabAugLUMcSplLQgIuDoz8sKD7pO+FZcmj4Td0BwSdWleapbZcI4p3 /L2mRxOjkpCnIYPvPfF1/eM/yP/MyMo52dM124OAwDJ5gr+g1r/8tACAZySkZ0iYz/No QJUw== X-Gm-Message-State: AOJu0YwHAnJa2B3qD6L5G9Tnm5Hff01h+lB0nncRdQkW9IwDJQzwI1HD z8XxeD5ogcD+2hRwiWcmqzR2HT2pmGK+FIg0A5o3rmnce4waZ70zbrFLt0Sezx/Ur1hTHdiWD0u YEUoR7ORSwLJpAnR4CFYov82BQkktaRqb X-Google-Smtp-Source: AGHT+IHp44Lkt1GpNXDKwjBUDrc/vSZQXbMrcoxadmHPxl3xnfknqa601ZGeK4VsZpgw17VIt2Rgo/7Gwlm5lZgCtck= X-Received: by 2002:ac8:584a:0:b0:444:9a68:102a with SMTP id d75a77b69052e-44662dfa8cfmr29890521cf.35.1719727309740; Sat, 29 Jun 2024 23:01:49 -0700 (PDT) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 References: <71a73b87-cc2f-4ee5-a961-7bf2b191fbb6@gmail.com> In-Reply-To: <71a73b87-cc2f-4ee5-a961-7bf2b191fbb6@gmail.com> Date: Sun, 30 Jun 2024 08:00:00 +0200 Message-ID: Subject: Re: [PHP-DEV] [RFC] [Discussion] Add WHATWG compliant URL parsing API To: Niels Dossche Cc: internals@lists.php.net Content-Type: multipart/alternative; boundary="00000000000068693b061c1537a1" From: kocsismate90@gmail.com (=?UTF-8?B?TcOhdMOpIEtvY3Npcw==?=) --00000000000068693b061c1537a1 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Niels, First of all, thank you for your support! Why did you choose UrlParser to be a "static" class? Right now it's just a > fancy namespace. That's a good question, let me explain the reason: one of my major design goals was to make the UrlParser class to be extendable and configurable (e.g. via an "engine" property similar to what Random/Randomizer has). Of course, UrlParser doesn't support any of this yet, but at least the possibility is there for followup RFCs due to the class being final. Since I knew it would be an overkill to require instantiating an UrlParser instance for a task which is stateless (URL parsing), finally I settled on using static methods for the purpose. Later, if the need arises, the static methods could be converted to non-static ones with minimal BC impact. It's a bit of a shame that the PSR interface treats queries as strings. In Javascript we have the URLSearchParams class that we can use as a > key-value storage for query parameters. Hm, yes, that's an observation I can agree with. However, this restriction shouldn't limit followups to add key-value storage support for query parameters. Although, as far as I could determine, neither Lexbor is capable of such a thing currently. Why is UrlComponent a backed enum? To be honest, it has no specific reason apart from that's what I am used to. I'm fine with whatever choice, even with getting rid of UrlComponent completely. I added the UrlParser::parseUrlComponent() method (and hence the UrlComponent enum) to the proposal in order to have a direct replacement for parse_url() when it's called with the $component parameter set, but I wasn't really sure whether this is needed at all... So I'm eager to hear any recommendations regarding this problem. A nit: We didn't bundle the entire Lexbor engine, only select parts of it. > Just thought I'd make it clear. Yes, my wording was slightly misleading. I'll clarify this in the RFC. About edge cases: e.g. what happens if I call the Url constructor and leave > every string field empty? Nothing :) The Url class in its current form can store invalid URLs. I know that URLs are generally modeled as value objects (that's also why the proposed class is immutable), and generally speaking, value objects should protect their invariants. However, due to separating the parser to its own class, I abandoned this "rule". So this is one more downside of the current API. Regards, M=C3=A1t=C3=A9 --00000000000068693b061c1537a1 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Niels,

First of all= , thank you for your support!

Why did you choose UrlParser to be a "static&= quot; class? Right now it's just a fancy namespace.
That's a good question,=C2=A0let me explain the reason: on= e of my major design goals was to make the UrlParser class to be
= extendable and configurable (e.g. via an "engine" property simila= r to what Random/Randomizer has). Of course, UrlParser
doesn'= t support any of this yet, but at least the possibility is there for follow= up RFCs due to the class being final.

Since I knew= it would be an overkill to require instantiating an UrlParser instance for= a task which is stateless (URL parsing),
finally I settled on us= ing static methods for the purpose. Later, if the need=C2=A0arises, the sta= tic methods could be converted to
non-static ones with minimal BC= impact.

It's a bit of a shame that the PSR interface treats queries as stri= ngs.
In Javas= cript we have the URLSearchParams class that we can use as a key-value stor= age for query parameters.=C2=A0

Hm, yes, th= at's an observation I can agree with. However, this restriction shouldn= 't limit followups to add key-value storage
support for query= parameters. Although, as far as I could determine, neither Lexbor is capab= le of such a thing currently.

Why is UrlComponent a backed enum?

To be honest, it has no specific reason apart from that= 9;s what I am used to. I'm fine with whatever choice, even with getting= rid of
UrlComponent completely. I added the UrlParser::parseUrlC= omponent() method (and hence the UrlComponent enum) to the
propos= al in order to have a direct replacement for parse_url() when it's call= ed with the $component parameter set, but I wasn't
really sur= e whether this is needed at all... So I'm eager to hear any recommendat= ions regarding this problem.

A nit: We didn't bundle the entire Lexbor engin= e, only select parts of it. Just thought I'd make it clear.

Yes, my wording was slightly misleading. I'll clar= ify this in the RFC.

About edge cases: e.g. what happens if I call the Url c= onstructor and leave every string field empty?

<= div>Nothing :) The Url class in its current form can=C2=A0store invalid URL= s. I know that URLs are generally modeled as value objects (that's
also why the proposed class is immutable), and generally speaking, va= lue objects should protect their=C2=A0invariants. However, due to
separating the parser to its own class, I abandoned this "rule".= So this is one more downside of the current API.

= Regards,
M=C3=A1t=C3=A9
--00000000000068693b061c1537a1--