Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:117969
Return-Path: <larry@garfieldtech.com>
Delivered-To: mailing list internals@lists.php.net
Received: (qmail 47914 invoked from network); 16 Jun 2022 18:00:02 -0000
Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5)
  by pb1.pair.com with SMTP; 16 Jun 2022 18:00:02 -0000
Received: from php-smtp4.php.net (localhost [127.0.0.1])
	by php-smtp4.php.net (Postfix) with ESMTP id 00330180211
	for <internals@lists.php.net>; Thu, 16 Jun 2022 12:48:06 -0700 (PDT)
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net
X-Spam-Level: 
X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,
	SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no
	version=3.4.2
X-Spam-ASN: AS29838 64.147.123.0/24
X-Spam-Virus: No
X-Envelope-From: <larry@garfieldtech.com>
Received: from wout4-smtp.messagingengine.com (wout4-smtp.messagingengine.com [64.147.123.20])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256)
	(No client certificate requested)
	by php-smtp4.php.net (Postfix) with ESMTPS
	for <internals@lists.php.net>; Thu, 16 Jun 2022 12:48:05 -0700 (PDT)
Received: from compute1.internal (compute1.nyi.internal [10.202.2.41])
	by mailout.west.internal (Postfix) with ESMTP id 52519320091B
	for <internals@lists.php.net>; Thu, 16 Jun 2022 15:48:03 -0400 (EDT)
Received: from imap52 ([10.202.2.102])
  by compute1.internal (MEProxy); Thu, 16 Jun 2022 15:48:03 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	garfieldtech.com; h=cc:content-transfer-encoding:content-type
	:date:date:from:from:in-reply-to:in-reply-to:message-id
	:mime-version:references:reply-to:sender:subject:subject:to:to;
	 s=fm2; t=1655408882; x=1655495282; bh=Asst2VY+DT8z1pweCjnFvBXev
	0G5BUoKt0KEOcx/gec=; b=gNAKADNm4Dpva3kIWqSz8T8kCaF7xSVea6t7pvd6l
	7RElg9cAapswUt8yMxJu3Rs6UXxkRtd/DEb2uf/Z3LfPZQYr6cOmoc3MA+Hly5k5
	fdCSsoftimGoNWcb8AFt7Cb/TcDYMl188V9B3dhN9Ubi+D/+k0gxNHvYWNT0HLwO
	Sa+DqhorliF4OuAafYGIcVkwculI0GKHvZ9jmQEIwHIZyh5Agx2O5fHSqQwUIsPh
	KSft8vPGhmYIyZJRL1ivYrNFpqujttIwHq0MTe65rMWvhQ4/xsj6Z24Q/SclxB3S
	8xrvmJh/d5jIZX153G99/22C9pOz0afx94rFiarezte4g==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:content-transfer-encoding:content-type
	:date:date:feedback-id:feedback-id:from:from:in-reply-to
	:in-reply-to:message-id:mime-version:references:reply-to:sender
	:subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender
	:x-me-sender:x-sasl-enc; s=fm2; t=1655408882; x=1655495282; bh=A
	sst2VY+DT8z1pweCjnFvBXev0G5BUoKt0KEOcx/gec=; b=DVxdBQMESF4zuVIJs
	DjfLW/tF6ohSAWGgfzzw+IFKoOhJpatp7WxbXdF8S76iRAgnPyztRN52t8CDm30q
	GcgsKCbevNlLF0aAKxTQX3TBbgxcfwisOAc0cXnCn5RioJiHt4GLMKPkjX2IAcM1
	gaDy5aJVPLFD29fVh4VuSOqkE/YCBYWUhkeQkp5ulRQyCDUBPaEpPIjvDnBW26qs
	7AbgPsRlPXPsrVbd41jfDPOKI10aL3yQQrY96ZunAkfGzeglVjo3oT+6nEhxE9ht
	baiCu/fKVq40fopXoBe0PN/WxOe7Pcsww6s0C04Klh6rEAnDoCN22NeSI10N8+Fl
	Teivw==
X-ME-Sender: <xms:8oirYoI9jpdXyXeBNn4btfJixzuIhO8hgyXrRJ23b70FvXIzmv8s-A>
    <xme:8oirYoIYoQo8ycs7D9QR8kQ048G_5QP71bbJHcrPULfCvIqNhOP14Qrg7jonnvQrv
    WYqQRoB_BDWOg>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvfedruddvfedgudefhecutefuodetggdotefrod
    ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh
    necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd
    enucfjughrpefofgggkfgjfhffhffvufgtgfesthhqredtreerjeenucfhrhhomhepfdfn
    rghrrhihucfirghrfhhivghlugdfuceolhgrrhhrhiesghgrrhhfihgvlhguthgvtghhrd
    gtohhmqeenucggtffrrghtthgvrhhnpedtgfdtvefghfejhfffgeeuledttdeihfeugfdt
    uefgheejgeejvdeitedtheffkeenucffohhmrghinhepphgvrghkugdrtghomhenucevlh
    hushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehlrghrrhihsehg
    rghrfhhivghlughtvggthhdrtghomh
X-ME-Proxy: <xmx:8oirYouT1I0fs1zlu1XwzP0wthyCUn0Qzrw9bbSqjJPzFShhbeqYgQ>
    <xmx:8oirYlYgGTNLDpoLTX8bxT_KzhGckfDcBCWPmYR49lTCjf9ioFo-QA>
    <xmx:8oirYvbMS_TsJcf7w7QM2YOhdDExTFOOEq9X7CdFSO1dpmJvDuXQwQ>
    <xmx:8oirYnq6n0vr64tA_1l82vwbD1W5FlzB05-GnEgTBKJujjo0-M21xA>
Feedback-ID: i8414410d:Fastmail
Received: by mailuser.nyi.internal (Postfix, from userid 501)
	id 7BA09C6008B; Thu, 16 Jun 2022 15:48:02 -0400 (EDT)
X-Mailer: MessagingEngine.com Webmail Interface
User-Agent: Cyrus-JMAP/3.7.0-alpha0-712-gb9e94258b0-fm-20220610.001-gb9e94258
Mime-Version: 1.0
Message-ID: <b8333cb6-eaa0-4170-a3ba-29eadb2fd8fa@www.fastmail.com>
In-Reply-To: 
 <CAOfKkdRzoo7JuD07F8nkZBYvS5eDMiV2r6cNV-FgeLxBrdCsiQ@mail.gmail.com>
References: 
 <CAOfKkdRzoo7JuD07F8nkZBYvS5eDMiV2r6cNV-FgeLxBrdCsiQ@mail.gmail.com>
Date: Thu, 16 Jun 2022 14:47:41 -0500
To: "php internals" <internals@lists.php.net>
Content-Type: text/plain;charset=utf-8
Content-Transfer-Encoding: quoted-printable
Subject: Re: [PHP-DEV] Discussion about new Curl URL API and ext/curl improvements
From: larry@garfieldtech.com ("Larry Garfield")

On Thu, Jun 16, 2022, at 2:10 AM, Pierrick Charron wrote:
> Hi internals,
>
> Since its version 6.62.0 [1], libcurl features a brand new URL API [2]=
 that
> can be used to parse and generate URLs, using libcurl=E2=80=99s own pa=
rser. One of
> the goals of this API is to tighten a problematic vulnerable area for
> applications where the URL parser library would believe one thing and
> libcurl another. This could and has sometimes led to security problems=
 [3].
> The new API is handled base and composed of 5 new functions [4] (and o=
ne
> more since 7.80.0 to get more verbose information upon error).
>
> I started to work on an implementation [5] to expose this API to PHP
> userland in the curl extension so that PHP users can benefit from it. =
My
> first reflex was to stay consistent on how the extension is currently
> working and do a one to one mapping of the new curl functions. I got
> feedback from both Ilija and Christoph saying that the API is, I quote=
, "a
> bit clunky" which I can't disagree with.
>
> For a long time, ext/curl worked with handles as "curl resources" and
> functions mapped to the libcurl functions, but since PHP8.0 "curl
> resources" were converted to opaque classes with no methods. So my main
> goal here is to come up with a solution for the new Curl URL API, but =
maybe
> also to be consistent, make some changes to existing curl classes like
> `CurlHandle` to add methods and improve the current state of the exten=
sion.
>
> Here is the solution I would propose :
> - First of course keep all the current ext/curl functions as is not to
> break any compatibility (Seems obvious but just to be sure everybody i=
s on
> the same page).
> - For consistency expose the new Curl URL API as functions mapped one =
to
> one to libcurl functions :
>
> function curl_url(?string $url =3D null): CurlUrl|false {}
> function curl_url_set(CurlUrl $url, int $part, string $content, int $f=
lags
> =3D 0): bool {}
> function curl_url_get(CurlUrl $url, int $part, int $flags =3D 0):
> string|false {}
> function curl_url_errno(CurlUrl $url): int {}
> function curl_url_strerror(int $error_code): ?string {}
>
> - Add methods to the CurlUrl object to make it less opaque and expose =
an
> object oriented style API. I would keep it minimal and let userlanAd A=
PI
> provide higher level APIs as guzzle for example. (You can see the curr=
ent
> implementation [5])
>
> final class CurlUrl implements Stringable
> {
>     public function __construct(?string $url =3D null) {}
>     public function get(int $part, int $flags =3D 0): string {}
>     public function set(int $part, string $content, int $flags =3D 0):=
 void {}
>     public function getErrno(): int {}
>     public function __toString(): string {}
>     public function __clone() {}
> }
>
> - It would also be nice to add this object oriented API for existing
> CurlHandle, CurlMultiHandle and CurlShareHandle classes. For example t=
he
> CurlHandle class would look like that (First implementation [6])
>
> final class CurlHandle
> {
>     public function __construct(?string $url =3D null) {}
>     public function setOpt(int $option, mixed $value): bool {}
>     public function getInfo(?int $option =3D null): mixed {}
>     public function exec(): string|bool {}
>     public function escape(string $string): string|false {}
>     public function unescape(string $string): string|false {}
>     public function pause(int $flags): int {}
>     public function getErrno(): int {}
>     public function reset(): void;
>     public function setOptArray(array $options): bool {}
>     public function upkeep(): bool {}
>     public function __clone() {}
> }
>
> As of right now I still have some unanswered questions like how should=
 we
> handle errors on the new CurlUrl API ?
> - Throw `CurlUrlException` on both the procedural and object oriented =
style
> API (that's how current implementation works [5])
> - Throw `CurlUrlException` with the oriented style API, but return for
> example boolean/null on errors when user uses the procedural API ?
> - Always return null/booleans using both object oriented API and
> procedural API ?
>
> The same question applies if we add a new object oriented style API on
> existing classes. Current functions MUST stay unchanged but should we =
throw
> `CurlException` when the user uses the object oriented style API ? (Th=
at's
> what I would do) Or should we return the same result from OO and proce=
dural
> API ? Should we make the new CurlApi consistent with that ?
>
> What are your thoughts about all this ? Any feedback on any of those
> subjects are more than welcome.
>
> Pierrick


Like most of the responders so far, I would say skip the procedural API,=
 just go OOP and be done with it.  However, I would go a step further an=
d say it should be a *good* OOP API, not just the Curl procedural API wi=
th funny syntax.

For example:

public function get(int $part, int $flags =3D 0): string {}

Absolutely not. :-)  It should be public function getHost(), public func=
tion getScheme(), etc.  The `int $part` bit is frankly bad design even b=
y procedural standards, and has no place in a modern library.  I don't e=
ven know what the $flags are or could be, which means that's also a bad =
design.  They should either be omitted, handled via separate methods, or=
 replaced with named arguments with reasonable defaults.  (We can do tha=
t now, yay!)  Translating those into Curl's nutty flag API is something =
that should be done once, in the wrapping extension/object, and no PHP u=
ser space developer should have to think about it.

Similarly, getErrno() should be replaced.  If a URL cannot even be parse=
d at all, that's an exception.  Not every error should be an exception (=
I recently wrote extensively on the subject here: https://peakd.com/hive=
-168588/@crell/much-ado-about-null), but in this case "I don't know WTF =
to do with this string you gave my constructor" is an exception.  If ins=
tead of a constructor it had a static factory method (or multiple), then=
 we could discuss those returning some other sentinel value instead to i=
ndicate failure, but "there was an error, go call this other thing to fi=
nd out what it was" is a fundamentally flawed API design that should not=
 be replicated.

In short, don't put a wrapper on the existing Curl API that happens to u=
se ->.  Design a good, usable, PHP-developer-friendly API, and put Curl =
behind it.  Trying to just expose a crappy API and make it slightly less=
 crappy is not a good use of time.

--Larry Garfield