Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:117958 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 84247 invoked from network); 16 Jun 2022 05:22:26 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 16 Jun 2022 05:22:26 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 9A584180212 for ; Thu, 16 Jun 2022 00:10:22 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com [209.85.208.179]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 16 Jun 2022 00:10:21 -0700 (PDT) Received: by mail-lj1-f179.google.com with SMTP id l20so556382lji.0 for ; Thu, 16 Jun 2022 00:10:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=gLOu2WXGsJxLYa4lMN8dGK0SUTJUGcrxLhOt4zJDgt8=; b=U6W1Z/94qchOUFv9lBSKnm5pvFeMbEHqOOgYkcZcfBNVHCsT73dIO1HAztfRS8LP2j o5JZ4TdBALZw7JcUVEzifMtv13e0xRFP3/bu6losNbkWOQTHgeRDQgsdSAi+xYdrslJ4 CoaWaZ90V5Qndi0JfH5KzTLd3NAIO9FgiBQ8zQcR+JBZwfHYwlowwy0smuIMNK+QNBhS NMavxjS4CrBBdk9ysYLdC2PUGHc8CdcGDVO8rNaJJoYQkep/QolMXyEEQ1/NYa4z166M Y4ngdwm6dP8OEdILWRvYP9ZLCVKbWGfu9nPkQzT9HIifPXWeU9KzTBlB0EtUj5BSI642 egZQ== X-Gm-Message-State: AJIora+uV6fhLIVLlleolJU62XUtaUOcNlxPoYelGg5Pu9Foxl1R1g1m TrlBMhkRdnlyxeSHv1wGExL+JRQyM7brRZXGRJYG8xwMGNXeMw== X-Google-Smtp-Source: AGRyM1t4Qm0MCaPe6L4DcxOrsqwSLtu8Y1ygDL3Bf1tRer64NtoISYq0sGxHiYklT+HZiZyqbKIyEzesaaA2pJtLVFM= X-Received: by 2002:a05:651c:893:b0:249:4023:3818 with SMTP id d19-20020a05651c089300b0024940233818mr1899232ljq.44.1655363419780; Thu, 16 Jun 2022 00:10:19 -0700 (PDT) MIME-Version: 1.0 Date: Thu, 16 Jun 2022 03:10:08 -0400 Message-ID: To: PHP internals Content-Type: multipart/alternative; boundary="0000000000009c4dbb05e18b5485" Subject: Discussion about new Curl URL API and ext/curl improvements From: pierrick@php.net (Pierrick Charron) --0000000000009c4dbb05e18b5485 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi internals, Since its version 6.62.0 [1], libcurl features a brand new URL API [2] that can be used to parse and generate URLs, using libcurl=E2=80=99s own parser.= One of the goals of this API is to tighten a problematic vulnerable area for applications where the URL parser library would believe one thing and libcurl another. This could and has sometimes led to security problems [3]. The new API is handled base and composed of 5 new functions [4] (and one more since 7.80.0 to get more verbose information upon error). I started to work on an implementation [5] to expose this API to PHP userland in the curl extension so that PHP users can benefit from it. My first reflex was to stay consistent on how the extension is currently working and do a one to one mapping of the new curl functions. I got feedback from both Ilija and Christoph saying that the API is, I quote, "a bit clunky" which I can't disagree with. For a long time, ext/curl worked with handles as "curl resources" and functions mapped to the libcurl functions, but since PHP8.0 "curl resources" were converted to opaque classes with no methods. So my main goal here is to come up with a solution for the new Curl URL API, but maybe also to be consistent, make some changes to existing curl classes like `CurlHandle` to add methods and improve the current state of the extension. Here is the solution I would propose : - First of course keep all the current ext/curl functions as is not to break any compatibility (Seems obvious but just to be sure everybody is on the same page). - For consistency expose the new Curl URL API as functions mapped one to one to libcurl functions : function curl_url(?string $url =3D null): CurlUrl|false {} function curl_url_set(CurlUrl $url, int $part, string $content, int $flags =3D 0): bool {} function curl_url_get(CurlUrl $url, int $part, int $flags =3D 0): string|false {} function curl_url_errno(CurlUrl $url): int {} function curl_url_strerror(int $error_code): ?string {} - Add methods to the CurlUrl object to make it less opaque and expose an object oriented style API. I would keep it minimal and let userlanAd API provide higher level APIs as guzzle for example. (You can see the current implementation [5]) final class CurlUrl implements Stringable { public function __construct(?string $url =3D null) {} public function get(int $part, int $flags =3D 0): string {} public function set(int $part, string $content, int $flags =3D 0): void= {} public function getErrno(): int {} public function __toString(): string {} public function __clone() {} } - It would also be nice to add this object oriented API for existing CurlHandle, CurlMultiHandle and CurlShareHandle classes. For example the CurlHandle class would look like that (First implementation [6]) final class CurlHandle { public function __construct(?string $url =3D null) {} public function setOpt(int $option, mixed $value): bool {} public function getInfo(?int $option =3D null): mixed {} public function exec(): string|bool {} public function escape(string $string): string|false {} public function unescape(string $string): string|false {} public function pause(int $flags): int {} public function getErrno(): int {} public function reset(): void; public function setOptArray(array $options): bool {} public function upkeep(): bool {} public function __clone() {} } As of right now I still have some unanswered questions like how should we handle errors on the new CurlUrl API ? - Throw `CurlUrlException` on both the procedural and object oriented style API (that's how current implementation works [5]) - Throw `CurlUrlException` with the oriented style API, but return for example boolean/null on errors when user uses the procedural API ? - Always return null/booleans using both object oriented API and procedural API ? The same question applies if we add a new object oriented style API on existing classes. Current functions MUST stay unchanged but should we throw `CurlException` when the user uses the object oriented style API ? (That's what I would do) Or should we return the same result from OO and procedural API ? Should we make the new CurlApi consistent with that ? What are your thoughts about all this ? Any feedback on any of those subjects are more than welcome. Pierrick [1] https://daniel.haxhttps://github.com/adoy/php-src/tree/curl-object-apix.se/= blog/2018/10/31/curl-7-62-0-moar-stuff/ [2] https://daniel.haxx.se/blog/2018/09/09/libcurl-gets-a-url-api/ [3] https://www.blackhat.com/docs/us-17/thursday/us-17-Tsai-A-New-Era-Of-SSRF-E= xploiting-URL-Parser-In-Trending-Programming-Languages.pdf [4] https://github.com/curl/curl/blob/master/include/curl/urlapi.h [5] https://github.com/php/php-src/pull/8770 [6] https://github.com/adoy/php-src/tree/curl-object-api --0000000000009c4dbb05e18b5485--