Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:118113 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 85892 invoked from network); 28 Jun 2022 12:32:39 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 28 Jun 2022 12:32:39 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 1F11D180551 for ; Tue, 28 Jun 2022 07:23:41 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, NUMERIC_HTTP_ADDR,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 28 Jun 2022 07:23:40 -0700 (PDT) Received: by mail-wr1-f52.google.com with SMTP id d17so12307804wrc.10 for ; Tue, 28 Jun 2022 07:23:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=JxNk4yUMWlLc5P4RCMjTh3rZk2dwcFqk2Q0fF1nhHqA=; b=Q/YVhleA4ESC1VkiOJ3OvPrAPNbUlLt5JQK3BRk8Llv4ucHHn1aSl8O5n/gfY72NwH CfZv5MfSHBTWCafWAaXQGYWGtmIeNUdsHOqeO6E/Z5G/hSIE3OwQZX7/8Z8mvmEsx6Js uRPKnxEdoKm5YdiiN9K7EXS9iLTLDcGPYapTPP0YFI2YjgjqwjVqTC/dcvwudRhJq793 jl0EhIYGVay/spLvQnnUyefxrv/+IrGEcXFdKhqruMnEKJvw+rScqWekoX/ebTmxEDdM TV0wX5c7XuhIOn3SoWlK3N5hl63LUziD0s3rp2LfuJ7xVaj+tgTawIJBc09PYOyJNwQQ 2zgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=JxNk4yUMWlLc5P4RCMjTh3rZk2dwcFqk2Q0fF1nhHqA=; b=a2sG8HB/6r2JGwSt0V0N2JHzwCGX+HzB0tIR6XBWl564w84jK8wUvbhmDPyrxfcV8v RQrmy/gclupcpvI+ktB8PEaHFjwFzdvJvcYJmNWc0D15IY2R4O9iLdagy58o0R2AuLMt fcKJWwK8+Fo0bJzgxcYIrKwj8ULbpxJqIz5F/lkpi6CiN+8bT4S2M+wnKrq8k0BSlgM1 1Sk0At4zwFD3OVxdzI1xDaorbq9uzY88kiKkWuAHgHHKUhsvfAkkUY6NGJvS94nfpbnU h109Y4FJ6iy5+AhitjB1AcaYuOWkfXh6jC5Krj4LWFDHX+Dg2d9SFnyO2dz4RVVLqbfO h5UQ== X-Gm-Message-State: AJIora+/646+W+YNtCtrV0nV4auKgv4Qbk8RNV0ZsxaTFdAFr7OKJbQ2 egLHGoyziIOywRMbo/eg66GibrRWHXCTRQKczuXtAI5in9M= X-Google-Smtp-Source: AGRyM1ux50hukaOcKGpNY7+0gCWqEi1ibR23tpIAFXsoXxoL2UMhqK2RvN+MB4nmeTwjSifO4IbJnoy2qlClRrhJBjk= X-Received: by 2002:a5d:5483:0:b0:21b:88d5:e725 with SMTP id h3-20020a5d5483000000b0021b88d5e725mr18426054wrv.82.1656426219545; Tue, 28 Jun 2022 07:23:39 -0700 (PDT) MIME-Version: 1.0 References: <4511aee0-b5a0-6310-270f-38ae5cfd8a06@gmail.com> In-Reply-To: Date: Tue, 28 Jun 2022 15:23:27 +0100 Message-ID: To: PHP internals Content-Type: multipart/alternative; boundary="00000000000069ab1905e282c8ec" Subject: Re: [PHP-DEV] [RFC] [Under Discussion] New Curl URL API From: rowan.collins@gmail.com (Rowan Tommins) --00000000000069ab1905e282c8ec Content-Type: text/plain; charset="UTF-8" On Mon, 27 Jun 2022 at 22:32, Pierrick Charron wrote: > That's just an example with an old version of PHP, but let's say you have > some code that makes requests but only to a specific list of servers, so > you want to analyze the URL and check if the host is in a whitelist. If the > provided URL is "http://127.0.0.1:11211#@google.com:80/" and that you > used PHP <= 7.0.13 your parse_url function would tell you that the domain > you're trying to request is google.com so everything is fine, but in fact > when the call to curl is made, curl would call 127.0.0.1. This one was > fixed but the problem could still occur if the parser is not the same as > the one used in the requester. > ... > You can use CurlUrl within your implementation of UriInterface but for the > same reason if you're using another request engine than curl, you may have > the same security problem where curl will not parse the same data. If you > want to make sure that your CurlUrl object represents the same thing as > your UriInterface you could build the CurlUrl object part by part using > your UriInterface. When you assign your CurlUrl to your CurlHandle with the > CURLOPT_CURLU option, curl will use the parts directly instead of parsing > the URL again, so you're sure that the host will be the one you set with > `CurlUrl::setHost()` and so on. > This is actually a lot trickier than it sounds. Imagine this code, with the bug you gave as an example still present in one of the libraries we're interacting with: $url = new MyUrlObject("http://127.0.0.1:11211#@google.com:80/"); var_dump($url->getHostName()); // ??? This won't work, because we don't know which parser to call; it needs to be something like this: var_dump($url->getHostName(UrlContext::CURL)); // '127.0.0.1' var_dump($url->getHostName(UrlContext::BROKEN_PHP)); // 'google.com' Then we call this: $url->setHostName("duckduckgo.com"); var_dump($url->getFullUrl(); // ??? Again, the result depends on context: var_dump($url->getFullUrl( UrlContext::CURL )); // " http://duckduckgo.com#@google.com:80/" var_dump($url->getFullUrl( UrlContext::BROKEN_PHP )); // " http://127.0.0.1:11211#@duckduckgo.com:80/" But that means we can't actually apply the setHostName change until the call to getFullUrl(), because we don't know how the original URL will be parsed. Instead, the object has to internally store a queue of applied modifications, and then reproduce them, in order, on the underlying implementation. Alternatively, we can build our implementation to assume it will always be used in the context of curl, or for debugging curl, so can just have getHostName() and setHostName() directly use curl's implementation. Which leads us back to where we started, of using CurlUrl directly or via a thin wrapper, as either a URL parser+builder, or as a value object in its own right. I don't really know how to make all this nuance clear to users, but that makes me a bit wary of adding the object to PHP in its current design. Regards, -- Rowan Tommins [IMSoP] --00000000000069ab1905e282c8ec--