Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:125869 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 816F01A00BD for ; Mon, 28 Oct 2024 11:40:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1730115776; bh=0141qOLuRCJ6h1KQ0Iq27hzgtbEIBDaictgVEOlYBB8=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=TPJOZBeH/GE70HUWUwG4WSnnPXD4ooEM3XYIsLnDCOkeNQgm27pQ5wz5MC2VVrnBT DD6yis4THdzXbYXani5WAp6QTkQDCUXBsxZVJToutgArcSzYQUHRnC9Tj1F47sAv2a swPPCuY9j3+ZArRDwRdx5TAwhr1wblCxsKQZrcPvpkn6DmTJu/ZYYTWZW+T7v+wgYy XRiWeRrshA6s6eEjXW3zYQRoLGLr0aT6hpBN4YHt/zp73x9F38oc4gCo4mDtrR+Umf lLYNPnlbvlg+tQenSYu8k0y405mKqnLJH1j+/5vBkNI/sTm+XQ8omZFMAr56GDW5/D im9zyMmgsAtqQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id EC70518007D for ; Mon, 28 Oct 2024 11:42:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from chrono.xqk7.com (chrono.xqk7.com [176.9.45.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 28 Oct 2024 11:42:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bastelstu.be; s=mail20171119; t=1730115625; bh=/FnY4TBCPHjiXBXUVp4aV529Wknfs7A/f1m+Yo77w2k=; h=MIME-Version:Date:From:To:Cc:Subject:In-Reply-To:References: Message-ID:Content-Type:from:to:cc:subject:message-id; b=QevLnFOmVhuVjMoFXmK8awf1nesD6qCGhNZLX4o70T1n6cwcM/6f93q+iK7FD5BCJ 7SahrdwJZpzVEzWn0X9g0lwTsLLHAJS/io18xYL2Y42bWSeVmUz56B0xra+bk+ywfK Rx8s363uH7zPD1QOC2KCRIdFIhaSCcbJOMeRBg0Y57A1GL1+tZ/BwYq35nu3G85xBF KSUdFm1s+ksfUp6TmmziRQV4f5udMWfeOWtMqFBMjbtXel55poNv4sM59K22m1csMh qO9pEcDOFfL9JclDvF7oqTzm14LzGZchYZuR8tGpHb/hpBLu4LAkYu6u0d+erNBfo1 0mXNI7IoKlxTg== Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 Date: Mon, 28 Oct 2024 12:40:25 +0100 To: erictnorris@gmail.com Cc: PHP internals Subject: Re: [PHP-DEV] [VOTE] Add persistent curl share handles In-Reply-To: References: <6e533102b9a2e9c8f6a2183440b2601a@bastelstu.be> Message-ID: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit From: tim@bastelstu.be (=?UTF-8?Q?Tim_D=C3=BCsterhus?=) Hi Am 2024-10-25 16:29, schrieb Eric Norris: >> I'm especially concerned, because the documentation for >> `curl_share_init()` uses `CURL_LOCK_DATA_COOKIE` as the example. I >> would >> also assume that sharing a cookie jar amongst several requests is the >> primary use case for leveraging a curl share handle, given that curl >> already performs in-request connection reuse when reusing a single >> CurlHandle or when performing requests with a CurlMultiHandle. > > I don't believe that sharing a cookie jar is the primary use case; it > certainly wouldn't be for us. Our primary use case is DNS and > connection sharing, for the aforementioned fanout. FWIW, the curl documentation also lists cookie sharing as the first thing when describing the "share interface": https://curl.se/libcurl/c/libcurl-share.html. >> Accidentally sharing a cookie jar for unrelated requests due to a >> badly >> chosen `$persistent_id` sounds like a vulnerability to is bound to >> happen to someone. > > I'll admit that I don't have a good response to this, since while I > agree this is possible, I don't think it outweighs the benefit that > persistence gives to the applications that need it (again, we won't be > using the cookie jar). I see you've noted concerns about > documentation, would this be something you'd at least feel more > comfortable with if curl persistence documentation included a strong > warning? Yes, I would be *more* comfortable if the documentation would ensure to appropriately the possible issues with persisting the share handle across multiple requests, possibly giving suggestions as to how a good persistent ID could be constructed (e.g. by using a hash of the current security context or something like that). I would also be more comfortable, if persisting a share handle with cookie jar sharing enabled would not be possible / would be ignored, because that only leaves the TLS session sharing (see below) as a possible safety issue. > Interestingly, one of the reasons that I began this work was that we > found a measurable performance impact of talking to the caching > resolver on localhost, let alone the connection overhead. Essentially That honestly surprises me. >> As for sharing a TLS or HTTP connection, >> this might or might not lead to vulnerabilities similarly to cookie >> sharing, when TLS client certificates are used for authentication. > > I believe curl itself handles this - it won't re-use a connection if > the settings aren't the same. Okay, the curl documentation does not say anything about this in https://curl.se/libcurl/c/CURLSHOPT_SHARE.html#CURLLOCKDATASSLSESSION. > re-raise it there. We do already have a curl_share_close method, so > the RFC would not need to add a new method, and since the method is > called curl_share_close, it's not really changing the meaning of the > method either. That makes sense to me. >> Also badly chosen `$persistent_id`s might result in a >> large number of handles accumulating, without any kind of garbage >> collection. For the existing persistent handles (e.g. for database >> connections), the ID is chosen by PHP itself, ensuring a somewhat >> predictable behavior. > > I agree that this may be a risk, but it's not clear to me that this is > a risk that people would realistically run into, considering you'd > want to have a small number of persistent IDs in order to benefit from > re-using them. Dynamically picking your persistent IDs would lead to > less re-use. I believe in misuse-resistant API design. Folks should be steered towards doing the right thing by default. As an example: What if a library internally choses to use persistent share handles for whatever reason? Or an add-on you install in your CMS software? PHP for better or worse is not just used for in-house software with an experienced development team. > that developers are reaching for. Frankly, I see very little reason to > use it without persistence, considering that curl_multi_init does the > same thing, and is simpler to use. Yes. And that probably explains why you are not seeing much use of it until now. But if/when it gets “marketed” as the next-big-thing in performance by tech bloggers that don't understand the possible caveats, it will probably start to get used by folks that don't fully understand it. > Regarding the RFC text, I apologize for not filling it out more, but > it seemed straightforward to me: curl supports the share interface as > a primary feature, and there are no glaring warnings in the > documentation (https://curl.se/libcurl/c/libcurl-share.html) > indicating that you might hold the share interface incorrectly. As I The difference to me is that when using libcurl directly, you would generally have a more explicit data flow and don't pull a curl share handle in an unknown state out of thin air. See the library example above. >> After all the pros and cons would also need to be >> documented in PHP’s documentation and the RFC text would be a good >> source for the documentation team. > > My plan was to help add the documentation (and tests) after this > passed; it was unclear to me that documentation may need to be sorted > out up-front in the RFC process. That said, I believe I can summarize The actual documentation is generally written during the RC phase, but if the RFC author is no longer available when that happens, the documentation team needs something to work with. > the remaining cons from your email: > > - A developer could reuse a cookie jar with cookies from user A in a > subsequent request from user B to origin Z if they opted in to > persistence, used CURL_LOCK_DATA_COOKIE, and used the same persistent > handle for both users. Yes. > - A developer could balloon memory usage by using dynamic persistent > IDs, especially so if curl_share_close doesn't close them. Yes. > If I were to add notes about these both to the documentation, and > curl_share_close closed persistent share handles, would you change > your vote for this or a future RFC? I'm not sure about whether I consider documentation alone to be sufficient, but when share handles with cookie sharing enabled would not be persisted (and removed from persistence when enabling the option after the fact), then I would be fine with trusting libcurl to do the right thing. Ideally this could be confirmed with some reference to the libcurl source code or getting upstream confirmation that this is safe to do. Best regards Tim Düsterhus