Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:120641 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 53773 invoked from network); 20 Jun 2023 09:25:53 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 20 Jun 2023 09:25:53 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id B556C18053F for ; Tue, 20 Jun 2023 02:25:52 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.7 required=5.0 tests=BAYES_05,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 20 Jun 2023 02:25:52 -0700 (PDT) Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1b50394a7f2so24556695ad.1 for ; Tue, 20 Jun 2023 02:25:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687253151; x=1689845151; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=B/Auq7n5ZmmZwrdj2crw+ifj9MmpZgSCBT607wa9SQI=; b=p2YZIcyWvljmPQBSXFfvV56wS9LjgZnC7Oq1U0/RfYOj3OmLss5zrscnZt1SrzP04M gcDva+dO5qOdXM/y4n46pB+6lr85Y/GPTKHmAJJzmOgp+AmCQtPwMmQR50EQukVxLVje jPgPKilSEuZcAb80AL98nCGd3bKWhAlRvd49rKzkTjHudcM1auvsKy72ET2739BaDof6 2IODS45XJM/MJmhoq19xjrLNnPL33qaGP03yALfZ0/ywx0o7QcjOhpM8bMJLsOVM2Pe7 GnHQdyv+QIXXLv+WafAUbXTwqQFNfd7ul9ZShRBvsZN/h8vmL3bSEU78MdISnpM1dIqe sL0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687253151; x=1689845151; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=B/Auq7n5ZmmZwrdj2crw+ifj9MmpZgSCBT607wa9SQI=; b=HtPxjIqPudUWHozjRzbmBtuvVIYPgjbjy7sMEcQ0hPeW19G+f/lfvya3rnqlCtMWuC IdNiveIKV6iepss1qLk2PIabRP/1zLwifUgyr16IU+2H3wCePTReEOekDMP+HKpCnVcS gMI9P7qkPp0JbglLNX57K04nbTdD87z+GEr6Lf8UWK1Uu5MdCSnH3Si1gYEyGOkoOtHZ D/QODp+r1BOc6LW4pjRLPDC8qSKgVGOXv6/XGpAT7vMlwFTqtM+8UP3K9aGEgWOeXqxx xO1PtV2+RbDbWyDbnH0EQT+EEN1hrRLhITfhMUEgZ0KGf3bCNkaF0XktbQP2G9QQnN9+ Cy8g== X-Gm-Message-State: AC+VfDwQMInSBoux2g3+ho9Wtmi+D9HxyBBwa5ioazc8EGEa9QIEDxQu +pfSvlwPyz9QEGWH4jcZYDUSIHlcbtnyVIpRsB1AXIG5DQQ= X-Google-Smtp-Source: ACHHUZ43MhhzD3nDaSCXS1/VbeXQBBodDF7k+IMOI0veBy6j3/vjzeu5PZ/7dYeW9zghzrtOElTdYMPNE0cDvu4FxuE= X-Received: by 2002:a17:902:dad2:b0:1b5:5a5c:8432 with SMTP id q18-20020a170902dad200b001b55a5c8432mr5823447plx.2.1687253150663; Tue, 20 Jun 2023 02:25:50 -0700 (PDT) MIME-Version: 1.0 Date: Tue, 20 Jun 2023 11:25:39 +0200 Message-ID: To: PHP internals Content-Type: text/plain; charset="UTF-8" Subject: RFC1867 (multipart/form-data) PUT requests From: tovilo.ilija@gmail.com (Ilija Tovilo) Hi internals A while ago I encountered a limitation of how RFC1867 requests are handled in PHP. PHP populates the $_POST and $_FILES superglobals when the Content-Type is multipart/form-data or application/x-www-form-urlencoded, but only when the method is POST. For application/x-www-form-urlencoded PUT requests this is not a problem because the format is simple, usually limited in size and PHP offers functions to parse it, namely parse_str and parse_url. For RFC1867 it's a different story. The code handling the request will need to use streams because RFC1867 is often used with files, the format is much more complicated, files should be cleaned up when the request ends if unused, etc. Handling this manually is non-trivial. This has been reported many years ago, and evidently caused a bit of frustration. https://bugs.php.net/bug.php?id=55815 This is not limited to PUT either, multipart/form-data bodies are valid with other requests. Here's the approach I believe is best. Introduce a new function (currently named populate_post_data()) to read the input stream and populate the $_POST and $_FILES superglobals. The function works for any non-POST requests. It assumes that none of the input stream has been consumed, and that the Content-Type is set accordingly. A nice side-effect of this approach is that it may be used with the enable_post_data_reading ini setting to decide whether to parse the RFC1867 bodies dynamically. For example, a specific endpoint may accept bigger requests. The function may be implemented in a more generic way 1. by returning the data/files arrays instead of populating the superglobals and 2. by providing an input stream manually. I don't know if there's such a use-case and thus if this is worthwhile, as it would require bigger changes in the RFC1867 handling. Here's the proof-of-concept implementation: https://github.com/php/php-src/pull/11472 For completeness, here are other options I considered. 1. Create a new $_PUT superglobal that is always populated. Two issues: The obvious one is that this is limited to PUT requests. While we could also introduce $_PATCH, this seems like a poor solution. While discouraged, other methods can also contain bodies. Another issue is that the code for processing RFC1867 consumes the input stream. This constitutes a BC break. Buffering the input is not feasible for large requests that would be expected here. 2. The same as option 1, but populate the existing $_POST global. This comes with the same BC break. 3. The same as options 1 or 2 with an additional ini setting to opt into the behavior. The issue with this approach is that both the old and new behavior might be desired in different parts of the same application. The ini option can't be changed at runtime because the populating of the superglobals happens before user code is being executed. Let me know what your thoughts are. If there is consensus in the feedback I'll update the implementation accordingly and post an update to the list. If there is no consensus, I will create an RFC. Ilija