Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:120712 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 76653 invoked from network); 28 Jun 2023 13:00:02 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 28 Jun 2023 13:00:02 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 42BCB1804D0 for ; Wed, 28 Jun 2023 06:00:02 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 28 Jun 2023 06:00:01 -0700 (PDT) Received: by mail-pj1-f42.google.com with SMTP id 98e67ed59e1d1-2553663f71eso2472045a91.3 for ; Wed, 28 Jun 2023 06:00:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687957200; x=1690549200; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=9FI837Mn4hXaw+3Yt8tqS+D3/CghNsFthro+xDjTNCA=; b=C/SLU93shcJqnv8+hoCjpMRQbo3qX3xHgN9rvk3+WkWGhrfs6K5HPZMPHy4K3XpA7q Mbi5Yw/h2GciZcTUu/0tJjv2Mg4gooRqeZ5hgIqJrDQa2GlhL8UNGsAZv+Z/qBsmTgcf SBLXftxmkdo5YCp/Z0zwelloeRaUGd0c0ReSZpLQpKoQsxqH1MXLMCvaJDpLuhFLLHtz WOtpj3HWdZ0BSRBW8fjzAGKX+bqF+GjEV6bsIntC3b8C5HRLauP73Ou1nUrRG2CqdcHJ AgnZzKSOEOlXy9ykkqxb4LDeOnoeDsvvrrimlUPcfQskoh6g7NbUTmbNFRDSsJaE1fiY VbJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687957200; x=1690549200; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9FI837Mn4hXaw+3Yt8tqS+D3/CghNsFthro+xDjTNCA=; b=C06fmjNu9t7e3dOrRu4o/XGF14ipaxLaxIqe8avonB5iF8lc4Re3aeIdx8VfK75jnb +Zl7ZX48RBN5+bJswZ+vD99nkXZ0AKdR+sN7BHi5LGyTV2YLaVnyGl9cCODsxBJY1Tzi rl6ACsuv3+65ppNVztM/DC8Lbz7Wm81GRaE4oA0g8lY5Um49JblLyT/SWmQK3YkUnmz4 1pE/KFV109jFrpU4i3+CfuNza86P/e8hwqMx2SOhWrf5+1osFDoXfCeifre1wtMbagVR 1duGvifIAkGrq+RXjwJrJT44icE8WAcutUOdFAYIYtfApM56zgF7RXeq2b2iuz2rXCi5 FdCQ== X-Gm-Message-State: AC+VfDy6Mvmyr7PUgLp5XQdf9jhcvoh0lq+w4IYunDAJAq2U3V5IRHGI 4qMjRjUxkBp1TvKblFkU/8/qOADovSJAVsc3u+7/U+gQOsY= X-Google-Smtp-Source: ACHHUZ4wT8pvqc0XkM2ASwmlICM+D/htu9ALELFoog8eAHOJkvMOeASO0y0WPoIK6Y43Ndxl07lqJgBXy1n90ywNngc= X-Received: by 2002:a17:90a:a6e:b0:25b:dd3e:d3c7 with SMTP id o101-20020a17090a0a6e00b0025bdd3ed3c7mr21812959pjo.3.1687957200311; Wed, 28 Jun 2023 06:00:00 -0700 (PDT) MIME-Version: 1.0 References: <15D6E65D-97E3-4F82-8C8A-B8C1FB22D972@benramsey.com> In-Reply-To: <15D6E65D-97E3-4F82-8C8A-B8C1FB22D972@benramsey.com> Date: Wed, 28 Jun 2023 14:59:49 +0200 Message-ID: To: PHP internals Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] RFC1867 (multipart/form-data) PUT requests From: tovilo.ilija@gmail.com (Ilija Tovilo) Hi Ben On Tue, Jun 27, 2023 at 9:54=E2=80=AFPM Ben Ramsey wrot= e: > > > On Jun 27, 2023, at 04:01, Ilija Tovilo wrote: > > > > Hi Ben, Hi Rowan > > > > On Mon, Jun 26, 2023 at 8:55=E2=80=AFPM Ben Ramsey = wrote: > >> > >>> On Jun 20, 2023, at 06:06, Rowan Tommins wr= ote: > >>> > >>> On Tue, 20 Jun 2023 at 10:25, Ilija Tovilo w= rote: > >>> > >>>> Introduce a new function (currently named populate_post_data()) to > >>>> read the input stream and populate the $_POST and $_FILES > >>>> superglobals. > > In the past, I=E2=80=99ve used something like the following to solve this= : > > parse_str(file_get_contents('php://input'), $data); > > I haven=E2=80=99t looked up how any of the frameworks solve this, but I w= ould be willing to bet they also do something similar. > > Rather than implementing functionality to populate globals, would you be = interested in introducing some new HTTP request functions. Something like: > > http_request_body(): string > http_parse_query(string $queryString): array > > `http_request_body()` would return the raw body and would be the equivale= nt of calling `file_get_contents('php://input')`. Of special note is that i= t should _always_ return the raw body, even if `$_POST` is populated, for t= he sake of consistency and reducing confusion. > > `http_parse_query()` would be the opposite of `http_build_query()` and wo= uld return a value instead of requiring a reference parameter, like `parse_= str()`. The problem is that the content stream for multipart/form-data is expected to be big, as in possibly multiple gigabytes big. We can't use http_request_body() to return the entire content as a string at once. The current RFC1867 implementation reads and operates in chunks, i.e. appends it to a file or to a string, depending on the content part. It never has to hold on to the entire content in memory. http_request_body() also can't return the content of the request again after it has been consumed, because that's not how the HTTP protocol works. We would need to buffer the content somewhere when reading it for the first time, which again we can't do because it may be very big. It may be possible to pass the fopen('php://input', 'r') stream to this function and let it consume it. However, as mentioned in my original e-mail this requires some changes to how RFC1867 requests are handled. Currently, it calls sapi_module.read_post() which directly reads from the TCP socket. Instead, we'd need to read from the stream, possibly in addition so that the general case is not degraded in terms of performance. I'll verify if this is an option, and whether the changes are (too) big. However, I don't suspect there to be a lot of use-cases for this as RFC1867 is primarily used for requests and not for responses, so you wouldn't usually need to parse this type of content from some other source. As for returning the parsed values as non-globals, that's entirely possible. However, it's inconsistent with how requests are currently handled. The values will need to be passed around manually and kept alive, but the function still modifies global state (i.e. the input stream, whether that's sapi_module.read_post() or php://input). I don't believe it will be common to call this function more than once per request, and thus decoupling the state is not really necessary. Ilija