Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:116754 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 83199 invoked from network); 2 Jan 2022 04:13:20 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 2 Jan 2022 04:13:20 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id E0F7B1804C4 for ; Sat, 1 Jan 2022 21:19:56 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.6 required=5.0 tests=BAYES_00,BODY_8BITS, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-oi1-f177.google.com (mail-oi1-f177.google.com [209.85.167.177]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 1 Jan 2022 21:19:53 -0800 (PST) Received: by mail-oi1-f177.google.com with SMTP id t23so50445221oiw.3 for ; Sat, 01 Jan 2022 21:19:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=GK6Sh+0fNfUhFPpSgbNHcYM2cKWIEGaaugGwKldCax8=; b=FFBoDuy1yOkSjdmOUE+rxOpJR/X/7n4sEgn98z9X6lLAv6ZQIgmVs/QdGTHWSb6eaq Ejla97i38bPh84YBH2Y+bDRy8RvEC+lhI5/1EOqvoV7OcQSl2w5LqFSdAzkDQCqQUcZD EWFNjzYb9gTJFbtHrzBmMVmiuIME3LTBg6BBGIOto25Djb0h3sbd56WMrPWjLBYOQzJF ZaGCq/xaQZafbhfX8j/blDH453jWieu6iAurEeG8cxn1xG/HkpDRgFnjr37cu1unH+/G xRjBWrDe6YO7vntZCDPkYoXuHqt0Tz0d6swAKaNxaagfVmFLOb9pZ1KSx797PP50CQu6 i96g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GK6Sh+0fNfUhFPpSgbNHcYM2cKWIEGaaugGwKldCax8=; b=axP0MweGCKEzo32KGqydOiJFTd3RL6mAx1s/PgMzXODA6BnLxB5IRtZ/BhMnLx0Rf9 anH1n0+qgNjSLxeuoJLbcluwWwPti5k1Q36uVPPV3lSqhHu243E0wK6/XCcb5ieJ52tZ rMm+JhquR5RREyjDNBpfGFH8Xcl43y76+nxNj9raPjnOEbVnkZRLTd/1VhanfjtAe2fC YiZ8iRgBkEQvBqga8WT0OmzuB9jgEAmW1m0KJwC4xeRV8SJ6kqlj3W3NfUBc6YHLx3fk cIPBAynqXSpH+S8OP4COqDLz53kn7Ii1mnmp6hE94L4qJXm4/s+AaoB9tH/Y/Yj+KYvx QR0Q== X-Gm-Message-State: AOAM530O09XOisY3hXYB7d56thaIxaTr2LGe23WnC8Gm1SpSs2WReEf3 J3zOQuHCB70dOZDyQgzw8aaPYtadNbokrI3LlSHGeCgc X-Google-Smtp-Source: ABdhPJzm60v5/DgQoBZuQ0IjZli/zNPSprJFtQi+ZGRa/OHWiv1pC4m+eSq91+VxIaotlfLrDdw/Sp+3jRFh71xW94I= X-Received: by 2002:aca:4bc1:: with SMTP id y184mr31652884oia.76.1641100792664; Sat, 01 Jan 2022 21:19:52 -0800 (PST) MIME-Version: 1.0 References: <1640910093.890171965@f721.i.mail.ru> <1641095231.967164658@f750.i.mail.ru> In-Reply-To: <1641095231.967164658@f750.i.mail.ru> Date: Sun, 2 Jan 2022 00:19:42 -0500 Message-ID: To: Kirill Nesmeyanov Cc: internals Content-Type: multipart/alternative; boundary="000000000000c995d605d4928dfb" Subject: Re: [PHP-DEV] RFC: Stop to automatically cast numeric-string to int when using them as array-key From: tendoaki@gmail.com (Michael Morris) --000000000000c995d605d4928dfb Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sat, Jan 1, 2022 at 10:47 PM Kirill Nesmeyanov wrote: > > >=D0=A1=D1=83=D0=B1=D0=B1=D0=BE=D1=82=D0=B0, 1 =D1=8F=D0=BD=D0=B2=D0=B0= =D1=80=D1=8F 2022, 17:41 +03:00 =D0=BE=D1=82 Rowan Tommins < > rowan.collins@gmail.com>: > > > >On 31/12/2021 00:21, Kirill Nesmeyanov wrote: > >> I support this behavior fix because in its current form, due to a > similar problem (almost?), all PSR-7 implementations contain bugs that > violate RFC7230 (section 3.2: > https://datatracker.ietf.org/doc/html/rfc7230#section-3.2 ). Thus, > physically, by the standard, all headers can have the name "0" (like =C2= =AB0: > value=C2=BB), but when stored inside implementations, it is converted to = a > string and a problem arises ($message->getHeaders() // > returns array instead of array). > > > >You appear to be technically correct - the RFC defines a header name > >only as "token", which implies the following would all be valid HTTP > >headers: > > > >42: The Answer > >!: Bang > >^_^: Surprised > > > >In practice, it would be a bad idea to use any of these. > > > >Every single one of the field names registered with IANA [1] starts with > >a letter, and proceeds with only letters, digits, and hyphen ('-'). [The > >exception is "*", listed there as "reserved" to specifically prevent its > >use conflicting with the wild-card value in "Vary" lists.] > > > >I'm actually surprised this definition hasn't been updated with > >interoperability advice in recent revisions of the standard. I did find > >this general advice for internet message headers in RFC 3864 [2]: > > > > > Thus, for maximum flexibility, header field names SHOULD further be > > > restricted to just letters, digits, hyphen ('-') and underscore ('_'= ) > > > characters, with the first character being a letter or underscore. > > > >The additional restriction on underscore ('_') in HTTP arises from CGI, > >which maps headers to environment variables. For instance, Apache httpd > >silently drops headers with anything other than letters, digits, and > >hyphen [3] to avoid security issues caused by environment manipulation. > > > >If I was developing a PSR-7 or similar library, I would be inclined to > >drop any header composed only of digits, and issue a diagnostic warning, > >so that it wouldn't escalate to a type error later. It certainly doesn't > >seem reasonable to change the entire language to work around that > >inconvenience. > > > >[1] https://www.iana.org/assignments/http-fields/http-fields.xhtml > >[2] https://datatracker.ietf.org/doc/html/rfc3864#section-4.1 > >[3] https://httpd.apache.org/docs/trunk/env.html#setting > > > >Regards, > > > >-- > >Rowan Tommins > >[IMSoP] > > > >-- > >PHP Internals - PHP Runtime Development Mailing List > >To unsubscribe, visit: https://www.php.net/unsub.php > > I just gave an example of what at the moment can cause an exception in an= y > application that is based on the PSR. It is enough to send the header "0: > Farewell to the server". In some cases (for example, as is the case with > RoadRunner) - this can cause a physical stop and restart of the server. > > Just in case, I will repeat my thesis: I cannot imagine that anyone is > using this functionality consciously and that it is part of the real logi= c > of the application. You don't have a lot of experience with legacy code then. PHP, particularly old PHP (like 4, 5.1 era) was used by a lot of idiots. I was one of those idiots (Perhaps I still am an idiot - jury is deliberating on that but I digress). Snark aside though, PHP has more than its fair share of self taught programmers (again, not trying to be insulting as I am one myself), and they do things with the code that veterans and formally trained programmers would never think to try, let alone implement. I guarantee fixing how key handling is done will break something - either in the form of code exploiting the weird behavior, or code that is guarding against the weird behavior; not to mention any tests that might be written - though amateurs rarely write test code (again, speaking from past experience I've grown beyond). > And fixing this behavior, I believe, will automatically fix many librarie= s > (not necessarily PSR) that do not take this behavior into account. > > And blow up who knows how many old code bases - many of which don't have unit test suites to discover if there is a break ahead of time. This is the sort of BC break that would cause a cliff of users unable to migrate to the major version that implements it. A Python 2 vs. 3 style of break. Even with that all said it may indeed be worth fixing - but this will require the same sort of kid gloves approach removing register globals had (for the newer folks, there was a time when $_REQUEST["var"] would auto populate $var with lovely security snarls). IIRC PHP 3 had register globals always on, 4 created a config toggle to turn them off, and PHP 5.0 turned that toggle off by default, finally PHP 5.3 (6 without unicode more or less) removed support for register globals entirely (My memory could be off - it's in the changelogs for the curious). I leave the decision making to the maintainers and contribs who do the actual work. Hell, I personally don't even use PHP that much these days having gotten a job where I focus on writing Cucumber tests in JavaScript that run on node.js. I keep up with PHP and this list though cause one never knows what the next job will entail. I just dropped out of lurk mode to underscore along with others up thread the massive ramifications of what is being proposed. As someone who wrote stupid code I can see this breaking, tread lightly. And hell, I don't even know how much of that code is still in use since I've changed employers many times since it was written. This situation is not unique and can create huge headaches for companies running projects on legacy code bases. --000000000000c995d605d4928dfb--