Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:114639 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 98012 invoked from network); 27 May 2021 13:30:26 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 27 May 2021 13:30:26 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id BFFFA1804D9 for ; Thu, 27 May 2021 06:42:12 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com [209.85.208.171]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 27 May 2021 06:42:12 -0700 (PDT) Received: by mail-lj1-f171.google.com with SMTP id c15so852872ljr.7 for ; Thu, 27 May 2021 06:42:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=2OSYQ6FTMG8Lpv9IB+PrrTW/kRRge8ouHJhKwvukRp4=; b=T19v8mA5xiEfpd+IEFBe0YeNzHbmyOQ+59S0pbVZuqPXGHgH7oQwiueE5iY9KV/UdU PArMs2x2OlfxoRK26/Ez/rhHtrVrCjtwmwtR3Uviu2zm+spooWw2bjmpGGXfmoJ9w7Su n/4M7JgetEv6Yx0h/vZucTmKxuoYS2lE0oC7K3rO/hHiek4go77eLwJkdoasfBkN1Y7D fpqj7F/wKCH0yxTBqCWTyZA72Pjdi0tkTrG5y12ojijsHhGrOEmbfXIbYwm/WMfhDNZY AL6uNIXVmMIDMlClwvTrqVzdMElf9DAOC7GdKFPC0HuAFR076ncBygMWhH/hnkaVF0BO DerQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2OSYQ6FTMG8Lpv9IB+PrrTW/kRRge8ouHJhKwvukRp4=; b=pbzjW2rNbYknLKGeKWPEM1/OxV3rsIoj+5e5K4hWt1Em2CxIRurteacUHvsvcWk8Cr iV08Vo0CXv96xOFhW7frNFGyBfT/+zj+fgsF4jJs1uTZfsJKrqd4QYnL/CXwutRzoih4 dQLGwGR3mF8eJQ7DkHhZfximyWwKuR+1IsxQ9sQfpA8GUmihbID/6bG/+2Ua5byQLxB1 oHoUKL5MGsbxGDypriwtK/4uAKBnrh1QvXEXIjwMmqOKYm65EQuQkIPgipUXqXKAHxD4 QuQLKKDbpWP+bws3eozlGK4nzK0ET+TD8r8UKI7pArsHH9Pn2Nbp/4vXBpaTqaJF9msn 5b3Q== X-Gm-Message-State: AOAM530UAk7OudHCIttHybsYgUh/J8dsQclGhVEBdebDdkQ9nBomjb6r 16KyJyuyuIo9NVY3wexO71P/dec1W3a97JoN92E= X-Google-Smtp-Source: ABdhPJz2a3IpaOs21ydhQ8AaAVC8CTiKcM3S9AbNNk8iIM0VgZBEy5dhGZ3ThK1kXpDgDXtAyWYNSggt1+b1zdcamSA= X-Received: by 2002:a2e:9189:: with SMTP id f9mr2823739ljg.353.1622122929794; Thu, 27 May 2021 06:42:09 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: Date: Thu, 27 May 2021 15:41:53 +0200 Message-ID: To: Sara Golemon Cc: PHP internals Content-Type: multipart/alternative; boundary="00000000000003329305c34fed19" Subject: Re: [PHP-DEV] Escape \0 in var_dump() output From: nikita.ppv@gmail.com (Nikita Popov) --00000000000003329305c34fed19 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, May 27, 2021 at 3:22 PM Sara Golemon wrote: > On Thu, May 27, 2021 at 5:44 AM Nikita Popov wrote= : > >> https://github.com/php/php-src/pull/7059 does a surgical change to >> replace >> null bytes with \0. >> >> > Before delving into any other observations, I first want to state > emphatically that we should not ONLY escape chr(0). We either apply full > on addcslashes() (or similar) or do nothing at all. Half-escaped is wors= e > than not escaped. Sadly, this presents more BC issues than just chr(0) o= r > '\\0' do alone since there are many more potential places where output wi= ll > change for anyone using it programmatically. > > So should we? > > On the one hand: The intent of the var_dump() function is for human > readable debugging. Anyone using this in a programmatic way is Doing It > Wrong=E2=84=A2, so I'm not too fussed about changing the output of this f= unction. > On the other hand: WE use this function programmatically in our tests > across many thousands of occurrences. Others probably rightly do as well= . > Yes, the dissonance is real. > FWIW, updating our own tests is easy, because the bulk of the change is automated, just need to fix up individual tests afterwards. It may be an inconvenience for extension tests though, which require compatibility with multiple PHP versions. > The most prudent and BC-safe thing would be to add another function > `var_dump_escaped()` or even to prefer/advise json_encode() when content > safety is relevant, but additional type information is not (most of the > uses of var_dump we have). Unfortunately, this doesn't actually fix the > initial problem you stated, which is just getting useful data out of CI > failures. > Well, we had https://wiki.php.net/rfc/readable_var_representation a few weeks ago, which is basically a better var_dump() with escaping and all -- but you can tell how it went ;) > The PR you offered is the lightest touch and fixes the issue you cited > without causing any likely damage to current users, but I don't think we > should ignore the ambiguity of the output. Additionally, I think I could > make an easy argument in favor of escaping CR and NL at the very least. = At > this point the urge to escape backslash is extra-real as with windows pat= hs > an innocent \n is far more likely. > > ------ > > Actually. Maybe we're thinking of this wrong. > > Rather than change the output at all, why not just have a post-process > step in our test runner that transforms the output in the test report? > Then we could be as aggressive as we want, going as far as escaping all > non-printables plus backslash since at that point we're in it for the hum= an > readability (and knowing specific byte sequences is essential) and we bre= ak > zero BC for anyone else? > Transforming the test output (such that the transformed output is checked in EXPECT) would indeed be a way to address the original motivation. There's two caveats though: 1. It does not address the issue for other uses of var_dump(). People being confused by var_dump((array) $object) output because of the null bytes in the property names is practically a PHP classic. Of course, fixing this would take away from the authentic PHP experience :) 2. Changes to run-tests.php behavior also affect extensions, and I think this kind of change would be even harder to address for PECL extensions than a simple var_dump() behavior change (where you can at least easily fork two versions of the file for the tests that need it). Regards, Nikita --00000000000003329305c34fed19--