Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:114642 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 3690 invoked from network); 27 May 2021 13:55:12 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 27 May 2021 13:55:12 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 13F491804C9 for ; Thu, 27 May 2021 07:06:59 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-lf1-f54.google.com (mail-lf1-f54.google.com [209.85.167.54]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 27 May 2021 07:06:58 -0700 (PDT) Received: by mail-lf1-f54.google.com with SMTP id w33so274485lfu.7 for ; Thu, 27 May 2021 07:06:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=gtc5hn2cUugxM9cdIpJJ8FZxaxMPJPpbG1zBtJGYuyg=; b=I59jN8jgr8MfCSQjuQv8xeX5Fuc0D3ml5EU1iUHLLkEgc5tysj0O4Jav0rUrlhpQMj M5biLSqrTE6NYfBDbx8O/ilK9PSyFtXTNw5gL9M8qJUXo3zdt2mSLZeh3kt+a3O3LBSB HUQZDNFNHefcf7BDtYbv+h8jbqvZc5/ZXkWSD4+WUiDSHE2GeUZZzMM94Z5orSuGfm92 8HPUZLII4oVmzq3BlCTKdH41is2K67XHg6iaUGwpTcWNFOxzi6dxH+ONnQm/TJokUdkw T6vWXiCIGmIrxdAwJvnExkboXfcCCPNMsdwGqnOxUDWaEQh8cdg8zzXF399HjX+XcXBN UNaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=gtc5hn2cUugxM9cdIpJJ8FZxaxMPJPpbG1zBtJGYuyg=; b=V/xUGa/Anth9z1hHFK7R7IRq1qLuT6t4JVufKvsliRV2psYXxvHHCTPKVglFje4Ade fFTBO+h5ubQjuXaatzmMQU1FK6RkmZtPsPyz7odnO5IjbxzVLA//9lIVd5f8CEMzJE+Q 6kP0/h3hHmNQWe0TCbteVuyVU5ZSqfh+26S19v5GJs/PFJOC3CYhAxtk1+safqbmiJJC qWKYe3XEIUU+atny0yW+FUm7iuDrtRwg/4sb87jq2lNiIWjFlxC6y60RW6M0lqaUtSVC s2chHVMjwFmYOZdsgZSTNtmZSKsMu5Lk5woQQvjGQ1hchZSF17JbOFulnSPBYagBNu1v cgLQ== X-Gm-Message-State: AOAM532sGqkzECd+X6y0pwpgtTU7g1EJuwrchKmAcG4gyVKpsghX7MpS 7cBPCyJ6gOOzXLgXYAEq/8vnuufpxmcV09yjifM= X-Google-Smtp-Source: ABdhPJzvH8TT+HVzJjZsiyd/wfW2TcsDJtcykQXCRFL7dS4a7zKh6GucJ2dC2g1o/eOahbBMqAaEaemW9XY6IF9KiKU= X-Received: by 2002:a05:6512:3b22:: with SMTP id f34mr2494718lfv.159.1622124416071; Thu, 27 May 2021 07:06:56 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: Date: Thu, 27 May 2021 16:06:40 +0200 Message-ID: To: Sara Golemon Cc: PHP internals Content-Type: multipart/alternative; boundary="00000000000099fcc505c3504596" Subject: Re: [PHP-DEV] Escape \0 in var_dump() output From: nikita.ppv@gmail.com (Nikita Popov) --00000000000099fcc505c3504596 Content-Type: text/plain; charset="UTF-8" On Thu, May 27, 2021 at 3:54 PM Sara Golemon wrote: > On Thu, May 27, 2021 at 8:42 AM Nikita Popov wrote: > >> The most prudent and BC-safe thing would be to add another function >>> `var_dump_escaped()` or even to prefer/advise json_encode() when content >>> safety is relevant, but additional type information is not (most of the >>> uses of var_dump we have). Unfortunately, this doesn't actually fix the >>> initial problem you stated, which is just getting useful data out of CI >>> failures. >>> >> >> Well, we had https://wiki.php.net/rfc/readable_var_representation a few >> weeks ago, which is basically a better var_dump() with escaping and all -- >> but you can tell how it went ;) >> >> > > I don't specifically remember that thread, but suddenly I feel like I can > guess everything I need to know about the outcome. :( > > > Rather than change the output at all, why not just have a post-process >>> step in our test runner that transforms the output in the test report? >>> Then we could be as aggressive as we want, going as far as escaping all >>> non-printables plus backslash since at that point we're in it for the human >>> readability (and knowing specific byte sequences is essential) and we break >>> zero BC for anyone else? >>> >> >> Transforming the test output (such that the transformed output is checked >> in EXPECT) would indeed be a way to address the original motivation. >> There's two caveats though: >> 1. It does not address the issue for other uses of var_dump(). People >> being confused by var_dump((array) $object) output because of the null >> bytes in the property names is practically a PHP classic. Of course, fixing >> this would take away from the authentic PHP experience :) >> > > True, but neither (IMO) would just escaping null byte, which again I > regard as a half-measure. It has all the negative impact of breaking BC > (your point about extension phpts is a good one), without going far enough > to address the human-readability problem for many common cases (CR/NL chief > among them). > > Insert Karate Kid "squash like a bug" reference here. Escape it all, or > escape none of it. > > >> 2. Changes to run-tests.php behavior also affect extensions, and I think >> this kind of change would be even harder to address for PECL extensions >> than a simple var_dump() behavior change (where you can at least easily >> fork two versions of the file for the tests that need it). >> >> > Would it? The report at the end of the run is for human eyes, generally > speaking. A human can interpret escaping (or not) as needed. > Also, we could make the post-processing optional. > > run-test.php --escape-test-output > > Then we use that mode in CI where we need git-friendly diffs, and it works > "normally" elsewhere. > Ah, I think I explained the original issue badly: The test runner output isn't really a problem, or at least I never perceived it to be. The problem is that if you change a test that contains a null byte anywhere (read: in var_dump output), then github will not show a diff for that file. You can see this nicely in the proposed PR itself: https://github.com/php/php-src/pull/7059/files Most of the updated test files show up as "Binary file not shown." As you can imagine, this makes review hard. Using the .patch file doesn't help either, because it only contains entries like: diff --git a/Zend/tests/bug60569.phpt b/Zend/tests/bug60569.phpt index 480c9c8f8a1285956616a119825520471bbe88b5..c658fa28a00017412a3a840498605efa7f8fe10b 100644 GIT binary patch delta 25 gcmZ3i_@% delta 23 ecmZ3@w32DUA4Z0W{}j0y6mnCGixbmRmAC+9#0Sy< To avoid this issue, we need the transformation to apply the actual EXPECTed output, otherwise our test files will still be considered as binary files. Regards, Nikita --00000000000099fcc505c3504596--