Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:114630 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 74476 invoked from network); 27 May 2021 10:32:56 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 27 May 2021 10:32:56 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 0002D180211 for ; Thu, 27 May 2021 03:44:41 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.2 required=5.0 tests=BAYES_40,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-lf1-f50.google.com (mail-lf1-f50.google.com [209.85.167.50]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 27 May 2021 03:44:41 -0700 (PDT) Received: by mail-lf1-f50.google.com with SMTP id e17so6744821lfb.2 for ; Thu, 27 May 2021 03:44:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=NAY+/HSHj4g/+0cux8Is+a8z6myzVC01xnbkEXD9lcM=; b=Otk45LSs8P7ROnoaM43barrwrm8KF6GsTVXp7PtO6hjUPLR24jbjKzlHnwcUzEdKOR SHJNFoKHmzY6QB8olu5JbaDMxbhk/4v57HJuf4xUYHrfrIEdJE5rjL7wIGWlxoZEUBor pbyiZuss7nHrYwe9QyyJ0x+Jh01g7AtV6R4kdNiYRgp1upWjBH3QOomkmXX1Mo6PDPxN Rm7EVXgn01GLRquAdrSLMCYM/D48rU8VH/RKBSqoq+BYQuvsHXU4vfwdrYkIOeVVfEKt ry5qQpaVFKr6Crrp65hT59FnZpoQFG4Lsjf5irzR8fxehe1PMe1I5Ju4pwoMuIBS57Sn FZrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=NAY+/HSHj4g/+0cux8Is+a8z6myzVC01xnbkEXD9lcM=; b=RSfJCaaL9krJV69hPfaqTdxcgReWPjV63lO8rcCbU61Bp2CLWWJNABk/rzaVGSpOUM 4k3evhWs1LDiiQ3pojU3pio8DEuhaZjftlyblRJa9NW/AQO1ZFmV3DB4UCXD/F7Pln7t XdJzG3vVDNOOmkjrU8kkRqZUK05h7WGUr6WoeuMJNSvFZ8gkGClZeZSDA/U0VmclhJ9D Xmo5BHJ7hMd+fsSN2JYKMH/FChT+c/hxmwjsrl/hL3L2Frhnjs7KoPg6KzBr5k4Zf8Hh LevDCCwNYEODOZVemUez//mHF0CbwtlZIBIRTeZtgKLeJ1Phgh7AYLs6273HaUskzeVJ R04A== X-Gm-Message-State: AOAM531pjFI7/pVcmTtQRUIHPr7qjyDDWySn38Zsv7h+ekTFlyKHE1C6 uh7eTuF+rFGpIow96tQH+8af7VcVIGM1//MbygKA19aorYEn2w== X-Google-Smtp-Source: ABdhPJwHtPZUg0TCtqAcJNrhHN/uXqraTYx38qMQqxPhyzgH6LSyweyhzvODJpVKwxYHdeClwEm7WZUCRV7ti7t9qMs= X-Received: by 2002:ac2:424f:: with SMTP id m15mr1831049lfl.223.1622112277315; Thu, 27 May 2021 03:44:37 -0700 (PDT) MIME-Version: 1.0 Date: Thu, 27 May 2021 12:44:21 +0200 Message-ID: To: PHP internals Content-Type: multipart/alternative; boundary="00000000000013418305c34d72bc" Subject: Escape \0 in var_dump() output From: nikita.ppv@gmail.com (Nikita Popov) --00000000000013418305c34d72bc Content-Type: text/plain; charset="UTF-8" Hi internals, A regular annoyance with our test suite is that var_dump() prints null bytes literally, those null bytes get into our test expectations, the test is interpreted as a binary file by git, and diffs will not be shown on GitHub. Of course, the null bytes are also confusing to the reader, as many editors will not display them, and the only indication that they are there is the string length. https://github.com/php/php-src/pull/7059 does a surgical change to replace null bytes with \0. A potential problem is that this no longer allows an easy distinction between "\0" and "\\0", as both will be printed as "\0". We could resolve this by additionally escaping \ (and presumably "). See https://gist.github.com/nikic/d3d74d88178ea6946305e0b2d38a84f9 for an rough idea on the impact on Zend and ext/standard tests. So, I'd be interested in knowing a) whether var_dump() should be performing any escaping at all, and if yes b) how much we should be escaping. Just \0, both \0 and \ to avoid ambiguity, or a larger set of control characters? Regards, Nikita --00000000000013418305c34d72bc--