Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:113059 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 26129 invoked from network); 3 Feb 2021 15:08:49 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 3 Feb 2021 15:08:49 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 2922C1804DC for ; Wed, 3 Feb 2021 06:52:19 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com [209.85.208.171]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 3 Feb 2021 06:52:18 -0800 (PST) Received: by mail-lj1-f171.google.com with SMTP id l12so28589232ljc.3 for ; Wed, 03 Feb 2021 06:52:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=pBPRczX3RJD6xxMcJBA/nD4ZMoXIzxe0VExDtwDc6yE=; b=SnXQOibMQHYAyxiD6o0uQ2CQGnViC/VBS+edVn/W2lcEebG7LU/0rUUNJcPTxnZ6eY Z7wBrYpk0ZzbDHnDcsM00+yP0qid3pQtIl3zjKqF0Y7X896pPrzlUOoMhS+fALXOnvNz rwVkr5engJBbgcWLQr7xQ1K+yTChMHh6VNlJ5fT/srA0XLGxCFwVxNYcAb2KRUgsxS1W cJ71KXXmhbB3zg99yYDzGT0v4LY/2WcxCs9qQA80JwkiDZfrlxgW02g+OdTb4agUqEEu 9WAeAIXFsasGQyCWNjF3FuUBXax7U3Tmenv5MOamsFIhY4yXu+Hd1wNYoT4X2qHLkKVo dn/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pBPRczX3RJD6xxMcJBA/nD4ZMoXIzxe0VExDtwDc6yE=; b=m4txXRbs6ukufQOTeEZXucjiGX6nsE5lwkTYODZ8KNHRkBM+TQ1/gUi4hhDwcemCyg jJR8ok9R0USIN0rZzvLeLaN8bAkBltnshD0zeERnSDx0bY73ClfVRcwnioqFWUZMqMD6 gCbuDHAjuH/cMdFkFwM3wosUFkJFy4qLkI9ldW56QvQZw9BKArtbltrZPTTuUJ/Ic0fg BTweIaHL6XHQgqAABxyll7UKtDnWU2qFxBiKhnOk5xFr7PR8CQ4gsg/f7kZFVoZdKYnX qAQd+JWskkO/w9FlccBSMN+pN+6xsGyB1/7VEJu12rOmr4P80Waf8b4CZYfE/vWuFZGF OTyw== X-Gm-Message-State: AOAM530qxDmgkDDlxUekqE08Krvqh037/3Oti1CQWf6HgU9latuK53JI 9i6mTwfm3Ez9GYpgn65a25Gn3t31JdkJu359GiM= X-Google-Smtp-Source: ABdhPJynSc1oCP+dhO2FAKrI1jmJC21AtWhhXpG0mbYby1W1ycLxy46ZGZxo9KsWVHuWifaokeEGaRKzbJy3u5Y/3bI= X-Received: by 2002:a2e:9f17:: with SMTP id u23mr2020564ljk.353.1612363936125; Wed, 03 Feb 2021 06:52:16 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Date: Wed, 3 Feb 2021 15:52:00 +0100 Message-ID: To: tyson andre Cc: Bob Weinand , PHP internals Content-Type: multipart/alternative; boundary="000000000000a956ec05ba6fbb4b" Subject: Re: [PHP-DEV] [VOTE] Dump results of expressions in `php -a` From: nikita.ppv@gmail.com (Nikita Popov) --000000000000a956ec05ba6fbb4b Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Feb 3, 2021 at 3:40 PM tyson andre wrote: > Hi Bob Weinand, > > >>>> Voting has started on > https://wiki.php.net/rfc/readline_interactive_shell_result_function > >>>> on 2021-01-19, and ends on 2021-02-02. > >>>> > >>>> This RFC proposes to dump the results of non-null expressions using > var_dump/var_export() by default in `php -a` (the interactive shell). > >>>> Additionally, this adds a new function > `readline_interactive_shell_result_function` to the readline PHP module. > >>>> This function only affects interactive shells - it can optionally be > used to set or clear a closure when `extension_loaded('readline') =3D=3D= =3D true`, > >>>> but that closure would only be called in interactive shells (i.e. ph= p > -a). > >>>> (That closure would be called instead of the native implementation > with the snippet of code that was evaluated and the expression's result, > >>>> if a php statement contained a single expression such as `2+2;` or > `$x =3D [1,2];` (that could be used as the expression of a return stateme= nt) > >>>> - Dumping of expression results can be disabled using an ini setting > or at runtime > >>>> > >>>> Thanks, > >>>> - Tyson > >>> > >>> Hey Tyson, > >>> > >>> My main concern in this iteration of the RFC is: what happens with > big/deeply nested objects? > >>> They tend to spew tons of lines if var_dump()'ed. Do we have > reasonable depth/output limitations in default dumping mode? > >>> > >>> I'm often enough using php -a to do some quick ad-hoc processing > (example, read a big json file, and then access a value; instantiating a > mediawiki bot framework and calling replace on it; ...). > >>> > >>> It's really cool to have any interactive feedback at all, but please, > at least by default, limit the output. (An example is the JS REPL in > browser console - it shows you a minimal preview of the object, and then > you can expand with your mouse. Obviously with a pure cli application, th= is > needs different - intuitive - navigation.) > >>> > >>> As it currently stands, this makes php -a unusable in any but the > simplest cases, without just disabling the whole feature. > >>> > >>> I like the whole feature, but the missing output limitation (I have > yet enough nightmares from var_dump()'ing the wrong object filling my she= ll > with tons of irrelevant information=E2=80=A6 I don't need that potentiall= y > happening on every single evaluated expression) > >>> > >>> Thus I'm voting no, for now. > >> > >> As-is, the entire object or string would be dumped with > var_export/var_dump to stdout. > >> > >> Thoughts on the adding following output truncation mechanism > >> (for the default C result dumper implementation) > >> before printing the results of the returned expression > >> (the user output continues to be untruncated, and the existence of > cli.pager > >> would not affect this mechanism in case the binary is not actually a > pager)? > >> For arrays/objects used with var_dump - the equivalent of > >> `ob_start(); var_dump($result); $result =3D ob_get_clean();` > >> would have to be used first from C since var_dump still writes to the > output buffer (php_printf(), etc.) > > > >var_dump() tends to be quite intensive in newline usage. I don't think > var_dump() (as is) is the best mechanism to print. At least I'd use one > property/one array argument =3D single line instead of two lines. > >Additionally, when dumping a nested object, it's more valuable to see as > much as possible from the primary object rather than the deep nesting. > > > > > >> I'd omitted output truncation from the RFC because I wasn't sure how > many people > >> would consider it excessive to include a limit on var_dump output, and > there was little feedback before the RFC vote started. > > > >Yeah, sorry for the late comment on that :-) > > > >> The simplest implementation would be to truncate to a byte limit and > append `...` if truncated, > >> but the main concern that's been brought up is the approximate number > of lines. > >> Obviously, there'd be a value in truncating the output if there were t= o > be megabytes of output, > >> though exactly what settings make sense as a default would vary. > > > >As mentioned a section earlier, there should be limits according to the > nesting-level =E2=80=A6 e.g. if it's the top-level value, a string may ha= ve 20 > lines printed and array and objects as well. And array entries/properties > should then be in a single line or take up to max ... 20 / count(properti= es > or array entries) lines (width before truncation can be determine from > terminal width), which then can be mostly in one line flattened output. > >The general rule here should be, show me my object/array at hand and giv= e > me a brief overview of what's nested within - and if I'm interested in > detail, let me (manually) output that object. > >Also consider, when dumping strings within (nested) arrays/objects to > replace newlines with \n, for nicer display. > > > >> C would not print anything (e.g. `=3D> `) or even call > var_dump/var_export if the limits were set to 0. > >> > >> ``` > >> >> > >> const ASSUMED_BYTES_PER_LINE =3D 80; > >> const ASSUMED_TAB_WIDTH =3D 4; > > > >On that topic, may make sense to also fetch the actual terminal width an= d > heights respectively (ioctl TIOCGWINSZ). > >I.e. if terminal is 20 lines high, printing 20 lines is a lot. if it's 8= 0 > lines high, 20 lines is acceptable. > > > >> // unmodified > >> var_dump(truncate_string("test short string")); > >> // 5000 'A's followed by "...\n" > >> var_dump(truncate_string(str_repeat('A', 10000))); > >> // 100 lines containing "A" followed by "...\n" > >> var_dump(truncate_string(str_repeat("A\n", 10000))); > >> ``` > > > >I'd wager nobody needs a hundred lines, I'd use inspiration from > debuggers which typically show 5-20 lines max. > >IF you want to print it all, it's just as easy as `echo $var;` though. > > > >So overall: try to make efficient usage of horizontal as well as vertica= l > space while still remaining usable and readable. > > I'm not sure if I want to do anything that complicated or if there'd be a > consensus on how to indent each level. > Having the same rules at any indent level would be my preference. > > Another idea I'd had would be to add a new function > `var_dump_as_string(mixed $value, int $use_placeholders_after_lines =3D -= 1, > int $use_placeholders_after_lines =3D -1)` to > > 1. Escape control characters in the same way as the proposed > var_representation > 2. Replace remaining fields with `...` if the line limits or byte limits > would be exceeded by adding a key-value pair > in arrays/objects that have more fields. (go back and truncate the > string if necessary) > 3. Possibly revisit other representation choices, e.g. put values on the > same lines as keys > > ``` > php > echo var_dump_as_string(range(0,99), use_placeholder_after_lines: 5= ); > array(100) { > [0]=3D>int(0) > [1]=3D>int(1) > ... > } > php > var_dump("a\n\"b"); > string(4) "a > "b" > php > echo var_dump_as_string("a\n\"b"); > string(4) "a\n\"b" > ``` > > This would have the following benefits: > > 1. In addition to being usable in an interactive shell/REPL, > it would be possible to use this in other libraries/applications where > dumping too much debug output is a concern, > or to invoke it manually (e.g. through a userland wrapper function > `v($value)`) when debugging. > (where recursion detection and object ids are useful to have when > debugging) > 2. Performance and memory usage would be less of a concern when dumping > extremely large objects. > 3. This would avoid the chance of mixing php code output from > `__debugInfo`/notices with the outputted string, > which is possible with ob_start()/ob_get_clean() > 4. Avoid mixing in control characters such as newlines with debug output > when a var_dump-like representation is needed. > > -Tyson > Seems like we've been circling around this topic for a while: PHP has a lot of dumping functions, but they all have their problems. https://wiki.php.net/rfc/readable_var_representation adds another one, but it also doesn't cover this particular angle, because it is focussed on generating valid PHP code. Possibly that RFC should take on a wider scope? One dump to rule them all ;) Nikita --000000000000a956ec05ba6fbb4b--