Hi internals,
I've created a PR https://github.com/php/php-src/pull/6619 to add an alternative to var_export()
.
The proposed short_var_export() also outputs/returns a parseable representation of a variable,
but adds requested features for var_export.
- Use
null
instead ofNULL
- the former is recommended by more coding
guidelines (https://www.php-fig.org/psr/psr-2/). - Change the way indentation is done for arrays/objects.
See ext/standard/tests/general_functions/short_var_export1.phpt
(e.g. always indent with 2 spaces, never 3 in objects, and put the array start on the
same line as the key) - Render lists as
"[\n 'item1',\n]"
rather than"array(\n 0 => 'item1',\n)"
- Prepend
\
to class names so that generated code snippets can be used in
namespaces (e.g. with eval()) without any issues. - Support opt-in
SHORT_VAR_EXPORT_SINGLE_LINE
inint $flags
.
This will use a single-line representation for arrays/objects, though
strings with embedded newlines will still cause newlines in the output.
("['item']"
)
Changing var_export()
itself has been proposed 9 months ago (https://externals.io/message/109415)
but the RFC discussion seems to be on hold.
https://externals.io/message/106674#106684 suggested adding a new function as an alternative.
One of the major objections is that changing var_export()
(instead of adding a new function) would have these drawbacks:
- It would be impractical to backport/polyfill calls with 3 args to older php versions, if
int $flags
was added.
ArgumentCountErrors are thrown in php 8.0 for too many parameters, and earlier php versions warn - If defaults were changed, then thousands of unit tests in php-src and other projects may fail due to
tests expecting the exact representation of a variable, requiring a lot of work to update tests
I feel like the user-friendliness of the enhancements makes adding another function
to PHP worth it. (e.g. copying short_var_export() into a code snippet such as the expectation for a unit test or a configuration)
I don't believe I've seen an RFC for a separate function before.
Also, a native C implementation would be much more efficient than a polyfill implemented in PHP,
e.g. https://github.com/brick/varexporter/issues/14
I plan to start an RFC document shortly. Thoughts?
(e.g. for the name, short_var_export() seemed more appropriate than var_export_short(), pretty_var_export() or canonical_var_export())
Thanks,
- Tyson
On Mon, Jan 18, 2021 at 10:12 PM tyson andre tysonandre775@hotmail.com
wrote:
Hi internals,
I've created a PR https://github.com/php/php-src/pull/6619 to add an
alternative tovar_export()
.
The proposed short_var_export() also outputs/returns a parseable
representation of a variable,
but adds requested features for var_export.
- Use
null
instead ofNULL
- the former is recommended by more coding
guidelines (https://www.php-fig.org/psr/psr-2/).- Change the way indentation is done for arrays/objects.
See ext/standard/tests/general_functions/short_var_export1.phpt
(e.g. always indent with 2 spaces, never 3 in objects, and put the
array start on the
same line as the key)- Render lists as
"[\n 'item1',\n]"
rather than"array(\n 0 => 'item1',\n)"
- Prepend
\
to class names so that generated code snippets can be used
in
namespaces (e.g. with eval()) without any issues.- Support opt-in
SHORT_VAR_EXPORT_SINGLE_LINE
inint $flags
.
This will use a single-line representation for arrays/objects, though
strings with embedded newlines will still cause newlines in the output.
("['item']"
)Changing
var_export()
itself has been proposed 9 months ago (
https://externals.io/message/109415)
but the RFC discussion seems to be on hold.
https://externals.io/message/106674#106684 suggested adding a new
function as an alternative.
One of the major objections is that changingvar_export()
(instead of
adding a new function) would have these drawbacks:
- It would be impractical to backport/polyfill calls with 3 args to older
php versions, ifint $flags
was added.
ArgumentCountErrors are thrown in php 8.0 for too many parameters, and
earlier php versions warn- If defaults were changed, then thousands of unit tests in php-src and
other projects may fail due to
tests expecting the exact representation of a variable, requiring a lot
of work to update testsI feel like the user-friendliness of the enhancements makes adding another
function
to PHP worth it. (e.g. copying short_var_export() into a code snippet such
as the expectation for a unit test or a configuration)
I don't believe I've seen an RFC for a separate function before.Also, a native C implementation would be much more efficient than a
polyfill implemented in PHP,
e.g. https://github.com/brick/varexporter/issues/14I plan to start an RFC document shortly. Thoughts?
(e.g. for the name, short_var_export() seemed more appropriate than
var_export_short(), pretty_var_export() or canonical_var_export())Thanks,
- Tyson
Hi Tyson,
The formatting of var_export is certainly a recurring complaint, and
previous discussions were not particularly open to changing current
var_export behavior, so adding a new function seems to be the way to
address the issue (the alternative would be to add a flag to var_export).
I like the idea of the "one line" flag. Actually, this is the main part I'm
interested in :) With the one line flag, this produces the ideal formatting
for PHPT tests that want to print something like "$v1 + $v2 = $v3". None of
our current dumping functions are suitable for this purpose (json_encode
comes closest, but has edge cases like lack of NAN
support.)
Some note:
- You should drop the $return parameter and make it always return. As this
is primarily an export and not a dumping function, printing to stdout
doesn't make sense to me. - For strings, have you considered printing them as double-quoted and
escaping more characters? This would avoid newlines in oneline mode. And
would allow you to escape more control characters. I also find the current
'' . "\0" . '' format for encoding null bytes quite awkward. - I don't like the short_var_export() name. Is "short" really the primary
characteristic of this function? Both var_export_pretty and
var_export_canonical seem better to me, though I can't say they're great
either. I will refrain from proposing real_var_export() ... oops :P
Regards,
Nikita
Hi Tyson,
The formatting of var_export is certainly a recurring complaint, and
previous discussions were not particularly open to changing current
var_export behavior, so adding a new function seems to be the way to
address the issue (the alternative would be to add a flag to var_export).I like the idea of the "one line" flag. Actually, this is the main part I'm
interested in :) With the one line flag, this produces the ideal formatting
for PHPT tests that want to print something like "$v1 + $v2 = $v3". None of
our current dumping functions are suitable for this purpose (json_encode
comes closest, but has edge cases like lack ofNAN
support.)Some note:
- You should drop the $return parameter and make it always return. As this
is primarily an export and not a dumping function, printing to stdout
doesn't make sense to me.- For strings, have you considered printing them as double-quoted and
escaping more characters? This would avoid newlines in oneline mode. And
would allow you to escape more control characters. I also find the current
'' . "\0" . '' format for encoding null bytes quite awkward.- I don't like the short_var_export() name. Is "short" really the primary
characteristic of this function? Both var_export_pretty and
var_export_canonical seem better to me, though I can't say they're great
either. I will refrain from proposing real_var_export() ... oops :PRegards,
Nikita
Sorry for the plug, but you may be interested in getting inspiration from
brick/varexporter:
https://github.com/brick/varexporter
In particular:
-
uses short array syntax always
-
doesn't output array keys for lists
-
can output lists of scalars on one line, under a flag:
INLINE_NUMERIC_SCALAR_ARRAY
-
Benjamin
Hi Nikita,
The formatting of var_export is certainly a recurring complaint, and previous discussions were not particularly open to changing current var_export behavior, so adding a new function seems to be the way to address the issue (the alternative would be to add a flag to var_export).
I like the idea of the "one line" flag. Actually, this is the main part I'm interested in :) With the one line flag, this produces the ideal formatting for PHPT tests that want to print something like "$v1 + $v2 = $v3". None of our current dumping functions are suitable for this purpose (json_encode comes closest, but has edge cases like lack of
NAN
support.)Some note:
* You should drop the $return parameter and make it always return. As this is primarily an export and not a dumping function, printing to stdout doesn't make sense to me.
It seems inconsistent and prone to bugs when refactoring (e.g. converting to string and not using the result)
to have two functions named var_export where one prints by default and the other doesn't, but otherwise .
Changing to a different name entirely would solve that, such as var_repr(), var_representation(), serialize_[value_]as_php_snippet(), etc.
* For strings, have you considered printing them as double-quoted and escaping more characters? This would avoid newlines in oneline mode. And would allow you to escape more control characters. I also find the current '' . "\0" . '' format for encoding null bytes quite awkward.
I'd considered that but wasn't sure if anyone else would want that - I also found it annoying for \0
and \n
, as well as other control characters (0x00-0x1f, e.g. \r\t
)
* I don't like the short_var_export() name. Is "short" really the primary characteristic of this function? Both var_export_pretty and var_export_canonical seem better to me, though I can't say they're great either. I will refrain from proposing real_var_export() ... oops :P
Potentially not shorter with the extra backslash and the potential for switching to "$var\0\ntest\" for string encoding.
var_representation($var, int $flags=0) seems clearer
Thanks,
Tyson
Le 19/01/2021 à 16:12, tyson andre a écrit :
It seems inconsistent and prone to bugs when refactoring (e.g. converting to string and not using the result)
to have two functions named var_export where one prints by default and the other doesn't, but otherwise .
Changing to a different name entirely would solve that, such as var_repr(), var_representation(), serialize_[value_]as_php_snippet(), etc.
Hello,
I think that decoupling it from var_export()
and naming it differently
is fine. var_repr() is not a bad name, a bit cryptic, but for those who
do Python, it's familiar.
Regards,
--
Pierre
It seems inconsistent and prone to bugs when refactoring (e.g. converting to string and not using the result)
to have two functions named var_export where one prints by default and the other doesn't, but otherwise .
Changing to a different name entirely would solve that, such as var_repr(), var_representation(), serialize_[value_]as_php_snippet(), etc.
I am considering to try and introduce a __repr() magic method that is
similar to the repr() method from Python. Having a var_repr() method
which does not use the __repr() method would be confusing. So I would
like to suggest not to use that name.
As the intent of the resulting string is to obtain PHP code that will
construct the variable, another option might be: var_constructor().
Regards,
Dik Takken
I am considering to try and introduce a __repr() magic method that is
similar to the repr() method from Python.
No more magic methods, please. We have too many already.
As the intent of the resulting string is to obtain PHP code that will
construct the variable, another option might be: var_constructor().
Wouldn't this be extremely confusing given that we use the name constructor
for OOP?
Hi internals,
As the intent of the resulting string is to obtain PHP code that will
construct the variable, another option might be: var_constructor().Wouldn't this be extremely confusing given that we use the name constructor for OOP?
I'd agree, var_constructor($x) could be mistaken for a shorthand for Closure::fromCallable([$x, '__construct'])
- Tyson
- You should drop the $return parameter and make it always return. As this
is primarily an export and not a dumping function, printing to stdout
doesn't make sense to me.
I'd argue the opposite. If dumping a particularly large tree of elements,
serializing that to a single string before then being able to write it to
file or wherever seems like packing on a lot of unnecessary effort. What I
would do is expand the purpose of the $output parameter to take a stream.
STDOUT
by default, a file stream for writing to include files (one of the
more common uses), or even a tmpfile()
if you do actually want it in a var.
** Or.... See my comment about objects further down.
- I don't like the short_var_export() name. Is "short" really the primary
characteristic of this function? Both var_export_pretty and
var_export_canonical seem better to me, though I can't say they're great
either. I will refrain from proposing real_var_export() ... oops :P
I would also make var_export
the dominant part of the name, so
var_export_SOMETHING()
.
Alternatively how about making a VarExporter class.
$exporter = new VarExporter; // Defaults to basic set of encoding options
TBD
$exporter->setIndent(' '); // 2 spaces, 1 tab, whatever blows your dress up
$exporter->setUserShortArray(false); // e.g. use array(...)
etc...
$serialized = $exporter->serialize($var); // Exports to a var
$exporter->serializeToFile($var, '/tmp/include.inc'); // Exports to a file
$exporter->serializeToStream($var, $stream); // Exports to an already open
stream
And if you want the defaults, then just:
$serialized = (var VarExporter)->serialize($var);
Potentially, one could also allow overriding helper methods to perform
transformations along the way:
// VarExporter which encodes all strings as base64 blobs.
class Base64StringVarExporter extends VarExporter {
public function encodeString(string $var): string {
// parent behavior is `return '"' . addslashes($var) . '"';
return "base64_decode('" . base64_encode($var) . "')";
}
}
Not the most performant thing, but extremely powerful.
-Sara
Hi Sara Golemon,
* You should drop the $return parameter and make it always return. As this
is primarily an export and not a dumping function, printing to stdout
doesn't make sense to me.I'd argue the opposite. If dumping a particularly large tree of elements, serializing that to a single string before then being able to write it to file or wherever seems like packing on a lot of unnecessary effort. What I would do is expand the purpose of the $output parameter to take a stream.
STDOUT
by default, a file stream for writing to include files (one of the more common uses), or even atmpfile()
if you do actually want it in a var.
There's 3 drawbacks I don't like about that proposal:
-
If a function taking a stream were to throw or encounter a fatal error while converting an object to a stream, then you'd write an incomplete object to the stream or file, which would have to be deleted
E.g. internally,
fprintf()
andprintf()
calls sprintf before writing anything to the stream for related reasons. -
This may be much slower and end users may not expect that - a lot of small stream writes with dynamic C function calls would be something I'd expect to take much longer than converting to a string then writing to the stream.
(e.g. I assume a lot of smallecho $str;
is much faster than\fwrite(\STDOUT, $str);
in the internal C implementation)
(if we callserialize()
first, then there's less of a reason to expose serializeFile() and serializeStream) -
Adding even more ways to dump to a stream/file. Should that include stream wrappers such as
http://
?
For something like XML/YAML/CSV, that makes sense because those are formats many other applications/languages can consume,
which isn't the case for var_export.
** Or.... See my comment about objects further down.
* I don't like the short_var_export() name. Is "short" really the primary
characteristic of this function? Both var_export_pretty and
var_export_canonical seem better to me, though I can't say they're great
either. I will refrain from proposing real_var_export() ... oops :PI would also make
var_export
the dominant part of the name, sovar_export_SOMETHING()
.
If I were to go with that in the RFC, the signature would be var_export_something($value, bool $return=false, int $flags=0)
Alternatively how about making a VarExporter class.
$exporter = new VarExporter; // Defaults to basic set of encoding options TBD
$exporter->setIndent(' '); // 2 spaces, 1 tab, whatever blows your dress up
$exporter->setUserShortArray(false); // e.g. use array(...)
etc...$serialized = $exporter->serialize($var); // Exports to a var
$exporter->serializeToFile($var, '/tmp/include.inc'); // Exports to a file
$exporter->serializeToStream($var, $stream); // Exports to an already open streamAnd if you want the defaults, then just:
$serialized = (var VarExporter)->serialize($var);
Potentially, one could also allow overriding helper methods to perform transformations along the way:
// VarExporter which encodes all strings as base64 blobs.
class Base64StringVarExporter extends VarExporter {
public function encodeString(string $var): string {
// parent behavior is `return '"' . addslashes($var) . '"';
return "base64_decode('" . base64_encode($var) . "')";
}
}Not the most performant thing, but extremely powerful.
My main concern is that
-
I would want to deal with recursive data structures somehow, probably by throwing or printing recursion.
Supporting reentrancy such as$this->encode($mixedValue, $depth + 1)
is possible, or possibly just limiting to strings.Overriding strings may be useful for binary data that's invalid utf-8, but that can also be done by passing
['flags' => $flags, 'string_encoder' => $callback, 'indent' => '?']
if the global function's functionality were to be extended to allow other functionality in a future rfc. -
I don't know how often the functionality of extending
VarExporter
would be used in practice.
Userland alternatives such as https://github.com/brick/varexporter/ already exist for extremely customizable output such as putting scalars all on the same line -
Using classes adds verbosity, making it more likely for code to keep using
var_export($var, true)
over(new \VarExporter())->serialize($var)
-Tyson
Thoughts?
(e.g. for the name, short_var_export() seemed more appropriate than var_export_short(), pretty_var_export() or canonical_var_export())
While I agree that all the suggestions in this thread would improve
var_export, I worry that it is failing a "smell test" that I often apply:
"If you're struggling to come up with the appropriate name for something
that you're creating, maybe you're creating the wrong thing."
In this case, the reason it's difficult to name is that PHP already has
rather a lot of different ways to produce a human-readable string from a
variable. The synopses in the manual aren't particularly enlightening:
- print_r — Prints human-readable information about a variable
- var_dump — Dumps information about a variable
- var_export — Outputs or returns a parsable string representation of a
variable
Then there's the slightly more exotic (and rather less useful than it
once was) debug_zval_dump; serialization formats that are reasonably
human-friendly like json_encode; and any number of frameworks and
userland libraries that define their own "dumper" functions because they
weren't satisfied with any of the above.
The name of any new function in this crowded space needs to somehow tell
the user why they'd use this one over the others - and, indeed, when
they wouldn't use it over the others.
Should we be aiming for a single function that can take over from some
or all of the others, and deprecate them, rather than just adding to the
confusion?
Regards,
--
Rowan Tommins
[IMSoP]
Le 20 janv. 2021 à 19:50, Rowan Tommins rowan.collins@gmail.com a écrit :
Should we be aiming for a single function that can take over from some or all of the others, and deprecate them, rather than just adding to the confusion?
Or short_var_export()
could just reuse the existing __debugInfo()
magic method (which is already used by both var_dump()
and print_r
, but not by var_export()
). The programmer would just need to make sure that __set_state()
can consume the array produced by __debugInfo()
.
—Claude
Or
short_var_export()
could just reuse the existing__debugInfo()
magic method (which is already used by bothvar_dump()
and
print_r
, but not byvar_export()
). The programmer would just need
to make sure that__set_state()
can consume the array produced by
__debugInfo()
.
This is an interesting question to explore, but I don't follow how this
relates to my previous e-mail. I'm saying that having print_r, var_dump,
var_export, and var_export_short (or whatever we call it) all in one
language is more confusing than helpful.
Regards,
--
Rowan Tommins
[IMSoP]
On Wed, Jan 20, 2021 at 3:44 PM Rowan Tommins rowan.collins@gmail.com
wrote:
Or
short_var_export()
could just reuse the existing__debugInfo()
magic method (which is already used by bothvar_dump()
and
print_r
, but not byvar_export()
). The programmer would just need
to make sure that__set_state()
can consume the array produced by
__debugInfo()
.This is an interesting question to explore, but I don't follow how this
relates to my previous e-mail. I'm saying that having print_r, var_dump,
var_export, and var_export_short (or whatever we call it) all in one
language is more confusing than helpful.
IMO print_r/var_dump should be kept out of this discussion. Those are human
readable outputs for human consumption. var_export()
is about a machine
readable output for recreating initial state within a runtime. The
requirements presented are wholly different.
-Sara
Hi Sara Golemon and Rowan Tommins,
Or
short_var_export()
could just reuse the existing__debugInfo()
magic method (which is already used by bothvar_dump()
and
print_r
, but not byvar_export()
). The programmer would just need
to make sure that__set_state()
can consume the array produced by
__debugInfo()
.This is an interesting question to explore, but I don't follow how this
relates to my previous e-mail. I'm saying that having print_r, var_dump,
var_export, and var_export_short (or whatever we call it) all in one
language is more confusing than helpful.IMO print_r/var_dump should be kept out of this discussion. Those are human
readable outputs for human consumption.var_export()
is about a machine
readable output for recreating initial state within a runtime. The
requirements presented are wholly different.
Even python has str(), repr(), and a separate https://docs.python.org/3/library/pprint.html module for pretty printing,
and https://docs.python.org/3/library/reprlib.html
I'd agree with leaving out print_r/var_dump out of this - using __debugInfo()
in a manner it isn't intended for
and passing different arrays to __set_state
in var_export and short_var_export/var_representation would be
an inconsistency I'd find surprising and inconsistent as an end user.
Even as a machine-readable output, var_export has limitations
but there's resistance to changing it and a lot of pre-existing code depending on its current behavior -
it predates namespaces so there's no leading backslash,
the output is much longer than it needs to be and takes longer to generate if the output is saved somewhere or sent over the network
(e.g. million-element array with keys and indentation)
As for print_r()
, my stance is that the limitations of print_r()
should be documented on php.net with alternatives recommended,
(false, null, and the empty string all become the empty string, for example)
and I avoid using it for new code,
but there's not a compelling reason to remove it at the time - doing so would just be a barrier to upgrading
-Tyson
IMO print_r/var_dump should be kept out of this discussion. Those are
human
readable outputs for human consumption.var_export()
is about a
machine
readable output for recreating initial state within a runtime. The
requirements presented are wholly different.
In that case, why are we spending so much time discussing how to make it look nicer for human consumption?
Of the changes proposed, I think only the leading backslash on class names actually makes a difference to machine readability, and that could probably just be changed and documented as fixing the existing functionality.
What are the requirements of var_dump that couldn't be merged with var_export?
Regards,
Rowan Tommins
[IMSoP]
Le 21 janv. 2021 à 00:19, Sara Golemon pollita@php.net a écrit :
IMO print_r/var_dump should be kept out of this discussion. Those are human
readable outputs for human consumption.var_export()
is about a machine
readable output for recreating initial state within a runtime. The
requirements presented are wholly different.-Sara
If the goal of var_export
is only to have some machine-readable output, the following will do it:
<?php
function my_var_export(mixed $x): string {
$serialized = \base64_encode(\serialize($x));
return "\unserialize(\base64_decode('$serialized'))";
}
?>
In reality, the output of var_export()
is both machine-readable and human-readable.
—Claude