Hi internals,
I've created https://wiki.php.net/rfc/readable_var_representation based on
my original proposal in https://externals.io/message/112924
This RFC proposes adding a new function var_representation(mixed $value, int $flags=0): string
with the following differences from var_export:
-
var_representation() unconditionally returns a string
-
Use
null
instead ofNULL
- lowercase is recommended by more coding
guidelines (https://www.php-fig.org/psr/psr-2/). -
Change the way indentation is done for arrays/objects.
See ext/standard/tests/general_functions/short_var_export1.phpt
(e.g. always add 2 spaces, never 3 in objects, and put the array start on the
same line as the key) -
Render lists as
"['item1']"
rather than"array(\n 0 => 'item1',\n)"
Always render empty lists on a single line, render multiline by default when there are 1 or more elements
-
Prepend
\
to class names so that generated code snippets can be used in
namespaces without any issues. -
Support
VAR_REPRESENTATION_SINGLE_LINE
in$flags
.
This will use a single-line representation for arrays/objects, though
strings with embedded newlines will still cause newlines in the output. -
If a string contains control characters("\x00"-"\x1f" and "\x7f"(backspace)),
then represent the entire string as a double quoted string
escaping\r
,\n
,\t
,\$
,\\
, and\"
, in addition to escaping remaining control characters
with hexadecimal encoding (\x00, \x7f, etc)
This is different from my original proposal in two ways:
- The function signature and name changed from my previous proposal.
It now always returns a string. - Backspace control characters (\x7f) are now also escaped.
Thanks,
- Tyson
Hi internals,
I've created https://wiki.php.net/rfc/readable_var_representation based on
my original proposal in https://externals.io/message/112924This RFC proposes adding a new function
var_representation(mixed $value, int $flags=0): string
My hesitation remains that this is just duplicating existing
functionality with only cosmetic differences.
As a user of PHP 8.1, how would I decide whether to use print_r,
var_dump, var_export, or var_representation?
And under what circumstances would I bother to write
"var_representation($var, VAR_REPRESENTATION_SINGLE_LINE);"?
Regards,
--
Rowan Tommins
[IMSoP]
Hi Rowan Tommins,
Hi internals,
I've created https://wiki.php.net/rfc/readable_var_representation based on
my original proposal in https://externals.io/message/112924This RFC proposes adding a new function
var_representation(mixed $value, int $flags=0): string
My hesitation remains that this is just duplicating existing
functionality with only cosmetic differences.As a user of PHP 8.1, how would I decide whether to use print_r,
var_dump, var_export, or var_representation?And under what circumstances would I bother to write
"var_representation($var, VAR_REPRESENTATION_SINGLE_LINE);"?
You have a good point on needing to document the circumstances where this would be useful.
I meant to write up the reasons where an end user would
or wouldn't want this functionality - it's useful when reviewing or voting on the RFC.
I've added those to https://wiki.php.net/rfc/readable_var_representation#when_would_a_user_use_var_representation
, including when I think the VAR_REPRESENTATION_SINGLE_LINE flag would be useful to users.
var_representation may be useful to a user when any of the following apply:
- You are generating a snippet of code to eval() in a situation where the snippet will occasionally or frequently be read by a human (If the output never needs to be read by a human,
return unserialize(' . var_export(serialize($data), true) . ');
can be used) - The output is occasionally or frequently read by humans (e.g. CLI or web app output, a REPL, unit test output, etc.).
- The output contains control characters such as newlines, tabs,
\r
or\x00
and may be viewed or edited by other users in a text editor/IDE. Many IDEs may convert from windows line endings (\r\n
) to unix line endings(\n
) automatically. - You want to unambiguously see control characters in the raw output regardless of how likely they are (e.g. dumping php ini settings, debugging mysterious test failures, etc)
- You are writing unit tests for applications supporting PHP 8.1+ (or a var_representation polyfill) that test the exact string representation of the output (e.g. phpt tests of php-src and PECL extensions)
- You need to copy the output into a codebase that's following a modern coding style guideline such as modern coding guidelines such as PSR-2. It also saves time if you don't have to remove array keys of lists and convert array() to [].
(I added that and a section on when/why a user may decide to use VAR_REPRESENTATION_SINGLE_LINE in the RFC)
As for print_r()
- https://www.php.net/print_r seems underdocumented - It doesn't mention how booleans, null, and floats are represented (same as print()) except in user notes.
The online documentation for var_export/var_dump explains what they do but not why/when they would be used,
it may be worthwhile to submit a PR to expand on those.
Regards,
Tyson
I've added those to https://wiki.php.net/rfc/readable_var_representation#when_would_a_user_use_var_representation
, including when I think the VAR_REPRESENTATION_SINGLE_LINE flag would be useful to users.
Thanks, it's good to at least know your opinion on this, even if I don't
fully agree with it. :)
var_representation may be useful to a user when any of the following apply:
- You are generating a snippet of code to eval() in a situation where the snippet will occasionally or frequently be read by a human (If the output never needs to be read by a human,
return unserialize(' . var_export(serialize($data), true) . ');
can be used)
As far as I know I have never had any reason to generate code and then
eval() it, and can't think of a situation where I ever would. If I
wanted a machine-readable output from a variable, I would use
serialize()
or json_encode()
.
That's not to say that there aren't cases where those requirements do
happen, but I think it is a very niche use case to dedicate two
different built-in functions to.
- You are writing unit tests for applications supporting PHP 8.1+ (or a var_representation polyfill) that test the exact string representation of the output (e.g. phpt tests of php-src and PECL extensions)
Since test output doesn't need to be executable, I would have thought
var_dump would be more appropriate than var_export here.
Checking on php-src master, this does seem to be the case in the
majority of tests:
- var_export appears 703 times in 136 different *.phpt files (0.8% of files)
- print_r appears 827 times in 342 different *.phpt files (2.1% of files)
- var_dump appears 33503 times in 9599 different *.phpt files (59.7% of
files)
So if we want to improve anything for that use case, we need to improve
or replace var_dump, not var_export.
- You need to copy the output into a codebase that's following a modern coding style guideline such as modern coding guidelines such as PSR-2. It also saves time if you don't have to remove array keys of lists and convert array() to [].
Trying to match any particular coding style seems rather outside the
remit of a built-in function - do we need flags for tabs vs spaces,
trailing commas, etc, etc? Surely it's simpler for users to take the
existing var_export format and use their IDE or dev scripts to re-format
it to taste.
Regards,
--
Rowan Tommins
[IMSoP]
Hi Rowan Tommins,
var_representation may be useful to a user when any of the following apply:
- You are generating a snippet of code to eval() in a situation where the snippet will occasionally or frequently be read by a human (If the output never needs to be read by a human,
return unserialize(' . var_export(serialize($data), true) . ');
can be used)As far as I know I have never had any reason to generate code and then
eval() it, and can't think of a situation where I ever would. If I
wanted a machine-readable output from a variable, I would use
serialize()
orjson_encode()
.That's not to say that there aren't cases where those requirements do
happen, but I think it is a very niche use case to dedicate two
different built-in functions to.
Even if a developer such as yourself doesn't need to generate code and don't expect to directly use it,
some of the applications and libraries they do use everyday would need to generate human and machine readable code.
That output is then shown to users of those libraries/applications,
or saved to files that would need to be looked at by users submitting bug reports or trying to understand the issue,
e.g. trying to understand why a unit test mock isn't doing what they'd expect.
For example, the output of some composer autoload files are generated using var_export,
and composer uses var_export for generating some exception messages,
and it may be useful to have $e->getMessage() be a single line.
https://github.com/composer/composer/blob/master/src/Composer/Repository/FilesystemRepository.php#L205-L208
(if users were trying to diagnose composer not autoloading the class, having these files be more readable would be useful
many years from now if composer's minimum version became php 8.1)
And a subset of the uses of var_export in the dependencies of a project I'm using:
vendor/sebastian/global-state/src/CodeExporter.php
// in protected function recursiveExport
67: return \var_export($variable, true);
70: return 'unserialize(' . \var_export(\serialize($variable), true) . ')';
vendor/phpunit/php-code-coverage/src/Report/PHP.php
16: * Uses `var_export()` to write a SebastianBergmann\CodeCoverage\CodeCoverage object to a file.
37: \var_export($coverage->getData(true), true),
vendor/phpspec/prophecy/src/Prophecy/Doubler/Generator/ClassCodeGenerator.php
104: $php .= ' = '.var_export($argument->getDefault(), true);
vendor/symfony/console/Descriptor/MarkdownDescriptor.php
62: .'* Default: `'.str_replace("\n", '', var_export($argument->getDefault(), true)).'`'
If you want to distinguish between array, stdClass, and MyClass, json_encode isn't adequate.
- You are writing unit tests for applications supporting PHP 8.1+ (or a var_representation polyfill) that test the exact string representation of the output (e.g. phpt tests of php-src and PECL extensions)
Since test output doesn't need to be executable, I would have thought
var_dump would be more appropriate than var_export here.
https://www.php.net/var_dump does not have an option to save to a string - it outputs to stdout.
if __debugInfo or an error handler echoes anything that would get captured by output buffering and interfere with that test.
Additionally, humans need to update the test expectations and to read the test output when it fails.
If that output contains control characters or unexpectedly mixes line endings, it is inconvenient to work with files using the raw output of in var_export/var_dump.
As I mentioned before, var_export suffers from many shortcomings such as the fact
that it can have more lines of output than var_dump for complex datastructures,
and doesn't escape control characters.Checking on php-src master, this does seem to be the case in the
majority of tests:
- var_export appears 703 times in 136 different *.phpt files (0.8% of files)
- print_r appears 827 times in 342 different *.phpt files (2.1% of files)
- var_dump appears 33503 times in 9599 different *.phpt files (59.7% of
files)So if we want to improve anything for that use case, we need to improve
or replace var_dump, not var_export.
php-src phpt tests are for tests of php itself, which puts strict and atypical limitations on the test framework.
php-src's phpt test framework may represent needs of php-src and some pecl maintainers, not userland.
In a userland project, I could easily choose to add a whole lot of utility functions/methods such as
function dump_repr($value) { echo var_representation($value), "\n"; }
and use that to replace var_dump.
I consider adding those helper methods to php-src itself impractical because tests of php-src should be self-contained.
If you encounter an issue with the engine, opcache or JIT, it's much, much harder to diagnose if dozens of userland helper functions
were loaded and invoked before the snippet in question was invoked.
So I do use var_dump in phpt tests itself, mainly because:
- Currently, control characters such as
\r
are not escaped, so I'm more confidentstring(4) "test"
has no control characters - var_export does not append a newline, it's more convenient to copy the .out file into the
--OUT--
section - I'm avoiding adding reusable helpers in a self-contained test case
- var_export output make the overall test output longer for arrays of arrays
Still, I would prefer using var_representation over var_dump in phpt for many use cases, especially with VAR_REPRESENTATION_SINGLE_LINE available.
- You need to copy the output into a codebase that's following a modern coding style guideline such as modern coding guidelines such as PSR-2. It also saves time if you don't have to remove array keys of lists and convert array() to [].
Trying to match any particular coding style seems rather outside the
remit of a built-in function - do we need flags for tabs vs spaces,
trailing commas, etc, etc? Surely it's simpler for users to take the
existing var_export format and use their IDE or dev scripts to re-format
it to taste.
Again, the IDE may have issues with the control characters in strings or unexpectedly remove or add them
(e.g. windows vs unix newlines)
For tabs vs spaces, Sara Golemon suggested an 'indent'
option
where users could choose what string prefix to use as spaces/tabs,
but I feel this would increase the initial scope of the RFC too much.
For a new developer, they may not have those features or plugins installed in their IDE,
or may not be aware of the existence of the shortcut/command to invoke those features on a range.
(phpcbf, or for reindenting Ctrl+space in eclipse, =
in vim, etc.)
Additionally, scripts to reindent may not remove the 0 =>
, 1 =>
, etc,
either not containing a php parser or assuming the user deliberately added those keys.
It would be much easier to reindent the output of var_representation than to rewrite+reindent the output of var_export.
Regards,
Tyson
Even if a developer such as yourself doesn't need to generate code and don't expect to directly use it,
some of the applications and libraries they do use everyday would need to generate human and machine readable code.
Sure, I believe you that there are use cases for a function that exports
PHP code. I remain unconvinced that there is a need to have two
functions for that purpose, which differ only in a few cosmetic details.
php-src phpt tests are for tests of php itself, which puts strict and atypical limitations on the test framework.
php-src's phpt test framework may represent needs of php-src and some pecl maintainers, not userland.
I'm not sure what point you're trying to make here. You were the one
that mentioned php-src tests as a use case, so I looked to see what they
currently use; the answer is overwhelmingly var_dump.
Still, I would prefer using var_representation over var_dump in phpt for many use cases, especially with VAR_REPRESENTATION_SINGLE_LINE available.
In the previous thread, you agreed with Sara that var_dump should be
"left out of this". Are you now saying that the new function should
replace some uses of var_dump?
If that is an aim, should we look at what other differences there are
between var_dump and var_export, and how we can make this new function
cover more use cases?
For a new developer, they may not have those features or plugins installed in their IDE,
or may not be aware of the existence of the shortcut/command to invoke those features on a range.
I contend that a new developer would very rarely have any need to use
this functionality at all.
Additionally, scripts to reindent may not remove the
0 =>
,1 =>
, etc,
either not containing a php parser or assuming the user deliberately added those keys.
And maybe they'd be right not to - that's an unavoidable problem with
any formatter, it won't be universal.
And that is why I'm sceptical of this function: it seems to be mostly a
different set of arbitrary formatting decisions, which some people will
prefer and some won't.
The only part that feels like fundamental value is escaping control
characters, which could be added to var_dump and/or var_export and only
affect that minority of a minority of usages which are relying on the
exact output in cases where control characters are present.
Regards,
--
Rowan Tommins
[IMSoP]
Hi Rowan Tommins,
php-src phpt tests are for tests of php itself, which puts strict and atypical limitations on the test framework.
php-src's phpt test framework may represent needs of php-src and some pecl maintainers, not userland.I'm not sure what point you're trying to make here. You were the one
that mentioned php-src tests as a use case, so I looked to see what they
currently use; the answer is overwhelmingly var_dump.
I mentioned php-src test cases a use case for something that fixes the problems with var_export
and would make it possible to write shorter phpt tests (input+output) that continue to unambiguously represent output.
Still, I would prefer using var_representation over var_dump in phpt for many use cases, especially with VAR_REPRESENTATION_SINGLE_LINE available.
In the previous thread, you agreed with Sara that var_dump should be
"left out of this". Are you now saying that the new function should
replace some uses of var_dump?
I meant that changing the output format of print_r and var_dump should be "left out of
this" RFC discussion.
Yes, I would agree that new or refactored phpt test cases could start using var_representation instead of var_dump
if this was added to the language and authors preferred it.
If that is an aim, should we look at what other differences there are
between var_dump and var_export, and how we can make this new function
cover more use cases?
The use case of "generate a (short, readable, escaped) representation of a variable that can be evaluated"
is something I'd consider useful enough on its own, a useful primitive in a modern programming language,
and not need to cover every use case.
I don't plan to change the direction of this RFC.
The main features I remember unique to var_dump are as follows (I don't plan to add any of those):
- var_dump includes object ids, this is useful for telling that objects are different - if you need to know if objects are equivalent use var_dump or debug_zval_dump or serialize.
- String lengths- this is less of a concern with ascii control characters escaped in var_representation.
- Prefixing scalars with types - only useful for tutorials introducing the type system or debugging
- References and recursion - I don't think there's much demand for a predictable/readable/efficient way to represent that in a readable string.
unserialize()
covers that.
For a new developer, they may not have those features or plugins installed in their IDE,
or may not be aware of the existence of the shortcut/command to invoke those features on a range.I contend that a new developer would very rarely have any need to use
this functionality at all.
If a php developer is learning the language through php's official manual, official language reference, or through an online tutorial,
that would have sections showing the different scalar types, examples involving objects and/or arrays, etc.
Those use var_export/var_dump
https://www.php.net/manual/en/functions.anonymous.php
https://www.php.net/manual/en/language.types.string.php
https://www.php.net/manual/en/language.types.boolean.php
https://www.php.net/manual/en/language.types.object.php
<?php
var_dump((bool) ""); // bool(false)
var_dump((bool) 1); // bool(true)
var_dump((bool) -2); // bool(true)
<?php
$obj = (object) array('1' => 'foo');
var_dump(isset($obj->{'1'})); // outputs 'bool(true)' as of PHP 7.2.0; 'bool(false)' previously
var_dump(key($obj)); // outputs 'string(1) "1"' as of PHP 7.2.0; 'int(1)' previously
Additionally, scripts to reindent may not remove the
0 =>
,1 =>
, etc,
either not containing a php parser or assuming the user deliberately added those keys.And maybe they'd be right not to - that's an unavoidable problem with
any formatter, it won't be universal.And that is why I'm sceptical of this function: it seems to be mostly a
different set of arbitrary formatting decisions, which some people will
prefer and some won't.
And people who don't prefer that formatting are free to continue using var_export in their projects.
If people may not prefer that formatting, that would be a reason for introducing a new function instead of modifying var_export.
Imagining that var_export was never added to the language with those arbitrary formatting decisions
in PHP 4.2.0 and an RFC was being discussed to add the first built-in way to convert a string to a machine-readable language in PHP 8.1:
Would we really choose the indentation var_export used?
Would we advocate putting values on different lines from keys being aware of common coding standards?
Would we use array (
with a space instead of [
?
The only part that feels like fundamental value is escaping control
characters, which could be added to var_dump and/or var_export and only
affect that minority of a minority of usages which are relying on the
exact output in cases where control characters are present.
Minor changes to var_export and var_dump have been proposed in the past
and met with some hesitance or opposition to any output format changes,
e.g. the question of whether changing the default output format is
worth affecting existing libraries/applications/tests.
https://externals.io/message/106674#106684
https://externals.io/message/101883
I would be in favor of adding string escaping to var_dump,
but that is a discussion that should be "left out of this" RFC
as var_dump has a different purpose.
Regards,
Tyson
If a php developer is learning the language through php's official manual, official language reference, or through an online tutorial,
that would have sections showing the different scalar types, examples involving objects and/or arrays, etc.
Those use var_export/var_dumphttps://www.php.net/manual/en/functions.anonymous.php
https://www.php.net/manual/en/language.types.string.php
https://www.php.net/manual/en/language.types.boolean.php
https://www.php.net/manual/en/language.types.object.php
Every single one of those uses var_dump, and I see no reason why they
should use anything different.
The use case of "generate a (short, readable, escaped) representation of a variable that can be evaluated"
is something I'd consider useful enough on its own, a useful primitive in a modern programming language
I guess I just don't agree. I think the use case of "generate a
representation that can be evaluated" is adequately covered by
var_export, the "short" is mostly irrelevant, and the "readable" mostly
opinion-based. The "escaped" sounds useful, but more as an enhancement
to the current function (with a flag, if compatibility is really an
issue) than a ground-up re-design.
Imagining that var_export was never added to the language with those arbitrary formatting decisions
in PHP 4.2.0 and an RFC was being discussed to add the first built-in way to convert a string to a machine-readable language in PHP 8.1:
Would we really choose the indentation var_export used?
Would we advocate putting values on different lines from keys being aware of common coding standards?
Would we usearray (
with a space instead of[
?
No, I agree with all of those things, if PHP had no such function right now.
But it does, and the changes don't seem significant enough to me to
bother duplicating it.
I would be in favor of adding string escaping to var_dump,
but that is a discussion that should be "left out of this" RFC
as var_dump has a different purpose.
Sorry, I'm really confused here. On the one hand, most of your examples
are of things that currently use var_dump; on the other hand, you insist
that improving var_dump is a completely unrelated topic. Surely if
var_dump was improved, that would influence the choice of function in
those use cases?
Regards,
--
Rowan Tommins
[IMSoP]
Hi Rowan,
If a php developer is learning the language through php's official manual, official language reference, or through an online tutorial,
that would have sections showing the different scalar types, examples involving objects and/or arrays, etc.
Those use var_export/var_dumphttps://www.php.net/manual/en/functions.anonymous.php
https://www.php.net/manual/en/language.types.string.php
https://www.php.net/manual/en/language.types.boolean.php
https://www.php.net/manual/en/language.types.object.phpEvery single one of those uses var_dump, and I see no reason why they
should use anything different.
Sorry, I misread your comment as being about the family of debug output functions as a whole.
I'd agree var_dump is the most appropriate for these.
if the manual depended on printf or var_export(); echo "\n";
then a developer new to programming
might have to read the printf documentation or echo documentation or concatenation documentation first to better understand the examples.
If php were to add a puts()
or println()
function that would automatically append a newline
then that might also work better in tutorials longer-term, e.g. puts(var_representation($value))
.
The fact that php was originally for "personal home pages"
may explain why this isn't currently part of the standard library,
since newlines are mostly treated like whitespace in most html elements except <textarea>
, etc.
The use case of "generate a (short, readable, escaped) representation of a variable that can be evaluated"
is something I'd consider useful enough on its own, a useful primitive in a modern programming languageI guess I just don't agree. I think the use case of "generate a
representation that can be evaluated" is adequately covered by
var_export, the "short" is mostly irrelevant, and the "readable" mostly
opinion-based. The "escaped" sounds useful, but more as an enhancement
to the current function (with a flag, if compatibility is really an
issue) than a ground-up re-design.
I had considered that option, but had expected more pushback because of https://github.com/php/php-src/pull/6619#issuecomment-765809096
and the discussion on the prior RFC email threads on improving var_export.
- Passing too many arguments to var_export would cause a ArgumentCountError
or warning before php 8.1 if flags were added to var_export - If that approach was used, I'm concerned PHP may indefinitely keep the default behavior of
array (
,NULL
, etc. in var_export.
(changing the default behaviour of var_export significantly may discourage some users from upgrading php due to libraries/applications that depend on it) - It is practical to polyfill var_representation for php 8.0 and older and allow applications to use it immediately after 8.1 is released, but less so for var_export.
I would be in favor of adding string escaping to var_dump,
but that is a discussion that should be "left out of this" RFC
as var_dump has a different purpose.Sorry, I'm really confused here. On the one hand, most of your examples
are of things that currently use var_dump; on the other hand, you insist
that improving var_dump is a completely unrelated topic. Surely if
var_dump was improved, that would influence the choice of function in
those use cases?
You have proposed no mechanism other than string escaping by which var_dump could or should be improved.
var_dump accepts a variadic list of objects, so $flags or $options can't be added to it even if you wanted opt-in enhancements like single line representation.
As I said earlier, var_dump always prints to stdout, which may get mixed with error handler output or stdout if ob_start()
is used..
I don't expect var_dump to be changed in a way that would make the output possible to eval()
;
eval()able output representation is what I'm working on.
Regards,
Tyson
I'm concerned PHP may indefinitely keep the default behavior of
array (
,NULL
, etc. in var_export.
I agree that that's likely, but I don't find it concerning, unless
"array()" is deprecated or "null" becomes case sensitive.
You have proposed no mechanism other than string escaping by which var_dump could or should be improved.
Indeed, because out of all the improvements proposed in this thread,
that's the only one I consider to be anything more than picking a new
colour for the bikeshed.
It is also the one which I suspect would have least impact if it was
simply added to both var_dump()
and var_export()
, because the
intersection of "relying on exact output" and "processing strings
containing control characters" is probably a rather small one, precisely
because the current output doesn't represent them well.
Regards,
--
Rowan Tommins
[IMSoP]
As far as I know I have never had any reason to generate code and then eval() it, and can't think of a situation where I ever would. If I wanted a machine-readable output from a variable, I would use
serialize()
orjson_encode()
.That's not to say that there aren't cases where those requirements do happen, but I think it is a very niche use case to dedicate two different built-in functions to.
There's an interesting edge case where this does become useful. If an evaluable representation of a data structure is written to a file (wrapped with <?php return $x;
) and subsequently loaded using require()
, the file can be stored in opcache. This is dramatically faster than unserialize()
or json_decode()
, especially for large structures.
What'd be particularly useful for this use case would be a function which behaved similarly to var_export()
, but which didn't pretty-print its output, so as to yield a smaller file.
Hi Dusk,
There's an interesting edge case where this does become useful. If an evaluable representation of a data structure is written to a file (wrapped with
<?php return $x;
) and subsequently loaded usingrequire()
, the file can be stored in opcache. This is dramatically faster thanunserialize()
orjson_decode()
, especially for large structures.What'd be particularly useful for this use case would be a function which behaved similarly to
var_export()
, but which didn't pretty-print its output, so as to yield a smaller file.
The larger benefit is readability and that you could reindent the output or change the line endings in the output (automatically or manually through your IDE) without worrying about accidentally changing the representation.
// or other classes
const SER = array (
0 => 'O:8:"stdClass":1:{s:7:"headers";s:20:"Content-Length: 10
";}',
);
The size on disk would be smaller and I expect the encoding and saving a file to disk could be a tiny bit faster,
but the size in opcache would be unchanged because the resulting array is the same.
For the question of size, you'd still benefit in cases where representation size (e.g. megabytes of large/nested configuration or generated data) is a concern
and may or may not get a small speedup when loading the file for the first time after clearing opcache (e.g. server restart).
(haven't benchmarked it)
Cheers,
- Tyson
Hi internals,
I've created https://wiki.php.net/rfc/readable_var_representation based on
my original proposal in https://externals.io/message/112924This RFC proposes adding a new function
var_representation(mixed $value, int $flags=0): string
with the following differences from var_export:
- var_representation() unconditionally returns a string
- Use
null
instead ofNULL
- lowercase is recommended by more coding
guidelines (https://www.php-fig.org/psr/psr-2/).- Change the way indentation is done for arrays/objects.
See ext/standard/tests/general_functions/short_var_export1.phpt
(e.g. always add 2 spaces, never 3 in objects, and put the array start on the
same line as the key)- Render lists as
"['item1']"
rather than"array(\n 0 => 'item1',\n)"
Always render empty lists on a single line, render multiline by default when there are 1 or more elements
5. Prepend\
to class names so that generated code snippets can be used in
namespaces without any issues.
6. SupportVAR_REPRESENTATION_SINGLE_LINE
in$flags
.
This will use a single-line representation for arrays/objects, though
strings with embedded newlines will still cause newlines in the output.
7. If a string contains control characters("\x00"-"\x1f" and "\x7f"(backspace)),
then represent the entire string as a double quoted string
escaping\r
,\n
,\t
,\$
,\\
, and\"
, in addition to escaping remaining control characters
with hexadecimal encoding (\x00, \x7f, etc)This is different from my original proposal in two ways:
- The function signature and name changed from my previous proposal.
It now always returns a string.- Backspace control characters (\x7f) are now also escaped.
A reminder that voting on the var_representation RFC starts in a day.
This RFC proposes adding a new function var_representation(mixed $value, int $flags=0): string
with multiple improvements on var_export()
.
Any other feedback?
Thanks,
- Tyson
On Thu, Feb 4, 2021 at 3:36 PM tyson andre tysonandre775@hotmail.com
wrote:
Hi internals,
I've created https://wiki.php.net/rfc/readable_var_representation based
on
my original proposal in https://externals.io/message/112924[...]
A reminder that voting on the var_representation RFC starts in a day.
This RFC proposes adding a new functionvar_representation(mixed $value, int $flags=0): string
with multiple improvements onvar_export()
.Any other feedback?
Hi,
I think the "though strings with embedded newlines will still cause
newlines in the output" part is obsolete (since \r
and \n
are escaped
now).
Apart from that, since var_export (and var_dump) can't really be "fixed"
for BC reasons, I'm +1 for the new function.
Thanks
--
Guilliam Xavier
Hi internals,
I've created https://wiki.php.net/rfc/readable_var_representation based on
my original proposal in https://externals.io/message/112924[...]
A reminder that voting on the var_representation RFC starts in a day.
This RFC proposes adding a new functionvar_representation(mixed $value, int $flags=0): string
with multiple improvements onvar_export()
.Any other feedback?
I think the "though strings with embedded newlines will still cause newlines in the output" part is obsolete (since
\r
and\n
are escaped now).Apart from that, since var_export (and var_dump) can't really be "fixed" for BC reasons, I'm +1 for the new function.
Thanks, that's indeed obsolete - I removed it from the RFC.
-Tyson
On Thu, Feb 4, 2021 at 3:36 PM tyson andre tysonandre775@hotmail.com
wrote:
Hi internals,
I've created https://wiki.php.net/rfc/readable_var_representation based
on
my original proposal in https://externals.io/message/112924This RFC proposes adding a new function
var_representation(mixed $value, int $flags=0): string
with the following differences from var_export:
var_representation() unconditionally returns a string
Use
null
instead ofNULL
- lowercase is recommended by more coding
guidelines (https://www.php-fig.org/psr/psr-2/).Change the way indentation is done for arrays/objects.
See ext/standard/tests/general_functions/short_var_export1.phpt
(e.g. always add 2 spaces, never 3 in objects, and put the array
start on the
same line as the key)Render lists as
"['item1']"
rather than"array(\n 0 => 'item1',\n)"
Always render empty lists on a single line, render multiline by
default when there are 1 or more elementsPrepend
\
to class names so that generated code snippets can be
used in
namespaces without any issues.Support
VAR_REPRESENTATION_SINGLE_LINE
in$flags
.
This will use a single-line representation for arrays/objects, though
strings with embedded newlines will still cause newlines in the
output.If a string contains control characters("\x00"-"\x1f" and
"\x7f"(backspace)),
then represent the entire string as a double quoted string
escaping\r
,\n
,\t
,\$
,\\
, and\"
, in addition to
escaping remaining control characters
with hexadecimal encoding (\x00, \x7f, etc)This is different from my original proposal in two ways:
- The function signature and name changed from my previous proposal.
It now always returns a string.- Backspace control characters (\x7f) are now also escaped.
A reminder that voting on the var_representation RFC starts in a day.
This RFC proposes adding a new functionvar_representation(mixed $value, int $flags=0): string
with multiple improvements onvar_export()
.Any other feedback?
Thanks,
- Tyson
Given the recent discussion in the interactive shell thread, I think you
should consider whether the new function could also be expanded to serve
that use case. I think that if we're going to add one more dumping function
to the 4 we already have, it better cover all the use-cases we have. The
"limited size dump" doesn't really fit in with "dump is executable PHP
code", but if I understand correctly, executable PHP code is not the whole
goal of the proposal.
Regards,
Nikita
Hi internals,
I've created https://wiki.php.net/rfc/readable_var_representation based on
my original proposal in https://externals.io/message/112924This RFC proposes adding a new function
var_representation(mixed $value, int $flags=0): string
with the following differences from var_export:
- var_representation() unconditionally returns a string
- Use
null
instead ofNULL
- lowercase is recommended by more coding
guidelines (https://www.php-fig.org/psr/psr-2/).- Change the way indentation is done for arrays/objects.
See ext/standard/tests/general_functions/short_var_export1.phpt
(e.g. always add 2 spaces, never 3 in objects, and put the array start on the
same line as the key)- Render lists as
"['item1']"
rather than"array(\n 0 => 'item1',\n)"
Always render empty lists on a single line, render multiline by default when there are 1 or more elements
5. Prepend\
to class names so that generated code snippets can be used in
namespaces without any issues.
6. SupportVAR_REPRESENTATION_SINGLE_LINE
in$flags
.
This will use a single-line representation for arrays/objects, though
strings with embedded newlines will still cause newlines in the output.
7. If a string contains control characters("\x00"-"\x1f" and "\x7f"(backspace)),
then represent the entire string as a double quoted string
escaping\r
,\n
,\t
,\$
,\\
, and\"
, in addition to escaping remaining control characters
with hexadecimal encoding (\x00, \x7f, etc)This is different from my original proposal in two ways:
- The function signature and name changed from my previous proposal.
It now always returns a string.- Backspace control characters (\x7f) are now also escaped.
A reminder that voting on the var_representation RFC starts in a day.
This RFC proposes adding a new functionvar_representation(mixed $value, int $flags=0): string
with multiple improvements onvar_export()
.Any other feedback?
Given the recent discussion in the interactive shell thread,
I think you should consider whether the new function could also be expanded to serve that use case.
I think that if we're going to add one more dumping function to the 4 we already have,
it better cover all the use-cases we have.
The "limited size dump" doesn't really fit in with "dump is executable PHP code",
but if I understand correctly, executable PHP code is not the whole goal of the proposal.
I suppose that technically could be done by adding a VAR_REPRESENTATION_DEBUG_DUMP flag to a $flags bitmask to generate var_dump output,
and allow that to be combined with VAR_REPRESENTATION_SINGLE_LINE and other style flags.
I should at least mention it as an option - that possibly combines unrelated functionality (Debug vs evaluable code) in flags,
but at least it cuts down on the number of different functions.
I don't plan to include that in this RFC.
I have considered it but think that readable executable PHP code and debug representations are largely incompatible -
having a function to generate executable PHP code that generates a truncated representation of a value doesn't seem as useful.
If you're generating code to eval() - you're usually generating all of it.
For example, for this
php > $x = (object)[]; $v = [$x, $x]; echo var_representation($v);
[
(object) [],
(object) [],
]
In an application where the identity or use of refereinces in the object didn't matter (e.g. read but not modified), that might be the best representation.
A hypothetical function could emit '(static function () { $t1 = (object)[]; return [$t1, $t1]; })();'
if it detected object duplication or references (and so on),
and likely be faster at generating output than a userland implementation such as https://github.com/brick/varexporter
(and avoid limitations of ReflectionReference only being able to check individual pairs at a time),
but I'm concerned that there's not much interest in that, and the response would be "this should go in a PECL instead"
due to the complexity of an implementation, edge cases that unserialize solves but not the hypothetical function, and due to the ongoing requirement to maintain it.
- e.g. some internal classes forbid
newInstanceWithoutConstructor()
- If an application needed all of the functionality of serialize, it's already possible to generate a call to unserialize instead.
I feel like trying to do everything at once would increase the scope to the point
where the RFC and implementation would be hard to implement and review.
with only a few people responding so far with mixed feedback I think it's too early to do that.
There's other orthogonal improvements
(e.g. Internal objects don't implement __set_state
(alternately, have a way to specify a constructor classes support instead of dumping __set_state
)
(ArrayObject::__set_state
, SplObjectStorage::__set_state
, GMP::__set_state
, etc. currently do not exist))
Thanks,
Tyson