I am following the RFC guideline for the first time. (
https://wiki.php.net/rfc/howto)
As suggested there, I am here to get a feeling from you, regarding the
following RFC for PHP.
Change (draft):
New function in php called like:
is_json(string $string): bool
Description
Parameters
string $string -> string to find out if is a valid JSON or not
Return
Returns a bool. The function is capable to determine if the passed string
is a valid JSON (true) or not (false).
Why this function ?
At the moment the only way to determine if a JSON-string is valid we have
to execute the json_decode()
function.
The drawback about this, is that json_decode()
generates an in memory an
object/array (depending on parameters) while parsing the string; this leads
to a memory usage that is not needed (because we use memory for creating
the object/array) and also can cause an error for reaching the memory-limit
of the php process.
Sometimes we just need to know is the string is a valid json or not, and
nothing else.
Do we need something like this? If a check to an string is valid JSON
then for sure I will have to use it in my code either as an object or an
array.
Well that is not true. There are plenty of cases where you just need to
check if a string is a valid json and that is it. Just looking into
stackoverflow will give you an idea about how many people is looking for
something like this in an efficient way.
Who would develop the RFC ?
I would develop the RFC with the help of the community if needed. I already
have a first version of the is_json() function tested and ready for review.
Thanks in advance, I am looking forward to hear your opinion on this.
Kind Regards,
Juan Carlos Morales.
Hi Juan,
pt., 29 lip 2022, 16:26 użytkownik juan carlos morales <
dev.juan.morales@gmail.com> napisał:
I am following the RFC guideline for the first time. (
https://wiki.php.net/rfc/howto)As suggested there, I am here to get a feeling from you, regarding the
following RFC for PHP.Change (draft):
New function in php called like:
is_json(string $string): bool
Description
Parameters
string $string -> string to find out if is a valid JSON or not
Return
Returns a bool. The function is capable to determine if the passed string
is a valid JSON (true) or not (false).Why this function ?
At the moment the only way to determine if a JSON-string is valid we have
to execute thejson_decode()
function.The drawback about this, is that
json_decode()
generates an in memory an
object/array (depending on parameters) while parsing the string; this leads
to a memory usage that is not needed (because we use memory for creating
the object/array) and also can cause an error for reaching the memory-limit
of the php process.
Personally I'd vote NO.
Cheers,
Michał Marcin Brzuchalski
I was about to say NO, but after being completely your argument, the idea
makes sense.
Initially I thought about using json_decode()
with error capture, but the
idea that it would overload memory makes perfect sense, compared to a
simple structure analysis, if that is indeed the user's intention. The
performance would also be absurdly better.
What worries me above is the misuse of the function, like checking if
is_json() === true and using json_decode()
right after. However, I believe
this can be easily optimized by the engine itself.
My vote is YES.
Atenciosamente,
David Rodrigues
Em sex., 29 de jul. de 2022 às 16:32, Michał Marcin Brzuchalski <
michal.brzuchalski@gmail.com> escreveu:
Hi Juan,
pt., 29 lip 2022, 16:26 użytkownik juan carlos morales <
dev.juan.morales@gmail.com> napisał:I am following the RFC guideline for the first time. (
https://wiki.php.net/rfc/howto)As suggested there, I am here to get a feeling from you, regarding the
following RFC for PHP.Change (draft):
New function in php called like:
is_json(string $string): bool
Description
Parameters
string $string -> string to find out if is a valid JSON or not
Return
Returns a bool. The function is capable to determine if the passed string
is a valid JSON (true) or not (false).Why this function ?
At the moment the only way to determine if a JSON-string is valid we have
to execute thejson_decode()
function.The drawback about this, is that
json_decode()
generates an in memory an
object/array (depending on parameters) while parsing the string; this
leads
to a memory usage that is not needed (because we use memory for creating
the object/array) and also can cause an error for reaching the
memory-limit
of the php process.Personally I'd vote NO.
Cheers,
Michał Marcin Brzuchalski
El vie, 29 jul 2022 a las 22:18, David Rodrigues (david.proweb@gmail.com)
escribió:
I was about to say NO, but after being completely your argument, the idea
makes sense.Initially I thought about using
json_decode()
with error capture, but the
idea that it would overload memory makes perfect sense, compared to a
simple structure analysis, if that is indeed the user's intention. The
performance would also be absurdly better.What worries me above is the misuse of the function, like checking if
is_json() === true and usingjson_decode()
right after. However, I believe
this can be easily optimized by the engine itself.My vote is YES.
Atenciosamente,
David RodriguesEm sex., 29 de jul. de 2022 às 16:32, Michał Marcin Brzuchalski <
michal.brzuchalski@gmail.com> escreveu:Hi Juan,
pt., 29 lip 2022, 16:26 użytkownik juan carlos morales <
dev.juan.morales@gmail.com> napisał:I am following the RFC guideline for the first time. (
https://wiki.php.net/rfc/howto)As suggested there, I am here to get a feeling from you, regarding the
following RFC for PHP.Change (draft):
New function in php called like:
is_json(string $string): bool
Description
Parameters
string $string -> string to find out if is a valid JSON or not
Return
Returns a bool. The function is capable to determine if the passed
string
is a valid JSON (true) or not (false).Why this function ?
At the moment the only way to determine if a JSON-string is valid we
have
to execute thejson_decode()
function.The drawback about this, is that
json_decode()
generates an in memory
an
object/array (depending on parameters) while parsing the string; this
leads
to a memory usage that is not needed (because we use memory for
creating
the object/array) and also can cause an error for reaching the
memory-limit
of the php process.Personally I'd vote NO.
Cheers,
Michał Marcin Brzuchalski
Thanks for the feedback
*** To: Michał Marcin Brzuchalski
Thanks for the opinion, but it would be useful for me to know if there is a
reason behind the "no" that could help me get better.
*** To: David Rodrigues
Gracias. Regarding "performance", I did some benchmarks. I tested is_json()
vs json_decode()
with a json-string with 1 million key-values; is_json was
100% faster than json_decode()
, plus memory usage with is_json() was flat!
, yes , flat. memory_get_usage()
before and after is_json() returned the
same value; with json_decode()
I had to change the memory limit setting
otherwise memory reaches max allowed before finishing parsing it.
Regarding use cases, I can say that this idea came up , as in my current
company , our product heavily uses json_decode()
to know if an string is a
valid json, and compromises the memory usage once in a while, plus,
performance decreases; that is why this RFC-Idea. The good thing is that I
already have something for this.
Also in stack overflow, honestly, take a look, and check how much people is
needing this.
Unit tests could heavily benefit from such a functionality.
A quick look in github for php code with a function called "is_json()"
exposes tons of projects where developers are writting their own
"is_json()" function relaying in json_decode()
to make it work.
Etc.
:)
Stay in contact David. and once again, thanks!
On Fri, Jul 29, 2022 at 7:27 AM juan carlos morales <
dev.juan.morales@gmail.com> wrote:
Why this function ?
At the moment the only way to determine if a JSON-string is valid we have
to execute thejson_decode()
function.The drawback about this, is that
json_decode()
generates an in memory an
object/array (depending on parameters) while parsing the string; this leads
to a memory usage that is not needed (because we use memory for creating
the object/array) and also can cause an error for reaching the memory-limit
of the php process.Sometimes we just need to know is the string is a valid json or not, and
nothing else.
You say that you have a first-pass at the implementation done. I'd be
curious to see it. My initial thought was that in order to validate the
string, you likely need to allocate extra memory as part of the validation
that depends on the string size. You'd definitely save the overhead of a
ZVAL, but for such an object that overhead is likely negligible.
So I guess my question would be: in the actual implementation that lands,
how much memory would this actually save compared to json_decode()
? This
seems like it would make the RFC tricky, as the purported benefit of the
RFC depends very tightly on the implementation that lands.
Jordan
Just want to clarify that when I mentioned the use of memory, I wrote down
the function "memory_get_usage()", which basically gives us the memory
handle by php that is related to the memory_limit INI setting.
Now I will provide a benchmark I have done really quick:
Code used (I have the implementation of is_json() done already)
<?php
// make sure you set your memory limit to -1 before running this code
// Here we create a very very very big json string, really big
$limit = 1000000;
$jsonString = '{ "test": { "foo": "bar" },';
for ($i=0; $i < $limit; $i++) {
$jsonString .= " "test$i": { "foo": { "test" : { "foo" : {
"test" : { "foo" : "bar" }}}}},";
}
$jsonString .= ' "testXXXX": { "foo": "replaceme" } }';
//{ "test" : { "foo" : "bar" }}}
$memoryBefore = memory_get_usage(true) / 1024 / 1024;
echo "Megas used before call: " . $memoryBefore . PHP_EOL;
$start = microtime(true);
json_decode($jsonString, null, $limit, 0);
//<------------------ un/comment to show/hide results for json_decode()
//is_json($jsonString);
//<------------------ un/comment to show/hide results for is_json()
$memoryAfter = memory_get_usage(true) / 1024 / 1024;
echo "Megas used after call: " . $memoryAfter . PHP_EOL;
echo "Difference: " . ($memoryAfter - $memoryBefore) . PHP_EOL;
echo "Time: " . (microtime(true) - $start) . " seconds" . PHP_EOL;
return;
Results
json_decode()
Megas used before call: 79.23828125
Megas used after call: 3269.23828125
Difference: 3190
Time: 12.091101884842 seconds
is_json()
Megas used before call: 79.23828125
Megas used after call: 79.23828125
Difference: 0
Time: 5.4537169933319 seconds
And yes, I am open to share the implementation, but after I write the RFC.
Thanks for taking your time to give me a feedback.
El sáb, 30 jul 2022 a las 3:50, Jordan LeDoux (jordan.ledoux@gmail.com)
escribió:
On Fri, Jul 29, 2022 at 7:27 AM juan carlos morales <
dev.juan.morales@gmail.com> wrote:Why this function ?
At the moment the only way to determine if a JSON-string is valid we have
to execute thejson_decode()
function.The drawback about this, is that
json_decode()
generates an in memory an
object/array (depending on parameters) while parsing the string; this
leads
to a memory usage that is not needed (because we use memory for creating
the object/array) and also can cause an error for reaching the
memory-limit
of the php process.Sometimes we just need to know is the string is a valid json or not, and
nothing else.You say that you have a first-pass at the implementation done. I'd be
curious to see it. My initial thought was that in order to validate the
string, you likely need to allocate extra memory as part of the validation
that depends on the string size. You'd definitely save the overhead of a
ZVAL, but for such an object that overhead is likely negligible.So I guess my question would be: in the actual implementation that lands,
how much memory would this actually save compared tojson_decode()
? This
seems like it would make the RFC tricky, as the purported benefit of the
RFC depends very tightly on the implementation that lands.Jordan
$memoryAfter = memory_get_usage(true) / 1024 / 1024;
I see that you used memory_get_usage
that shows memory usage at the time
of the function call.
As far as I understand, your function does not return any value,
so I suspect that it is obvious that the memory usage after the function
call is the same as before.
But what is the actual memory usage during the function call?
Can you run the same benchmark but with memory_get_peak_usage
function to
see how your function uses memory?
$memoryAfter = memory_get_peak_usage(true) / 1024 / 1024;
Also, I'm concerned if it would be better to name the function
is_json_valid
?
On Sat, Jul 30, 2022 at 10:37 AM juan carlos morales <
dev.juan.morales@gmail.com> wrote:
Just want to clarify that when I mentioned the use of memory, I wrote down
the function "memory_get_usage()", which basically gives us the memory
handle by php that is related to the memory_limit INI setting.Now I will provide a benchmark I have done really quick:
Code used (I have the implementation of is_json() done already)
<?php
// make sure you set your memory limit to -1 before running this code
// Here we create a very very very big json string, really big
$limit = 1000000;
$jsonString = '{ "test": { "foo": "bar" },';for ($i=0; $i < $limit; $i++) {
$jsonString .= " "test$i": { "foo": { "test" : { "foo" : {
"test" : { "foo" : "bar" }}}}},";
}$jsonString .= ' "testXXXX": { "foo": "replaceme" } }';
//{ "test" : { "foo" : "bar" }}}$memoryBefore = memory_get_usage(true) / 1024 / 1024;
echo "Megas used before call: " . $memoryBefore . PHP_EOL;$start = microtime(true);
json_decode($jsonString, null, $limit, 0);
//<------------------ un/comment to show/hide results forjson_decode()
//is_json($jsonString);
//<------------------ un/comment to show/hide results for is_json()$memoryAfter = memory_get_usage(true) / 1024 / 1024;
echo "Megas used after call: " . $memoryAfter . PHP_EOL;echo "Difference: " . ($memoryAfter - $memoryBefore) . PHP_EOL;
echo "Time: " . (microtime(true) - $start) . " seconds" . PHP_EOL;
return;Results
json_decode()
Megas used before call: 79.23828125
Megas used after call: 3269.23828125
Difference: 3190
Time: 12.091101884842 secondsis_json()
Megas used before call: 79.23828125
Megas used after call: 79.23828125
Difference: 0
Time: 5.4537169933319 secondsAnd yes, I am open to share the implementation, but after I write the RFC.
Thanks for taking your time to give me a feedback.
El sáb, 30 jul 2022 a las 3:50, Jordan LeDoux (jordan.ledoux@gmail.com)
escribió:On Fri, Jul 29, 2022 at 7:27 AM juan carlos morales <
dev.juan.morales@gmail.com> wrote:Why this function ?
At the moment the only way to determine if a JSON-string is valid we
have
to execute thejson_decode()
function.The drawback about this, is that
json_decode()
generates an in memory an
object/array (depending on parameters) while parsing the string; this
leads
to a memory usage that is not needed (because we use memory for creating
the object/array) and also can cause an error for reaching the
memory-limit
of the php process.Sometimes we just need to know is the string is a valid json or not, and
nothing else.You say that you have a first-pass at the implementation done. I'd be
curious to see it. My initial thought was that in order to validate the
string, you likely need to allocate extra memory as part of the
validation
that depends on the string size. You'd definitely save the overhead of a
ZVAL, but for such an object that overhead is likely negligible.So I guess my question would be: in the actual implementation that lands,
how much memory would this actually save compared tojson_decode()
? This
seems like it would make the RFC tricky, as the purported benefit of the
RFC depends very tightly on the implementation that lands.Jordan
El sáb., 30 de julio de 2022 11:58, Oleksii Bulba oleksii.bulba@gmail.com
escribió:
$memoryAfter = memory_get_usage(true) / 1024 / 1024;
I see that you used
memory_get_usage
that shows memory usage at the time
of the function call.
As far as I understand, your function does not return any value,
so I suspect that it is obvious that the memory usage after the function
call is the same as before.
But what is the actual memory usage during the function call?
Can you run the same benchmark but withmemory_get_peak_usage
function
to see how your function uses memory?$memoryAfter = memory_get_peak_usage(true) / 1024 / 1024;
Also, I'm concerned if it would be better to name the function
is_json_valid
?On Sat, Jul 30, 2022 at 10:37 AM juan carlos morales <
dev.juan.morales@gmail.com> wrote:Just want to clarify that when I mentioned the use of memory, I wrote down
the function "memory_get_usage()", which basically gives us the memory
handle by php that is related to the memory_limit INI setting.Now I will provide a benchmark I have done really quick:
Code used (I have the implementation of is_json() done already)
<?php
// make sure you set your memory limit to -1 before running this code
// Here we create a very very very big json string, really big
$limit = 1000000;
$jsonString = '{ "test": { "foo": "bar" },';for ($i=0; $i < $limit; $i++) {
$jsonString .= " "test$i": { "foo": { "test" : { "foo" : {
"test" : { "foo" : "bar" }}}}},";
}$jsonString .= ' "testXXXX": { "foo": "replaceme" } }';
//{ "test" : { "foo" : "bar" }}}$memoryBefore = memory_get_usage(true) / 1024 / 1024;
echo "Megas used before call: " . $memoryBefore . PHP_EOL;$start = microtime(true);
json_decode($jsonString, null, $limit, 0);
//<------------------ un/comment to show/hide results forjson_decode()
//is_json($jsonString);
//<------------------ un/comment to show/hide results for is_json()$memoryAfter = memory_get_usage(true) / 1024 / 1024;
echo "Megas used after call: " . $memoryAfter . PHP_EOL;echo "Difference: " . ($memoryAfter - $memoryBefore) . PHP_EOL;
echo "Time: " . (microtime(true) - $start) . " seconds" . PHP_EOL;
return;Results
json_decode()
Megas used before call: 79.23828125
Megas used after call: 3269.23828125
Difference: 3190
Time: 12.091101884842 secondsis_json()
Megas used before call: 79.23828125
Megas used after call: 79.23828125
Difference: 0
Time: 5.4537169933319 secondsAnd yes, I am open to share the implementation, but after I write the RFC.
Thanks for taking your time to give me a feedback.
El sáb, 30 jul 2022 a las 3:50, Jordan LeDoux (jordan.ledoux@gmail.com)
escribió:On Fri, Jul 29, 2022 at 7:27 AM juan carlos morales <
dev.juan.morales@gmail.com> wrote:Why this function ?
At the moment the only way to determine if a JSON-string is valid we
have
to execute thejson_decode()
function.The drawback about this, is that
json_decode()
generates an in memory
an
object/array (depending on parameters) while parsing the string; this
leads
to a memory usage that is not needed (because we use memory for
creating
the object/array) and also can cause an error for reaching the
memory-limit
of the php process.Sometimes we just need to know is the string is a valid json or not,
and
nothing else.You say that you have a first-pass at the implementation done. I'd be
curious to see it. My initial thought was that in order to validate the
string, you likely need to allocate extra memory as part of the
validation
that depends on the string size. You'd definitely save the overhead of a
ZVAL, but for such an object that overhead is likely negligible.So I guess my question would be: in the actual implementation that
lands,
how much memory would this actually save compared tojson_decode()
? This
seems like it would make the RFC tricky, as the purported benefit of the
RFC depends very tightly on the implementation that lands.Jordan
Hello Jordan, thanks for the feedback.
I think the benchmark talks by itself (also for the memory Save question).
Also by the fact that in order to run it for json_decode()
rhe memory limit
needs to be super high or -1 (no limit, not a good idea in production
right?)
The advantage here is to be able to parse huge strings without reaching the
memory limit set in the INI settings.
I take It as a "IF THIS IS AS GOOD AS IT SEEMS THEN YES" :D
Once again... Thanks
is_json(string $string): bool
json_decode()
has $depth argument, I think is_json() probably also
should. And I'm not sure about JSON_INVALID_UTF8_IGNORE flag.
My point is that if you use these with json_decode()
you might also need
to use these with is_json().
On the other hand maybe the function should be as simple as possible,
but I'd like it to be considered.
--
Aleksander Machniak
Kolab Groupware Developer [https://kolab.org]
Roundcube Webmail Developer [https://roundcube.net]
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
json_decode()
has $depth argument, I think is_json() probably also should. And I'm not sure about JSON_INVALID_UTF8_IGNORE flag.My point is that if you use these with
json_decode()
you might also need to use these with is_json()
How about if validation were a flag to json_decode()
-- perhaps JSON_VALIDATE_ONLY -- which made it return a placeholder value like true instead of the decoded structure? (It can't return null on success; that's used for errors if JSON_THROW_ON_ERROR
isn't set.)
El sáb, 30 jul 2022 a las 8:30, Dusk (dusk@woofle.net) escribió:
json_decode()
has $depth argument, I think is_json() probably also
should. And I'm not sure about JSON_INVALID_UTF8_IGNORE flag.My point is that if you use these with
json_decode()
you might also need
to use these with is_json()How about if validation were a flag to
json_decode()
-- perhaps
JSON_VALIDATE_ONLY -- which made it return a placeholder value like true
instead of the decoded structure? (It can't return null on success; that's
used for errors ifJSON_THROW_ON_ERROR
isn't set.)To unsubscribe, visit: https://www.php.net/unsub.php
I thought about that too, and yes can also be a valid approach that would
not interfeer with my development.
What I personally dont like about that approach is that a developer is able
to specify flags that are related to exclusively with result generation,
and when they get mixed with JSON_VALIDATE_ONLY would not make sense. for
example:
json_decode($jsonStringHere, false, 512, JSON_BIGINT_AS_STRING
|
JSON_INVALID_UTF8_SUBSTITUTE | JSON_OBJECT_AS_ARRAY
| JSON_THROW_ON_ERROR
|
JSON_VALIDATE_ONLY);
So the end result, is not clear by reading the line itself, I mean , the
mechanism (of having another constant in json_decode()
) is not enough to
make the function self-expressive.
On Fri, Jul 29, 2022 at 4:27 PM juan carlos morales <
dev.juan.morales@gmail.com> wrote:
I am following the RFC guideline for the first time. (
https://wiki.php.net/rfc/howto)As suggested there, I am here to get a feeling from you, regarding the
following RFC for PHP.Change (draft):
New function in php called like:
is_json(string $string): bool
Description
Parameters
string $string -> string to find out if is a valid JSON or not
Return
Returns a bool. The function is capable to determine if the passed string
is a valid JSON (true) or not (false).Why this function ?
At the moment the only way to determine if a JSON-string is valid we have
to execute thejson_decode()
function.The drawback about this, is that
json_decode()
generates an in memory an
object/array (depending on parameters) while parsing the string; this leads
to a memory usage that is not needed (because we use memory for creating
the object/array) and also can cause an error for reaching the memory-limit
of the php process.Sometimes we just need to know is the string is a valid json or not, and
nothing else.Do we need something like this? If a check to an string is valid JSON
then for sure I will have to use it in my code either as an object or an
array.Well that is not true. There are plenty of cases where you just need to
check if a string is a valid json and that is it. Just looking into
stackoverflow will give you an idea about how many people is looking for
something like this in an efficient way.
Could you please give some specific examples where the proposed
functionality would be useful?
Regards,
Nikita
On Fri, Jul 29, 2022 at 4:27 PM juan carlos morales <
dev.juan.morales@gmail.com> wrote:I am following the RFC guideline for the first time. (
https://wiki.php.net/rfc/howto)As suggested there, I am here to get a feeling from you, regarding the
following RFC for PHP.Change (draft):
New function in php called like:
is_json(string $string): bool
Description
Parameters
string $string -> string to find out if is a valid JSON or not
Return
Returns a bool. The function is capable to determine if the passed string
is a valid JSON (true) or not (false).Why this function ?
At the moment the only way to determine if a JSON-string is valid we have
to execute thejson_decode()
function.The drawback about this, is that
json_decode()
generates an in memory an
object/array (depending on parameters) while parsing the string; this
leads
to a memory usage that is not needed (because we use memory for creating
the object/array) and also can cause an error for reaching the
memory-limit
of the php process.
In the last 15 years, the only time I've ever needed to know if a string is
valid JSON is if I'm about to decode or otherwise parse it as JSON. If I'm
decoding what I expect to be a large JSON blob, such that memory usage
might be a concern, personally I use
https://github.com/salsify/jsonstreamingparser but the point is userland
solutions are possible.
What I'm asking is what's the practical use for this proposed function?
Where are you likely to need to know if a string is valid JSON but not have
to (try to, with error handling) parse it almost immediately afterwards
anyway? Unless there is some fairly commonplace use case for this I'm not
thinking of, you're going to be using that extra memory, or using a
streaming parser, at some point in your script regardless. If there is
genuine demand for it, I'd be in favour (I'm not a voting member so kind of
moot but...), otherwise I'm generally against introducing new core
functions which are either edge-case or can be perfectly well dealt with
via userland code.
Cheers
-Dave
Sometimes we just need to know is the string is a valid json or not, and
nothing else.Do we need something like this? If a check to an string is valid JSON
then for sure I will have to use it in my code either as an object or an
array.Well that is not true. There are plenty of cases where you just need to
check if a string is a valid json and that is it. Just looking into
stackoverflow will give you an idea about how many people is looking for
something like this in an efficient way.Could you please give some specific examples where the proposed
functionality would be useful?Regards,
Nikita
El sáb., 30 de julio de 2022 16:48, David Gebler davidgebler@gmail.com
escribió:
On Fri, Jul 29, 2022 at 4:27 PM juan carlos morales <
dev.juan.morales@gmail.com> wrote:I am following the RFC guideline for the first time. (
https://wiki.php.net/rfc/howto)As suggested there, I am here to get a feeling from you, regarding the
following RFC for PHP.Change (draft):
New function in php called like:
is_json(string $string): bool
Description
Parameters
string $string -> string to find out if is a valid JSON or not
Return
Returns a bool. The function is capable to determine if the passed
string
is a valid JSON (true) or not (false).Why this function ?
At the moment the only way to determine if a JSON-string is valid we
have
to execute thejson_decode()
function.The drawback about this, is that
json_decode()
generates an in memory an
object/array (depending on parameters) while parsing the string; this
leads
to a memory usage that is not needed (because we use memory for creating
the object/array) and also can cause an error for reaching the
memory-limit
of the php process.In the last 15 years, the only time I've ever needed to know if a string
is valid JSON is if I'm about to decode or otherwise parse it as JSON. If
I'm decoding what I expect to be a large JSON blob, such that memory usage
might be a concern, personally I use
https://github.com/salsify/jsonstreamingparser but the point is userland
solutions are possible.What I'm asking is what's the practical use for this proposed function?
Where are you likely to need to know if a string is valid JSON but not have
to (try to, with error handling) parse it almost immediately afterwards
anyway? Unless there is some fairly commonplace use case for this I'm not
thinking of, you're going to be using that extra memory, or using a
streaming parser, at some point in your script regardless. If there is
genuine demand for it, I'd be in favour (I'm not a voting member so kind of
moot but...), otherwise I'm generally against introducing new core
functions which are either edge-case or can be perfectly well dealt with
via userland code.Cheers
-Dave
Sometimes we just need to know is the string is a valid json or not, and
nothing else.Do we need something like this? If a check to an string is valid JSON
then for sure I will have to use it in my code either as an object or an
array.Well that is not true. There are plenty of cases where you just need to
check if a string is a valid json and that is it. Just looking into
stackoverflow will give you an idea about how many people is looking for
something like this in an efficient way.Could you please give some specific examples where the proposed
functionality would be useful?Regards,
Nikita
Thanks everybody for the feedback.
I think I got to a point that I need to write another e-mail with all my
arguments and use cases for this.
That is why I kindly ask you all for patience, and I Will write again soon.
It will be a long e-mail for sure, and when It comes , I ask you all to
take the time to read it.
Thanks in advance.
What I'm asking is what's the practical use for this proposed function?
Where are you likely to need to know if a string is valid JSON but not have
to (try to, with error handling) parse it almost immediately afterwards
anyway?
I'm still on the fence over the general idea, but I thought I could at
least address this question in particular.
I can definitely see it's usefulness on public HTTP API ingesting data
(specially large data) where if the payload is a valid JSON, it gets stored
and processed by a background queue, making it so that the HTTP layer can
either reject the request with a 4xx status or accept it and only truly
decode it on a separate process that may even be hosted by a separate
server with larger memory size.
El dom, 31 jul 2022 a las 0:56, Deleu (deleugyn@gmail.com) escribió:
What I'm asking is what's the practical use for this proposed function?
Where are you likely to need to know if a string is valid JSON but not
have
to (try to, with error handling) parse it almost immediately afterwards
anyway?I'm still on the fence over the general idea, but I thought I could at
least address this question in particular.I can definitely see it's usefulness on public HTTP API ingesting data
(specially large data) where if the payload is a valid JSON, it gets stored
and processed by a background queue, making it so that the HTTP layer can
either reject the request with a 4xx status or accept it and only truly
decode it on a separate process that may even be hosted by a separate
server with larger memory size.
Before starting, I want to thank all for taking time from your time, to
give me a feedback, I sincerely respect that, so ... thanks!
Sorry for the long message, but I have the feeling that ... this is it, is
now ... or it will not be, at least not now, so ... here my best effort;
even though, I dont put aside the possibility that I might have an
offuscated view/opinion regarding my proposal, so is possible that I might
not see the things right ... you know .. I am human after all.
So, enough of prologue and let's start ....
Why would I have code that checks if a json-string is a valid json, if I
will not use the content inside it? (or something like that)
- Some of you , asked me this question.
- That depends on the infinite amout of use cases that the human brain can
imagine, that is why I say that is not the right approach to discuss this
topic.
Why this change then?
- Is web related, ergo, is PHP related.
- It does not add complexity into PHP, with this functionality we are using
the existing JSON parser that exists in PHP at the moment, the only thing
we do is to create an interface between "userland" and the parser itself,
without the need to use memory asjson_decode()
is using to check if a
string is a valid json-string or not.
Proposed functionality (proposed, subject to changes for sure, as now I
have only a working and dirty prototype for this)
function is_json(string $json, int $flags): bool {}
Returns:
TRUE
if json-string is valid json, otherwise returns FALSE.
Exceptions:
Optionally set, for the same or subset of exceptions of the actual
json_decode()
, like the one Syntax Error for example.
So far this is it, subject to changes for sure.
Real open-source project over github using json_decode()
just to check if
a json-string is valid ... and nothing else than that.
- I provide here use cases from major projects, where they ned to have code
that has to be able to check if an string is a valid JSON and nothing else
than that. - Please check the link to see full code
- I provide here some small snippets from the link I provide
- We are not discussing here how they implemented the code, what I want to
show here is that there is a need to check if an string is a valid
JSON-string without th need to create an object/array out of it. - Also, we are not discussing if they are mejor projects or not. They are
listed in github as major projects, with stars and a big community of users
and developers maintaining them. - On some snippets I wrote some notes too, because on some of them is not
obvious how a funciton like the one I propose could be useful.
Symfony Framework
class JsonValidator extends ConstraintValidator
...
...
...
Laravel Framework
public function validateJson($attribute, $value)
{
if (is_array($value)) {
return false;
}
if (! is_scalar($value) && ! is_null($value) && !
method_exists($value, '__toString')) {
return false;
}
json_decode($value);
return `json_last_error()` === JSON_ERROR_NONE;
}
public static function isJson($value)
{
Magento
private function getJSONString($input)
{
$output = json_decode($input);
return $output ? $this->_jsonEncoder->encode($output) : '{}';
}
protected function isValidJsonValue($value)
{
if (in_array($value, ['null', 'false', '0', '""', '[]'])
|| (json_decode($value) !== null && `json_last_error()` ===
JSON_ERROR_NONE)
) {
return true;
}
//JSON last error reset
json_encode([]);
return false;
}
public function isValid($string)
{
if ($string !== false && $string !== null && $string !== '') {
json_decode($string);
if (json_last_error() === JSON_ERROR_NONE) {
return true;
}
}
return false;
}
getgrav
public static function validateJson($value, $params)
{
return (bool) (@json_decode($value));
}
Symfony / http-kernel
public function getPrettyJson()
{
$decoded = json_decode($this->getContent());
//<------ here they decode, just to check if is valid json-string or not,
that is th reason of this line.
return \JSON_ERROR_NONE === `json_last_error()` ?
json_encode($decoded, \JSON_PRETTY_PRINT) : null;
}
Respect / Validation
final class Json extends AbstractRule
{
/**
* {@inheritDoc}
*/
public function validate($input): bool
{
if (!is_string($input) || $input === '') {
return false;
}
json_decode($input);
return `json_last_error()` === JSON_ERROR_NONE;
}
}
final class Json extends AbstractRule
{
/**
* {@inheritDoc}
*/
public function validate($input): bool
{
if (!is_string($input) || $input === '') {
return false;
}
json_decode($input);
return `json_last_error()` === JSON_ERROR_NONE;
}
}
humhub
public function actionIndex()
{
Yii::$app->response->statusCode = 204;
if(!SecuritySettings::isReportingEnabled()) {
return;
}
$json_data = file_get_contents('php://input');
if ($json_data = json_decode($json_data)) {
//<----- here they `json_decode()` just to check if
is valid json-string only, am I right?
$json_data = json_encode($json_data, `JSON_PRETTY_PRINT` |
JSON_UNESCAPED_SLASHES);
$json_data = preg_replace('/\'nonce-[^\']*\'/',
"'nonce-xxxxxxxxxxxxxxxxxxxxxxxx'", $json_data);
Yii::error($json_data, 'web.security');
}
}
Prestashop
public static function isJson($string)
{
json_decode($string);
return `json_last_error()` == JSON_ERROR_NONE;
}
Wordpress CLI
https://github.com/wp-cli/wp-cli/blob/f3e4b0785aa3d3132ee73be30aedca8838a8fa06/php/utils.php
function is_json( $argument, $ignore_scalars = true ) {
if ( ! is_string( $argument ) || '' === $argument ) {
return false;
}
if ( $ignore_scalars && ! in_array( $argument[0], [ '{', '[' ], true )
) {
return false;
}
json_decode( $argument, $assoc = true );
return `json_last_error()` === JSON_ERROR_NONE;
}
JOOMLA CMS
if (\is_string($value)) {
json_decode($value);
//<---------------------------------------------------------- HERE
// Check if value is a valid JSON string.
if ($value !== '' && `json_last_error()` !== JSON_ERROR_NONE) {
/**
* If the value is not empty and is not a valid JSON string,
* it is most likely a custom field created in Joomla 3 and
* the value is a string that contains the file name.
*/
if (is_file(JPATH_ROOT . '/' . $value)) {
$value = '{"imagefile":"' . $value . '","alt_text":""}';
} else {
$value = '';
}
}
Stackoverflow questions related to this
In PHP, this question is one of the most high ranked questions related
to json && php in stackoverflow, "Fastest way to check if a string is JSON
in PHP?"
The question
https://stackoverflow.com/questions/6041741/fastest-way-to-check-if-a-string-is-json-in-php
Viewed 484k times
The ranking
https://stackoverflow.com/questions/tagged/php%20json?sort=MostVotes&edited=true
Person asking how to do exactly this, also providing a real use case;
eventhough in python, the programming language is not important.
https://stackoverflow.com/questions/5508509/how-do-i-check-if-a-string-is-valid-json-in-python
Someone has also doing exactly this , in JAVA
https://stackoverflow.com/questions/3679479/check-if-file-is-json-java
Before starting, I want to thank all for taking time from your time, to
give me a feedback, I sincerely respect that, so ... thanks!Sorry for the long message, but I have the feeling that ... this is it, is
now ... or it will not be, at least not now, so ... here my best effort;
even though, I dont put aside the possibility that I might have an
offuscated view/opinion regarding my proposal, so is possible that I might
not see the things right ... you know .. I am human after all.So, enough of prologue and let's start ....
Why would I have code that checks if a json-string is a valid json, if I
will not use the content inside it? (or something like that)
- Some of you , asked me this question.
- That depends on the infinite amout of use cases that the human brain can
imagine, that is why I say that is not the right approach to discuss this
topic.Why this change then?
- Is web related, ergo, is PHP related.
- It does not add complexity into PHP, with this functionality we are using
the existing JSON parser that exists in PHP at the moment, the only thing
we do is to create an interface between "userland" and the parser itself,
without the need to use memory asjson_decode()
is using to check if a
string is a valid json-string or not.Proposed functionality (proposed, subject to changes for sure, as now I
have only a working and dirty prototype for this)
function is_json(string $json, int $flags): bool {}
Returns:
TRUE
if json-string is valid json, otherwise returns FALSE.Exceptions:
Optionally set, for the same or subset of exceptions of the actual
json_decode()
, like the one Syntax Error for example.So far this is it, subject to changes for sure.
Real open-source project over github using
json_decode()
just to check ifa json-string is valid ... and nothing else than that.
- I provide here use cases from major projects, where they ned to have code
that has to be able to check if an string is a valid JSON and nothing else
than that.- Please check the link to see full code
- I provide here some small snippets from the link I provide
- We are not discussing here how they implemented the code, what I want to
show here is that there is a need to check if an string is a valid
JSON-string without th need to create an object/array out of it.- Also, we are not discussing if they are mejor projects or not. They are
listed in github as major projects, with stars and a big community of users
and developers maintaining them.- On some snippets I wrote some notes too, because on some of them is not
obvious how a funciton like the one I propose could be useful.Symfony Framework
class JsonValidator extends ConstraintValidator ... ... ...
Laravel Framework
public function validateJson($attribute, $value) { if (is_array($value)) { return false; } if (! is_scalar($value) && ! is_null($value) && ! method_exists($value, '__toString')) { return false; } json_decode($value); return `json_last_error()` === JSON_ERROR_NONE; }
public static function isJson($value) {
Magento
private function getJSONString($input) { $output = json_decode($input); return $output ? $this->_jsonEncoder->encode($output) : '{}'; }
protected function isValidJsonValue($value) { if (in_array($value, ['null', 'false', '0', '""', '[]']) || (json_decode($value) !== null && `json_last_error()` === JSON_ERROR_NONE) ) { return true; } //JSON last error reset json_encode([]); return false; }
public function isValid($string) { if ($string !== false && $string !== null && $string !== '') { json_decode($string); if (json_last_error() === JSON_ERROR_NONE) { return true; } } return false; }
getgrav
public static function validateJson($value, $params) { return (bool) (@json_decode($value)); }
Symfony / http-kernel
public function getPrettyJson() { $decoded = json_decode($this->getContent()); //<------ here they decode, just to check if is valid json-string or not, that is th reason of this line. return \JSON_ERROR_NONE === `json_last_error()` ? json_encode($decoded, \JSON_PRETTY_PRINT) : null; }
Respect / Validation
final class Json extends AbstractRule { /** * {@inheritDoc} */ public function validate($input): bool { if (!is_string($input) || $input === '') { return false; } json_decode($input); return `json_last_error()` === JSON_ERROR_NONE; } }
final class Json extends AbstractRule { /** * {@inheritDoc} */ public function validate($input): bool { if (!is_string($input) || $input === '') { return false; } json_decode($input); return `json_last_error()` === JSON_ERROR_NONE; } }
humhub
public function actionIndex() { Yii::$app->response->statusCode = 204; if(!SecuritySettings::isReportingEnabled()) { return; } $json_data = file_get_contents('php://input'); if ($json_data = json_decode($json_data)) { //<----- here they `json_decode()` just to check if is valid json-string only, am I right? $json_data = json_encode($json_data, `JSON_PRETTY_PRINT` | JSON_UNESCAPED_SLASHES); $json_data = preg_replace('/\'nonce-[^\']*\'/', "'nonce-xxxxxxxxxxxxxxxxxxxxxxxx'", $json_data); Yii::error($json_data, 'web.security'); } }
Prestashop
public static function isJson($string) { json_decode($string); return `json_last_error()` == JSON_ERROR_NONE; }
Wordpress CLI
https://github.com/wp-cli/wp-cli/blob/f3e4b0785aa3d3132ee73be30aedca8838a8fa06/php/utils.php
function is_json( $argument, $ignore_scalars = true ) { if ( ! is_string( $argument ) || '' === $argument ) { return false; } if ( $ignore_scalars && ! in_array( $argument[0], [ '{', '[' ], true ) ) { return false; } json_decode( $argument, $assoc = true ); return `json_last_error()` === JSON_ERROR_NONE; }
JOOMLA CMS
if (\is_string($value)) { json_decode($value); //<---------------------------------------------------------- HERE // Check if value is a valid JSON string. if ($value !== '' && `json_last_error()` !== JSON_ERROR_NONE) { /** * If the value is not empty and is not a valid JSON string, * it is most likely a custom field created in Joomla 3 and * the value is a string that contains the file name. */ if (is_file(JPATH_ROOT . '/' . $value)) { $value = '{"imagefile":"' . $value . '","alt_text":""}'; } else { $value = ''; } }
Stackoverflow questions related to this
In PHP, this question is one of the most high ranked questions related
to json && php in stackoverflow, "Fastest way to check if a string is JSON
in PHP?"The question
https://stackoverflow.com/questions/6041741/fastest-way-to-check-if-a-string-is-json-in-php
Viewed 484k timesThe ranking
https://stackoverflow.com/questions/tagged/php%20json?sort=MostVotes&edited=true
Person asking how to do exactly this, also providing a real use case;
eventhough in python, the programming language is not important.
https://stackoverflow.com/questions/5508509/how-do-i-check-if-a-string-is-valid-json-in-python
Someone has also doing exactly this , in JAVA
https://stackoverflow.com/questions/3679479/check-if-file-is-json-java
So the core argument, it seems, is "there's lots of user-space implementations already, hence demand, and it would be better/faster/stronger/we-have-the-technology to do it in C."
Thus another, arguably more important benchmark would be a C implementation compared to a userspace implementation of the same algorithm. Presumably your C code is doing some kind of stream-based validation with braces/quotes matching rather than a naive "try and parse and see if it breaks." We would need to see benchmarks of the same stream-based validation in C vs PHP, as that's the real distinction. That a stream validator would be more memory efficient than a full parser is not at all surprising, but that's also not a fair comparison.
As for the benchmarks themselves, do not use memory_get_usage()
; as noted, it shows the memory usage at that time, not ever. What you want is memory_get_peak_usage()
, which gets the highest the memory usage has gotten in that script run. Or, even better, use PHPBench with separate sample methods to compare various different implementations. It will handle all the "run many times and average the results and throw out outliers" and such for you. It's quite a flexible tool once you get the hang of it.
I'll also note that it would be to your benefit to share the working C code as a patch/PR already. If accepted it would be released open source anyway, so letting people see the proposed code now can only help your case; unless the code is awful, in which case showing it later would only waste your time and everyone else's discussing it in the abstract before the implementation could be reviewed.
--Larry Garfield
On Sun, Jul 31, 2022 at 4:41 PM Larry Garfield larry@garfieldtech.com
wrote:
So the core argument, it seems, is "there's lots of user-space
implementations already, hence demand, and it would be
better/faster/stronger/we-have-the-technology to do it in C."
There's innumerable features implemented in userland which would be
faster/stronger/better done in C. I'm not convinced this alone is a
sufficient basis to introduce a new core function. There are also userland
JSON streaming parsers which are memory efficient, I've used the one I
linked to parse JSON files over 1GB no problem.
And I'm not saying I'm against this proposal, I'm just covering devil's
advocate here - but while I can accept the number of userland
implementations for "validate string as JSON" out there clearly show some
demand / use cases for doing this, are there equally numerous examples of
issues raised on these product repositories demonstrating userland
implementations have commonly been insufficient, encountered OOM errors or
otherwise caused problems? This might be an RFC to fix a problem very, very
few people have.
Thus another, arguably more important benchmark would be a C
implementation compared to a userspace implementation of the same
algorithm. Presumably your C code is doing some kind of stream-based
validation with braces/quotes matching rather than a naive "try and parse
and see if it breaks." We would need to see benchmarks of the same
stream-based validation in C vs PHP, as that's the real distinction. That
a stream validator would be more memory efficient than a full parser is not
at all surprising, but that's also not a fair comparison.As for the benchmarks themselves, do not use
memory_get_usage()
; as noted,
it shows the memory usage at that time, not ever. What you want is
memory_get_peak_usage()
, which gets the highest the memory usage has gotten
in that script run. Or, even better, use PHPBench with separate sample
methods to compare various different implementations. It will handle all
the "run many times and average the results and throw out outliers" and
such for you. It's quite a flexible tool once you get the hang of it.I'll also note that it would be to your benefit to share the working C
code as a patch/PR already. If accepted it would be released open source
anyway, so letting people see the proposed code now can only help your
case; unless the code is awful, in which case showing it later would only
waste your time and everyone else's discussing it in the abstract before
the implementation could be reviewed.--Larry Garfield
--
To unsubscribe, visit: https://www.php.net/unsub.php