Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:118327 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 91134 invoked from network); 31 Jul 2022 13:41:38 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 31 Jul 2022 13:41:38 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id E090A1804BC for ; Sun, 31 Jul 2022 08:40:53 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS19151 66.111.4.0/24 X-Spam-Virus: No X-Envelope-From: Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sun, 31 Jul 2022 08:40:53 -0700 (PDT) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id 0F53E5C009F for ; Sun, 31 Jul 2022 11:40:53 -0400 (EDT) Received: from imap50 ([10.202.2.100]) by compute1.internal (MEProxy); Sun, 31 Jul 2022 11:40:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= garfieldtech.com; h=cc:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm3; t=1659282053; x= 1659368453; bh=X5P7hgnB8ViLRZ6K90ODNn+1qekqrACc2qfZ7g+WRhM=; b=R YHPXNLtn+54ktodztJKyx0cI7W4HJdzxzHZMbGJaQzpEY5rzJl8uozgdBgsyo9iZ EYwLV3nviaJok88bA6zgP1HaFH1Zl8/YZ34C1B9bvmtVvAADhV0iPjAcHs68xFiy XEv8fHVV1XzU32xM4NSNKxfUGF55vW5CA5+CoTahtF8OMrpthY+HGmTAE82SuCCR zIAtSe5lKdMw4gKTOAepUKTYIKQo5JIWqFgvbatxbzNE2hZoi7SDsoKtVe8eZd0j dWM3KNuuuALoMzCuzoGaGNTp5sq7rSHBlgzPnP9D7e0PH2OaOZALK1QFJfSZEe2f 7oPtHWB3aV7i3Ld5pKgDQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:date:feedback-id :feedback-id:from:from:in-reply-to:in-reply-to:message-id :mime-version:references:reply-to:sender:subject:subject:to:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1659282053; x=1659368453; bh=X5P7hgnB8ViLRZ6K90ODNn+1qekq rACc2qfZ7g+WRhM=; b=xBvOCQr0TTCygAZeTTB/HTlJAWuiocaEbPltTCwLjAk3 PV65Ik3bTE2F42wWCmu2gxahoIe8utFrviSzA/5z7ypyB/3qVsDiBPQslQ31qR1k N5eVgMpv0xbg+xBT3riI9U3R9i52DtzLJlF7Qtl9f62O52dTO5RCbmP01xls324H QDxFJIUfE3GTVtpyYhu1V94z0YX/wsDhmEwk64BXRDl+ZL/v0VnJ6HlQH42eUCIE l7ocdIpGiDQqdXNR7seJmXhmTTDZ8uh9rLc+4drArcG7nLrgPj03a1P41Zz9Ls9v aRWlvGdgbUA6Ntq5aYuPZlpuhF5qICJSomokHQptcQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvfedrvddvuddgleehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepofgfggfkjghffffhvffutgesthdtredtreertdenucfhrhhomhepfdfnrghr rhihucfirghrfhhivghlugdfuceolhgrrhhrhiesghgrrhhfihgvlhguthgvtghhrdgtoh hmqeenucggtffrrghtthgvrhhnpedvfeefvdeggfevgfffgfekteeljeefgeeigeevudev geevieeufeethfduieeivdenucffohhmrghinhepghhithhhuhgsrdgtohhmpdhsthgrtg hkohhvvghrfhhlohifrdgtohhmnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghm pehmrghilhhfrhhomheplhgrrhhrhiesghgrrhhfihgvlhguthgvtghhrdgtohhm X-ME-Proxy: Feedback-ID: i8414410d:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id B81FD170007E; Sun, 31 Jul 2022 11:40:52 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.7.0-alpha0-758-ge0d20a54e1-fm-20220729.001-ge0d20a54 Mime-Version: 1.0 Message-ID: In-Reply-To: References: Date: Sun, 31 Jul 2022 10:40:26 -0500 To: "php internals" Content-Type: text/plain Subject: Re: [PHP-DEV] RFC Idea - is_json - looking for feedback From: larry@garfieldtech.com ("Larry Garfield") On Sun, Jul 31, 2022, at 8:14 AM, juan carlos morales wrote: > Before starting, I want to thank all for taking time from your time, to > give me a feedback, I sincerely respect that, so ... thanks! > > Sorry for the long message, but I have the feeling that ... this is it, is > now ... or it will not be, at least not now, so ... here my best effort; > even though, I dont put aside the possibility that I might have an > offuscated view/opinion regarding my proposal, so is possible that I might > not see the things right ... you know .. I am human after all. > > So, enough of prologue and let's start .... > > # Why would I have code that checks if a json-string is a valid json, if I > will not use the content inside it? (or something like that) > > - Some of you , asked me this question. > - That depends on the infinite amout of use cases that the human brain can > imagine, that is why I say that is not the right approach to discuss this > topic. > > # Why this change then? > > - Is web related, ergo, is PHP related. > - It does not add complexity into PHP, with this functionality we are using > the existing JSON parser that exists in PHP at the moment, the only thing > we do is to create an interface between "userland" and the parser itself, > without the need to use memory as json_decode() is using to check if a > string is a valid json-string or not. > > > ## Proposed functionality (proposed, subject to changes for sure, as now I > have only a working and dirty prototype for this) > > function is_json(string $json, int $flags): bool {} > > Returns: > TRUE if json-string is valid json, otherwise returns FALSE. > > Exceptions: > Optionally set, for the same or subset of exceptions of the actual > json_decode(), like the one Syntax Error for example. > > > So far this is it, subject to changes for sure. > > > # Real open-source project over github using json_decode() just to check if > a json-string is valid ... and nothing else than that. > > - I provide here use cases from major projects, where they ned to have code > that has to be able to check if an string is a valid JSON and nothing else > than that. > - Please check the link to see full code > - I provide here some small snippets from the link I provide > - We are not discussing here how they implemented the code, what I want to > show here is that there is a need to check if an string is a valid > JSON-string without th need to create an object/array out of it. > - Also, we are not discussing if they are mejor projects or not. They are > listed in github as major projects, with stars and a big community of users > and developers maintaining them. > - On some snippets I wrote some notes too, because on some of them is not > obvious how a funciton like the one I propose could be useful. > > > ## Symfony Framework > > https://github.com/symfony/symfony/blob/870eeb975feb1abb4b8a1722e1fd57beeab2b230/src/Symfony/Component/Validator/Constraints/JsonValidator.php > > ``` > class JsonValidator extends ConstraintValidator > ... > ... > ... > ``` > > ## Laravel Framework > > https://github.com/laravel/framework/blob/302a579f00ebcb2573f481054cbeadad9c970605/src/Illuminate/Validation/Concerns/ValidatesAttributes.php > > ``` > public function validateJson($attribute, $value) > { > if (is_array($value)) { > return false; > } > > if (! is_scalar($value) && ! is_null($value) && ! > method_exists($value, '__toString')) { > return false; > } > > json_decode($value); > > return json_last_error() === JSON_ERROR_NONE; > } > ``` > > https://github.com/laravel/framework/blob/61eac9cae4717699ecb3941b16c3d775820d4ca2/src/Illuminate/Support/Str.php > > ``` > public static function isJson($value) > { > ``` > > > ## Magento > > https://github.com/magento/magento2/blob/7c6b6365a3c099509d6f6e6c306cb1821910aab0/app/code/Magento/User/Block/Role/Grid/User.php > > ``` > private function getJSONString($input) > { > $output = json_decode($input); > return $output ? $this->_jsonEncoder->encode($output) : '{}'; > } > ``` > > https://github.com/magento/magento2/blob/7c6b6365a3c099509d6f6e6c306cb1821910aab0/lib/internal/Magento/Framework/DB/DataConverter/SerializedToJson.php > > ``` > protected function isValidJsonValue($value) > { > if (in_array($value, ['null', 'false', '0', '""', '[]']) > || (json_decode($value) !== null && json_last_error() === > JSON_ERROR_NONE) > ) { > return true; > } > //JSON last error reset > json_encode([]); > return false; > } > ``` > > https://github.com/magento/magento2/blob/7c6b6365a3c099509d6f6e6c306cb1821910aab0/lib/internal/Magento/Framework/Serialize/JsonValidator.php > > ``` > public function isValid($string) > { > if ($string !== false && $string !== null && $string !== '') { > json_decode($string); > if (json_last_error() === JSON_ERROR_NONE) { > return true; > } > } > return false; > } > ``` > > ## getgrav > > https://github.com/getgrav/grav/blob/3e7f67f589267e61f823d19824f3ee1b9a8a38ff/system/src/Grav/Common/Data/Validation.php > > ``` > public static function validateJson($value, $params) > { > return (bool) (@json_decode($value)); > } > ``` > > > ## Symfony / http-kernel > > https://github.com/symfony/http-kernel/blob/94986633e4c3e7facb7defbd094a2e1170486ab5/DataCollector/RequestDataCollector.php > > ``` > public function getPrettyJson() > { > $decoded = json_decode($this->getContent()); > //<------ here they decode, just to check if is valid json-string or not, > that is th reason of this line. > > return \JSON_ERROR_NONE === json_last_error() ? > json_encode($decoded, \JSON_PRETTY_PRINT) : null; > } > ``` > > ## Respect / Validation > > https://github.com/Respect/Validation/blob/3dcd859d986f1b586b5539ea19962723ab7352ed/library/Rules/Json.php > > ``` > final class Json extends AbstractRule > { > /** > * {@inheritDoc} > */ > public function validate($input): bool > { > if (!is_string($input) || $input === '') { > return false; > } > > json_decode($input); > > return json_last_error() === JSON_ERROR_NONE; > } > } > ``` > > https://github.com/Respect/Validation/blob/3dcd859d986f1b586b5539ea19962723ab7352ed/library/Rules/Json.php > > ``` > final class Json extends AbstractRule > { > /** > * {@inheritDoc} > */ > public function validate($input): bool > { > if (!is_string($input) || $input === '') { > return false; > } > > json_decode($input); > > return json_last_error() === JSON_ERROR_NONE; > } > } > ``` > > ## humhub > > https://github.com/humhub/humhub/blob/26d7e2667a9317057abe335a056ac8e8f4d675fb/protected/humhub/modules/web/security/controllers/ReportController.php > > > ``` > public function actionIndex() > { > Yii::$app->response->statusCode = 204; > > if(!SecuritySettings::isReportingEnabled()) { > return; > } > > $json_data = file_get_contents('php://input'); > if ($json_data = json_decode($json_data)) { > //<----- here they json_decode() just to check if > is valid json-string only, am I right? > $json_data = json_encode($json_data, JSON_PRETTY_PRINT | > JSON_UNESCAPED_SLASHES); > $json_data = preg_replace('/\'nonce-[^\']*\'/', > "'nonce-xxxxxxxxxxxxxxxxxxxxxxxx'", $json_data); > Yii::error($json_data, 'web.security'); > } > } > ``` > > ## Prestashop > > https://github.com/PrestaShop/PrestaShop/blob/24f9e510ecb0cb002ac3f4834f3210e8d9359899/classes/Validate.php > > ``` > public static function isJson($string) > { > json_decode($string); > > return json_last_error() == JSON_ERROR_NONE; > } > ``` > > > ## Wordpress CLI > > https://github.com/wp-cli/wp-cli/blob/f3e4b0785aa3d3132ee73be30aedca8838a8fa06/php/utils.php > > ``` > function is_json( $argument, $ignore_scalars = true ) { > if ( ! is_string( $argument ) || '' === $argument ) { > return false; > } > > if ( $ignore_scalars && ! in_array( $argument[0], [ '{', '[' ], true ) > ) { > return false; > } > > json_decode( $argument, $assoc = true ); > > return json_last_error() === JSON_ERROR_NONE; > } > ``` > > > ## JOOMLA CMS > > https://github.com/joomla/joomla-cms/blob/09d14c65f25f9bc76f2698e69c4d7b35f43bc848/libraries/src/Form/Field/AccessiblemediaField.php > > ``` > if (\is_string($value)) { > json_decode($value); > //<---------------------------------------------------------- HERE > > // Check if value is a valid JSON string. > if ($value !== '' && json_last_error() !== JSON_ERROR_NONE) { > /** > * If the value is not empty and is not a valid JSON string, > * it is most likely a custom field created in Joomla 3 and > * the value is a string that contains the file name. > */ > if (is_file(JPATH_ROOT . '/' . $value)) { > $value = '{"imagefile":"' . $value . '","alt_text":""}'; > } else { > $value = ''; > } > } > > ``` > > > > # Stackoverflow questions related to this > > ## In PHP, this question is one of the most high ranked questions related > to json && php in stackoverflow, "Fastest way to check if a string is JSON > in PHP?" > > ### The question > > https://stackoverflow.com/questions/6041741/fastest-way-to-check-if-a-string-is-json-in-php > Viewed 484k times > > ### The ranking > > https://stackoverflow.com/questions/tagged/php%20json?sort=MostVotes&edited=true > > ## Person asking how to do exactly this, also providing a real use case; > eventhough in python, the programming language is not important. > > https://stackoverflow.com/questions/5508509/how-do-i-check-if-a-string-is-valid-json-in-python > > ## Someone has also doing exactly this , in JAVA > > https://stackoverflow.com/questions/3679479/check-if-file-is-json-java So the core argument, it seems, is "there's lots of user-space implementations already, hence demand, and it would be better/faster/stronger/we-have-the-technology to do it in C." Thus another, arguably more important benchmark would be a C implementation compared to a userspace implementation of the same algorithm. Presumably your C code is doing some kind of stream-based validation with braces/quotes matching rather than a naive "try and parse and see if it breaks." We would need to see benchmarks of the same stream-based validation in C vs PHP, as that's the real distinction. That a stream validator would be more memory efficient than a full parser is not at all surprising, but that's also not a fair comparison. As for the benchmarks themselves, do not use memory_get_usage(); as noted, it shows the memory usage at that time, not ever. What you want is memory_get_peak_usage(), which gets the highest the memory usage has gotten in that script run. Or, even better, use PHPBench with separate sample methods to compare various different implementations. It will handle all the "run many times and average the results and throw out outliers" and such for you. It's quite a flexible tool once you get the hang of it. I'll also note that it would be to your benefit to share the working C code as a patch/PR already. If accepted it would be released open source anyway, so letting people see the proposed code now can only help your case; unless the code is awful, in which case showing it later would only waste your time and everyone else's discussing it in the abstract before the implementation could be reviewed. --Larry Garfield