Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:112692 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 36348 invoked from network); 31 Dec 2020 01:56:01 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 31 Dec 2020 01:56:01 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id E9B121804D4 for ; Wed, 30 Dec 2020 17:30:50 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.9 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-lf1-f48.google.com (mail-lf1-f48.google.com [209.85.167.48]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 30 Dec 2020 17:30:50 -0800 (PST) Received: by mail-lf1-f48.google.com with SMTP id h205so41318066lfd.5 for ; Wed, 30 Dec 2020 17:30:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=YU9WabgBWabJVGzSyTXU2bi/H/oI/neM5QvSe0Hhr38=; b=b4N92FQBpOqkXAd3U6+WIlE60dintMwEWiSNmtrV99FSSpNQXHS6CgrmgOEqY1jrBF vtjCnfcZ0Kir4qOQTUvSF2szCWnh6IK8W9gGk1hEGpZmctx4JtGE4TNs4t3rfRVBPqno pP3Zv7mbdHd0T60st352ObGpnwizt3xam7L/qh8AOUPWrktfhIg2Rm52zQUcXEn+fokp YK9VdhSci5NvI8da49KBg94t5gIu/ouGKGHC1tT//2JT1GaXfnoPoA230XUgtS6LlbPu ofllDHMY0mafhMNuXasAj47IuyuKTDNN0d5PqyzcqY2dBpDDV4/oQj4DZwCppMhL6Bst Wh5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=YU9WabgBWabJVGzSyTXU2bi/H/oI/neM5QvSe0Hhr38=; b=TL/R5YEktcRptqbIKAkNWhuJTkmECSTZagux1t7uRTajAzdNKPR9ehFQSTsycDTdoV K2pIMdV8WBej0COBOi8qJX5gGYpmb9E8gzE2z8TrGCzT8oWjf4j+2H3SO/kurmpkBnye f6yasVMOGQPLezU2xIwDOApXE8tES7BhGm2URY2KczZmW+wTppV6jWS1R0lmucnxRsAi UqdV4wjTUcv4FCuuvfhQN4+O3Qc25ciwN/0wBRKeO7HQ9xRUzBEOj4KFSouWJlwAzUBf zhHymJY6qI/IFbYw+syAKLsXFnzpaxriAwK/mAouhwb8FlIpMvYUVFLgRo0qaVq64O1d UtPA== X-Gm-Message-State: AOAM530rWL1wdZcVBruhjSsiOWwBdbuyZqnggHY3FjkV5XTZ3uzjBPV1 xFfmmJVDNgRgsIv5w71LTB+PiK5COHGkI4eJoZc= X-Google-Smtp-Source: ABdhPJyzZ2o37p9Lpc7BmrvAQu3+8oF3CCgzMXPgo5rR3TN0uKYAt0/OPWn5KOvStiRv1z5HMJTGfeYGZqMjxZgLbPc= X-Received: by 2002:a2e:9d89:: with SMTP id c9mr28791772ljj.220.1609378248191; Wed, 30 Dec 2020 17:30:48 -0800 (PST) MIME-Version: 1.0 References: <8c1f2d5a-6888-0787-06ad-095a06dd4e7a@php.net> In-Reply-To: Date: Thu, 31 Dec 2020 02:30:35 +0100 Message-ID: To: Jakub Zelenka Cc: Remi Collet , PHP internals list Content-Type: multipart/alternative; boundary="000000000000caf31505b7b89293" Subject: Re: [PHP-DEV] [RFC] Bundling ext/simdjson into core From: kocsismate90@gmail.com (=?UTF-8?B?TcOhdMOpIEtvY3Npcw==?=) --000000000000caf31505b7b89293 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Remi and Jakub, > I agree it's too early as the library is young and won't be available in >> many distros. The PECL path is better in this case IMO as it will allow >> some time . > > In my opinion, this is a case where making an exception is worth considering. Should the simdjson library be written in C, I'd propose to add the new API + parser to ext/json directly, since ext/simdjson is just a very small wrapper around the parser, and not a complex piece of code in itself (compared to other parts of php-src). Also, I think the performance benefit of using the simdjson parser is so major that it would be a pity if people had to wait for years until the functionality becomes generally available as a core extension. As json_encode() and json_decode() are very easy to use, my guess is that a 3rd party JSON-related extension would never get an adoption large enough, because only those people would install it who have really reached the limitations of ext/json. By the way, it has just come to my mind that our company is also affected by these limitations. Sometimes we have to parse very large JSON documents, and in some cases these can end up being truncated. Fortunately we only need a specific part of the data, so someone wrote a partial "parser" (this is euphemism) tailored for the schema in question. Rather than having to use custom hackery, it would be so much better if PHP would offer partial parsing out of the box, like what the proposed JsonParser::getKeyValue() does. That said, the cost-benefit ratio of having simdjson in core seems advantageous for me. Was thinking that it would be good to consider some kind of plugable > decoder where another extension could register a parsing callback. > Something similar to what we have for parser but instead for the whole > decoding. That would allow to still use current parser in json_decode but > if simdjson available / configured in ini, then it would used instead and > would be just faster. Not sure if all options are supported though - for > example don't see any note about UTF8 substitution > (JSON_INVALID_UTF8_SUBSTITUTE). > This is a very interesting approach, and it reminds me about the hashing registry RFC in certain extent. And you are right, as far as I saw, these flags are not supported by ext/simdjson. But to be honest, I haven't analyzed yet how difficult it would be to have a reasonably full compatibility with ext/json. Cheers: M=C3=A1t=C3=A9 --000000000000caf31505b7b89293--