Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:112689 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 20650 invoked from network); 30 Dec 2020 22:37:09 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 30 Dec 2020 22:37:09 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 396C11804E3 for ; Wed, 30 Dec 2020 14:11:57 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_05, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-lf1-f49.google.com (mail-lf1-f49.google.com [209.85.167.49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 30 Dec 2020 14:11:56 -0800 (PST) Received: by mail-lf1-f49.google.com with SMTP id o19so40662912lfo.1 for ; Wed, 30 Dec 2020 14:11:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=/P6/KgIUdwTySiRg7fdh5t7nVA1wQmJtfAkBiSL+mrs=; b=mo4fB5/n8yKbdp/trvXINsiGxXOIqt6MObZmsV1fg0EwFDw3Fjo8PO9mwAXiTgSRzk aXq3ZuNxmU8yEvgrl4kTbkTiqiZUtoSp8Kh2kNXhpIzAqW1xLaQoIr0l7kvp2lfbzBjA 2Z7BRDkBS9TYZOg8z73xiG8jVF+FNkLsJHqWGHmYSF3qj0f/Z7g+67iKc/+cBSaFDAvk OdznaAY0k+geIlN2bXjTO01+Y9NCZwZI4LefT/Y2Meid4WdJQGLfkoKs08MG94HGazdp Z3jzpQ0D7vUNq6eOoH1K9tD6E/k2IxRkchQ6VPJxajPUo+oQ5LmmP0Bl4xV5+wI27Nvx 1R0w== X-Gm-Message-State: AOAM533YpWebePpl8pKbFV9k8AWIaV4yK5poqLNkO3g/EGITfTzlV+9S jilUlRAkafDbDjS57fhH3MyFq3OStQ+GBEWfnDY= X-Google-Smtp-Source: ABdhPJzzuh3nhNBRuqsTowCvsL7HMlmnwxZaOU3UO0J2SS0CuzXhTiop683wysPoyTemXxgkDC/piDh/JHgz967x5Yo= X-Received: by 2002:ac2:4a65:: with SMTP id q5mr24886423lfp.320.1609366312790; Wed, 30 Dec 2020 14:11:52 -0800 (PST) MIME-Version: 1.0 References: <8c1f2d5a-6888-0787-06ad-095a06dd4e7a@php.net> In-Reply-To: <8c1f2d5a-6888-0787-06ad-095a06dd4e7a@php.net> Date: Wed, 30 Dec 2020 22:11:41 +0000 Message-ID: To: Remi Collet , kocsismate90@gmail.com Cc: PHP internals list Content-Type: multipart/alternative; boundary="000000000000632f9605b7b5cb2a" Subject: Re: [PHP-DEV] [RFC] Bundling ext/simdjson into core From: bukka@php.net (Jakub Zelenka) --000000000000632f9605b7b5cb2a Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Dec 30, 2020 at 6:52 AM Remi Collet wrote: > Le 29/12/2020 =C3=A0 17:57, M=C3=A1t=C3=A9 Kocsis a =C3=A9crit : > > Hi Internals, > > > > I think this will be my last proposal for quite some while :) > > But this time, I'd like to propose bundling the > > https://github.com/crazyxman/simdjson_php extension > > with some major modifications. > > > > The proposed OO API is included in the description of the > > PR that I've just created: https://github.com/php/php-src/pull/6551 > > > > The main motivation behind this RFC is two-fold: > > - the underlying simdjson library (https://github.com/simdjson/simdjson= ) > > which is used by ext/simdjson provides huge performance gains > > compared to ext/json (see some benchmark results in the PR) > > - we can support new use-cases, most notably the so called "on-demand" > > parsing: > https://github.com/simdjson/simdjson/blob/master/doc/ondemand.md > > (This is not implemented currently) > > > > Originally, I planned to include the new API in ext/json, but > unfortunately, > > simdjson is written is C++, so it would make C++ as a hard dependency, > > which was not the case so far. That's why I opted for creating > ext/simdjson. > > > > Please let me know if you have any feedback. > > the library seems young and very active, and so probably not available > in most distributions. > > I would prefer to use the common way > > - propose as pecl extension > - wait for maturity (extension AND library) > - propose for merge in php-src > > IMHO: too early for this one. > > I agree it's too early as the library is too young and won't be available in many distros. The PECL path is better in this case IMO as it will allow some time . That said the simdjson lib looks pretty impressive. Definitely looks more advanced than the ext json parser that I wrote more than 6 years ago. In my defese, I had no grant for that. :) Anyway really cool to see what can be done and how much it's possible to speed it up. Was thinking that it would be good to consider some kind of plugable decoder where another extension could register a parsing callback. Something similar to what we have for parser but instead for the whole decoding. That would allow to still use current parser in json_decode but if simdjson available / configured in ini, then it would used instead and would be just faster. Not sure if all options are supported though - for example don't see any note about UTF8 substitution (JSON_INVALID_UTF8_SUBSTITUTE). Cheers Jakub --000000000000632f9605b7b5cb2a--