Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:73272 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 58730 invoked from network); 18 Mar 2014 15:49:02 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 18 Mar 2014 15:49:02 -0000 Authentication-Results: pb1.pair.com header.from=jakub.php@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=jakub.php@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.216.53 as permitted sender) X-PHP-List-Original-Sender: jakub.php@gmail.com X-Host-Fingerprint: 209.85.216.53 mail-qa0-f53.google.com Received: from [209.85.216.53] ([209.85.216.53:52953] helo=mail-qa0-f53.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id F8/31-52406-DEA68235 for ; Tue, 18 Mar 2014 10:49:01 -0500 Received: by mail-qa0-f53.google.com with SMTP id w8so7019753qac.40 for ; Tue, 18 Mar 2014 08:48:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=MR7q1rjDd5TovDrfQXq1FjeHvAV9u7dpOGyaI2X/8Zg=; b=AM78abKhANsx2rE8/ee5+edKFvg7JmNHNu0oHOqu+2iGchtzV/B3XoQ3rqGMN+RhK+ cNEN/18qjQgdV5GCRVp/6YpMXTWl0r4u0lw4bWwaYTDLNLy4RhU2JDTZdoAK6C2Mxjx+ 9vWqdYczey3/QMPFztBalVeSt0xAONyGbrQQi2k14L9W4ihkaI4D/4cGJA6wOhZuuibS WtrGuYz1xMJoY6DCCrZgpzLt5ImgBUln1ANBomweH/85zcGhsfrsQ+B8CeHe8z29e5mj lszFRLU2/yBtEonvI7w9pVC9ddu09VhlRo4SUVhf/LkNJoVDMRJyNZaWmnqUOmELJ06K hgUg== MIME-Version: 1.0 X-Received: by 10.140.82.175 with SMTP id h44mr35082873qgd.68.1395157738739; Tue, 18 Mar 2014 08:48:58 -0700 (PDT) Sender: jakub.php@gmail.com Received: by 10.224.126.72 with HTTP; Tue, 18 Mar 2014 08:48:58 -0700 (PDT) In-Reply-To: <53280D52.5040209@hoa-project.net> References: <53280D52.5040209@hoa-project.net> Date: Tue, 18 Mar 2014 15:48:58 +0000 X-Google-Sender-Auth: yKn3HR-T_pHjeiq4ZOCarJdB_qI Message-ID: To: "Ivan Enderlin @ Hoa" Cc: PHP internals list Content-Type: multipart/alternative; boundary=001a11c129ca6c643e04f4e379be Subject: Re: [PHP-DEV] New JSON parser From: bukka@php.net (Jakub Zelenka) --001a11c129ca6c643e04f4e379be Content-Type: text/plain; charset=ISO-8859-1 On Tue, Mar 18, 2014 at 9:09 AM, Ivan Enderlin @ Hoa < ivan.enderlin@hoa-project.net> wrote: > On 13/03/2014 20:48, Jakub Zelenka wrote: > >> Hi, >> > Hi Jakub, > > > > I have create a new JSON parser using conditional re2c and pure pull Bison >> parser. It's a native UTF-8 parser licensed under PHP license (it can be >> used for Evil though :) ). The extension is available at >> >> https://github.com/bukka/php-jsond >> > Very nice work, thanks! > > > [snip] >> >> >> I need to do more testing before creating RFC for replacing the current >> parser. There is still space for further improvements. If anyone has any >> ideas, please let me know. Or if you could test it, that would be great >> too! ;) >> > For my PhD thesis in the automatic testing domain, I have created some > grammar-based testing algorithms, based on our dedicated LL(k) compiler > compiler (with its dedicated grammar description language called PP). > Please, see the article [1] (along with the presentation [2] and all the > details [3] about the article and the conference) and also the tool [4] > (called Hoa\Compiler). In this article, my experimentation consisted to > generate a lot of JSON strings (based on the JSON grammar [5] written in > PP) and compared them against the JSON parsers of Gecko and PHP. Now I > re-play this experimentation but I compare all the generated data with > ext/json and ext/jsond to see if there is no potential regression. Also, I > test it with a bounded exhaustive algorithm: it means we generate all > possible JSON strings up to a given size (the unit is the number of tokens > in a sequence, so `{`, `true` or `foo` are tokens). Note that we have two > ther algorithms: uniform random generation and coverage-based generation. > > I have created a little repository to share my work [6]. I have generated > all sequences up to 15 tokens, which represents 356'327 data and no one has > failed. Congrats! > Just for the record, a good test is a test that fails. Here, I have > detected no regeression, and because I have previously compared the > ext/json with the JSON parser from Gecko, we can consider your > implementation as "safe". > > This is my little contribution of the morning :-). You can use this work > to generate data in a static file and use them to compare the memory and > CPU usage between ext/json and ext/jsond also. > > > Best regards. > > [1] http://hoa-project.net/Literature/Research/Amost12.html > [2] http://keynote.hoa-project.net/Amost12/EDGB12.pdf > [3] http://hoa-project.net/Event/Amost12.html > [4] https://github.com/hoaproject/Compiler > [5] https://github.com/hoaproject/Json/blob/master/Grammar.pp > [6] https://github.com/Hywan/jsond-test > Hi! That looks really nice! I've been actually looking for something like this. It's cool that I can have a bunch of JSON sub-grammars for generating test data for benchmarks. Need to have a look properly what it can do but think that it will be really useful... Thanks --001a11c129ca6c643e04f4e379be--