Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:73124 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 83503 invoked from network); 13 Mar 2014 20:04:26 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 13 Mar 2014 20:04:26 -0000 Authentication-Results: pb1.pair.com smtp.mail=nikita.ppv@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=nikita.ppv@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.219.49 as permitted sender) X-PHP-List-Original-Sender: nikita.ppv@gmail.com X-Host-Fingerprint: 209.85.219.49 mail-oa0-f49.google.com Received: from [209.85.219.49] ([209.85.219.49:58578] helo=mail-oa0-f49.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id B9/66-47923-84F02235 for ; Thu, 13 Mar 2014 15:04:25 -0500 Received: by mail-oa0-f49.google.com with SMTP id g12so1596032oah.22 for ; Thu, 13 Mar 2014 13:04:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=kn1b+hIjbGbaQ606+GeXZZSRes2AKq51ch1lBuerab0=; b=ZsDB0SivQ5brCGlz6w0/Fhnws9herMeuAZaGlQpJsSE0+6hY9BDmEi+ql65KzNV/8J 426kCERfjVy5b/jLxOXo7s95S5VkgySSQ43wMJQASVQXArTolm9/tPxU4K8pvmXei6NT YDoJhE0EYif3VddcTHUBVF2f6ivrGw7zF3fhOnh455IP1PWnnGunD4KxzCdoYgUGhJU3 d9WbXoibX6HtGTidl+8sq3uOYRurOfADuI3uewozlUClt8HRdt/TLMJvpVBG3Sd7IfF9 lwcexBPYrVPCZzKYrxfzALYJVTimbTuMJQ2RAPGlpRyRd6IvPkntooerR4cIbPOWYhto 9GOA== MIME-Version: 1.0 X-Received: by 10.182.196.3 with SMTP id ii3mr3020287obc.11.1394741061479; Thu, 13 Mar 2014 13:04:21 -0700 (PDT) Received: by 10.182.69.101 with HTTP; Thu, 13 Mar 2014 13:04:21 -0700 (PDT) In-Reply-To: References: Date: Thu, 13 Mar 2014 21:04:21 +0100 Message-ID: To: Jakub Zelenka Cc: PHP internals list Content-Type: multipart/alternative; boundary=089e015383e485f37f04f482750b Subject: Re: [PHP-DEV] New JSON parser From: nikita.ppv@gmail.com (Nikita Popov) --089e015383e485f37f04f482750b Content-Type: text/plain; charset=ISO-8859-1 On Thu, Mar 13, 2014 at 8:48 PM, Jakub Zelenka wrote: > Hi, > > I have create a new JSON parser using conditional re2c and pure pull Bison > parser. It's a native UTF-8 parser licensed under PHP license (it can be > used for Evil though :) ). The extension is available at > > https://github.com/bukka/php-jsond > > The encoder is taken from the current ext/json but the decoder has been > completely rewritten. > > I have done some basic benchmarks (the results are on the README.md page). > For short strings the results are almost the same (the new parser is > usually slightly faster) but there is a big difference for longer strings. > For example for string with 800 characters the new parser is 4 times > faster. If the string is longer, the results are even better. It's also > much faster for big arrays. For example decoding json_encode($_SERVER) is > twice faster with the new parser. > > In addition to the speed improvements there also is a big memory saving. > The old parser converts every serialized string to UTF-16 which requires > additional memory. Although the memory is freed after the parsing, it can > be problematic if the memory_limit is set. The new parser parses supplied > string and no extra memory for conversion is allocated. > > I need to do more testing before creating RFC for replacing the current > parser. There is still space for further improvements. If anyone has any > ideas, please let me know. Or if you could test it, that would be great > too! ;) > > I haven't finished the build config (it's working but it's not auto rebuild > when you change *.y or *.re files). Need to figure out how to add > Makefile.frag for bison and re2c . It's not working when I add > PHP_ADD_MAKEFILE_FRAGMENT($abs_srcdir/Makefile.frag) > $ make > make: *** No rule to make target `jsond.lo', needed by `jsond.la'. Stop. > Has anyone got any idea? :) > Also there is no build for Win atm as I don't have Windows... :( Hopefully > someone will do it for me... :) > This sounds great! We solve the licensing issue *and* get a performance benefit at the same time :) One quick note, while looking over the code: FLOAT = INT "." UINT should be FLOAT = INT "." DIGIT+ otherwise you'll disallow numbers with leading zeros, e.g. 1.001. Same applies to ( INT | FLOAT ) [eE] [+-]? UINT, where the exponent should also allow leading zeros ;) Nikita --089e015383e485f37f04f482750b--