Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:59310 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 55102 invoked from network); 1 Apr 2012 21:10:57 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 1 Apr 2012 21:10:57 -0000 Authentication-Results: pb1.pair.com smtp.mail=pierre.php@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=pierre.php@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.161.170 as permitted sender) X-PHP-List-Original-Sender: pierre.php@gmail.com X-Host-Fingerprint: 209.85.161.170 mail-gx0-f170.google.com Received: from [209.85.161.170] ([209.85.161.170:48860] helo=mail-gx0-f170.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 72/18-01129-F54C87F4 for ; Sun, 01 Apr 2012 17:10:56 -0400 Received: by ggmb2 with SMTP id b2so1114056ggm.29 for ; Sun, 01 Apr 2012 14:10:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=u57UrZwcpexMkIEIHkDKqN0TCXi1j3RyW9wacfktJAo=; b=uX9Q5P2wwudJd+d7g8mX/tHQLBuXo2CM/+o6JSNz/4OZ2K0M/6aCqvBwT/FeGt3wP4 nuefYhKbA22TBJFd5QV0O8oz+9RJqYcoGfQCS3tMkFT5L/mf8GW1FydEOUbiroURbv0P 9r3JioohW45wtiTAuVomzqaF2ZX/eiIY/918NjMFBNn3JbawfVQoxAnBfsBqkUbNmM6o HkjPfe7YozucbyL/xo7GZlBul2iIbm3M2hVgEUz7BIQig8TXEhQfF+sn+l/hknZOCYuy nqdc0ad/KxQH0tSQnwbMmR/rXIJtRPthurhak3UgwZefvNOebF29FxmNHNeTvieaXKCH ZFxA== MIME-Version: 1.0 Received: by 10.236.9.35 with SMTP id 23mr4949340yhs.41.1333314653186; Sun, 01 Apr 2012 14:10:53 -0700 (PDT) Received: by 10.147.168.16 with HTTP; Sun, 1 Apr 2012 14:10:53 -0700 (PDT) In-Reply-To: <4F7847CA.2090307@anderiasch.de> References: <4F7847CA.2090307@anderiasch.de> Date: Sun, 1 Apr 2012 23:10:53 +0200 Message-ID: To: Florian Anderiasch Cc: internals@lists.php.net Content-Type: text/plain; charset=ISO-8859-1 Subject: Re: [PHP-DEV] Question about parser implementation details From: pierre.php@gmail.com (Pierre Joye) hi, On Sun, Apr 1, 2012 at 2:19 PM, Florian Anderiasch wrote: > due to the widespread acceptance of binary number format (0b1010101) and > the growing demand for backwards compatibility I've started to work on > support for Roman Numerals (I, II, III, ...) I am really really not sure we want that as part of the php scripts like hex or binary. (Read: I do not want to have that :). > As you might know, this format cannot be strictly parsed from left to > right or right to left, as several number values need a look-ahead > before being able to compute them (like IV), so my naive first > implementation splits the string into tokens (like in 1990 = MCMXC => > M,CM,XC => 1000,900,90) then simplifying those 3 on their own, then > adding the results, but I'm not sure this could kill performance if > calculated inside zend_language_scanner.l. > > I'd appreciate any hints on how to tackle this serious concern. Btw, it may be possible to parse roman number using ICU, using the number parsing API. It is also possible to generate Roman numbers using ICU as well, using numberFormat and some extra rules. The docs have some examples. Cheers, -- Pierre @pierrejoye | http://blog.thepimp.net | http://www.libgd.org