Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:47813 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 50646 invoked from network); 7 Apr 2010 06:15:29 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 7 Apr 2010 06:15:29 -0000 Authentication-Results: pb1.pair.com header.from=martin@divbyzero.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=martin@divbyzero.net; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain divbyzero.net from 87.230.111.147 cause and error) X-PHP-List-Original-Sender: martin@divbyzero.net X-Host-Fingerprint: 87.230.111.147 mx.bauer-kirch.de Linux 2.6 Received: from [87.230.111.147] ([87.230.111.147:46369] helo=mx.bauer-kirch.de) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 6A/07-05593-BF22CBB4 for ; Wed, 07 Apr 2010 02:15:26 -0400 Received: by mx.bauer-kirch.de with ESMTP id 1NzOXm-00010e-U7; Wed, 07 Apr 2010 08:15:18 +0200 Message-ID: <4BBC22FC.8050307@divbyzero.net> Date: Wed, 07 Apr 2010 08:15:24 +0200 MIME-Version: 1.0 To: Rasmus Lerdorf CC: internals@lists.php.net References: <4BBB70B4.9050503@lerdorf.com> <79B651DA-34A4-4596-9204-47A47211BB27@macvicar.net> <4BBB756E.6000905@lerdorf.com> In-Reply-To: <4BBB756E.6000905@lerdorf.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] What gruntwork needs to be done From: martin@divbyzero.net (Martin Jansen) On 6.4.2010 19:54, Rasmus Lerdorf wrote: > On 04/06/2010 10:47 AM, Scott MacVicar wrote: >> http://whisky.macvicar.net/patches/utf8-string.diff.txt > > My only issue with this is that it essentially duplicates the utf8 part > of get_next_char() from html.c. I'd like to see cs parsing in one place > instead of spread out all over the code tree. The get_next_char() > function also supports other charsets, so we could have a more generic > cs_validate() function along with utf8_validate(). Shouldn't this be str_cs_validate() then? Or has the plan been abandoned to group functions into logical groups by their prefix? - Martin