Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:47812 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 82191 invoked from network); 6 Apr 2010 18:03:49 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 6 Apr 2010 18:03:49 -0000 Authentication-Results: pb1.pair.com header.from=scott@macvicar.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=scott@macvicar.net; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain macvicar.net from 97.107.131.220 cause and error) X-PHP-List-Original-Sender: scott@macvicar.net X-Host-Fingerprint: 97.107.131.220 whisky.macvicar.net Linux 2.6 Received: from [97.107.131.220] ([97.107.131.220:47037] helo=whisky.macvicar.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 47/7B-19932-3877BBB4 for ; Tue, 06 Apr 2010 14:03:48 -0400 Received: from 70-6-172-206.pools.spcsdns.net (70-6-172-206.pools.spcsdns.net [70.6.172.206]) by whisky.macvicar.net (Postfix) with ESMTP id EA83746A51; Tue, 6 Apr 2010 14:03:43 -0400 (EDT) Mime-Version: 1.0 (Apple Message framework v1077) Content-Type: text/plain; charset=us-ascii In-Reply-To: <4BBB756E.6000905@lerdorf.com> Date: Tue, 6 Apr 2010 11:03:41 -0700 Cc: Justin Dearing , internals@lists.php.net, scottmac@php.net Content-Transfer-Encoding: quoted-printable Message-ID: <129041EE-158F-4E13-B368-2039EE6323FE@macvicar.net> References: <4BBB70B4.9050503@lerdorf.com> <79B651DA-34A4-4596-9204-47A47211BB27@macvicar.net> <4BBB756E.6000905@lerdorf.com> To: Rasmus Lerdorf X-Mailer: Apple Mail (2.1077) Subject: Re: [PHP-DEV] What gruntwork needs to be done From: scott@macvicar.net (Scott MacVicar) On Apr 6, 2010, at 10:54 AM, Rasmus Lerdorf wrote: > On 04/06/2010 10:47 AM, Scott MacVicar wrote: >> On Apr 6, 2010, at 10:34 AM, Rasmus Lerdorf wrote: >>=20 >>> On 04/06/2010 10:08 AM, Justin Dearing wrote: >>>> So pending review an acceptance by Dmitry, I've written my first = patch for >>>> PHP. While there is a good chance I will need to make further = revisions to >>>> the test or code, I don't know what that is. >>>>=20 >>>> However, I've got some free time at the moment, and I'd like to = make use of >>>> some of the sunk costs of figuring out how to hack PHP. So I know = that in >>>> general there is a lot of work to be done. I also know that there = are >>>> plenty of open bugs, tests to be written, etc etc. What I am = looking for is >>>> someone to say is "here are the next 10 bugs I will work on can you = write me >>>> test" or "I wrote this patch on linux, I need someone to make it = work on >>>> windows too" or, "Party X complains of this but refuses to fill out = a proper >>>> bug report." >>>=20 >>> Here is a straightforward (but not easy) one: >>>=20 >>> http://bugs.php.net/bug.php?id=3D47435 >>>=20 >>> The php_filter_validate_ip() function in = ext/filter/logical_filters.c >>> needs those reserved IPV6 ranges added to the FORMAT_IPV6 case in = the >>> switch statement there when FILTER_FLAG_NO_RES_RANGE is set. I say = it >>> isn't super easy because we don't have much in the way of ipv6 = parsing >>> in PHP yet, so it will probably involve finding some decent code = that >>> can expand an ipv6 notation into something we can logically = separate. >>> That might also mean a rewrite of the _php_filter_validate_ipv6() >>> function in the same file. >>>=20 >>> Another one, if you are interested in encoding issues: >>>=20 >>> http://bugs.php.net/bug.php?id=3D49687 >>>=20 >>> I don't necessarily agree with Scott that it is wrong to expect >>> addslashes() to validate the input string. It could call >>> get_next_char() the same way php_escape_html_entities_ex() in >>> ext/standard/html.c does. And we need that utf8_decode() fix = mentioned >>> in the report reviewed/committed if it hasn't been already. >>>=20 >>=20 >> I fixed utf8_decode and I had a patch for adding utf8_validate which = is probably suitable for 5.4. >>=20 >> http://whisky.macvicar.net/patches/utf8-string.diff.txt >>=20 >> It's not quite done, I had intentions of adding support for using = truncate, simple true / false for valid or the unicode replacement = character. >=20 > My only issue with this is that it essentially duplicates the utf8 = part > of get_next_char() from html.c. I'd like to see cs parsing in one = place > instead of spread out all over the code tree. The get_next_char() > function also supports other charsets, so we could have a more generic > cs_validate() function along with utf8_validate(). >=20 I missed this function last year, abstracting that and making it PHPAPI = would be awesome. Scott=