Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:24768 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 49721 invoked by uid 1010); 20 Jul 2006 18:02:07 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 49706 invoked from network); 20 Jul 2006 18:02:07 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 20 Jul 2006 18:02:07 -0000 X-PHP-List-Original-Sender: andrei@gravitonic.com X-Host-Fingerprint: 204.11.219.139 lerdorf.com Linux 2.5 (sometimes 2.4) (4) Received: from ([204.11.219.139:56906] helo=lerdorf.com) by pb1.pair.com (ecelerity 2.1.1.3 r(11751M)) with ESMTP id 30/40-29121-D15CFB44 for ; Thu, 20 Jul 2006 14:02:07 -0400 Received: from [66.228.175.145] (borndress-lm.corp.yahoo.com [66.228.175.145]) (authenticated bits=0) by lerdorf.com (8.13.7/8.13.7/Debian-1) with ESMTP id k6KI21vH015733; Thu, 20 Jul 2006 11:02:01 -0700 In-Reply-To: <44BE7C90.9030302@ctindustries.net> References: <44BC07B0.3070505@ctindustries.net> <236712dad8ce4ce9e4c1b68726fc3d64@gravitonic.com> <44BE7C90.9030302@ctindustries.net> Mime-Version: 1.0 (Apple Message framework v623) Content-Type: text/plain; charset=ISO-8859-1; format=flowed Message-ID: Content-Transfer-Encoding: quoted-printable Cc: "internals@lists.php.net" Date: Thu, 20 Jul 2006 11:02:32 -0700 To: Rob Richards X-Mailer: Apple Mail (2.623) Subject: Re: [PHP-DEV] unicode and xml extensions From: andrei@gravitonic.com (Andrei Zmievski) Hey Rob, Looks good. Have you tested the filesystem (filename) related functions=20= with non-ASCII filenames? Try making a file called "informa=E7on.xml" = for=20 example, set unicode.filesystem_encoding=3Dutf-8 (or whatever encoding=20= your filesystem uses) and see if you can read it. -Andrei On Jul 19, 2006, at 11:40 AM, Rob Richards wrote: > Andrei Zmievski wrote: >> Rob, >> >> I have not tested the patch, but it looks good to me on cursory=20 >> overview. I assume it passes your tests? >> The only comment I have is regarding the usage of 't' and 'T'=20 >> specifiers. Since you always have to pass binary UTF-8 strings to=20 >> libxml, we should always use 's' specifier and let PHP downconvert=20 >> Unicode strings based on the runtime encoding (which you set to=20 >> UTF-8). > Updated the code with your suggestion. I first attempted to eliminate=20= > having to change converters when running with unicode off for all the=20= > "t" parameters (save a few extra instructions there), but code is much=20= > more manageable now than converting them manually. > > Would like some feedback, though, on the changes made to xmlreader=20 > before moving on to any of the other extensions (seeing the changes=20 > are going to be pretty much the same). > > Rob