Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:27660 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 73431 invoked by uid 1010); 25 Jan 2007 17:33:10 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 73416 invoked from network); 25 Jan 2007 17:33:10 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 25 Jan 2007 17:33:10 -0000 Authentication-Results: pb1.pair.com header.from=andrei@gravitonic.com; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=andrei@gravitonic.com; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain gravitonic.com from 204.11.219.139 cause and error) X-PHP-List-Original-Sender: andrei@gravitonic.com X-Host-Fingerprint: 204.11.219.139 lerdorf.com Linux 2.5 (sometimes 2.4) (4) Received: from [204.11.219.139] ([204.11.219.139:50148] helo=lerdorf.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 41/84-13597-6D9E8B54 for ; Thu, 25 Jan 2007 12:33:10 -0500 Received: from [66.228.175.145] (borndress-lm.corp.yahoo.com [66.228.175.145]) (authenticated bits=0) by lerdorf.com (8.13.8/8.13.8/Debian-3) with ESMTP id l0PHX56f013888; Thu, 25 Jan 2007 09:33:05 -0800 In-Reply-To: <004e01c740a3$599a2670$0100a8c0@pc07653> References: <0F741213-BCA4-4923-A83A-3E4E9C561DAE@prohost.org> <45B897E5.40007@zend.com> <41936.195.22.180.233.1169730121.squirrel@avilys.eik.lt> <45B8B2E5.4010204@zend.com> <40869.195.22.180.233.1169733866.squirrel@avilys.eik.lt> <3ED37F9A-9BC8-4BBA-BB85-77BB0B188074@prohost.org> <000b01c74090$60bce950$0100a8c0@pc07653> <017A7F13-255C-4C7E-B22F-7481CCE07BAB@prohost.org> <000a01c74093$b03dd180$0100a8c0@pc07653> <0EFF1969-038A-4F67-872C-674B99E75009@prohost.org> <6b4d01c77cd1c8ca09b68d822bcd1f15@gravitonic.com> <004e01c740a3$599a2670$0100a8c0@pc07653> Mime-Version: 1.0 (Apple Message framework v624) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-ID: <3d6fe7502ae17ed621fe251baeb4403b@gravitonic.com> Content-Transfer-Encoding: 7bit Cc: "Ilia Alshanetsky" , "Pierre" , Date: Thu, 25 Jan 2007 09:33:58 -0800 To: "Nuno Lopes" X-Mailer: Apple Mail (2.624) Subject: Re: [PHP-DEV] Re: PHP 5.2.1RC3 Released From: andrei@gravitonic.com (Andrei Zmievski) I've been thinking about how to not force UTF-8 in PCRE for PHP 6, and it's not that simple. This is mainly due to preg_replace(), because it allows array() parameters that can contain mixed IS_UNICODE and IS_STRING values. I hope you realize though, that in UTF-8 mode PCRE does not care about POSIX locales, even in PHP 5. By the way, I think ICU regexp extension, when implemented, will let you match Portuguese characters in UTF-8 strings. -Andrei On Jan 25, 2007, at 9:07 AM, Nuno Lopes wrote: > But how do I match only portuguese letters? you'll (always) need posix > locales.. > I don't think forcing preg_* function to utf-8 is a good idea, but > anyway I haven't looked enough to PHP 6 (yet) to produce a strong > opinion. > > Nuno > >> Because with UTF-8, PCRE already knows the uppercase and lowercase >> equivalents, without having to rely on the POSIX locales. >> >> -Andrei >> >> On Jan 25, 2007, at 7:24 AM, Pierre wrote: >> >>> On 1/25/07, Ilia Alshanetsky wrote: >>>> PCRE should operate in UTF-8 mode. >>> >>> How does UTF-8 help to make it locale compliant? >>> >>> --Pierre