Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:63756 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 82783 invoked from network); 6 Nov 2012 04:55:06 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 6 Nov 2012 04:55:06 -0000 Authentication-Results: pb1.pair.com header.from=philip@roshambo.org; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=philip@roshambo.org; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain roshambo.org from 209.85.220.42 cause and error) X-PHP-List-Original-Sender: philip@roshambo.org X-Host-Fingerprint: 209.85.220.42 mail-pa0-f42.google.com Received: from [209.85.220.42] ([209.85.220.42:51446] helo=mail-pa0-f42.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 17/22-59729-82898905 for ; Mon, 05 Nov 2012 23:55:05 -0500 Received: by mail-pa0-f42.google.com with SMTP id fa1so31489pad.29 for ; Mon, 05 Nov 2012 20:55:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=Mbo/rUzDaReKC88DQw8Z1jSuoNzkNJrUD4PSnp1Iywk=; b=nXHIx59f6wVadiF9xxY5ImLMvI4117T93d/x1U+5ncpx3pDHR8GvVBhMtDDGWU9iH0 KZM3yfxVmmIMYIpPAL+pbhSULFyYxgEVNeiWLLdjcmqKyWSkolnjPm4oy93JidyhPMAA eHuqswsEjp8g4fML25iMe93LacxR8swmQkQdSnL4ZWoqtR921v36Nv+3wbC7MHVacALz 6KsIWM1OSaQtisumeW8NUqNkUahGgHUwfvmQwkmT4ti+MzbgDzcQ4aoBly9znn6zDcVd 70Z+0GELbk1GRX3Cc9DiWiiWZY53VjE9h7iAZfhlHerH+x2xZKFeLvWLUoEvGb1Io5t8 VOQQ== Received: by 10.66.86.102 with SMTP id o6mr34660740paz.11.1352177700723; Mon, 05 Nov 2012 20:55:00 -0800 (PST) Received: from [192.168.2.100] (c-71-56-134-232.hsd1.wa.comcast.net. [71.56.134.232]) by mx.google.com with ESMTPS id j8sm11793572paz.30.2012.11.05.20.54.57 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 05 Nov 2012 20:54:59 -0800 (PST) Mime-Version: 1.0 (Apple Message framework v1278) Content-Type: text/plain; charset=iso-8859-1 In-Reply-To: <5097EF8A.1000809@lerdorf.com> Date: Mon, 5 Nov 2012 20:54:55 -0800 Cc: =?iso-8859-1?Q?Jean-S=E9bastien_Hedde?= , internals Content-Transfer-Encoding: quoted-printable Message-ID: References: <5fce29a0cb5467c00eeb267dd38fd788@localhost> <5097E376.6040709@lerdorf.com> <5097EF8A.1000809@lerdorf.com> To: Rasmus Lerdorf X-Mailer: Apple Mail (2.1278) X-Gm-Message-State: ALoCoQl2EkiTyBfndNg2AlVS228bHuDNViLNUbB1VhkOrayoYrRo2Ak3ow7BYZgF+5dS2MSL+6yl Subject: Re: [PHP-DEV] Incomprehension with preg_match and utf8 From: philip@roshambo.org (Philip Olson) On Nov 5, 2012, at 8:55 AM, Rasmus Lerdorf wrote: > On 11/05/2012 08:41 AM, Jean-S=E9bastien Hedde wrote: >> On Mon, 05 Nov 2012 08:04:06 -0800, Rasmus Lerdorf = >> wrote: >>>=20 >>> I think the documentation is wrong on that. In Unicode mode = [[:alnum:]] >>> actually becomes \p{Xan} which should match Unicode chars as well, = but >>> only if PCRE was compiled with Unicode support. So I suspect you = don't >>> actually have a Unicode-capable PCRE build in some cases there. >>>=20 >>> -Rasmus >>=20 >> I will report the bug to the package maintainers (remi, debian = too...). >>=20 >> Is there anyway for us to avoid those "wrong" builds ? >=20 > I don't see how. Hi geeks, Does anyone have a suggestion on how the documentation should be updated? The quote is from here: http://php.net/manual/en/regexp.reference.character-classes.php With the quote being: "In UTF-8 mode, characters with values greater than 128 do=20 not match any of the POSIX character classes." A few simple/related facts: - PCRE_UCP exists as of PCRE 8.10 - Gustavo mentioned the related PHP change on Oct 3, 2010 (not sure=20 what PHP version, and googling for "87a237342" turns up empty,=20 and I miss SVN version numbers) Anyway, how should this be documented? Regards, Philip