Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:63758 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 3227 invoked from network); 6 Nov 2012 09:59:38 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 6 Nov 2012 09:59:38 -0000 Authentication-Results: pb1.pair.com header.from=ww.galen@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=ww.galen@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 74.125.82.54 as permitted sender) X-PHP-List-Original-Sender: ww.galen@gmail.com X-Host-Fingerprint: 74.125.82.54 mail-wg0-f54.google.com Received: from [74.125.82.54] ([74.125.82.54:49656] helo=mail-wg0-f54.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id E5/34-59729-88FD8905 for ; Tue, 06 Nov 2012 04:59:37 -0500 Received: by mail-wg0-f54.google.com with SMTP id dt10so128673wgb.11 for ; Tue, 06 Nov 2012 01:59:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=iQ+W/jKP/sxmbtVdDAFOR3CciDqc/mg4vUTij4cqcL8=; b=AzJDITz7OeJADaMbuU2P2bQEdCM80MRpaqwfX+4EF4sPS9ojTWhHoQx5Nxm+Ka6VWS ALe1ZCClbPZNKtJbmD72r3GQE9qMtqqLNXebB/9KDeCNDw/kKNplx4WHLFK8TTBQtVRO /4gHs8buSwIHT3RiGyTcAktmEGZ+j9KFw9NJPO2S9rJ6nOVLg34DdLkasCOj7D0h0kt+ kugt4DxiJ7RViOcRIL3mNiq1WuOQ7tjRfk/ptl8vHUSV7dk2x4z7A0JoWKv4VmKUAN2j zD1DuBuV/WIS8PVgxo+6GnOZ4EK7uhZ8S4dnO7p99bK362YCQzoWZXcAZqY9tGmBR8ZQ sTHg== Received: by 10.180.91.71 with SMTP id cc7mr956410wib.2.1352195973271; Tue, 06 Nov 2012 01:59:33 -0800 (PST) MIME-Version: 1.0 Received: by 10.194.54.101 with HTTP; Tue, 6 Nov 2012 01:58:53 -0800 (PST) In-Reply-To: References: <5fce29a0cb5467c00eeb267dd38fd788@localhost> <5097E376.6040709@lerdorf.com> <5097EF8A.1000809@lerdorf.com> Date: Tue, 6 Nov 2012 01:58:53 -0800 Message-ID: To: Philip Olson Cc: internals Content-Type: multipart/alternative; boundary=f46d04374a07a74fb304cdd0a85a Subject: Re: [PHP-DEV] Incomprehension with preg_match and utf8 From: ww.galen@gmail.com (Galen Wright-Watson) --f46d04374a07a74fb304cdd0a85a Content-Type: text/plain; charset=ISO-8859-1 On Mon, Nov 5, 2012 at 8:54 PM, Philip Olson wrote: > > [...] > A few simple/related facts: > > [...] > - Gustavo mentioned the related PHP change on Oct 3, 2010 (not sure > what PHP version, and googling for "87a237342" turns up empty, > and I miss SVN version numbers) > For reference: php_version.h in commit 87a237342282fe036bb90486fdd6cdc392e16ac7 lists the version as 5.3.99-dev. The commit adds PCRE_UCPwhen defined and the "u" modifier is used. The commit message is: - Fixed bug #52971 (PCRE-Meta-Characters not working with utf-8) > # In PCRE, by default, \d, \D, \s, \S, \w, and \W recognize only ASCII > # characters, even in UTF-8 mode. However, this can be changed by > setting > # the PCRE_UCP option. The PHP changelog lists version 5.3.4 as containing the fix for bug #52971 . --f46d04374a07a74fb304cdd0a85a--