Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:91359 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 64408 invoked from network); 23 Feb 2016 14:34:09 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 23 Feb 2016 14:34:09 -0000 Authentication-Results: pb1.pair.com header.from=cschneid@cschneid.com; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=cschneid@cschneid.com; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain cschneid.com from 195.226.6.51 cause and error) X-PHP-List-Original-Sender: cschneid@cschneid.com X-Host-Fingerprint: 195.226.6.51 darkcity.gna.ch Received: from [195.226.6.51] ([195.226.6.51:40646] helo=mail.gna.ch) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 6B/53-38634-FDD6CC65 for ; Tue, 23 Feb 2016 09:34:08 -0500 Received: from [10.183.1.94] (unknown [217.192.174.36]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by darkcity.gna.ch (Postfix) with ESMTPSA id CF429B2ADD4; Tue, 23 Feb 2016 15:34:04 +0100 (CET) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.2 \(3112\)) In-Reply-To: <001001d16c94$954769e0$bfd63da0$@belski.net> Date: Tue, 23 Feb 2016 15:34:04 +0100 Cc: =?utf-8?Q?=C3=81ngel_Gonz=C3=A1lez?= , PHP internals Content-Transfer-Encoding: quoted-printable Message-ID: References: <79F03701-9083-439B-A9D1-43E24C99CF13@cschneid.com> <012501d16a24$a230d050$e69270f0$@belski.net> <031b01d16aee$645ef340$2d1cd9c0$@belski.net> <56C90452.7040206@gmail.com> <001001d16c94$954769e0$bfd63da0$@belski.net> To: Anatol Belski X-Mailer: Apple Mail (2.3112) Subject: Re: [PHP-DEV] PCRE jit bug with UTF-8 and lookbehind assertion From: cschneid@cschneid.com (Christian Schneider) Am 21.02.2016 um 11:42 schrieb Anatol Belski : >> -----Original Message----- >> From: =C3=81ngel Gonz=C3=A1lez [mailto:keisial@gmail.com] >> Sent: Sunday, February 21, 2016 1:27 AM >> To: Anatol Belski >> Cc: 'Christian Schneider' ; 'PHP internals' >> >> Subject: Re: [PHP-DEV] PCRE jit bug with UTF-8 and lookbehind = assertion >>=20 >> On 19/02/16 09:20, Anatol Belski wrote: >>> Could you please write back, what is the out difference between = those >>> two commands? Thanks. Anatol >> In the first case, it correctly outputs =C2=ABx=C2=B011=C2=BB (78 c2 = b0 7a). With jit enabled it >> produces =C2=ABx z=C2=BB (78 c2 7a). That is, it is only outputting = the lower byte of the utf-8 >> encoding of the U+00B0 character Tested on PHP 7.0.3 using the system = libpcre >> 8.38 >>=20 > Were you putting the snippets into a file or testing on the console? I = had an issue while testing this on the console, that some chars was = partially swallowed by terminal (which was a utf-8 terminal). When = putting into a file, the output is same for both - "x=C2=B0z". Please = see also the continued discussion in the original ticket = https://bugs.exim.org/show_bug.cgi?id=3D1189 . The offsets delivered by = PCRE also seem to be correct, and valgrind doesn't find anything. It = were great if you could confirm these insights. I can reproduce it in a console and in a file. PCRE Library Version =3D> 8.38 2015-11-23 I also reproduced it with a C program directly using the system PCRE = library, no PHP involved. I attached the C source to https://bugs.exim.org/show_bug.cgi?id=3D1189 Regards, - Chris