Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:40062 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 55561 invoked from network); 21 Aug 2008 17:26:47 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 21 Aug 2008 17:26:47 -0000 Authentication-Results: pb1.pair.com header.from=david.zuelke@bitextender.com; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=david.zuelke@bitextender.com; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain bitextender.com from 80.237.132.12 cause and error) X-PHP-List-Original-Sender: david.zuelke@bitextender.com X-Host-Fingerprint: 80.237.132.12 wp005.webpack.hosteurope.de Received: from [80.237.132.12] ([80.237.132.12:60490] helo=wp005.webpack.hosteurope.de) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 16/40-53328-355ADA84 for ; Thu, 21 Aug 2008 13:26:46 -0400 Received: from munich.bitxtender.net ([85.183.90.3] helo=[10.224.254.2]); authenticated by wp005.webpack.hosteurope.de running ExIM using esmtpsa (TLSv1:RC4-SHA:128) id 1KWDeL-0005lE-Hy; Thu, 21 Aug 2008 19:08:41 +0200 Cc: "William A. Rowe, Jr." , Stanislav Malyshev , "'PHP Internals'" Message-ID: <1A89A691-543C-4321-B488-506C2C487C72@bitextender.com> To: Rasmus Lerdorf In-Reply-To: <48AD9FD9.9000709@lerdorf.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v926) Date: Thu, 21 Aug 2008 19:08:40 +0200 References: <48ACC389.2030801@zend.com> <48ACC638.1030904@rowe-clan.net> <7C51580F-C656-47D9-9269-CA140AA9EBC2@bitextender.com> <48AD9312.9050903@lerdorf.com> <1D87B84E-1502-4BBA-8CDB-0A9E73A8196F@bitextender.com> <48AD9AA6.9040805@lerdorf.com> <48AD9CCC.8070208@lerdorf.com> <158D158E-8A72-4DE2-81D1-49A01BC948B2@bitextender.com> <48AD9FD9.9000709@lerdorf.com> X-Mailer: Apple Mail (2.926) X-bounce-key: webpack.hosteurope.de;david.zuelke@bitextender.com;1219339606;762b49ec; Subject: Re: [PHP-DEV] bug #43941 From: david.zuelke@bitextender.com (=?ISO-8859-1?Q?David_Z=FClke?=) Am 21.08.2008 um 19:03 schrieb Rasmus Lerdorf: > David Z=FClke wrote: >> Interesting. I assume that was a weakness in the respective >> implementation, right? Since >> >> 0xE0 " > >> >> should never be regarded a valid sequence since neither " nor > are =20= >> in >> the range above 0x7F... > > But that's what we are talking about. What to do with invalid =20 > sequences. The E0 says that the following 2 bytes are part of the =20 > UTF-8 character. So this is a 3-byte sequence. Together these 3 =20 > bytes are not valid, so Microsoft chose to replace those 3 with some =20= > other character. And yes, Microsoft is notoriously bad at reading =20 > specs, but I don't think it is completely clear what to do here, but =20= > we do know that we shouldn't do that. Well to me, the invalid part would be "0xE0" since it is incomplete =20 (0x7F and below are never part of multi-byte sequences, so they don't =20= count into the sequence here), so 0xE0 would be replaced by 0xEF 0xBF =20= 0xBD, and then you don't have an XSS unless I'm mistaken :) If Microsoft regards 0x7F and below as valid sequence members, then =20 that is unfortunate, but that shouldn't stop PHP from doing it =20 properly, as we all know better now, don't we :) I mean what does the patch do in that case? Strip it all? Then it's =20 the same problem. Strip 0xE0 only? Then we could just as well insert U=20= +FFFD instead. No difference. David=