Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:59740 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 80990 invoked from network); 11 Apr 2012 17:39:53 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 11 Apr 2012 17:39:53 -0000 Authentication-Results: pb1.pair.com header.from=johncrenshaw@priacta.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=johncrenshaw@priacta.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain priacta.com designates 64.95.72.241 as permitted sender) X-PHP-List-Original-Sender: johncrenshaw@priacta.com X-Host-Fingerprint: 64.95.72.241 mxout.myoutlookonline.com Received: from [64.95.72.241] ([64.95.72.241:22708] helo=mxout.myoutlookonline.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id B2/97-36454-7E1C58F4 for ; Wed, 11 Apr 2012 13:39:52 -0400 Received: from mxout.myoutlookonline.com (localhost [127.0.0.1]) by mxout.myoutlookonline.com (Postfix) with ESMTP id 46C05416BD6; Wed, 11 Apr 2012 13:39:49 -0400 (EDT) X-Virus-Scanned: by SpamTitan at mail.lan Received: from HUB027.mail.lan (unknown [10.110.2.1]) by mxout.myoutlookonline.com (Postfix) with ESMTP id C1E0F416ED4; Wed, 11 Apr 2012 13:39:08 -0400 (EDT) Received: from MAILR001.mail.lan ([10.110.18.28]) by HUB027.mail.lan ([10.110.17.27]) with mapi; Wed, 11 Apr 2012 13:39:01 -0400 To: Rasmus Lerdorf , Stas Malyshev CC: Yasuo Ohgaki , "internals@lists.php.net" Date: Wed, 11 Apr 2012 13:38:58 -0400 Thread-Topic: [PHP-DEV] Re: Disabling PHP tags by php.ini and CLI options Thread-Index: Ac0X7LBj527JIEqxQIyxqn9DhRlpzAADjnxQ Message-ID: References: <4F850D06.10701@sugarcrm.com> <4F8515AF.8060706@sugarcrm.com> <4F851FE4.7000706@sugarcrm.com> <4F8539E0.1090701@sugarcrm.com> <4F859063.1010401@lerdorf.com> In-Reply-To: <4F859063.1010401@lerdorf.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: RE: [PHP-DEV] Re: Disabling PHP tags by php.ini and CLI options From: johncrenshaw@priacta.com (John Crenshaw) From: Rasmus Lerdorf [mailto:rasmus@lerdorf.com]=20 > I guess he is saying that it prevents: >=20 > Random bytes > > More random bytes > > Where random bytes might be an image file so finfo_file() might identify = it as a valid image Right, but anyone can trivially construct a fully valid bitmap with a start= ing byte sequence of `42 4D 3B 2F 2A`, which resolves to `BM;/*`. PHP will = decide that BM meant 'BM', effectively skipping it, then the open comment w= ill slide the PHP interpreter past any remaining header stuff. You can clos= e the comment and place the actual code payload anywhere in the image data.= The early bytes in other image formats are similarly exploitable. As far a= s I can tell there is really no security win here. > 4. Only protecting against mid-script injections and not top-of-script in= jections is a somewhat subtle concept when the real problem is the vulnerab= le include $_GET['filename'] hole. If this really is a prevalent problem, m= aybe instead of trying to mitigate the symptoms, why don't we try to attack= the actual cause of the problem. I would love to hear some ideas along tho= se lines that don't fundamentally change the nature of PHP for somewhat clo= udy benefits. >=20 > -Rasmus It's disturbingly common. Probably 90% of the automated attacks I see in th= e 404 error logs are trying to exploit various inclusion vulnerabilities. One idea that comes to mind immediately is the old taint RFC: https://wiki.= php.net/rfc/taint. This doesn't actually prevent LFI, but it (optionally) w= arns the developer that they did something very bad, regardless of whether = it actually caused a problem with the specific input data. I'd really love = to see that one finalized and implemented. Another wild alternative could be to have a non-trivial string format inter= nally, where PHP strings are actually a set of distinct blocks which each c= ontain encoding information. This would make it possible to concatenate str= ings just as always, but since the attributes of each block are known the e= ntire string contents could be manipulated to an arbitrary final encoding, = (or rejected as impossible to safely convert) when the string is actually u= sed. In the include case this isn't really very different from taint, becau= se safe conversion is impossible, but for things like XSS and SQL injection= it could actually *fix* the otherwise vulnerable code. A simplified exampl= e of how this might work: http://example.com?name=3D%3Cscript%3Exss()%3B%3C%2Fscript%3E // $_GET['name'] =3D=3D=3D [text&user&utf8('')]; $name =3D $_GET['name']; $welcome =3D html("Welcome $name!"); // $welcome =3D=3D=3D [html('We= lcome '), text&user&utf8(''), html('!')]; echo $_GET['name']; // assuming the current output format is text/html, the= output will be "Welcome <script>xss();</script>!" Obviously this second idea is probably a prohibitively large change, there = is some BC break (especially where an input was known to be HTML but secure= d via something like HTMLPurifier), and there are huge open questions (like= how to handle string comparison). Still, I think it is interesting because= it actually divines the real meaning. The intent of the above code is obvi= ous to a developer, and something like this could bring that understanding = to the final result. This specific concept has issues, but maybe it gives s= omeone else a more practical idea. John Crenshaw Priacta, Inc.