Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:87399 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 11036 invoked from network); 30 Jul 2015 13:12:42 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 30 Jul 2015 13:12:42 -0000 Authentication-Results: pb1.pair.com header.from=rrichards@cdatazone.org; sender-id=softfail Authentication-Results: pb1.pair.com smtp.mail=rrichards@cdatazone.org; spf=softfail; sender-id=softfail Received-SPF: softfail (pb1.pair.com: domain cdatazone.org does not designate 216.22.18.221 as permitted sender) X-PHP-List-Original-Sender: rrichards@cdatazone.org X-Host-Fingerprint: 216.22.18.221 b221.a.smtp2go.com Linux 2.6 Received: from [216.22.18.221] ([216.22.18.221:54647] helo=b221.a.smtp2go.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id EB/B2-31830-9C22AB55 for ; Thu, 30 Jul 2015 09:12:42 -0400 To: Stanislav Malyshev , Anthony Ferrara , "internals@lists.php.net" References: <55B94D57.4070509@gmail.com> Message-ID: <55BA22C5.2020104@cdatazone.org> Date: Thu, 30 Jul 2015 09:12:37 -0400 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <55B94D57.4070509@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] Disabling External Entities in libxml By Default From: rrichards@cdatazone.org (Rob Richards) On 7/29/15 6:01 PM, Stanislav Malyshev wrote: > Hi! > >> Currently, PHP by default is vulnerable to XXE attacks: >> https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Processing >> >> To bypass this, you need to turn off external entity loading: >> >> libxml_disable_entity_loader(true); > AFAIR right now, due to how it is implemented, this blocks loading XML > content from files with something like XMLReader::open() - due to the > use of the same code path by both. It may have changes since last time I > looked, but it definitely was a major reason why default stayed that > way. What people did is something like that: > > libxml_disable_entity_loader( false ); > $reader->open( $filename ); > libxml_disable_entity_loader( true ); > > I imagine we could do better. But we need to be careful - if we just set > it as disabled, we could break a lot of unsuspecting apps that do > nothing more that reading XML files. > It hasn't changed since then. I've gone back and forth on this in the past. I've looked at both allowing the initial file to be read as well as potentially adding support for file resources. The latter requires app changes anyways so no big advantage there, tho would get the job done. I'm still going back and forth on it as a few things come to mind. If you are already working with a trusted document then you should safely be able to disable the entity loader. If you aren't then wouldn't you want to do some sort of checking (especially if you dont have an XML gateway fronting the system) for other malicious things before even opening the document regardless if it has external entities or not. The intial doc is really an external at that point. In our systems the entity loader is disabled right off the bat so if any developer needs to work with xml anywhere else in the system, they have to explicitly enable it, load and then disable it. Requires a conscious effort on their part as well as makes it easier to audit their usage of it and the trustworthiness of their data. imo it might be better to leave the entity loader function as is and introduce a new function which can be enabled by default yet allow for the initial data to be read yet not allow any external loading from there. This way you are not going to have to relax the security aspect of the current function (where the current function always overrides the ability to read the initial data if the entity loader is disabled). One thing to be careful of tho is the breakage of unsuspecting apps. For anyone who relies on external DTD, schemas, xslt includes, etc.. even within their own environment, if they are not currently using the entity loader function now, they are certain to break. Whichever direction the consensus is for this I'm more than willing to help implement. Anyways just my 2 cents. Rob