Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:87417 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 56623 invoked from network); 30 Jul 2015 20:35:06 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 30 Jul 2015 20:35:06 -0000 Authentication-Results: pb1.pair.com smtp.mail=rrichards@cdatazone.org; spf=softfail; sender-id=softfail Authentication-Results: pb1.pair.com header.from=rrichards@cdatazone.org; sender-id=softfail Received-SPF: softfail (pb1.pair.com: domain cdatazone.org does not designate 216.22.18.221 as permitted sender) X-PHP-List-Original-Sender: rrichards@cdatazone.org X-Host-Fingerprint: 216.22.18.221 b221.a.smtp2go.com Linux 2.6 Received: from [216.22.18.221] ([216.22.18.221:40373] helo=b221.a.smtp2go.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 59/13-34806-87A8AB55 for ; Thu, 30 Jul 2015 16:35:06 -0400 To: Rowan Collins , internals@lists.php.net References: <55B94D57.4070509@gmail.com> <55BA22C5.2020104@cdatazone.org> <55BA34EF.6030209@gmail.com> Message-ID: <55BA8A75.30208@cdatazone.org> Date: Thu, 30 Jul 2015 16:35:01 -0400 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <55BA34EF.6030209@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] Disabling External Entities in libxml By Default From: rrichards@cdatazone.org (Rob Richards) On 7/30/15 10:30 AM, Rowan Collins wrote: > Rob Richards wrote on 30/07/2015 14:12: >> If you are already working with a trusted document then you should >> safely be able to disable the entity loader. If you aren't then >> wouldn't you want to do some sort of checking (especially if you dont >> have an XML gateway fronting the system) for other malicious things >> before even opening the document regardless if it has external >> entities or not. > > Can you give any pointers to what kind of checking this would be, and > how it would be carried out without parsing the XML document in the > first place? > > According to the bug report, one of the affected uses is the > SoapClient, which by definition is dealing with remote data. I can see > how that could be considered "untrusted", but I can't think of any > particular action that would make it more trusted (quite apart from > the lack of an obvious point to intercept the data before it is parsed). > > Would it not make more sense for the parser to operate in an > "untrusted" mode - disabling external entities, maybe different limits > on stack depth, etc? > > Regards, All depends upon what you are trying to accomplish as this covers tree, streaming, different types of schemas, xsl, etc... For example, you can easily check if there is a DTD, imports/includes, specific xslt functionality, list goes on and on without ever having to load the document. There really is no one size fit all imo so what one considers untrusted someone else would consider trusted.