Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:72815 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 47038 invoked from network); 26 Feb 2014 06:15:47 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 26 Feb 2014 06:15:47 -0000 Authentication-Results: pb1.pair.com header.from=yohgaki@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=yohgaki@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.215.46 as permitted sender) X-PHP-List-Original-Sender: yohgaki@gmail.com X-Host-Fingerprint: 209.85.215.46 mail-la0-f46.google.com Received: from [209.85.215.46] ([209.85.215.46:42605] helo=mail-la0-f46.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 27/C9-28538-0968D035 for ; Wed, 26 Feb 2014 01:15:46 -0500 Received: by mail-la0-f46.google.com with SMTP id hr17so289482lab.33 for ; Tue, 25 Feb 2014 22:15:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=spTGq8X6mbBZxMXKraH52N+3S4LZp4LmxO1Aiw/fKJY=; b=IYsaq6xopNODABSlmZJLNlBxERaM63rGOVq37E64g4dx8MOR6kTkPw+px1IYU9JSOn Kf4W5yjXXjH6o3Vt95R+IInx91/J0rJaUsWGPNF+2nKfoSgAhTBqHHEARTzSYDMwg1XL EA0p0bB8ZTeB+ZzdxtpLHgctQ5u4L+ItUolPYDcJuhhf2dJmcqk6c3xkJLjhPYg2B7zM 1bZdZwaB97oYViEfFZnobPqmpFtZkEExup1trpl+iEeYx0Y5q7BMtLz5l6l/eQd2Hwuk 4IJYi+evEJmXZbRaS0K4Dm/lwx0nkVcbdluEPZzQ2ckzrDNc28zMzYec7/qV+7yZJcqe 7pMw== X-Received: by 10.112.164.5 with SMTP id ym5mr339533lbb.48.1393395341961; Tue, 25 Feb 2014 22:15:41 -0800 (PST) MIME-Version: 1.0 Sender: yohgaki@gmail.com Received: by 10.112.199.37 with HTTP; Tue, 25 Feb 2014 22:15:01 -0800 (PST) In-Reply-To: <1393328380.5233.45.camel@guybrush> References: <530C3C7B.8080907@sugarcrm.com> <530C77F8.2060809@sugarcrm.com> <1393328380.5233.45.camel@guybrush> Date: Wed, 26 Feb 2014 15:15:01 +0900 X-Google-Sender-Auth: zc4oM6OrovBZsM0ZX5Dzug1RfNQ Message-ID: To: =?UTF-8?Q?Johannes_Schl=C3=BCter?= Cc: Stas Malyshev , "internals@lists.php.net" Content-Type: multipart/alternative; boundary=001a11c2681863add704f3492276 Subject: Re: [PHP-DEV] Resolution for ver_export()/addslashes() encoding based script execution attack? From: yohgaki@ohgaki.net (Yasuo Ohgaki) --001a11c2681863add704f3492276 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Johannes, On Tue, Feb 25, 2014 at 8:39 PM, Johannes Schl=C3=BCter wrote: > On Tue, 2014-02-25 at 13:01 +0200, Stas Malyshev wrote: > > Hi! > > > > > Engine supports SJIS/other encoding script execution. Therefore, > > > > I'm not sure what you mean by "other encoding script execution". Engine > > execution is the same regardless of the data you use, nowhere in the > > opcodes there's any encoding that is important. > > I think Yasuo refers to > php -d zend.script_encoding=3DSJIS > Yes, it is. There are users that uses SJIS and SJIS like encoding for PHP scripts. This is nice feature for Windows users in east Asia. > > addslashes()/var_export() behavior is security vulnerability just like > > > encoding based SQL/JavaScript injections. > > > > I don't see how it is "therefore". If there's SQL injection or JS > > injection that is not the result of wrong code, then let's outline them > > and consider them. Specifically for var_export, which behavior is > > broken? I see no mention of the specific examples in the RFC. > > i think Yasuo assumes that results from those operations with a > script_encoding setting would be handled "correctly". > > I don't think we can do that, though. zend.script_encoding is a hardly > documented feature which should be used with care. > > The documentation of addslashes() refers to "characters". I don't think > the behavior should depend on PHP settings but like all "basic" > functions assume on ASCII compatible single byte strings. Adding magic > there makes it more confusing and harder to use. As said in a previous > discussion on this topic rather than using addslashes users should use > context-aware escaping functions. Using addslashes is almost always a > bad idea. > I think many users realize this thanks to magic_quotes_gpc ;) Therefore, many users know there would be issue with addslashes(), but not for var_export(), I suppose. The situation around var_export() is a bit more complicated. > var_export() is used to create application configuration, cache data > etc. so one might expect the PHP which created that to be able to read > that, again. Doing this isn't easy, though, as it makes the generated > file non-portable. > It would be portable as long as user uses correct encoding. The default would be UTF-8 for 5.6 and ASCII for others in case if we modify addslashes()/var_export(). (If mb_*(), internal_encoding setting is used) Users who are not affected by this issue can use the default. For affected users, all we have to do is ask users to specify proper encoding so that php_addslashes() would not break structure of var_export() data for certain encoding. As you know, all databases' escaping functions have encoding parameter. PostgreSQL uses encoding parameter stored in db connection structure. This is the reason why pg_escape_string() has optional database base connection parameter for escaping. Adding "magic" into probably security relevant features is problematic > and unless the engine is truly encoding-aware I'd abstain from such > changes. I agree. For example, pgsq/mysql/mysqli module does some magic by using and assuming default connection's encoding parameter may be used to escape string properly. If users are using multiple connections with different encoding, it may cause problem. It wouldn't cause problem almost always because chances are rare that users are using multiple encoding at once, though. Anyway, if we add mb_*() or modify existing functions, user has to specify correct encoding when it differs from default encoding ("default_charset" for 5.6 and up, mb_internal_encoding for less) Although there would be code duplication, adding mb_*() seems to be the best choice. Duplication may be removed with a little changes in API. This cannot happen in released versions, but it is possible for 5.6. I should write this in the RFC. If this is fixed in all released versions, it's nicer for users and good for us, but I'm not against fix this issue only in 5.6 and warn affected users hoping nobody will be attacked. Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net --001a11c2681863add704f3492276--