Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:95261 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 87489 invoked from network); 17 Aug 2016 11:44:04 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 17 Aug 2016 11:44:04 -0000 Authentication-Results: pb1.pair.com smtp.mail=yohgaki@ohgaki.net; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=yohgaki@ohgaki.net; sender-id=pass Received-SPF: pass (pb1.pair.com: domain ohgaki.net designates 180.42.98.130 as permitted sender) X-PHP-List-Original-Sender: yohgaki@ohgaki.net X-Host-Fingerprint: 180.42.98.130 ns1.es-i.jp Received: from [180.42.98.130] ([180.42.98.130:35577] helo=es-i.jp) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 1A/D6-45465-10E44B75 for ; Wed, 17 Aug 2016 07:44:04 -0400 Received: (qmail 44802 invoked by uid 89); 17 Aug 2016 11:43:58 -0000 Received: from unknown (HELO mail-qt0-f171.google.com) (yohgaki@ohgaki.net@209.85.216.171) by 0 with ESMTPA; 17 Aug 2016 11:43:58 -0000 Received: by mail-qt0-f171.google.com with SMTP id 52so47740406qtq.3 for ; Wed, 17 Aug 2016 04:43:57 -0700 (PDT) X-Gm-Message-State: AEkoouvXfdmOTCr2VYioY1mTH0KhVOps4vFWy27eRhbgjeKQNC8lStSk3IF32kHJ+5+ULT9RPzRyrdoKX8J3FQ== X-Received: by 10.237.53.206 with SMTP id d14mr45451320qte.83.1471434231948; Wed, 17 Aug 2016 04:43:51 -0700 (PDT) MIME-Version: 1.0 Received: by 10.140.85.242 with HTTP; Wed, 17 Aug 2016 04:43:10 -0700 (PDT) In-Reply-To: References: <7795ca21-bd70-fe65-9519-af95fdfee33f@gmail.com> Date: Wed, 17 Aug 2016 20:43:10 +0900 X-Gmail-Original-Message-ID: Message-ID: To: Stanislav Malyshev Cc: Marco Pivetta , Dan Ackroyd , PHP Internals List Content-Type: text/plain; charset=UTF-8 Subject: Re: [PHP-DEV] Re: [RFC][VOTE] Add validation functions to filter module From: yohgaki@ohgaki.net (Yasuo Ohgaki) Hi Stas, On Wed, Aug 17, 2016 at 5:33 PM, Stanislav Malyshev wrote: >> Let's say your app validate user written/chosen "Date" on client side by >> JavaScript. Then browser must send whatever "Date" format you impose >> to client. It may be "YYYYMMDD", for example. > > I'm not sure what Javascript has to do with it. Many apps don't have any > client-side and have little to do with Javascript. Assuming that whole > world is browser applications running Javascript (controlled by you) > would be a big mistake. I think you wrote your JavaScript code to impose certain format for "date", "phone", "zip", etc. It's not my JavaScript code, but your JavaScript code that defines output of browser to your PHP web apps. > >> Then programer should not accept "Date" format other than "YYYYMMDD" >> because other format is invalid. Accepting format other than "YYYYMMDD" >> does only bad and increase risks of program malfunctioning. i.e. All kinds >> of injections like JavaScript, SQL, Null char, Newline, etc. > > What you mean by "accept" here? I think you are under impression (please Accept means that allow program to process input data. (continue execution) > correct me if I'm wrong) that there are only two ways for application to > work - either treating all inputs equally, or bailing out immediately > when incorrect input is detected. However, this is not the case, there > are many other ways for application to handle the situation of invalid > input - while knowing it is invalid - and exact manner of this handling > is application-dependent. If your JavaScript date picker uses "YYYYMMDD" format (date like 20160817) for a date, anything other than "YYYYMMDD" format is attacker tampered inputs. It may be considered "valid input" means expected inputs from _legitimate_ users. Anything other than "valid input" should not be accepted because they come from non legitimate users. i.e. attackers. - Broken encoding - CNTRL chars - Bad format ( YYYYMMDD is the format for this case ) - Too long or short ( Exactly 8 chars is the length for this case ) - and so on are examples of invalid inputs. > >> "Input validation" should reject all of them and does not have to inform users >> (attackers) to "there is invalid input". If you need to tell legitimate users > > I think we disagree here. I think not doing this makes my work as a > developer much much harder. It may increase your work, but you'll get less risks in return. It's all about avoiding/mitigating possible risks with additional costs. I know you've fixed many vulnerabilities in PHP. What's the best way to avoid broken char encoding attacks in some libraries? Validating string char encoding is the best way as nobody can guarantee correct behavior with broken char encoding in a system. i.e. There are many codes that misbehave with broken encodings. Software is changed continuously. Even if one had 100% broken encoding attack free code at certain point, it could be vulnerable due to software version ups. > >> "There is invalid input", then it should be treated by "Business logic", not by >> "Input validation". > > Wait, input validation happens before business logic has a chance to > run, so if input validation bails, how business logic can treat anything? The input validation only reject invalid input. If you use plain for "date", then you should consider any valid UTF-8 without CNTRL chars up to 100 char or so, not "YYYYMMDD". (Assuming UTF-8 is the encoding) >> "User name" and "Password" shouldn't have CNTRL chars or invalid char >> encoding. Even when fields are plain , there shouldn't be 500 chars >> long inputs for them. > > So your proposal seems to be having two input checking procedures > instead of one. I don't think people would find it very useful to have > two separate input checking procedures. If you blindly follow best practice that "Control/validate all inputs", then previously mentioned inputs like browser request headers - Invalid REFERER contains Illegal/CTNRL chars and/or too many chars. - Invalid ACCEPT-CHARSET contains Illegal/CNTRL chars and/or too many chars. - Invalid ACCEPT-ENCODING contains Illegal/CNTRL chars and/or too many chars. - Invalid ACCEPT-LANGUAGE contains Illegal/CNTRL chars and/or too many chars. - and so on. must be validated. I don't think it's a business logic job since it's unrelated to most business logic. Attackers are trying to tamper via any inputs that software accepts. Therefore, developers are needed to try to reduce possible attack path as much as possible. i.e. Close doors, don't open doors needlessly. Code may not use them at all, but nobody can make sure they will never be used when group of people are developing a software. Input validation is better to be done ASAP. Exception is canonicalization. Otherwise, SSRF could be possible via Controller in MVC architecture, for instance. You may ignoring unused input and/or accepting inputs, but there are applications what requires lock solid security. This input validation is a mitigation for "Oops!" and people does "Oops!" on occasions like we do. It's very strong mitigation for "Oops!". Therefore, input validation is listed as #1 item. Software design is upto developers. There are many softwares that do not follow best practices. Nobody enforce to use the validator as I explains. It's okay to me this is used by only users who need. As I mentioned, ISO 27000/ISMS requires this kind of validations, not few users may need this. Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net