Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:95261
Mailing-List: contact internals-help@lists.php.net; run by ezmlm
Received-SPF: pass (pb1.pair.com: domain ohgaki.net designates 180.42.98.130 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <fe3ccc4c-e81e-34fc-3257-19bd943e4abe@gmail.com>
References: <CAGa2bXarw2jTj0yhXpywWNQOJLR2pdhfaqSdwj_btT2giEL15Q@mail.gmail.com>
 <CAGa2bXadVagTwBzEym9PZROmXu3SKxg=huCAQNK3k1-jZeG_pg@mail.gmail.com>
 <CAGa2bXY-vmEgEFrE0OxQPSx=FKHeHdCayrDXxtTkHUVaJbV+-A@mail.gmail.com>
 <CA+kxMuRiOBQpmTeKqNyV8rX0GKCLrYixi--y5TcYUkdqpT746w@mail.gmail.com>
 <CAGa2bXa78fkZw8gtepmBDu+VDwHW_WiDDv65=zzgLTFqYxF+DA@mail.gmail.com>
 <CADyq6s+Y-OeKEyMus5s7xp9MqY7Fq7udqd5BRwtyGXbYCY1Uag@mail.gmail.com>
 <CADyq6sJuHCrLOoL22MWHv+cfrQQx9z=avswCpDVmnuyBCAchsA@mail.gmail.com>
 <CADyq6sKs=e_1Pc_CCLuThsU-qsvRseQcpx=sWKB_uhWsg=aRmQ@mail.gmail.com>
 <7795ca21-bd70-fe65-9519-af95fdfee33f@gmail.com> <CAGa2bXZMrO__dVd=qVoKjuPmcvVJ-q4xa8XKc0+aj1kH7Xd2Fw@mail.gmail.com>
 <fe3ccc4c-e81e-34fc-3257-19bd943e4abe@gmail.com>
Date: Wed, 17 Aug 2016 20:43:10 +0900
Message-ID: <CAGa2bXYy_tNiRfmy=jTsy-GALN4t8SDfxZ5vZocVCMw_7mMO6Q@mail.gmail.com>
To: Stanislav Malyshev <smalyshev@gmail.com>
Cc: Marco Pivetta <ocramius@gmail.com>, Dan Ackroyd <danack@basereality.com>, 
	PHP Internals List <internals@lists.php.net>
Content-Type: text/plain; charset=UTF-8
Subject: Re: [PHP-DEV] Re: [RFC][VOTE] Add validation functions to filter module
From: yohgaki@ohgaki.net (Yasuo Ohgaki)

Hi Stas,

On Wed, Aug 17, 2016 at 5:33 PM, Stanislav Malyshev <smalyshev@gmail.com> wrote:
>> Let's say your app validate user written/chosen "Date" on client side by
>> JavaScript. Then browser must send whatever "Date" format you impose
>> to client. It may be "YYYYMMDD", for example.
>
> I'm not sure what Javascript has to do with it. Many apps don't have any
> client-side and have little to do with Javascript. Assuming that whole
> world is browser applications running Javascript (controlled by you)
> would be a big mistake.

I think you wrote your JavaScript code to impose certain format for "date",
"phone", "zip", etc. It's not my JavaScript code, but your JavaScript code
that defines output of browser to your PHP web apps.

>
>> Then programer should not accept "Date" format other than "YYYYMMDD"
>> because other format is invalid. Accepting format other than "YYYYMMDD"
>> does only bad and increase risks of program malfunctioning. i.e. All kinds
>> of injections like JavaScript, SQL, Null char, Newline, etc.
>
> What you mean by "accept" here? I think you are under impression (please

Accept means that allow program to process input data. (continue execution)

> correct me if I'm wrong) that there are only two ways for application to
> work - either treating all inputs equally, or bailing out immediately
> when incorrect input is detected. However, this is not the case, there
> are many other ways for application to handle the situation of invalid
> input - while knowing it is invalid - and exact manner of this handling
> is application-dependent.

If your JavaScript date picker uses "YYYYMMDD" format (date like
20160817) for a date, anything other than "YYYYMMDD" format is
attacker tampered inputs.

It may be considered "valid input" means expected inputs from _legitimate_
users. Anything other than "valid input" should not be accepted because
they come from non legitimate users. i.e. attackers.

 - Broken encoding
 - CNTRL chars
 - Bad format ( YYYYMMDD is the format for this case )
 - Too long or short ( Exactly 8 chars is the length for this case )
 - and so on

are examples of invalid inputs.

>
>> "Input validation" should reject all of them and does not have to inform users
>> (attackers) to "there is invalid input". If you need to tell  legitimate users
>
> I think we disagree here. I think not doing this makes my work as a
> developer much much harder.

It may increase your work, but you'll get less risks in return.
It's all about avoiding/mitigating possible risks with additional costs.

I know you've fixed many vulnerabilities in PHP. What's the best way to
avoid broken char encoding attacks in some libraries? Validating string
char encoding is the best way as nobody can guarantee correct behavior
with broken char encoding in a system.

i.e. There are many codes that misbehave with broken encodings.
Software is changed continuously. Even if one had 100% broken
encoding attack free code at certain point, it could be vulnerable
due to software version ups.


>
>> "There is invalid input", then it should be treated by "Business logic", not by
>> "Input validation".
>
> Wait, input validation happens before business logic has a chance to
> run, so if input validation bails, how business logic can treat anything?

The input validation only reject invalid input.

If you use plain <input> for "date", then you should consider any valid
UTF-8 without CNTRL chars up to 100 char or so, not "YYYYMMDD".
(Assuming UTF-8 is the encoding)

>> "User name" and "Password" shouldn't have CNTRL chars or invalid char
>> encoding. Even when fields are plain <input>, there shouldn't be 500 chars
>> long inputs for them.
>
> So your proposal seems to be having two input checking procedures
> instead of one. I don't think people would find it very useful to have
> two separate input checking procedures.

If you blindly follow best practice that "Control/validate all inputs", then
previously mentioned inputs like browser request headers

 - Invalid REFERER contains Illegal/CTNRL chars and/or too many chars.
 - Invalid ACCEPT-CHARSET contains Illegal/CNTRL chars and/or too many chars.
 - Invalid ACCEPT-ENCODING contains Illegal/CNTRL chars and/or too many chars.
 - Invalid ACCEPT-LANGUAGE contains Illegal/CNTRL chars and/or too many chars.
 - and so on.

must be validated. I don't think it's a business logic job since it's
unrelated to most business logic.

Attackers are trying to tamper via any inputs that software accepts.
Therefore, developers are needed to try to reduce possible attack path
as much as possible. i.e. Close doors, don't open doors needlessly.
Code may not use them at all, but nobody can make sure they will never
be used when group of people are developing a software.

Input validation is better to be done ASAP. Exception is
canonicalization. Otherwise, SSRF could be possible via Controller in
MVC architecture, for instance.

You may ignoring unused input and/or accepting inputs, but there are
applications what requires lock solid security.

This input validation is a mitigation for "Oops!" and people does
"Oops!" on occasions like we do. It's very strong mitigation for
"Oops!". Therefore, input validation is listed as #1 item.

Software design is upto developers. There are many softwares that do
not follow best practices. Nobody enforce to use the validator as I
explains. It's okay to me this is used by only users who need. As I
mentioned, ISO 27000/ISMS requires this kind of validations, not few
users may need this.

Regards,

--
Yasuo Ohgaki
yohgaki@ohgaki.net