Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:100554 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 43311 invoked from network); 12 Sep 2017 22:11:37 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 12 Sep 2017 22:11:37 -0000 Authentication-Results: pb1.pair.com smtp.mail=yohgaki@ohgaki.net; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=yohgaki@ohgaki.net; sender-id=pass Received-SPF: pass (pb1.pair.com: domain ohgaki.net designates 180.42.98.130 as permitted sender) X-PHP-List-Original-Sender: yohgaki@ohgaki.net X-Host-Fingerprint: 180.42.98.130 ns1.es-i.jp Received: from [180.42.98.130] ([180.42.98.130:49904] helo=es-i.jp) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 82/B9-10715-49B58B95 for ; Tue, 12 Sep 2017 18:11:35 -0400 Received: (qmail 111918 invoked by uid 89); 12 Sep 2017 22:11:28 -0000 Received: from unknown (HELO mail-io0-f181.google.com) (yohgaki@ohgaki.net@209.85.223.181) by 0 with ESMTPA; 12 Sep 2017 22:11:28 -0000 Received: by mail-io0-f181.google.com with SMTP id g32so38346312ioj.2 for ; Tue, 12 Sep 2017 15:11:27 -0700 (PDT) X-Gm-Message-State: AHPjjUiJSlYOmORAW/X9b/aLVWWMPPVxii3GA7x4AYaBaU++A68ko0Kf pFWfXGAWnVIA4FmV2jm/SL4V/Cc6Rg== X-Google-Smtp-Source: AOwi7QBjQqoq7VtS11bl45an8K5tHAwhtBCsMDMDkF4aLwZ1FPyCDBaw5WV8k26NyrnXbnFu3QCjmbH3R1QyFeHbi70= X-Received: by 10.107.197.198 with SMTP id v189mr3952169iof.94.1505254281184; Tue, 12 Sep 2017 15:11:21 -0700 (PDT) MIME-Version: 1.0 Received: by 10.79.3.134 with HTTP; Tue, 12 Sep 2017 15:10:40 -0700 (PDT) In-Reply-To: <15572C6C-4C44-4E76-9B1D-394E663FDBCD@koalephant.com> References: <2a4491b4-e6f5-4297-beec-363f373a93e6@lsces.co.uk> <3f8be7b1-0e59-21c6-4fe8-8299b2c05645@rhsoft.net> <6ba62d62-f1ab-9e7b-93f0-a1a9238c47a6@lsces.co.uk> <0db9cfa3-2b31-ee41-713c-889b7cc06406@lsces.co.uk> <3C.DD.10715.4E501B95@pb1.pair.com> <93.85.10715.AB3B3B95@pb1.pair.com> <049578E9-4C9A-42D8-B206-8ABAF070E151@koalephant.com> <05A8DB1C-4683-4934-A7DA-C7CD71E6CCB6@koalephant.com> <15572C6C-4C44-4E76-9B1D-394E663FDBCD@koalephant.com> Date: Wed, 13 Sep 2017 07:10:40 +0900 X-Gmail-Original-Message-ID: Message-ID: To: Stephen Reay Cc: Tony Marston , "internals@lists.php.net" Content-Type: multipart/alternative; boundary="94eb2c188434ba10f4055905528d" Subject: Re: [PHP-DEV] A validator module for PHP7 From: yohgaki@ohgaki.net (Yasuo Ohgaki) --94eb2c188434ba10f4055905528d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Sep 12, 2017 at 1:04 PM, Stephen Reay wrote: > > On 12 Sep 2017, at 04:07, Yasuo Ohgaki wrote: > > Stephen, > > On Tue, Sep 12, 2017 at 12:22 AM, Stephen Reay > wrote: > >> >> On 11 Sep 2017, at 17:41, Yasuo Ohgaki wrote: >> >> Hi Stephen, >> >> On Mon, Sep 11, 2017 at 6:37 PM, Stephen Reay >> wrote: >> >> On 11 Sep 2017, at 15:42, Yasuo Ohgaki wrote: >> >> It seems you haven't try to use filter module seriously. >> It simply does not have enough feature for input validations. >> e.g. You cannot validate "strings". >> >> >> Yasuo, >> >> I=E2=80=99ve asked previously what your proposal actually offers over th= e filter >> functions, and got no response, so please elaborate on this? >> >> >> >> Can you show a concrete example that cannot be validated in user land >> currently, using the filter functions as a base? >> >> >> FILTER_VALIDATE_REGEXP is not good enough simply. >> PCRE is known that it is vulnerable to regex DoS still. (as well as >> Oniguruma) >> Users should avoid regex validation whenever it is possible also to avoi= d >> various >> risks. >> >> In addition, current filter module does not provide nested array >> validation >> array key validation, etc. It's not true validation neither. It does not >> provide >> simple length, min/max validations. It does non explicit conversions (i.= e. >> trim), etc. >> Length, min/max validation is mandatory validation if you would like to >> follow >> ISO 27000 requirement. >> >> Regards, >> >> -- >> Yasuo Ohgaki >> yohgaki@ohgaki.net >> >> >> >> So, you still didn=E2=80=99t actually provide an example. I *guess* you= =E2=80=99re >> talking about character class validation or something else equally >> =E2=80=9Csimple=E2=80=9D, because I can=E2=80=99t imagine what else woul= d be a common enough case >> that you=E2=80=99d want to have built-in rules for, and that you wouldn= =E2=80=99t >> internally use RegExp to test anyway. >> > > Your request is like "Devil's Proof". Example code that cannot do things > with existing API cannot exist with meaningful manner. It can be explaine= d > why it cannot, though. Try what "validate" string validator can do, > Then you'll see. > > $input =3D [ > 'defined_but_should_not_exist' =3D> 'Developer should not allow unwante= d > value', > '_invalid_utf8_key_should_not_be_allowed_' =3D> 'Developer should > validate key value as well', > 'utf8_text' =3D> 'Validator should be able to allow UTF-8 and validate = its > validity at least', > 'default_must_be_safe' =3D> 'Crackers send all kinds of chars. CNTRL ch= ars > must not be allowed by default', > 'array' =3D> [ > 'complex' =3D> 1, > 'nested' =3D> 'any validation rule should be able to be applied', > 'array' =3D> 1, > 'key_should_be_validated_also' =3D> 1, > 'array' =3D> [ > 'any_num_of_nesting' =3D> 'is allowed', > ], > ], > 'array_num_elements_must_be_validated' =3D> [ > "a", "b", "c", "d", "e", "f", "and so on", "values must be able to > be validated as user wants", > ], > ]; > > There is no STRING validation filter currently. This fact alone, > it could be said "filter cannot do string validation currently". > > List of problems in current validation filter > - no STRING validator currently > - it allows any inputs by default > - it does not allow multiple rules that allows complex validation rules > for string > - it does not have callback validator > - it does not have key value validation (note: PHP's key could be binary= ) > - it does not validate num of elements in array. > - it cannot forbids unwanted elements in array. > - it cannot validate "char encoding". > - it does not enforce white listing. > - and so on > > These are the list that "filter" cannot do. > > Ok so we can=E2=80=99t use filter_var() rules to validate that a string f= ield is >> an Alpha or AlphaNum, between 4 and 8 characters long (technically you >> could pass mb_strlen() to the INT filter with {min,max}_range options se= t >> to get the length validation, but I=E2=80=99ll grant you that *is* kind = of a crappy >> workaround right now) >> >> Why not stop trying to re-invent every single feature already present in >> PHP (yes, I=E2=80=99ve been paying attention to all your other proposals= ), and just >> *add* the functionality that=E2=80=99s missing: >> > > https://wiki.php.net/rfc/add_validate_functions_to_filter > It's _declined_. You should have supported this RFC if you would like to > add features to filter. > (I'm glad there is a new RFC supporter regardless of occasion) > > I don't mind this result much. > Adding features to "filter" has some of shortcomings mentioned above > even with my proposal. > > A `FILTER_VALIDATE_STRING` filter, with =E2=80=9COptions=E2=80=9D of `min= ` =3D> ?int, `max` >> =3D> ?int and =E2=80=9CFlags=E2=80=9D of FILTER_FLAG_ALPHA, FILTER_FLAG_= NUMERIC (possibly a >> built in bit mask =E2=80=9CFILTER_FLAG_ALPHANUMERIC=E2=80=9D ?) >> > > Simply adding these wouldn't work well as validator because > > - Filter is designed for black listing > > As you may know, all of security standards/guidelines require > > - White listing for validation > > We may change "filter", but it requires BC. > > >> >> Lastly: it may not be the format you personally want, but the filter >> extension *does* have the `filter_{input,var}_array` functions. Claiming >> something doesn=E2=80=99t exist because it doesn=E2=80=99t work exactly = how you would like >> it to, makes you seem immature and petty, IMO. >> > > Discussion is confusing because you ignore this RFC result. > https://wiki.php.net/rfc/add_validate_functions_to_filter > This RFC proposes filter module improvement while keeping compatibility. > > I understand your point. This exactly the same reason why I proposed > "improvement" at first, not new extension. > > I don't understand why you insist already failed attempt repeatedly. > > Would you like me to propose previous RFC again? > and implement "ture validation" with filter? > I don't mind implementing it if you would like to update the RFC and it > passes. > I must use "white list" as much as possible. > > Regards, > > P.S. "Filter" module is black listing module. "Validate" is white listing > module. > Even with BC, mixing them would result in confusing FLAGs and codes. > Codes may be cleaned up later, but FLAGs cannot. > We should consider this also. > > -- > Yasuo Ohgaki > yohgaki@ohgaki.net > > > > I was going to give a lot of detailed replies inline, but I=E2=80=99ve co= me to the > realisation its pointless with you. You really respond to what people say= , > you just use their comments as jumping off points to re-post your same > little rant, ad nauseam. > May be I shouldn't reply if a reply indicates previous mails aren't read. I usually reply all regardless. As a result, I reply the basically the same thing. Since someone mentioned hash_hkdf() mess on this thread, short note for this. It's clearly Nikita and Andrey's fault. They don't read the internet RFC fully. I had no idea why they're acting like ignorant, kept insisting ridiculous/insecure API clearly violates the RFC, i.e. Salt as last optional param. Wrong is wrong. I cannot stop point it out problems in security feature(key derivation) unless it is fixed. If one feels curious, read RFC 5869, then you'll see why the API is so ridiculous/insecure without salt. So here=E2=80=99s the summary. Don=E2=80=99t both replying, because I won= =E2=80=99t be reading it. > > - I never asked for a working code example that is impossible with the > current extension. I asked for a simple example of what you wanted to > achieve. > OK. My excuse for misunderstanding. Unit tests do not cover all features yet, but you can see them from working "validate" module's *.phpt. > > - More than half the =E2=80=9Cissues=E2=80=9D you claim with the filter e= xtension, are > only =E2=80=9Cvalid=E2=80=9D if you agree that it needs to do complex arr= ay structure > validation. I do not agree with this. Userland can iterate an array of > rules/input and validate quite easily. > I totally agree that "validate" and "filter" is similar. I also totally agree that it's very easily done by scripts. I thought "proper application level validation" would be common sense many years ago since it is easy, but it is not. As you can see from this discussion, there are many people that "database" and/or "model" level validation is good enough for apps. This is one of my motivation, another is performance. Validation should be done always, so module functions are suitable for both performance and documentation purpose. For example, developers provide escape API for security purpose even when it is trivial with string functions. It's good for documentation purpose, as well as performance. It's the same. - The *actual* issues with the filter extension could be solved by > improving/adding filters. > Largest filter module issue as validator is "filter is made for filtering" and "blacklisting nature came from filtering architecture". I realized following issues with my filter module improvement RFC, of course. My approach back then was "it's better than nothing". There are many issues. I picked 2 most importants. - Although it can be used for whitelisting, but it's optional by design. It does not enforce whitelisting by default. Enforced whitelisting archives much better results. Therefore, security related features should use whitelisting. e.g. MAC(Mandatory Access Control), SELinux - It applies extremely dangerous default filter which does nothing in case user sets invalid filter/validator. i.e. Simply pass inputs, let code use it. No security check at all. Making filter's validation a true whitelist validator is possible. However, there are issues. Making filter a true validator requires a lot of BC. We may have "strict option switch", but making filter a whitelist validator isn't so simple as it may seem. "strict option switch" will add many branches and code might be unmaintainable. Even without "strict option switch", the code is based on "filtering"/"blacklisting" and requires large refactoring. I've already tried both "filter validate improvement" and "new validation module". From this experience, MySQL to MySQLi like transition is the best choice, IMO. We can use whatever API/interface(e.g. spec array, flags) that is easy to understand/maintain/expand. But again, I don't mind implementing filter's validation improvement if anyone would like to spend time for RFC. Even if it would be far from true validation, it's still better than nothing. If you would like to create filter improvement RFC and if it passes, I'll write code for it. Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net --94eb2c188434ba10f4055905528d--