Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:95834 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 97029 invoked from network); 9 Sep 2016 09:00:18 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 9 Sep 2016 09:00:18 -0000 Authentication-Results: pb1.pair.com smtp.mail=php-lists@koalephant.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=php-lists@koalephant.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain koalephant.com designates 206.123.115.54 as permitted sender) X-PHP-List-Original-Sender: php-lists@koalephant.com X-Host-Fingerprint: 206.123.115.54 mail1.25mail.st Received: from [206.123.115.54] ([206.123.115.54:42470] helo=mail1.25mail.st) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 8D/6F-61313-A1A72D75 for ; Fri, 09 Sep 2016 05:00:12 -0400 Received: from [10.0.1.23] (unknown [183.89.43.165]) by mail1.25mail.st (Postfix) with ESMTPSA id B487660D7C; Fri, 9 Sep 2016 08:59:58 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) In-Reply-To: Date: Fri, 9 Sep 2016 15:59:52 +0700 Cc: "internals@lists.php.net" Content-Transfer-Encoding: quoted-printable Message-ID: References: <232F1604-2211-4351-B830-EDC958A25D6D@strojny.net> <2de35db0-9974-cc96-83dd-3d2dbd48f7f8@lsces.co.uk> <5b72e9da-068a-bc79-82c2-f36f723f42bb@gmail.com> <8E15A146-F5B6-41AA-8177-4DAA5ACB35C4@koalephant.com> To: Yasuo Ohgaki X-Mailer: Apple Mail (2.3124) Subject: Re: [PHP-DEV] [RFC][VOTE] Add validation functions to filter module From: php-lists@koalephant.com (Stephen Reay) Hi Yasuo, > On 9 Sep 2016, at 06:12, Yasuo Ohgaki wrote: >=20 > On Thu, Sep 8, 2016 at 9:13 PM, Stephen Reay = wrote: >>> On 8 Sep 2016, at 17:49, Yasuo Ohgaki wrote: >>>=20 >>> Hi Stephen, >>>=20 >>> On Thu, Sep 8, 2016 at 7:34 PM, Stephen Reay = wrote: >>>> Adding a bunch of new functions is IMO the wrong approach to this = type of thing. >>>> The existing filter_var/filter_input infrastructure works well, if = you want to define more rules I would definitely encourage building = on/improving that system not adding a bunch of extra functions. >>>=20 >>> Do you really think filter module works well as optimal validator? >>=20 >> It=E2=80=99s not perfect, but nothing is. As I said, I believe the = issues can largely be resolved by building on the existing = functionality. >>=20 >>> It cannot enforce even whitelisting well=E2=80=A6 >>=20 >> VALIDATE_INT already accepts $max and $min options. Those options = could be applied to VALIDATE_FLOAT, and $charset, $accepted_chars, = $max_len, $min_len could be implemented on a new VALIDATE_STRING filter. >=20 > But it trims input because it is filter based. For example, int input = like > ' 1234 = ' > must be invalid with whitelisting approach, but it's allowed with > current implementation. >=20 >>=20 >> I understand the use-case for multiple validation per input, and for = validating multiple inputs, but frankly the way this implements that is = both confusing to use, and has a less than ideal error-mode. >>=20 >=20 > I agree. > As filter, it works well mostly. As validator, it's unuseable due to > filter like behavior/non whitelisting behaviours. As you mentioned, there isn=E2=80=99t *much* space left for new flags = (about 8 slots left by my count?) but surely a = FILTER_FLAG_ALLOW_WHITESPACE could be utilised by multiple filters to = achieve the same thing, while still allowing for legacy behaviour. >> The =E2=80=9Cfilter spec=E2=80=9D input is an array of arrays of = arrays, most of which will also contain an array for =E2=80=98options=E2=80= =99. To me that=E2=80=99s getting dangerously close to JavaScript=E2=80=99= s callback hell for impossible to read code. >>=20 > I can understand your concern. Issue would be callback validator, but > callback nesting would not be needed unlike JS callback hell. Since > it's simple array, content/rule can be viewed easily also. >=20 Sorry, maybe I wasn=E2=80=99t clear. My issue wasn=E2=80=99t with the = callback filter specifically. Callback hell in Javascript is the issue = where you have a lot of nested callbacks, which can make the code quite = difficult to read, even with indenting, matching brace highlighting etc. My concern here is that you=E2=80=99d have a very deeply nested data = structure essentially to avoid a user-space loop. >> The error mode is also not ideal in a real world use case in my = opinion. If I am validating a dozen input fields, I do *not* want to = know just the first one that failed. Can you imagine using a web form = that made you submit 12 times to tell you each time you got a field = wrong, rather than trying to validate them ALL and telling you ALL the = errors at once? >=20 > You can omit validation where you would like. So your concern wouldn't > be problem. (I strongly suggest to validate all inputs for all entry > points, though) Again, I apologise, maybe I wasn=E2=80=99t clear. Let=E2=80=99s say I expect to get 5 posted parameters (from a form, a = direct http api call, etc). 3 are required, of those 1 must be an email, = and the other two must match (e.g. password/confirm) and have a custom = callback to match (e.g. to prevent =E2=80=98aaaaaa=E2=80=99). the other = two are optional, but have a maximum length of 45 characters each. Now, I want to validate all those fields, and give meaningful error = messages back to the user. With your proposal, if they have made provided somehow invalid data in = all five fields, they will have to make at least 6 http requests = (assuming each time they get an error response they manage to fix that = field on their first re-try).=20 I can definitely see benefit in allowing exceptions when validation = fails. I use a validation system that does something similar internally, = but it includes catches so effectively, validating an request with 5 = inputs will potentially give you a single exception, which then gives = access to the error encountered for each input (which is also expressed = as an exception) >=20 >>=20 >> Personally I think a better approach is: >> 1. improve/adding to the filters available, and if desired, add extra = flags/options e.g, to throw an exception on failure (which, btw was = requested via bugs.php.net 6 years ago), to set min/max values for = FILTER_VALIDATE_FLOAT, etc. >>=20 >> 2a. Leave the multiple rules per input to userland (e.g. dev uses = foreach, array_walk, etc on a rules array or what have you) >> 2b. *maybe* add an alternative to filter_(input/var)_array where = it=E2=80=99s 1 input and multiple rules, e.g. = filter_(input|var)_multiple >>=20 >> If you wanted to follow 2b, I=E2=80=99d suggest perhaps tackling it = as a separate RFC - improving *what* can be validated isn=E2=80=99t = necessarily tied to *how* you define what you want validated. >=20 > Thank you for the suggestion. There are issues this approach. >=20 > 1. Existing validator does not perform strict validations. We need new > validator rules. e.g. >=20 > FILTER_VALIDATE_STRICT_FLOAT > - Do not allow white space prefix/postfix, raise exception. > FILTER_VALIDATE_STRICT_INT > - Do not allow white space prefix/postfix, raise exception. >=20 > Having > FILTER_VALIDATE_STRICT_FLOAT > and > FILTER_VALIDATE_FLOAT > would be problematic. As I mentioned above, I would imagine a single new flag could be used to = make =E2=80=9Call=E2=80=9D filters have the strict behaviour. Presumably = FILTER_VALIDATE_BOOLEAN should treat =E2=80=98 true =E2=80=98 as = an error too? >=20 > Current filter functions fallback to FILTER_UNSAFE_RAW when something > goes wrong which is unacceptable for validator. I=E2=80=99m not sure I understand this. What do you mean by =E2=80=9Cwhen = something goes wrong=E2=80=9D. Can you elaborate on that? >=20 > 2a. Letting user to use foreach will results in the same, at least > similar, rule definition array. Please take a look at the last reply > to Rowan's post. >=20 > 2b. This sounds good. I should think about implementation. 2a issue > that result in the same or similar array definition remains though. Not necessarily. While they *could* be in the format you specified, they = could also be defined individually or defined inline. I really believe that combining multiple fields *and* multiple rule = definitions complicates things, a *lot*. Making improvements to the filter system but only for this new = =E2=80=9Ccomplicated=E2=80=9D function feels like it goes against your = intent of trying to make it easier for developers to make safe(r) = applications. Cheers Stephen >=20 > Thank you for suggestions! >=20 > Regards, >=20 > -- > Yasuo Ohgaki > yohgaki@ohgaki.net >=20 > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php >=20