Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:95286 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 75889 invoked from network); 18 Aug 2016 06:55:04 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 18 Aug 2016 06:55:04 -0000 Authentication-Results: pb1.pair.com smtp.mail=smalyshev@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=smalyshev@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.213.171 as permitted sender) X-PHP-List-Original-Sender: smalyshev@gmail.com X-Host-Fingerprint: 209.85.213.171 mail-yb0-f171.google.com Received: from [209.85.213.171] ([209.85.213.171:33767] helo=mail-yb0-f171.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 58/87-23968-8CB55B75 for ; Thu, 18 Aug 2016 02:55:04 -0400 Received: by mail-yb0-f171.google.com with SMTP id r187so2978707ybr.0 for ; Wed, 17 Aug 2016 23:55:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=+dIDAQpvDa6M2J5IBK9F/958p5TNNdmjRSzg91ChQ04=; b=QsVgk1da3mPXEUUhPQ4GYBxNROZOS0yZMsCe+n/SalG0rzxl2JCfM9IdC3uOo7tGlg lDS6jIvhcb6wIOUL+W7bCSMX91QgQfksF/Pp5JAIOl1vhBdlwg5PLDTCgPu3JCsCKKqM tciJE+cSvs6ReCXFFDK2hJabDgMGEYipfblDsaAtFoswVQyiXgcA413TJq629co7epmG 6z8RiAXbf4t6bUBr2FuRF4oFxqNT5ER4rh3pX3YcAgYP9v0cdbO1Jtgac6SXCQIzylMw L2CZ2T57fSkWaaxf1UuDFGOUVyGUHkCMW0idhK5z6bnBG/iEXYVqYIkmt8sS5nKnIO3H KA1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=+dIDAQpvDa6M2J5IBK9F/958p5TNNdmjRSzg91ChQ04=; b=kCj4xIvy0lfoGdqN+scm93OAyLGZxE96UGAVA8X4JSiYe4Du2+ix/HYlnFgOuu0OAD 7GIkPAWUMzpENt+zU3/Pg2o10ycIsk0r+TmqZJy/7DvulWSr9D6p0qIbODwE4y2Dln5K xz84d5+f9QAWcb+BSU6zsoLIAEbg0WkxBLqS7+fzeu3pIEkEDMefg4Z/F7cno36+hF+g OwWcuVNsK+0MmZHoMvMqvHO/UePsHWM7RxRCeez1v0gX/9tcsMeRbWjRB2fJIu414lQ5 WtEKNx2ax55dtvMABpk6LnnPX5/0bBK6Rfnk/3Y7TI9/wAQVRiv2XkMy3Z30Y7CaF4aB 3Hiw== X-Gm-Message-State: AEkoouurz94IC4dgMQsOQ2tTU72u6I6BD/JgOLgfX8Ua8K6u6SfzEvMU43L65P9FXM4GjQ== X-Received: by 10.37.77.137 with SMTP id a131mr453948ybb.1.1471503301409; Wed, 17 Aug 2016 23:55:01 -0700 (PDT) Received: from Stas-Air.local (108-233-206-104.lightspeed.sntcca.sbcglobal.net. [108.233.206.104]) by smtp.gmail.com with ESMTPSA id f10sm311089ywb.22.2016.08.17.23.54.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 17 Aug 2016 23:54:59 -0700 (PDT) To: Yasuo Ohgaki References: <7795ca21-bd70-fe65-9519-af95fdfee33f@gmail.com> <40279244-a1ba-2680-8a14-89708bcd1852@gmail.com> Cc: Marco Pivetta , Dan Ackroyd , PHP Internals List Message-ID: Date: Wed, 17 Aug 2016 23:54:57 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] Re: [RFC][VOTE] Add validation functions to filter module From: smalyshev@gmail.com (Stanislav Malyshev) Hi! > Even when there is no JavaScript nor HTML5 forms, input validations > can be done. It's matter of definition of "valid inputs" for type="text" name="var" />. If page encoding is UTF-8, web browsers > must return response by UTF-8 encoding. (Unless other encoding is I think you're still missing my point. The point is that it is absolutely irrelevant what browser might or might not do, since PHP does not have any means to know if browsers even exist. PHP doesn't talk to browser, it talks to HTTP channel (provided we're in webserver scenario), what's on the other end is unknown and irrelevant. So there's no point discussing browsers. > We recently added number of > php_error_docref(E_ERROR, "Cannot process too large data"); > in PHP core to avoid possible memory destruction attacks. We added it because we didn't have choice. PHP does not have generic error mechanism that allows to fail an arbitrary function and still continue execution. It's because PHP is highly complex C code and C is not the most friendly language out there. Your app is not in C, so it can do it differently. If you talk about such situations, fine, but it's not input validation - it's limitation of the environment (since PHP can't support arbitrary length string). If your application has such limitations - fine, but it would be application-defined and will not apply for most cases of input validation. > Broken char encoding shouldn't came from legitimate users. Text > contains CNTRL chars from shouldn't > come from legitimate users. 1MB data from name="var" /> shouldn't come from legitimate users. Numeric database > record ID that is set by app shouldn't contain anything other than > digits. And so on. I think you are mixing abnormal situations due to physical limitations of software (like memory limits, etc.) with business logic. Numeric format validation and size limits are clearly business logic. Encoding may be not, depending on what the input is and used for. > Broken char encoding (Accept only valid encoding) > NUL, etc control chars in string. (Accept only chars allowed) > Too long or too short string. e.g. JS validated values and values set > by server programs like /etc, 100 chars for > username, 1000 chars for password, empty ID for a database record, > etc. (Accept only strings within range) These all fine filters/validators, and may be very useful in many situations. What I still don't understand is insistence of application dropping everything and exiting when one of them fails. We already have sanitization/filtering infrastructure, we can add new filters and flags - what I don't understand, why we need parallel infrastructure which seems to be only different by an unhelpful feature of crashing each time it sees something unexpected. Am I missing something? > How to deal with bad inputs. > - You seem you would like to treat as normal input. No, you didn't understand. I would like to treat is as erroneous input, but not stop the application immediately, but return error status to the business logic and let it sort things out. > When plain is used, users may type in any valid UTF-8 char by mistake. > For example, this wouldn't happen for date field, but autocomplete may > fill my name "大垣靖男" to name field that supposed to contain alphabets > only. If the software is properly internationalized (like my email client) there's absolutely nothing wrong with this string. If it is not, it should check that the text matches its expectations - that's part of business logic. > If developers try to validate "all inputs", validation in MVC model is > not efficient nor reasonable. It does not make sense to validate > browser request headers in db model, for example. Ideally, input > validation is better to be done as fast as possible to maximize the > mitigation effect. If you use browser headers, you validate them. If you don't use them, no point validating them, of course, since they are not your inputs. -- Stas Malyshev smalyshev@gmail.com