Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:94519 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 31363 invoked from network); 16 Jul 2016 15:33:34 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 16 Jul 2016 15:33:34 -0000 Authentication-Results: pb1.pair.com smtp.mail=michael.vostrikov@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=michael.vostrikov@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.220.172 as permitted sender) X-PHP-List-Original-Sender: michael.vostrikov@gmail.com X-Host-Fingerprint: 209.85.220.172 mail-qk0-f172.google.com Received: from [209.85.220.172] ([209.85.220.172:33790] helo=mail-qk0-f172.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 3A/5E-20986-CC35A875 for ; Sat, 16 Jul 2016 11:33:32 -0400 Received: by mail-qk0-f172.google.com with SMTP id p74so125647450qka.0 for ; Sat, 16 Jul 2016 08:33:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=RWxRi0UE0A1d+ei155dChvEvm1t3A9BMKnBCcLzSg+8=; b=ul1Y1KxTWGB1XFHHcaQXBK0SleAdCSTKuQX10QYB1RGf8wk45KMYAfANoWIX5iqgTf r+mJG+5//sLAg2SYQWob8BABBVHFPQ6rf/EBUaiYBEe9VK8DfGwXuk3KZ09SEKF30Hzi BaiiuKx5nFeguU8hlinhlbtEMgNWC4LdE8V0St6EObscr5U7i6qH3Dve2Lfm/M0ryBiX BqYtt9SdHq1KBxR3H6qXu3KKsJxoiUqqGGrR6sX2NSu9fDDhDREd5Uh9k2zMe5bP4Xha WeoU+7axFz51/LY1LPeyk/dp8u0MrVFn5f9r+Dpq3NJOYtu1sf/yC9O9/mC2z00Qjv3M 8EvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=RWxRi0UE0A1d+ei155dChvEvm1t3A9BMKnBCcLzSg+8=; b=RPvve27zPiJKptLCLtiGJalrbUCdvUArMkhkPYC/VxGzNieUHxqtsaqm3kzYJbBWdG U4sQch9UkQ1l0QB/8vmCO7cM+DB3qNlXe7noKMpmvTgwlmQHaQt1YA4+PnmxXnYHTn99 XjCi+0zQyB2PmxhiFEBw0eu3qKtu399loLXr+jmrR+KOHVSxtiMZyRLmeNQZzqKr5d5s W62Tvopn5TVv7HMQrde7pEi3BljdwnVv68jTZmDsPnmpTxi/gFKoQjb+91CHYgCnHur0 4yF1oYuSV2+PAP4f8FgA6hBzx853tfC4AX1LQZma3xu1axU81OyGkneYWeQaXx9eCovq spZw== X-Gm-Message-State: ALyK8tK/o8BfcZYzB+3hKRQ9j/kT/iMSjzCMmpCW3LV7bLPWMWpKlJC+YFEqx87zJUc4YwZWMuLxTdMEhnxPqw== X-Received: by 10.55.129.71 with SMTP id c68mr30207654qkd.174.1468683209421; Sat, 16 Jul 2016 08:33:29 -0700 (PDT) MIME-Version: 1.0 Received: by 10.55.189.135 with HTTP; Sat, 16 Jul 2016 08:33:29 -0700 (PDT) Date: Sat, 16 Jul 2016 20:33:29 +0500 Message-ID: To: PHP Internals Content-Type: multipart/alternative; boundary=94eb2c06266afc38750537c27446 Subject: [RFC] New operator for context-dependent escaping From: michael.vostrikov@gmail.com (Michael Vostrikov) --94eb2c06266afc38750537c27446 Content-Type: text/plain; charset=UTF-8 Hello. I have created RFC about context-dependent escaping operator. https://wiki.php.net/rfc/escaping_operator Initial discussion was here: http://marc.info/?t=146619199100001 At first, I wanted to add a call of special function like escaper_call($str, $context), which performs html-escaping by default and can be replaced with a separate extension for extended work with contexts. But then I figured out better variant. Main idea. Operator has the following form: Both expressions can be any type which can be converted to string. Second expression is optional. I changed '~' sign because it is not present on keyboard layouts for some european languages. And also it does not give any error on previous versions of PHP with short tags enabled, because this is recognized as bitwise operation. Operator is compiled into the following AST: echo PHPEscaper::escape(first_argument, second_argument); Don't you forget that we already have special operator for one function? Backticks and shell_exec(). New operator is compiled very similar to it. There is a default implementation of the class 'PHPEscaper'. It has 4 static methods: PHPEscaper::escape($string, $context = 'html'); PHPEscaper::registerHandler($context, $escaper_function); PHPEscaper::unregisterHandler($context); PHPEscaper::getHandlers(); Method PHPEscaper::escape($string, $context) splits $context by '|' delimiter, all parts are trimmed, and then calls registered handler for every context in a chain. 'html' is default value for context, and it has special handling. If there is no handler for 'html' context, it calls htmlspecialchars($string, ENT_QUOTES | ENT_SUBSTITUTE); We can use it like this: And even more. In the AST, 'PHPEscaper' is registered as not fully qualified name (ZEND_NAME_NOT_FQ). This allows us to use namespaces and autoloading: MyEscaper::escape($str, 'js | html') will be called. In this way we can have autoloading, multiple contexts, HTML escaping by default, and full control and customization. This is not an operator for one function, just there is one default implementation. My first goal is to draw the attention on the problem with a security and HTML escaping. Exact implementation is secondary thing. This small change can really improve a security and make development easier in many applications. How do you think, maybe also it would be good to create some official poll about this feature and to know community opinion about it? --94eb2c06266afc38750537c27446--