Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:91801 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 55596 invoked from network); 21 Mar 2016 04:54:02 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 21 Mar 2016 04:54:02 -0000 Authentication-Results: pb1.pair.com header.from=yohgaki@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=yohgaki@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.213.53 as permitted sender) X-PHP-List-Original-Sender: yohgaki@gmail.com X-Host-Fingerprint: 209.85.213.53 mail-vk0-f53.google.com Received: from [209.85.213.53] ([209.85.213.53:33167] helo=mail-vk0-f53.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 4E/3D-48999-56E7FE65 for ; Sun, 20 Mar 2016 23:53:57 -0500 Received: by mail-vk0-f53.google.com with SMTP id k1so203015037vkb.0 for ; Sun, 20 Mar 2016 21:53:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=0e25qhHrIKiAYVr0ZhFDGa44JCJEDLFlshtVk9vhpvE=; b=yIxnWIRjRkx69w8DNbmlKM48t3cG6aZRX/1AZ5m7TlREkhpLix0AzR4wzwdj4FkK+/ WLUlCR209Mgu19GbeBBrrU184QnKDPg2m38FOWdXLXPB0wDEcM5xK8NzeYts9pJu5gd8 3lOHZYzQpOLEuZ875XGxbTJS6hga3aCrwpytfIIapEXRtPTliucqgtKaFgabq13DIRpG xHQ2zHbDJC0k0HAZVNUzCcIMyqhUBf1EZN+rXXTihF4KBYnYS45YCeg8ZyP5AwRmjed0 8bkAq9xaUfGreT39KPy0L9wywh04sngEd6PFEtMZI61C2Fvb1rW+yJT+6ERJdV6cWLrZ QBWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=0e25qhHrIKiAYVr0ZhFDGa44JCJEDLFlshtVk9vhpvE=; b=BNRE7rX26kaBESb5ZA82YjNv/TQeY4itwbwxTIzphRmYkNI1czdv0jw1ejOjeRHnXN kmPndEKRfLQe44YBYbdENDgcT7zNQmi+dSDXUkaa7HHfkRwGzGwwwGMQ11bHl1InQz1Q KNhZPIDtia8YgMlgaAoLeV26H3n54C+85wCFJ5uvaoNI47qYHX6a8X/HHLQvjvubp5xj EACkbuKqxD1UDNFaC0T/Zii+3ckNw/PZkO7qjKK1JjGS9Y/iz+k2JUTxdmoae04r3I1t CWq48+1hRP1ImbxCB0y5fUxrxFQEDdmza/JzaLypxPAAmP0zP1NgoEA23JGvEyS1au15 qP9g== X-Gm-Message-State: AD7BkJKa7CGrR3uySBHeerLRuSuqbquUZP3n4JrxAPy1ChO3oVCTq74hGmCHrbuuO7PrzVOMBQBRqEUOEzGUBg== X-Received: by 10.176.5.135 with SMTP id e7mr1445819uae.91.1458536034615; Sun, 20 Mar 2016 21:53:54 -0700 (PDT) MIME-Version: 1.0 Sender: yohgaki@gmail.com Received: by 10.159.35.77 with HTTP; Sun, 20 Mar 2016 21:53:15 -0700 (PDT) In-Reply-To: References: Date: Mon, 21 Mar 2016 13:53:15 +0900 X-Google-Sender-Auth: zFm8G-QnCJx1xiSfq5-hFXOlYCQ Message-ID: To: Daniel Beardsley Cc: "internals@lists.php.net" Content-Type: text/plain; charset=UTF-8 Subject: Re: [PHP-DEV] RFC about automatic template escaping From: yohgaki@ohgaki.net (Yasuo Ohgaki) Hi Daniel, On Mon, Mar 21, 2016 at 7:11 AM, Daniel Beardsley wrote: > I'd like to submit an RFC (with a pull request) for adding auto-escaping to > the php language. > > We at iFixit.com have used PHP for nearly a decade to run our website. > Several years ago, we abandoned the Smarty templating engine and used php > files directly as templates. This worked, but was a bit unsafe and made it > too easy to leave user submitted content unescaped. Several years ago we > switched to using a modified version of PHP that included auto-escaping and > it has been working great. In the process of preparing to use php 7, I've > re-implemented the changes against the master branch. > > I'd like to gauge interest in this before I formally submit an RFC. Here's > a somewhat better description that I've attached to a pull request on our > internal fork of php. > > Pull request on internal fork: https://github.com/iFixit/php-src/pull/14 > > Background > ========== > PHP doesn't have any mechanism to inject logic between templating > and final output. There is no way to filter or alter the content > that comes from code in templates like: > > To use php as a robust templataing language, we must inject *some* > logic between templates and their output. We have chosen to make > all trip through the internal function php_escape_html_entitiles. > > The functionality can be toggled with `ini_set('__auto_escape')` > and configured with `__auto_escape_flags` and > `__auto_escape_exempt_class` (see commit > https://github.com/iFixit/php-src/commit/2dae5d16436ce37856f6e00ca2a1b3009bb1f7ed > for info about the class name based auto-escaping exemption. > > Methodology > =========== > T_ECHO (echo, ZEND_AST_ECHO_ESCAPE node in the syntax tree. > > That's compiled to a function which emits a ZEND_ECHO_ESCAPE op code. > > The op code interpretation is a dupe of ZEND_ECHO except with some > if() statements that switch the underlying function from `zend_write` > to `zend_write_escape` based on the ini settings. > > zend_write_escape is a new function pointer that points to > php_escape_write. > > php_escape_write is a new function that passes it's string argument > through php_escape_html_entities() (with __auto_escape_flags) before > calling the underlying php_output_write. > > Use > === > This functionality allows us to safely use php straight as a > templating language with no template compilation step (as many > other templating libraries have). > > See the included tests for more usage information. > > Exempt Class > ============ > It is useful to allow some utility functions and helpers to produce > html and have it passed straight through in the template (without > being double-encoded). We accomplish this by *tagging* strings > as being HTML. > > class HtmlString implements JsonSerializable { > protected $html = ''; > > public function __construct($html) { > $this->html = $html; > } > > public function __toString() { > return (string)$this->html; > } > > public function jsonSerialize() { > return $this->html; > } > } > > The auto-escaping system can be configured with an: > __auto_escape_exempt_class="HtmlString" > > Which allows instances of `HtmlString` to pass straight through a > template without being modified (skipping the html_entities call). > Helper functions can now return html safely and consumers don't have > to care if it is HTML or not because the auto-escaping system knows > what to do. > > Thanks for your consideration! > Daniel Beardsley Issue is "Escaping is done on a specific context". I understand your proposal is focused on HTML escaping. However, setting names like __auto_escape_exempt_class is not good choice. It has to be __auto_html_escape_exempt_class at least because it is for HTML escaping. In addition, HTML consists of multiple contexts - HTML context that requires HTML escape - URI context that requires URI escape - JavaScript context, embedded JavaScript strings for example , that requires JavaScript string escape, etc. e.g. http://blog.ohgaki.net/javascript-string-escape (Sorry. It's my blog and written in Japanese. You may try translation service or you should be able to understand PHP code at least) - CSS context that requires CSS escape. e.g. https://developer.mozilla.org/ja/docs/Web/API/CSS/escape - And so on Dealing HTML context only would be problematic even if it works for many cases. Escaping must be done depends on context. Multiple contexts may apply also. HTML context only escaping would not work well.. Applying proper escapes to variables in HTML is very complex task.. Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net