Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:94530 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 83293 invoked from network); 17 Jul 2016 11:33:45 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 17 Jul 2016 11:33:45 -0000 Authentication-Results: pb1.pair.com header.from=rasmus@mindplay.dk; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=rasmus@mindplay.dk; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain mindplay.dk from 209.85.213.53 cause and error) X-PHP-List-Original-Sender: rasmus@mindplay.dk X-Host-Fingerprint: 209.85.213.53 mail-vk0-f53.google.com Received: from [209.85.213.53] ([209.85.213.53:36270] helo=mail-vk0-f53.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 79/46-31884-71D6B875 for ; Sun, 17 Jul 2016 07:33:44 -0400 Received: by mail-vk0-f53.google.com with SMTP id j126so153836175vkg.3 for ; Sun, 17 Jul 2016 04:33:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mindplay-dk.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=TGvl6ix7sptA3t/9AMiFy6wcD9z6OxnUikV+jQ2oyXs=; b=UCjpWUsYMHKZrauvRdfYRnNTdVvcFmqYgV4s3vVLvREAcXtIA0R6pt9+cUgtVHvcHP xd7OKzhHGm9PC9WebrbZecKGj6yC/gnVEuspCUS+antbqU2/TcHHqWYaJPo06AeQsG18 1eNWafUUT+Try8S1VuU8Suk81X63eWXo+uGwk+wB2FEBTWd33jgIg2RzvU7q2RIWJFgc LEKKFBtUmOpN1X4sedyxpZ7HZX+v5HhBajlNMN6VNIRFy7uNGwziijSdbLi0M05WRCKO j9pqtWchrcE6/fw8XF/I2YmWA2j8Sfi+sZsE2YP7kj7elDQZpyBBgv3PjeQcs00ASYA0 Ibig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=TGvl6ix7sptA3t/9AMiFy6wcD9z6OxnUikV+jQ2oyXs=; b=U41wIGUHg+GnFBYSi0N2Jn2JwqlKFtjkGcedw6i3lpFRYmDwE3rgcCwoNQ9JIulRVe 0kEOLlkbi6cUg5lZMv4DDOEjhfJqCFZQGhVJMT4nQTU32NP7e4uW6b5GLB3r65pl+ZD4 ZYG7V3Relqbsz8wFoaKekSrqXeLDlcQ4jgS9INT19PxuS+Us1U/val4HhMo76dCfKxHz aOuCZCjdLXuynQPt6429ngYkxUE/SBRP9lkOrql3gNlArIf02iDlotyDjc33sIR8JktV +DfhUk9ItOVjqtfYLNMDSsX40444fuNRDtEfC1JjfyfB0isYe36885yLIr88iledBmDP WlNw== X-Gm-Message-State: ALyK8tIJoi11GL7VPZd70JdFiYI7eTZKijDL5tthQvEXzS3i5RwQBfVkLsG233IMkI3HjVvn0py+kGX5RWPZpw== X-Received: by 10.159.39.39 with SMTP id a36mr15003231uaa.86.1468755220636; Sun, 17 Jul 2016 04:33:40 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.105.9 with HTTP; Sun, 17 Jul 2016 04:33:39 -0700 (PDT) In-Reply-To: References: Date: Sun, 17 Jul 2016 13:33:39 +0200 Message-ID: To: Michael Vostrikov Cc: PHP Internals Content-Type: multipart/alternative; boundary=94eb2c12324c3044d00537d33935 Subject: Re: [PHP-DEV] [RFC] New operator for context-dependent escaping From: rasmus@mindplay.dk (Rasmus Schultz) --94eb2c12324c3044d00537d33935 Content-Type: text/plain; charset=UTF-8 I've read your RFC, and I think this a strange feature. All it is, really, is a registry for functions, and syntactic sugar for calling those functions - it's unnecessary, it's more global state you have to manage, and it's the kind of superficial convenience that will end up breeding more complexity. What's also strange, is the ability to call functions in this registry hinges on syntax. What if I want to call the registered functions from within code? ob_start(); ?><* $text *> Both variants and work good. This is so true - and the whole syntactic convenience line of thinking really should end with that. > Also there is a problem with function autoloading. I maintain that this is the real problem, and perhaps the only problem - all this RFC does, is provide a stop-gap solution. What we should really be talking about, is implementing the RFC that addresses the existing gap in the the existing feature of the language. Your arguments don't make sense to me. It's somehow easier to choose between two different characters * and ? versus electing to call a function or not? I don't see how - it still requires an active choice, and I don't believe there's any (sound) way around that. All this RFC changes is the syntax - not the problem. Addition of a feature like this will affect even those who don't use it - we all collaborate in teams, and most of us contribute to open source projects... a feature like this will bring global state, side-effects and many other interesting problems even to those who don't elect to use it, when they inherit or consume code that does. The poll doesn't make a whole lot of sense either, because you're asking specifically about the proposed feature, rather than asking in general about the problem. This doesn't prompt people to think about the problem - it prompts them to consider the proposed solution. It's easy enough to look at this on the surface and think "sure, that solves it" - reasoning about the impact on the language, or deeper problems not directly relating to this on the surface, requires more of an involvement than just a quick click on a radio button. > More than 90% of output data - is data from DB and must be HTML-encoded Yet, you argue we need a function registry for all kinds of other escape operations to address the other 10%. I can't follow this line of thinking. If the 90% use case is HTML escaping (with UTF-8 encoding, as is likely true) then maybe I could accept the addition of syntax just for that. *Maybe*. I would still be *much* more concerned about the limited usefulness of functions in general, which could be more generally addressed by solving autoloading. I view this RFC as a huge distraction and, if implemented, addressing that one use-case for functions (templates) we're more likely to put off the deeper issues for even longer. Please, let's focus on improving the language in general - rather than improving one isolated use-case. On Sat, Jul 16, 2016 at 5:33 PM, Michael Vostrikov < michael.vostrikov@gmail.com> wrote: > Hello. > I have created RFC about context-dependent escaping operator. > https://wiki.php.net/rfc/escaping_operator > > Initial discussion was here: http://marc.info/?t=146619199100001 > > > At first, I wanted to add a call of special function like > escaper_call($str, $context), which performs html-escaping by default and > can be replaced with a separate extension for extended work with contexts. > But then I figured out better variant. > > > Main idea. > > Operator has the following form: > > > > > > Both expressions can be any type which can be converted to string. Second > expression is optional. > > I changed '~' sign because it is not present on keyboard layouts for some > european languages. And also it does not give any error on previous > versions of PHP with short tags enabled, because this is recognized as > bitwise operation. > > > Operator is compiled into the following AST: > > echo PHPEscaper::escape(first_argument, second_argument); > > Don't you forget that we already have special operator for one function? > Backticks and shell_exec(). New operator is compiled very similar to it. > > > There is a default implementation of the class 'PHPEscaper'. It has 4 > static methods: > > PHPEscaper::escape($string, $context = 'html'); > PHPEscaper::registerHandler($context, $escaper_function); > PHPEscaper::unregisterHandler($context); > PHPEscaper::getHandlers(); > > Method PHPEscaper::escape($string, $context) splits $context by '|' > delimiter, all parts are trimmed, and then calls registered handler for > every context in a chain. > 'html' is default value for context, and it has special handling. > If there is no handler for 'html' context, it calls > htmlspecialchars($string, ENT_QUOTES | ENT_SUBSTITUTE); > > > We can use it like this: > > // anywhere in application > PHPEscaper::registerHandler('html', [MyEscaper, 'escapeHtml']); > PHPEscaper::registerHandler('js', function($str) { return > json_encode($str); }); > ?> > > > > And even more. > In the AST, 'PHPEscaper' is registered as not fully qualified name > (ZEND_NAME_NOT_FQ). > This allows us to use namespaces and autoloading: > > > > > MyEscaper::escape($str, 'js | html') will be called. > > > In this way we can have autoloading, multiple contexts, HTML escaping by > default, and full control and customization. > This is not an operator for one function, just there is one default > implementation. > > My first goal is to draw the attention on the problem with a security and > HTML escaping. Exact implementation is secondary thing. > > This small change can really improve a security and make development easier > in many applications. > > > How do you think, maybe also it would be good to create some official poll > about this feature and to know community opinion about it? > --94eb2c12324c3044d00537d33935--