Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:109194 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 24745 invoked from network); 22 Mar 2020 00:32:02 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 22 Mar 2020 00:32:02 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id E90B01804E2 for ; Sat, 21 Mar 2020 15:55:47 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,HTML_MESSAGE,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-wr1-f54.google.com (mail-wr1-f54.google.com [209.85.221.54]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 21 Mar 2020 15:55:47 -0700 (PDT) Received: by mail-wr1-f54.google.com with SMTP id v11so11942342wrm.9 for ; Sat, 21 Mar 2020 15:55:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=craigfrancis.co.uk; s=default; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=KW6rXRcWcw/wfKcOH03ZpVqf2YhFZ1PZKcCgEv1/Yc0=; b=Rs2o0Cl8UyhChQuOLpwO+ZK8F4ypzmev0xbItBq38dAHxP7OyW8gVjjVSqTCLfQMjM oLteasZxDdx2nd7+/JKUL5e9xfTdleg5rOC1yalB/koKo/C4t7ZEoau9zf7JNX0cX/0y 170afKY1022/G0TL674z0Zbr5fw1PHqyiwpgM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=KW6rXRcWcw/wfKcOH03ZpVqf2YhFZ1PZKcCgEv1/Yc0=; b=eM7YnnaOil7Ea+PzrrjJcbUUMoXRJQSjIRA4prA7bmuSFvGshz7UMqs7bnsXlaO49r yXnXb0qXj59t3inqOcyLGXTMB/bietcKGBrKPMzSzYCS4L3T87+kry2YwvFTajOStZom 9EegqtUBAoNGzkpo4ap95ZPgkYWPvvlKXzZm1cBsF/WeMdx0dROtYx0voZTjkv9s9VSV dBSvf6REKNaDHZC0Eo+3ay9Bac2OItfQWjMaL0UNNVgRQA/nbGX9aIoruCDeGbJJr7Pd Pxao/kD6W03jNsCHaUd9cTiJvfitZCYOs/cQorGVC+xV5FLxfi8GR5TawlL7cNej77JC Pz9g== X-Gm-Message-State: ANhLgQ0/WV7TWvDpXbO+7fA8FLrDDJo63eLtUkKZRnhAw7G9TEBPR+9/ rpVHYfETlEwqBRIoyJlmIGeWsmDFB595GRHMGMxCDQ== X-Google-Smtp-Source: ADFU+vuOguXhJpbv5ObZmJiz8GJE6ml3LNljQpIuC6xFSSeG8jnwGFge9y3gr7lcDEWnZzZ6p7+TPfkZI8/hL2bIEgA= X-Received: by 2002:adf:f68a:: with SMTP id v10mr15298478wrp.80.1584831344405; Sat, 21 Mar 2020 15:55:44 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: Date: Sat, 21 Mar 2020 22:55:33 +0000 Message-ID: To: tyson andre Cc: PHP internals Content-Type: multipart/alternative; boundary="00000000000050117f05a1654d15" Subject: Re: [PHP-DEV] [RFC] is_literal() From: craig@craigfrancis.co.uk (Craig Francis) --00000000000050117f05a1654d15 Content-Type: text/plain; charset="UTF-8" Hi Tyson, Thanks for your email, I really appreciate it. I'm not a C developer, but if I was to have a go at implementing `is_literal()`, I probably would start with the existing taint extension. But you're right, that approach can cause performance issues, and doesn't play well with XDebug (much to my annoyance). I'm hoping that someone with more C and PHP Internals experience might have a better solution, as there was some mention of the existing engine knowing about literals being semi-special, and there might be (I'm really showing my ignorance here) a better way to work with the compiler. As to using PECL; my hope is that we can use this to help educate developers about the dangers of mixing variables with commands. While they won't think to use this function themselves, it can be used by frameworks to ensure mistakes aren't being made (and who knows, maybe one day functions like mysqli_query() could use this as well). As to the related projects / static analysis; while I do appreciate them, similar tools have been available for years (admittedly some with huge price tags), and I don't think they have really solved these problems. That said, I have added a note in the RFC. Thanks again, Craig On Sat, 21 Mar 2020 at 21:59, tyson andre wrote: > Hi Craig, > > https://github.com/laruence/taint#taint notes that > "Please note that do not enable this extension in product(ion) env, since > it will slowdown your app." > > - That repo already provides is_tainted() http://docs.php.net/is_tainted > > A fork of that repo would theoretically allow implementing is_literal() > as described in the RFC - is that the implementation plan? > - The slowdown would be a large problem if this feature was always on. > > And if it can be implemented as a PECL module, that would be more > preferable to me than a core module of php. > If it was in core, having to support that feature may limit > optimizations or implementation changes that could be done in the future. > > If it's implemented in the same way as taint (i.e. cannot be used in > combination with XDebug, blackfire, newrelic, etc), > that would also be a problem for including it in core. > If it wasn't, then it'd slow down concatenation, calls, etc. even when the > application didn't need is_literal. > > I also imagine that whether or not opcache was enabled is likely to affect > whether or not > something ends up being a literal or not > (e.g. opcache can evaluate functions such as str_repeat() for literals at > compile time) > https://github.com/laruence/taint/blob/master/taint.c seems to already > support a whitelist (php_taint_override_func), > so that isn't insurmountable for functions, > but it's possible `if ($local === 'literal') { process($local); }` would > only satisfy is_literal() with opcache enabled. > > Related projects (static analysis instead of runtime checks, though): > > It's also worth noting that `vimeo/psalm` had an in progress way to detect > some ways in which tainted strings may be used by applications, based on a > paper by Facebook. > ( > https://cacm.acm.org/magazines/2019/8/238344-scaling-static-analyses-at-facebook/fulltext > (for HHVM, though)) > https://github.com/vimeo/psalm/issues/611#issuecomment-515153305 - but it > isn't completed or usable yet, as far as I can tell. > > Wikimedia also created > https://gerrit.wikimedia.org/g/mediawiki/tools/phan/SecurityCheckPlugin/ > , but that's currently beta. > Both would have ways they fail to catch every way an argument could be > passed to a function (e.g. unanalyzable dynamic/framework calls) > > - Tyson > --00000000000050117f05a1654d15--