Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:93026 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 84273 invoked from network); 2 May 2016 01:58:47 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 2 May 2016 01:58:47 -0000 Authentication-Results: pb1.pair.com smtp.mail=jesseschalken@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=jesseschalken@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.213.182 as permitted sender) X-PHP-List-Original-Sender: jesseschalken@gmail.com X-Host-Fingerprint: 209.85.213.182 mail-ig0-f182.google.com Received: from [209.85.213.182] ([209.85.213.182:35168] helo=mail-ig0-f182.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id DD/C6-03860-454B6275 for ; Sun, 01 May 2016 21:58:45 -0400 Received: by mail-ig0-f182.google.com with SMTP id bi2so72777971igb.0 for ; Sun, 01 May 2016 18:58:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc; bh=9uMAXzObM5VTXweoA/5DAKSXdxmO/iHhfLNwwFXbWaQ=; b=VuUNXxUxzexfZaKQgrxDxkvQAZKYMWwmSezSzojON3JXXtysfBjmOASjAbznTWUKbN /ickg6fx4lzSUqq6UuGk94ywAlMvETVyYOvD7JnfJW1cRNxZdAST7nh+JAuRZ1B2sg2f wk7GiRJVABi92KwtZhRIAmVwMUPUh0zRm0V1gqCQg0PhswZe5lmZVZYKPe2qDfsRaC9X Sz2fhcIlg654fVTAr78I7ty3BNsLgfqM2nTuMajGEp9vMIcYFhW5LY59aHTv7oxH+PbJ ycMppScKLoEX8cacS0iYew9dSx0yjvVMxbICmVRC5551gquiudMmq32rmJalI24dLP7M aNYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc; bh=9uMAXzObM5VTXweoA/5DAKSXdxmO/iHhfLNwwFXbWaQ=; b=k0qWcCWshWfFn4mehD7tRH4d82Siry2/4oB/UqToSd9ICTCxgqBgQshtKOk0G0sN0K 9XQYvrPAbzhw5x1K/JaIqFJ0rrAZ1g9Nb3SM9gHkVC13ooiH4KIazhYxARKPqEzdftdt LNt8cqz0twC65/vtvUYEvLDMD759x3sOUbi2l3178agthm07MFlxPtHLwybb8CEV+HbP yL56ADlFd3G5zRrkqFNuYXHmFs4KL7GFMclUyy/0IQeQ8qs8clfvqORwFGjPUY8QI/hU P3otJqvpknsQcxA2VM09jGkk5bYCjVOVM73UnVWDo3KnalzywecG5GQfrJplnwjpWXXI DbVw== X-Gm-Message-State: AOPr4FU/c9+Jc7amT8jRplkUZ1+6McyCCmaYEjTx9teRc0+xHKf+lPxSneOVkyLFOIdNslC4mcLmYTs0wI+3jw== MIME-Version: 1.0 X-Received: by 10.50.28.113 with SMTP id a17mr16947210igh.44.1462154322088; Sun, 01 May 2016 18:58:42 -0700 (PDT) Sender: jesseschalken@gmail.com Received: by 10.79.139.133 with HTTP; Sun, 1 May 2016 18:58:41 -0700 (PDT) In-Reply-To: References: Date: Mon, 2 May 2016 11:58:41 +1000 X-Google-Sender-Auth: GYkp6LXIfu2Rd1yrF-by90cY3qo Message-ID: To: Nikita Popov Cc: PHP internals Content-Type: multipart/alternative; boundary=089e01538ac8f99dcf0531d25411 Subject: Re: [PHP-DEV] Forbid dynamic calls to scope introspection/modification functions From: me@jesseschalken.com (Jesse Schalken) --089e01538ac8f99dcf0531d25411 Content-Type: text/plain; charset=UTF-8 I'm all in favour of this. It makes the language much more predictable and amenable to optimisation and static analysis overall, at the cost of what IMO is a very, very minor BC break. To think that all this time, functions I've written which accept a callable could be passed something like "extract" and depended on whatever behaviour results. Eugh. On Fri, Apr 29, 2016 at 7:48 PM, Nikita Popov wrote: > Hi internals! > > Welcome to another edition of "crazy PHP edge-cases you don't want to know > about". > > I'd like to introduce a restriction that forbids performing dynamic calls > to scope introspection and modification functions. "Dynamic calls" here > means things like $fn(), call_user_func($fn) and array_map($fn, ...). > "Scope introspection functions" refers to the following functions that in > one way or another inspect or modify parent stack frames: > > * extract() > * compact() > * get_defined_vars() > * parse_str() with one arg > * mb_parse_str() with one arg > * func_get_args() > * func_get_arg() > * func_num_args() > > I'd like to introduce this restriction for a number of reasons, which will > be outlined in the following. There are essentially two core problems: A) > It is not clear how these functions should behave in this situation -- > indeed I will show examples of behavior differing due to inconsequential > changes, differing between different PHP runtimes or versions and just > generally behaving crazily. B) These calls pose a stability problem (yay > segfaults) and violate assumptions of existing optimizations (yay more > segfaults). > > A) The primary issue is that dynamic calls to these functions have unclear > behavior and may lead to some very odd behavior. They all fundamentally > work by inspecting higher stack frames, but don't agree on which frame they > should operate on. > > Example #1: > > namespace { > function test($a, $b, $c) { > var_dump(call_user_func('func_num_args')); > } > test(1, 2, 3); > } > > namespace Foo { > function test($a, $b, $c) { > var_dump(call_user_func('func_num_args')); > } > test(1, 2, 3); > } > > This code will print int(3) int(1) on PHP 7 and HHVM (and two times int(1) > on PHP 5). The reason is that in the non-namespaced case the number of > arguments of the test() frame is returned, while in the namespaced case the > number of arguments of the call_user_func() frame is returned, because of > internal differences in stack frame management. > > Example #2: > > function test() { > array_map('extract', [['a' => 42]]); > var_dump($a); > } > test(); > > This will print int(42) on PHP 5+7, but result in an undefined variable on > HHVM. The reason is that HHVM will extract ['a' => 42] into the scope of > the array_map() frame, rather than the test() frame. It does this because > HHVM implements array_map as a userland (HHAS) function, rather than an > internal function. > > One might write this off as a bug in the HHVM implementation, but really > this illustrates a dichotomy between internal and userland functions with > regard to dynamic calls of these functions. Namely, if you were to > reimplement the array_map function in userland > > function array_map($fn, $array) { > $result = []; > foreach ($array as $k => $v) { > $result[$k] = $fn($v); > } > return $result; > } > > and then try the same snippet as Example #2, it would indeed extract the > array into the scope of array_map() and not the calling test() function. I > hope this helps to further illustrate why calling these functions > dynamically is a problem: They will generally be executed in a different > scope than the one where the callback is defined. This means you can > actually arbitrarily modify the scope of functions that accept callbacks, > even though they were certainly not designed for this use. E.g. you can > switch the $fn callback in the middle of the array_map execution using > something like: > > array_map('extract', [['fn' => ...]]); > > Just imagine the possibilities of this newly discovered feature! But this > is only where it starts. PHP has a number of magic callbacks that may be > implicitly executed in all kinds of contexts. For example, what happens if > we use one of these in spl_autoload_register? > > Example #3: > > spl_autoload_register('parse_str'); > function test() { > $FooBar = 1; > class_exists('FooBar'); > var_dump($FooBar); // string(0) "" > } > test(); > > Now any invocation of the autoloader (here using class_exists, but can be > generalized to new or anything else) will create a variable for the class > name in the local scope (with value ""). Of course there's more fun to be > had here (tick functions!), but let's leave it at that and get to the next > point. > > B) The second issue is stability. As might be expected, nobody has bothered > testing edge-cases of dynamic calls to these functions previously. Recently > two segfaults relating to this were found, see bug #71220 and bug #72102. > However, these are "just bugs". The more important issue is that these > dynamic calls to scope modifying functions go against assumptions in the > current optimizer. For example the following very simple script currently > crashes if opcache is enabled, because $i is incorrectly determined to be > an integer: > > function test() { > $i = 1; > array_map('extract', [['i' => new stdClass]]); > $i += 1; > var_dump($i); > } > test(); > > This is, of course, a bug in the optimizer and not in PHP. However, if we > try to catch this kind of situation in the optimizer we will have to do > very pessimistic assumptions (especially if you consider things like the > spl_autoload_register example), for a "feature" nobody needs and that > doesn't work particularly well anyway (see A). > > Due to all the aforementioned issues, I'd like to forbid these specific > dynamic calls in PHP 7.1. I highly doubt that this will have any BC impact. > A patch is available here: https://github.com/php/php-src/pull/1886 > > Does anybody see a problem with this change? > > Thanks, > Nikita > --089e01538ac8f99dcf0531d25411--