Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:108114 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 41371 invoked from network); 13 Jan 2020 20:25:28 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 13 Jan 2020 20:25:28 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id B6ED218059B for ; Mon, 13 Jan 2020 10:32:10 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_NONE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS11403 64.147.123.0/24 X-Spam-Virus: No X-Envelope-From: Received: from wout2-smtp.messagingengine.com (wout2-smtp.messagingengine.com [64.147.123.25]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 13 Jan 2020 10:32:09 -0800 (PST) Received: from compute7.internal (compute7.nyi.internal [10.202.2.47]) by mailout.west.internal (Postfix) with ESMTP id EB872784 for ; Mon, 13 Jan 2020 13:32:08 -0500 (EST) Received: from imap26 ([10.202.2.76]) by compute7.internal (MEProxy); Mon, 13 Jan 2020 13:32:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; bh=/of5Ua mFq3AP0lcCWCJw4MM0RBCfIPDSqmwWTg0yx/g=; b=gl9aDGMRAUyOAKFyQ1kaT9 hcxEHMzWGKUYq/KjJKEoeAlVZQmP69ORUSwjXVB3t7kdhwiL/KGgu6xfxjy4hmba hlTZRLRZRf/EIKcCqanRiI/j27abQhBXA63miA02pdOqMYb7iKVbrUWLwuSF+NR6 wlya7S3x1Zbcvmj5liXvqDGDVFOutO+hM60EZil3Ov+k2TQrJE5DbOH5vz2WaHM5 ZWPCB1keJ5vkGZNYhtO14b1+w+2WCQlwKvtivSH1++5jK6qGVpiwjBRT4UJZf5yE sJO3Ey/ZViDTpy2/xTcf4s5VAU47XPD5U3VVCZJebCDxnzA/L/394634qG0dzFkA == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedufedrvdejtddgudduudcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefofgggkfgjfhffhffvufgtsehttdertderreejnecuhfhrohhmpedfnfgr rhhrhicuifgrrhhfihgvlhgufdcuoehlrghrrhihsehgrghrfhhivghlughtvggthhdrtg homheqnecurfgrrhgrmhepmhgrihhlfhhrohhmpehlrghrrhihsehgrghrfhhivghlught vggthhdrtghomhenucevlhhushhtvghrufhiiigvpedt X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 501) id 3BFEA14200A2; Mon, 13 Jan 2020 13:32:08 -0500 (EST) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.1.7-754-g09d1619-fmstable-20200113v1 Mime-Version: 1.0 Message-ID: In-Reply-To: References: Date: Mon, 13 Jan 2020 12:31:47 -0600 To: "php internals" Content-Type: text/plain Subject: =?UTF-8?Q?Re:_[PHP-DEV]_Introducing_compile_time_code_execution_to_PHP_p?= =?UTF-8?Q?reloading?= From: larry@garfieldtech.com ("Larry Garfield") On Sun, Jan 12, 2020, at 6:45 PM, Mike Schinkel wrote: > > On Jan 12, 2020, at 1:57 PM, Larry Garfield wrote: > > > > Most notably, *not all code will be run in a preload context*. > > Can you give some concrete examples here? > > > Language features that only sometimes work scare me greatly. > > Do you have some examples of language features, from PHP or another > language, that only work sometimes and that are known to be > problematic. and why they are problematic? To use the example from the OP: static_run { $link = mysqli_connect("127.0.0.1", "my_user", "my_password", "my_db"); $res = mysqli_query ($link, 'select * from sometable'); $array = []; while($row = mysqli_fetch_assoc($res)) { $array[]= $row; } define('CONST_ARRAY' = $array); } I can see the use of that, sure. Now, what happens when the code is not preloaded? Does that block not get run, and thus CONST_ARRAY is not defined? Does it run on all requests if not preloaded? How does that interact with a file that gets read multiple times? What happens if the code does more than set a constant? Can it define new functions? What happens to those functions in a non-preload situation? To use the other example: static_run { on_file_load(function($file_ast){ // Do something with the ast of the file return $file_ast; }); } AST manipulation from user-space opens up a lot of possibilities for optimization. However, it's also a huge foot-gun. When you start messing with the AST I can't imagine it's hard to end up introducing subtle behavioral changes without intending to. Or, maybe you are intending to. So then what happens if the code runs in a context when that doesn't happen? Does the AST then get re-manipulated on every request instead? What's the performance impact of that? Net negative? I don't have answers to these questions. It's possible that we could come up with a set of answers that would address the core issue, but I am skeptical. My core point here is that I am fully in favor of leveraging preloading to improve performance, BUT ensuring that there is zero behavioral difference between preloaded and non-preloaded code, only performance differences, is paramount, and IMO is more important than any flexibility, power, or performance benefits it could offer. We should consider exposing that to user space *only* if we can be pretty damned sure that it's not going to introduce weird-and-subtle behavioral bugs that end up making preloaded and non-preloaded code behave differently. As an example, preloading seems like a great place to do something like tail recursion flattening. That's a logically safe thing to do, as long as the call is properly tail-recursive, and would make writing tail-recursive algorithms more practical. (They're often easier to read and maintain but performance makes them less practical.) However! Doing so means the preloaded version doesn't have an issue with blowing out the stack. The non-preload version does. That means the non-preload version has a built in limit on how long of a list it can operate on (100 by default, minus however many stack calls have already been made) while the preloaded version doesn't. That can have ugly implications if you're running code that was working in preload in a non-preload context, and suddenly your 105 element array is causing a fatal error when it didn't before. That's the sort of subtlety that, frankly, I am a lot more confident in Engine developers remembering to think about than user-land developers. Myself included. Not because they're less capable developers but because 99% of the time PHP doesn't force you to think about such questions, so most developers won't think to think about them. And 99% of the time that's a good thing. This is the other 1%. :-) What I very much want to avoid, for as long as possible at least, is "this library only works if preloaded" type situations. That's how we end up with a division in the language; not just between people who own their own servers and those that don't, but it ties the hands of admins and framework authors in deciding what to preload. What a "good" preload strategy is depends on context, and we've only had a month or two experience with it to even know what to recommend to people. And that's in addition to the development challenges of developing such code in the first place: > > "I changed one character and now I have to restart my webserver to see if it did anything" is a bad place for PHP to be. > > As I envision it preloaded code of this nature would not be handled on > server reboot, but when the files have had their time stamps updated. > If I am not mistaken, PHP already does this (but I could be mistaken as > I don't have expertise in PHP OpCodes.) The opcache does that, yes. The preloader, however, is a one-shot deal and requires restarting FPM to have it re-run. Thinking about it, I suspect there would be far more benefit in practice not from allowing AST manipulation but being able to "Checkpoint" a running script; that is, allow it to not just pre-load code (which we can do now in 7.4) but set up variables that are already initialized from one request to the next. I'm thinking here of things like bootstrapping a dependency injection container, declaring closed functions, and other semi-global stuff that right now makes a PHP application's bootstrap process more expensive than most other languages. (In the area of milliseconds, sure, but still slower.) Allowing that sort of execution to happen once and get persisted would reduce the need to do all the precompiling and such that many frameworks do today, at the cost of a great deal of complexity. That may be as much of a pipedream to do safely, I don't know, but in practice that seems like a more promising direction for userland developers to leverage themselves. --Larry Garfield