Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:108099 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 84089 invoked from network); 11 Jan 2020 14:34:17 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 11 Jan 2020 14:34:17 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 23E1B1804F2 for ; Sat, 11 Jan 2020 04:40:25 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.7 required=5.0 tests=BAYES_05,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-oi1-f172.google.com (mail-oi1-f172.google.com [209.85.167.172]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 11 Jan 2020 04:40:24 -0800 (PST) Received: by mail-oi1-f172.google.com with SMTP id c77so4320219oib.7 for ; Sat, 11 Jan 2020 04:40:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=N6mZx999TFmxy4NeM37+3bcfsj/4NM4JblB274g2ctg=; b=rg4nw2Qsr0pTCQZx+107FOEwkidQG3VsayAOl+RfMVOKvXVpdD9jwAkyhP/jIMmQbd vfGcdbVPkan8Rjg+otjA/2XEEvfmg8QAz6jxdQ11+f9pg+6MQioCJhesiGwYL2wvFZ4h EDkDn5AmfvOXrLDdahu9K43NdehPEybq0844yV28wRMARyte7MQqTU3swBz5Ze+NUqf3 jk1Zn8DQ3NRv+Ge1vr1k9SQE443mmWZeK3i89XN5asHxJSGStnEqoITVhlFVvJ7okmSK tX58US9I+xgHY/GkcdDP6+mylQEqK3851+71WjaNBhwP+PVB1EJfH4Xz4sNs4kAJIS6O nqsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=N6mZx999TFmxy4NeM37+3bcfsj/4NM4JblB274g2ctg=; b=AQRMac/5j0lXRtmY6r+kTXmbmYyyYoJHhC8eagAv0c+ubAOD6Blf48xxO9RFimGaIT 46eMiQxXq1UkWGhO7WF3N8DMha6N03NDZBUwft4eq6y5xoQLMQqxk5hboTLFXzMBR4PL TpN8x5r+LhCK0KVJNLLbWPzgBEyk6jqYIM0FaEA+oUclNZpRYfUIOBX5YEZIBFRVXOko vqzj8Ujrz2GLoI5sFnkIukVNQxcfVCnmGVWxhAdmJ/a9eOpKUP72NdkwHq1oE6TZoKIJ FVOOH4+tnxSkyxlP6Gyz0y2Xey3Gl6HPbncmVzveCaLT2Zr7wAp4mFxh63f/hgFl0+nD AuoA== X-Gm-Message-State: APjAAAW6JoGpEb3kMuIjIhwem7ZBosAnOyMiu6mRTyWrx3tVWl3u+nnH xA26Ac+wcoNpTReOGf6LkGZ88J4YyWLnJFOauGN2Kanv X-Google-Smtp-Source: APXvYqwd3vCG/RDYhP7uB1hqKg1uYEOSwWYNzDoAkqyxEBNPDMFUUPfwLa2hwfpJJkFP0TBxh2g5j4FOE9/FvuoW8G4= X-Received: by 2002:a05:6808:315:: with SMTP id i21mr5898300oie.139.1578746419609; Sat, 11 Jan 2020 04:40:19 -0800 (PST) MIME-Version: 1.0 Date: Sat, 11 Jan 2020 12:40:08 +0000 Message-ID: To: PHP internals , Mike Schinkel Content-Type: text/plain; charset="UTF-8" Subject: Introducing compile time code execution to PHP preloading From: robehickman@gmail.com (Robert Hickman) With PHP having recently introduced preloading, i have been thinking about the possibility of adding a system whereby arbitrary php code can run during this step. Essentially, this would serve the same function as 'compile time execution' in many programming languages. It should be noted that my thoughts below are mostly inspired by the in-development language JAI, demos of which are included at the end of this email. While PHP is an interpreted language, code is first parsed which generates an AST, and this AST is then used to generate bytecode that is stored in opcache. With preloading, the generation of this bytecode is done only once on server startup. Compile time code would run during this stage as a 'shim' between parsing and bytecode generation, allowing arbitrary modifications to the AST. I can think of numerous examples of ways this could be advantageous. For one, frameworks often want to store configuration data in a database or some other external source, and accessing it every request is needless overhead, given that data tends to never change in production. So you could do something like the following which runs once during preload, and caches the constant in opcache. ------------ static_run { $link = mysqli_connect("127.0.0.1", "my_user", "my_password", "my_db"); $res = mysqli_query ($link, 'select * from sometable'); $array = []; while($row = mysqli_fetch_assoc($res)) { $array[]= $row; } define('CONST_ARRAY' = $array); } ------------ static_run being a new keyword that allows an expression to be evaluated at compile time. I foresee this being able to do far more than simply define constants though. In my opinion, it should be able to allow arbitrary modifications to the AST, and arbitrary programmatic code generation. For example, static code could register a callback which receives the AST of a file during import: ------------ static_run { on_file_load(function($file_ast){ // Do something with the ast of the file return $file_ast; }); } ------------ As noted above, I can think of numerous things that this could do, and as a flexible and far reaching facility, I am sure many more things are possible that I have not considered. To give a few examples: * Choose a database interface once instead of during every request. * Check the types defined in an orm actually match the database. * Inverting the above, programmatically generate types from a database table. * Compile templating languages like twig into PHP statically, eliminating runtime overhead * Convert syntactically pretty code into a more optimised form. * Statically generate efficient code for mapping URLs to handler functions * Validate the usage of callback systems such as wordpress 'shortcodes'. * Arbitrary code validation, such as to implement corporate programming standards. ==== Why not a preprocessor? While things like this can be implemented as a preprocessor, I can see considerable advantages of implementation as a native feature of the language itself. A big one is that it would be aware of the semantics of the language like namespaces, and scope, which is a big downside of rudimentary preprocessors like the one in C/C++. Implementing it into the language runtime also eliminates the need for a build step, and means that everyone using the language has access to the same tools. I also think that given that these data structures already exist during compilation to bytecode, why not just give programmers access to them? This concept is not that unusual and python for example, allows python code to modify the AST of files as they are being loaded. However directly modifying the AST won't be very user friendly. Due to this, syntax could be created which allows the more common operations to be done more easily. Rust has a macro system that is based on this kind of idea, and JAI has recently introduced something comparable. While it should be obvious from the above, i am not talking about macros in the C sense. These should be 'hygienic macros'. ==== How it runs On the web, compile time code is ran during preloading. When running php code at the CLI, compile time code could just be run every time, before run time code. Cacheing the opcodes in a file and automatically detecting changes and recompiling this as python does, could be a worthwhile optimisation. ==== Inspirations The general idea with this was inspired by the in development programming language JAI, which has full compile time execution. Literally, the entire programming language can be run at compile time with very few restrictions. See the following to videos for a demonstration of what it can do: https://www.youtube.com/watch?v=UTqZNujQOlA https://www.youtube.com/watch?v=59lKAlb6cRg&list=PLmV5I2fxaiCKfxMBrNsU1kgKJXD3PkyxO&index=20&t=0s There is also a programming language called 'zig' that is based on similar ideas to JAI, and also has compile time execution. Unlike JAI it has been released ans is available to try today. My suggested syntax for static_run was inspired by zig.