Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:109211 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 49519 invoked from network); 22 Mar 2020 22:12:59 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 22 Mar 2020 22:12:59 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 263E01804E0 for ; Sun, 22 Mar 2020 13:36:57 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-qv1-f48.google.com (mail-qv1-f48.google.com [209.85.219.48]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sun, 22 Mar 2020 13:36:56 -0700 (PDT) Received: by mail-qv1-f48.google.com with SMTP id p19so3548567qve.0 for ; Sun, 22 Mar 2020 13:36:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=newclarity-net.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=v3HgQDMy+El9B2ipCMfTvWdFu50HFN7ipra2/9Cy/G4=; b=qvlIBIDGDc2DaS81DVRt2k314JupHzBPN1xlq61MFy/pEXR4wCdMjzn9gc2KdirZek PouyzzRscNxtB44reW/ttfBnwLDnZ5EvB4OjScSuqr5XLjlA0aPmrF8/Qa2pg3jF8Get u7mes0czy0u+EpI0lOP0L4k5XC832q/sh8d2lZmdpRJLh80ZJRCSfHNjLama5Ka1OK8Y QdOx2OGY/qDXFCRgtjAED9qwq47nrI2DZLZNVzTpC5T5CmBvSk3R3SjNapGYT2OvA2kc j+mtZ688AF9oGvKZZQlPNMwtfQD8UMGL0CrD9X3awLyzAblZQVOlbbfgISI7/nN8AuYd XzqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=v3HgQDMy+El9B2ipCMfTvWdFu50HFN7ipra2/9Cy/G4=; b=oXM5tX/vRdWtTY0fB2ulKrKUZBx1w/z0yQsRTOFO2zDjxo1xMqFMMO3AgqqYeuRWRs Mup9VTdfv00kbDSk4kOTMRdHAjQIucSSVedwKdhDFvh5WojxXOw3TxZLLkgKwTHvr8KX 7V2i3QNh5iXkWLopfRDWE9kqxvx0sAtFU4s+/AccyyaWyvWM506IlOC4FM1ITaGS+RNc oWCnOJK9AdY28zLDvia2eSkCeYLMkuGAk3HTUiCIm79CcvX4HfY0YXYQ4Lt/nbKV3TM7 yxiH4sg6YuxOnpex8DMiu1xZyUaQmEhvzE6tDaw0b1zr4cY6uZrdIccYyuj39744v3Mk ojzw== X-Gm-Message-State: ANhLgQ0UYE3aSYi2cTeZkTusL9Dft0KiO1/0nf9NNAl5Yt+NWTLepYjq 8cCePpORCjrd1l0ruyu/KV053w== X-Google-Smtp-Source: ADFU+vvWOVQ6jKTxsEJxkPihoTOGVnMAtcmO99O2iFGjyrMl7jsyPcylN0Bh7HhJYxEgonAiXSdjpQ== X-Received: by 2002:ad4:4847:: with SMTP id t7mr17939780qvy.237.1584909410962; Sun, 22 Mar 2020 13:36:50 -0700 (PDT) Received: from ?IPv6:2601:c0:c680:5cc0:b161:bce7:21ab:aa25? ([2601:c0:c680:5cc0:b161:bce7:21ab:aa25]) by smtp.gmail.com with ESMTPSA id y21sm9046269qka.37.2020.03.22.13.36.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 22 Mar 2020 13:36:50 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) In-Reply-To: Date: Sun, 22 Mar 2020 16:36:48 -0400 Cc: PHP internals Content-Transfer-Encoding: quoted-printable Message-ID: <2EF1EEB5-FF1A-4494-BD01-A7E90153A50F@newclarity.net> References: To: Ben Ramsey X-Mailer: Apple Mail (2.3445.104.11) Subject: Re: [PHP-DEV] Are PECL modules preferable? From: mike@newclarity.net (Mike Schinkel) > On Mar 21, 2020, at 7:15 PM, Ben Ramsey wrote: >=20 >> On Mar 21, 2020, at 17:52, Mike Schinkel wrote: >>=20 >>> On Mar 21, 2020, at 5:59 PM, tyson andre = wrote: >>> FROM: Re: [PHP-DEV] [RFC] is_literal() >>>=20 >>> And if it can be implemented as a PECL module, that would be more = preferable to me than a core module of php. >>> If it was in core, having to support that feature may limit = optimizations or implementation changes that could be done in the = future. >>=20 >> Just wanted to address this comment which was made on another thread = (I did not want to hijack that thread.) >>=20 >> A large number of PHP users have no control over the platform they = run on, so the option to use PECL modules is a non-starter for them. >>=20 >> Here are several of those managed hosting platforms I speak of. = Collectively they host a large number of WordPress sites, and Pantheon = also host Drupal sites: >>=20 >> https://pagely.com/ >> https://wpvip.com/ >> https://wpengine.com/ >> https://kinsta.com/ >> https://pantheon.io/ >>=20 >> Given that, if there is an option between a useful feature being = added to core or left in PECL, I would vote 100% of the time for core, = since working with WordPress on a corporate site I can rarely ever use = PECL extensions. >>=20 >> #fwiw >>=20 >> -Mike >=20 >=20 > If at all possible, I advocate for implementing in userland. I disagree with this in many cases; more on that below. > Of course, the specific is_literal/taint feature is special in this = regard -- it can=E2=80=99t be implemented in userland. Well, I broke off this thread on its own to decouple from the = is_literal/taint feature. Although I agree it is not a userland option. > IMO, PECL is an antiquated system that needs a successor, in much the = same way Composer is the successor to PEAR. I think there are folks = working on a solution for this, but I=E2=80=99m not sure where they are = in their efforts. If we could make extensions as easy to package, = distribute, and install (and load without root privileges) as Composer = packages are, then I think we could say that PECL extensions are = preferable. Totally agree with all of that. > Maybe FFI can help in this regard? I think that is not a viable solution because FFI can run unsafe code = and can be disabled in PHP.ini. Given the former most (all?) managed = hosts will disable FFI because =E2=80=94 as PHP.net says on = https://www.php.net/manual/en/intro.ffi.php: "Caution: FFI is dangerous, since it allows to interface with the system = on a very low level. The FFI extension should only be used by developers = having a working knowledge of C and the used C APIs. To minimize the = risk, the FFI API usage may be restricted with the ffi.enable php.ini = directive." I would love to see forward movement on this. I think what is needed for = a binary extension that could be loaded in userland is some language or = tool that can generate guaranteed-safe extensions, and one that will = provide a significant performance gain. =20 So what are the potential options? These are the ones that come to = mind:=20 1. Create a safe language similar to PHP but that uses LLVM to compile = down to a binary form that could be loaded in userland. There are = numerous existing languages currently in development that might be roped = in to becoming this language. Jai that Robert Hickman mentioned comes = to mind: https://inductive.no/jai/=20 2. Create a safe language similar to PHP but that compiles down to C = source code that can then be compiled to create a binary form that could = be loaded in userland. Zephir comes to mind as a language that might be = co-opted for this (zephir-lang.com) , and you Ben were the one to = mention this to me. 3. Explore Rust to see if there are ways to leverage its safety and = limit it to only safe features compiled to create a binary form that can = be loaded in userland (I have no clue if this is even possible.) 4. Explore GoLang to see how to leverage it to create a binary form that = can be loaded in userland. 5. Look for another language that can be co-opted to use for creating = safe binaries. Maybe a less well-known language where the language's = team would be motivated to implement the specific features PHP would = need to make this a reality (Julia?) 6. Embrace Web Assembly (WASM) as the extensibility mechanism for PHP. = If we do then people can use many compiled languages to create WASM = binary files such as C, C++, D, Rust, GoLang, Julia, etc. And there is = already an extension to handle WASM in PHP: https://github.com/wasmerio/php-ext-wasm 7. And finally, I am sure there are other solutions that did not come to = mind that someone else has considered? BTW, I don't think we can offer a "safe" extensibility method that has = the ability to manipulate PHP on the low level, unless of course PHP = offered safe lower-level APIs. So is_literal()/is_taint() might still = be off the table here for one of these extension mechanisms. But I = could be wrong here and if so I hope someone will explain. ------ That said, of all the options that came to mind using WASM as a = replacement for PECL seems like it would be the best solution, because: 1. WASM is WC3 recommendation 2. WASM is designed to be safe: https://webassembly.org/docs/security/ 3. WASM has a package manager (wapm.io) so there are/will be a lot of = existing WASM packages for use in PHP 4. WASM compiles to a binary, so could be uploaded to a server and = loaded as a binary by PHP 5. As noted above, many languages can create WASM: = github.com/appcypher/awesome-wasm-langs As for performance, WASM is probably faster than PHP, but we'll need = benchmarks to know how much faster. So what do others think? Should we consider adding WASM support to PHP? = =20 Maybe this would be a good example of implementing a proof-of-concept = container to let people try it out and see what they think? > In the meantime, I agree with you that general-use language features = that cannot be implemented in userland can serve the community best in = the core, rather than in PECL, but their general utility will need to be = weighed against their impact to the engine (i.e., if a feature slows = down the engine, we can=E2=80=99t put it into core). Agreed. If the feature slows down the engine then yes, it is = problematic to put into core. =20 Of course we should be careful to guard against a small subset of people = against a useful feature using a naive implementation that slow down = core as an argument against the feature when it is possible that an = intelligent implementation can exist w/o affecting performance. ------ And finally, as promised above, I disagree that everything should be = pushed to userland *whenever-possible.* =20 I believe there is a great benefit to standardization of functionality. = The benefits of said standardization include: 1. Avoids reinventing the wheel. While developers generally adopt core = language features they re-invent the wheel in userland. Lack of = awareness of existing, non-invented-here syndrome, dislike of the API, = company dictates only to use approved external code, etc. This = reinventing causes fragmentation and can result in a massive duplication = of error. 2. Minimizes training/learning required. A developer learns the core = feature and they are done. In userland they will have to be trained on = or learn the new feature every time they use a different codebase or = package that implements differently. 3. Allows commonality in articles, tutorials and documentation. If a = standard feature exists then it can be used in any article, tutorial or = documentation without necessarily having to show and explain a userland = implementation. This helps to level-up the skill of the average PHP = developer simply because more and better learning material will = naturally become available. 4. Minimizes dependency hell and/or duplicated code to maintain. If a = feature is in core then any PHP package can use it w/o having to bring = in yet another dependency and/or maintain yet another userland = implementation. 5. Empowers future developers to solve greater problems. Standardized = functionality empowers future developers to solve new problems rather = than continue to recreate solutions and/or manage the dependencies where = others solved them. Think of the biblical Tower of Babel and how it = stalled progress once everyone was speaking a different language. 6. The nature of programming languages. Programming languages are tools = that embrace and simplify well known patterns so we can solve harder and = harder problems. If everyone in the past had the opinion that code = should be implemented in userland if-at-all-possible then we would all = be coding everything in assembly language, still. So when we identify a = common pattern I believe is should be moved out of userland and into the = core. In closing, I agree that 80+% of functionality that is used < 20% of the = time can and should stay in userland. But for everything else =E2=80=94 = the ~20% of functionality that is used 80+% of the time, such as = str_contains() =E2=80=94 that should be moved out of userland and into = PHP core. IMO. #jmtcw #fwiw >=20 > Cheers, > Ben >=20 -Mike=