Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:119906 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 60822 invoked from network); 11 Apr 2023 11:16:32 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 11 Apr 2023 11:16:32 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 62CC61804D5 for ; Tue, 11 Apr 2023 04:16:31 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 11 Apr 2023 04:16:31 -0700 (PDT) Received: by mail-pl1-f172.google.com with SMTP id y6so6409595plp.2 for ; Tue, 11 Apr 2023 04:16:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1681211790; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=LBOpFmAKKINXVCyTyRKrp8//icfLD8+1Z6CykoaxrCc=; b=pRLWn0Ad6QrnA8E6p9W2W1FfzszbwKdLNOVxhsuUGB0Cpuiwcsy9/atU7NqYtoXnRa raw3BYfRLLhm+/DJbzaKXj8eXAgxNfl16qVLn43jofUp/P0IdDAb74wD53eiL/E6AIAk 0e4dj4px6vGfkRYs82uqRiy24vx9rvUuwtErLDRP9s8cWDwXAwhkeiXxBzLPDI9lOL2i N+0DLcSIUczsfRTWcJhrdWO5OBY21KdkY48d2Hqy3PpH1mI89pESGHPArApZwiBipw+Y kYP+oTU5Zrr3Zm8PlcgwS38KBngScXQqQ5DaX+0p0r8arL+0/ffRpR9wynuuDwQpF5tX tKxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1681211790; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=LBOpFmAKKINXVCyTyRKrp8//icfLD8+1Z6CykoaxrCc=; b=GCY4UHJVDfvJQgjhUlvf4G+6Ghi1bpRuNa0gpCdA6jjHfg32Lni1YgRUuw8HnK+tmg MxDQcEWMEKQf5M6Noj9fqkw/YazxEn6nSbZSpDyv6xMkU0lvyVC/wmLtmMxC4CSMbJoj 34XIuQ0R6+N5QT9loHL1uY6sAIOasCfrDTu7r04/mrtW2BN5S/JcmcGoRhcVoCZX+wkN hQlcPxzNccbulrjMFJIC99/hDscMp2vI3+PQ75o0UoI/IcGASicYncYcv3DaY8LleHtb ZKKBd4KtraGhQOymyRWN4tTS5KKX3jiBrr9RLlufmp2JBRzjNUQVVxpkmriPZB3EUIKU RL6g== X-Gm-Message-State: AAQBX9dOOjjx/TVxWLkERojEQ9DFxTSwnu/ZmWwgqXC/J6pLWPxiOIr5 PxwtaO4egI5uVKQqx1fIPOlqpuxzx/De0adFJn4= X-Google-Smtp-Source: AKy350Z1LEQzB/0REDc9QlpPrbHSmKG+epYvdzI5R+aXqEwznG70FUbkQy+6NzpB40MQoCOIIBEhCPgmFaPxRJKKiss= X-Received: by 2002:a17:90a:5c86:b0:246:9231:f0fd with SMTP id r6-20020a17090a5c8600b002469231f0fdmr2180692pji.8.1681211789742; Tue, 11 Apr 2023 04:16:29 -0700 (PDT) MIME-Version: 1.0 References: <6FA5AA43-9738-4400-8D83-38FC2D464B82@gmail.com> In-Reply-To: <6FA5AA43-9738-4400-8D83-38FC2D464B82@gmail.com> Date: Tue, 11 Apr 2023 12:16:18 +0100 Message-ID: To: Rowan Tommins Cc: internals@lists.php.net Content-Type: multipart/alternative; boundary="00000000000085196d05f90d9fc0" Subject: Re: [PHP-DEV] [RFC] New core autoloading mechanism with support for function autoloading From: george.banyard@gmail.com ("G. P. B.") --00000000000085196d05f90d9fc0 Content-Type: text/plain; charset="UTF-8" On Tue, 11 Apr 2023 at 09:48, Rowan Tommins wrote: > > As a result of the pinning, function_exists() will return true for > functions in namespaces that have been pinned to a global function. > > The lookup pinning seems like the right way to optimise the > implementation, but I don't think it should be visible to user code like > this - a lookup for "strlen" in namespace "Foo" is not the same as defining > the function "Foo\strlen". Consider this code: > > namespace Bar { > if ( function_exists('Foo\strlen') ) { > \Foo\strlen('hello'); > } > } > namespace Foo { > strlen('hello'); // triggers name pinning > } > namespace Bar { > if ( function_exists('Foo\strlen') ) { > \Foo\strlen('hello'); > } > } > > If I'm reading the RFC correctly, the second function_exists will return > true. I'm less clear if the call to \Foo\strlen will actually succeed - if > it gives "undefined function", then function_exists is clearly broken; if > it calls the global strlen(), that's a very surprising side effect. > It *should* indeed call the global strlen() but I hadn't actually created such a test as ... I hadn't thought of doing that. However, we *already* do function pinning which can result in this behaviour via the function cache, see the following bug which defines a new function via eval(): https://bugs.php.net/bug.php?id=64346 I am not sure that it calling the global strlen() is that surprising, as it is basically aliasing the function \Foo\strlen() to \strlen(). For bonus points, the call to strlen that triggers pinning could be inside > an autoloader, making even the first function_exists call return true. > If it is in the same namespace as the autoloader, then yes. However, if the autoloader is in Bar then only Bar\strlen() is being aliased to \strlen(). Ilija mentioned this off-list, and I hadn't considered this, but this could lead to a large increase of symbols being defined in the function symbol table, as every nonqualified call (either by using the "use" statement, or writing the full FQN) will get aliased to a global function and take an entry in the symbol table. > Similarly, I think it should be possible to "unpin" a function lookup with > a later definition, even if no autoloading would be triggered. That is, > this should not be a duplicate definition error: > > namespace Foo; > if ( strlen('magic') != 42 ) { > function strlen($string) { /* ... */ } > } > There are some larger technical issues at play, as mentioned in the previous bug. The function cache will pin the call and there is no way of unpinning it. I tried looking into fixing this, but it turns out to be too complicated (/impossible?). More so, disabling the function cache is a massive performance penalty. As such, the RFC follows the current de facto behaviour. > > The use of the word class in the API is currently accurate > > This isn't actually true: classes, interfaces, traits, and enums all share > a symbol table, and thus an autoloader. I don't know of a good name for > this symbol table, though. > They do share a symbol table indeed but using class is probably the least confusing one. > Regarding the API, would it be possible to take advantage of nearly all > autoloaders only being interested in particular namespace prefixes? > > Currently, every registered autoloader is run for every lookup, and most > immediately check the input for one or two prefixes, and return early if > not matched. I suspect this design is largely because autoloading came > before namespaces, so the definition of "prefix" wasn't well-defined, but > going in and out of userland callbacks like this is presumably rather > inefficient. > > Perhaps the "register" functions should take an optional list of namespace > prefixes, so that the core implementation can do the string comparison, and > only despatch to the userland code if the requested class/function name > matches. > That is actually interesting, hadn't thought about taking an array of prefixes. And yes, every callback call requires a VM re-entry, which is expensive. Should the prefix be with or without the trailing backlash? Best regards, George P. Banyard --00000000000085196d05f90d9fc0--