Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:108561 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 87405 invoked from network); 14 Feb 2020 10:34:00 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 14 Feb 2020 10:34:00 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 1CEB61804F8 for ; Fri, 14 Feb 2020 00:48:37 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-lf1-f50.google.com (mail-lf1-f50.google.com [209.85.167.50]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 14 Feb 2020 00:48:36 -0800 (PST) Received: by mail-lf1-f50.google.com with SMTP id 9so6214427lfq.10 for ; Fri, 14 Feb 2020 00:48:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=LzWrieOYWyhtvUXzJDcyuBTxsn1PVKz3dAjF7cIPlHI=; b=Mp07JxBA0yk34+2E5TX3VenA89BwBdVMzoI/zsoKWX6TILUnLzGyFuqhVmZjB5f2G5 Sm26j23rLmYQbn5QqCYhAvGTYkFLb6i9H06GJkqOKjdRKrofd3lHQATobBEs7DsJJrmr y36QRzes9S80OLaEaxSkWAfH51eLabWALdaQlf8YoAVie3NlgphW20JiWbRg16dfhExp Gt0bizQgg0+lleg3niU4pcNw/dp5bnv1X49aqIeqYQxf84I/UqNu+Q/Hb/IpL50likbi VkkFmDz4EvDPs7z8YQ94qqIOzpzpwkcnGf3zMlcK/S+UtIsGMlI9dFh6Yx6A8mPX4SXQ QNAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LzWrieOYWyhtvUXzJDcyuBTxsn1PVKz3dAjF7cIPlHI=; b=CvXCOOLcGRcgz9DRAFppAUae/PoEOQju4yybg8tpZ0BFhcNQ+wNnXjlSS0VxPSOPJ5 Ab2WE5XPimPOkXFsBFwZkhgXhUKhCxqaRs4eWiAOaaVETkIR/L9v/Ea1r7j/7w2E+VDw TGSgGXRHm7wFcL5/E6DNunEdDtAXDe7nt6c4dNVaIRGce5PUO6AyHHSklcQcjCU6ftJr uWo8yExk35f5alxihdJA1cOT8/BDHKclmUp/Zm99uKWWDCGY8C4mkutoGnzVAe8DnFkT lfEI568qwvTSu8vzyRAefhlYGMh+4l7vIA6lj89IXfPNewOgmM1JVyvJqACqweQyCCyk x0MA== X-Gm-Message-State: APjAAAVZyMH3GDJZx0nur1k1zsBOYPWpMtblPyom/5L/6jMxdHM3/S59 tnjY9150+PBcSPXl+e5E5DDxukn8cjljjuA/Lf7gKbK2 X-Google-Smtp-Source: APXvYqwh0j8maTMVCQR9CgEcJFXXTwmcwYn5+L6LKUusMjYNPv/bwp2axqmTQnXdNZnrYOZq1o1kCNkK4PThvnwe4UQ= X-Received: by 2002:a19:2351:: with SMTP id j78mr1128980lfj.173.1581670113181; Fri, 14 Feb 2020 00:48:33 -0800 (PST) MIME-Version: 1.0 References: <466bb718-4513-4a87-81e9-295ad3983443@www.fastmail.com> In-Reply-To: <466bb718-4513-4a87-81e9-295ad3983443@www.fastmail.com> Date: Fri, 14 Feb 2020 09:48:17 +0100 Message-ID: To: Larry Garfield Cc: php internals Content-Type: multipart/alternative; boundary="0000000000003f9ae0059e854573" Subject: Re: [PHP-DEV] [RFC] token_get_all() TOKEN_AS_OBJECT mode From: nikita.ppv@gmail.com (Nikita Popov) --0000000000003f9ae0059e854573 Content-Type: text/plain; charset="UTF-8" On Thu, Feb 13, 2020 at 6:06 PM Larry Garfield wrote: > On Thu, Feb 13, 2020, at 3:47 AM, Nikita Popov wrote: > > Hi internals, > > > > This has been discussed a while ago already, now as a proper proposal: > > https://wiki.php.net/rfc/token_as_object > > > > tl;dr is that it allows you to get token_get_all() output as an array of > > PhpToken objects. This reduces memory usage, improves performance, makes > > code more uniform and readable... What's not to like? > > > > An open question is whether (at least to start with) PhpToken should be > > just a data container, or whether we want to add some helper methods to > it. > > If this generates too much bikeshed, I'll drop methods from the proposal. > > > > Regards, > > Nikita > > I love everything about this. > > 1) I would agree with Nicolas that a static constructor would be better. > I don't know about polyfilling it, but it's definitely more > self-descriptive. > > 2) I'm skeptical about the methods. I can see them being useful, but also > being bikeshed material. For instance, if you're doing annotation parsing > then docblocks are not ignorable. They're what you're actually looking for. > > Two possible additions, feel free to ignore if they're too complicated: > > 1) Should it return an array of token objects, or a lazy iterable? If I'm > only interested in certain types (eg, doc strings, classes, etc.) then a > lazy iterable would allow me to string some filter and map operations on to > it and use even less memory overall, since the whole tree is not in memory > at once. > I'm going to take you up on your offer and ignore this one :P Returning tokens as an iterator is inefficient because it requires full lexer state backups and restores for each token. Could be optimized, but I wouldn't bother with it for this feature. I also personally have no use-case for a lazy token stream. (It's technically sufficient for parsing, but if you want to preserve formatting, you're going to be preserving all the tokens anyway.) > 2) Rather than provide bikesheddable methods, would it be feasible to take > a queue from PDO and let users specify a subclass of PhpToken to fetch > into? That way the properties are always there, but a user can attach > whatever methods make sense for them. > It would be technically feasible. If we go with a static method for construction, then one might even say that there's reasonable expectation that PhpToken::getAll(...) is going to return PhpToken[] and MyPhpTokenExtension::getAll() is going to return MyPhpTokenExtension[]. I'm a bit apprehensive about this though, specifically because you mention PDO... which, I think, isn't exactly a success story when it comes to this. If we do this, then the behavior would be that the object gets created, the properties populated, and *no constructor gets called*. The last part is important -- when you start calling constructors and magic methods, that's where the mess starts and you get PDO. Regards, Nikita --0000000000003f9ae0059e854573--