Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:108583 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 70427 invoked from network); 14 Feb 2020 18:18:36 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 14 Feb 2020 18:18:36 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 2581418058D for ; Fri, 14 Feb 2020 08:33:18 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-lf1-f42.google.com (mail-lf1-f42.google.com [209.85.167.42]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 14 Feb 2020 08:33:17 -0800 (PST) Received: by mail-lf1-f42.google.com with SMTP id 9so7139296lfq.10 for ; Fri, 14 Feb 2020 08:33:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=czjwpV9lqjTa78TA+6SHLBg4Kx6Ypi80/aBgB/FcKQE=; b=H6fVxeiXBFYs9LlF795aDhQWkrPEW6DcCbQYP5nJtAnpBuW88sdjFItQ1Lg0WW+m5c 6yhD8wrJp1bU71428akNGOIk5O8W7qj75aljnRrsOQfL+nql0AlxwFgULzqwh0LpODzJ qinnE6xWPIrEq703yIfYyCcNS6ns3JSGEwCWxe3VwAGtemXO/mnAf3KrevXZXza/Ge6q hMYaAcHLiI6iPCYg4hdzp3DBdkYfXV9pPbt06c2IdNAzJFpWX1QIkvY4G5AXEV1yAKmV 8yYk2RI5c1QMS+TW0ENdTfI4uLKt9j5hT9myddqox7vkIxGIjbU9QQH0E2FjNqA8XHNn d/ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=czjwpV9lqjTa78TA+6SHLBg4Kx6Ypi80/aBgB/FcKQE=; b=Ov9fTXHr5sGgSaS4/4z5k92Fb7BYiqCSBZolTgZ0g3b3LbgTUDFhv6On0rpF9qkngy I5owNOwjjLywU0A13Q9pQA8BcQsO5WSB7OXCi7efGVk2soSE2eoGqdTG0EKfzYhF9Skx Wdf0XBgj7u48KJTNcHLLthGX1GDtpzosWs2peNBTUPtNMI6ccmEeps3V5fx3hbtBqLjW c0zKz5LYb0ckYJsCiDV22Vxjmx/PHbEtgSIvnxHgBlIiH+Tx03cylse1mDGkaebv5gbT jEFY1CG7tjuL58dUUPDxBl1nVUGGQFTjtTbjgyyzs1lQ0Ed4nxWp/f++DxigRSHBepST HK2Q== X-Gm-Message-State: APjAAAU0msLl0f+bj6KY6CPh3T9fYLsEzxlHgMoukKkr+ZLFR5y2jw2O wF7Mvp3mydXBlpHOlRxRIMQnKe62EO2uqZMrYXY= X-Google-Smtp-Source: APXvYqyh40rrj0iPLc1OIIw8D4qq0XkIsHBkIHH87wANYUCDntB76Fmdp/g8ITxl0/BgUaE2u/6nInqEmtd/j8IpvMc= X-Received: by 2002:ac2:5e29:: with SMTP id o9mr2095389lfg.81.1581697995262; Fri, 14 Feb 2020 08:33:15 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Date: Fri, 14 Feb 2020 17:32:58 +0100 Message-ID: To: Sara Golemon Cc: PHP internals Content-Type: multipart/alternative; boundary="000000000000264f2d059e8bc352" Subject: Re: [PHP-DEV] [RFC] token_get_all() TOKEN_AS_OBJECT mode From: nikita.ppv@gmail.com (Nikita Popov) --000000000000264f2d059e8bc352 Content-Type: text/plain; charset="UTF-8" On Fri, Feb 14, 2020 at 2:38 PM Sara Golemon wrote: > On Thu, Feb 13, 2020 at 3:48 AM Nikita Popov wrote: > >> This has been discussed a while ago already, now as a proper proposal: >> https://wiki.php.net/rfc/token_as_object >> >> tl;dr is that it allows you to get token_get_all() output as an array of >> PhpToken objects. This reduces memory usage, improves performance, makes >> code more uniform and readable... What's not to like? >> >> An open question is whether (at least to start with) PhpToken should be >> just a data container, or whether we want to add some helper methods to >> it. >> If this generates too much bikeshed, I'll drop methods from the proposal. >> >> Yep. I remember bringing this up in 2016 and there was generally good > reception to it from you, Andrea, Stas, and Stig at the very least. Why > isn't it in? It got derailed by some bike shed colorizing and a little bit > of workplace drama then dropped on the floor. > > Thanks for picking it up, and I agree with your response to Larry. As > nice as it would be to lazy iterate, the scanner is just in NO shape to > tolerate reentering userspace and potentially reinvoking the scanner before > the first run through is done. > > I also agree that being able to subclass the token would be great, but > that PDO made some mistakes here and we can do that as a separate addition > later on if there's not consensus now. > > I'm +1 for NOT overloading the token_get_all() function, but rather > putting a static factory method on the PhpToken class (or whatever you call > it). When we add subclassability, we can always add additional statics if > our initial signature doesn't work out. > As there seems to be a pretty clear consensus on this point at least, I've updated the proposal to add the static factory method. I'm not clear why you want to final the base constructor. IMO we populate > the fields on object creation, then invoke constructor (which is a no-op in > the base class). Later uses of subclassing can deal with the properties > (or not) at that time, in their own constructor, delegating (or not) to the > base. > Populating the fields on creation and then calling the constructor is the whole problem with PDO, because that's not how PHP usually does construction. Normally there is either no constructor and properties are explicitly populated, or there is a constructor, in which case only the constructor is called. Populating properties first and the calling a constructor is something that essentially only PDO does. To clarify a bit what I meant with the final constructor: As part of the last update, I've added the following constructor: class PhpToken { public function __construct(int $id, string $text, int $line = -1, int $pos = -1); } This constructor will initialize the corresponding properties. Now, the behavior that would make most sense to me (if extension of the class is allowed) is that MyPhpToken::getAll() is going to create the new tokens based on "new MyPhpToken($id, $text, $line, $pos)". If we mark the constructor final, then we could hardcode the construction behavior of the base class without introducing any kind of weird rules, it would be just the usual language semantics. If we don't make the constructor final, then we would have to actually call it (if it is overridden -- otherwise we can use more optimized initialization code). We can do that (I believe calling user code here should be perfectly safe -- the lexer is reentrant after all), it's just extra complexity and I'm not sure it's actually useful. The final constructor would still allow a) adding methods in the child class and b) adding properties with default values in the child class, which seems like it should cover most of the usefulness. So I think the options here are: a) Make PhpToken final and simplify don't support this. b) Make PhpToken::__construct() final and (very easily) support basic extension. c) Make PhpToken::__construct() non-final and support full extension, with some extra effort. Regards, Nikita --000000000000264f2d059e8bc352--