Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:84247 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 49824 invoked from network); 3 Mar 2015 16:12:30 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 3 Mar 2015 16:12:30 -0000 Authentication-Results: pb1.pair.com smtp.mail=leight@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=leight@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.220.175 as permitted sender) X-PHP-List-Original-Sender: leight@gmail.com X-Host-Fingerprint: 209.85.220.175 mail-vc0-f175.google.com Received: from [209.85.220.175] ([209.85.220.175:39056] helo=mail-vc0-f175.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 96/2F-03783-C6DD5F45 for ; Tue, 03 Mar 2015 11:12:28 -0500 Received: by mail-vc0-f175.google.com with SMTP id hq12so13704393vcb.6 for ; Tue, 03 Mar 2015 08:12:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=yXqFYfq1OwqKWNo106IKQO/b81alNpPSMNUHP3MbstI=; b=ICpBvVq7+Vibv7LJC/CrXP7nbvdDoK30NGs5L5bn+VE59+RxEP9UcZOiqst9OLGYAv h55EeFFY7ZcIkgyGaouZSsQ/4WSEiUt6jqPvQLIFGe2OMy1oMocxBQYa2qFvagTVKevm fhCe/0U3a+pQt0Ab48jzs2moC32IT5//Sskl4i257zmZgH+VeTJxSn/xXXayyAt6paKV OQwgEm65YR3QdU/AGFsWnAt52BFhdc/AgMLbWeG9/RTx9GeQSKleThRu3vFRajEY2PTR ZWv5/TKmk8lEngUrBlDds0SA9EO+CPpB1AjhPeOmhojXj81eojXifsUhNnjoh55v5a9H O5lg== MIME-Version: 1.0 X-Received: by 10.53.11.76 with SMTP id eg12mr30085531vdd.68.1425399145269; Tue, 03 Mar 2015 08:12:25 -0800 (PST) Received: by 10.52.177.10 with HTTP; Tue, 3 Mar 2015 08:12:25 -0800 (PST) In-Reply-To: References: Date: Tue, 3 Mar 2015 16:12:25 +0000 Message-ID: To: Alexander Lisachenko Cc: Sara Golemon , Nikita Popov , PHP internals list Content-Type: text/plain; charset=UTF-8 Subject: Re: [PHP-DEV] [RFC][Discussion] Parser extension API From: leight@gmail.com (Leigh) On 3 March 2015 at 11:56, Alexander Lisachenko wrote: > Good morning! > > I have cleaned https://wiki.php.net/rfc/parser-extension-api and restricted > it's scope only to the parsing API. Extension API can be implemented later +1 > on top of > https://github.com/php/php-src/commit/1010b0ea4f4b9f96ae744f04c1191ac228580e48 > and current implementation, because it requires a lot of discussion and can > not be implemented right now. I had no idea that zend_ast_process was such a recent addition, and part of your proposal. I've actually started using it already completely independently in one of my extensions! > 1. Should each node type be represented as personal class? > There are two possible ways: single node class for everything (current > proposal) and separate class for every node kind. I have made a quick > research of AST API in several languages and noticed, that most of them > generates AST class nodes automatically. E.g. UnaryOperationNode, > StatementNode... This way is cool, but it requires a lot of classes to be > loaded and available as API. Pros: SRP of each node class, AST validators > (can be described with XML and DTD), more clear code analysis (checks with > $node instaceof StatementNode), typehints for concrete nodes for visitors. > However, PHP is dynamic language and we can use only one class for > everything, adjusting `kind` property for each node. Pros: single class, > easier to maintain and to use. Cons: bad UX (how to validate nodes, how to > determine number of available children, how to check is node using flags or > not, etc) I think we need to at least represent all of the current node structures. A common base class, and then classes to represent lists, zvals and decls that extend from this base. > 2. Where metadata should be stored (flags, names of kind nodes, relation > between node types)? This information will be needed later for validation > of AST > > Nikita have some thoughts for the future :) So he asked about the storage > of metadata to validate an AST and to perform some analysis on it. Metadata > should include the following: name of each node kind (this can be just a > class name of node or constants in the class), node restrictions (which > kind of node types can be used as children for concrete node; number of > children nodes), node flag names (it's PUBLIC, PROTECTED, PRIVATE, etc) Thinking for the future is fine, but do we need this metadata for the current proposal? Is the AST returned by the parser in a read-only state, or can users create their own nodes + children and get a pretty printed output? If it's the latter then we obviously need to know the restrictions. I think we need a mechanism that keeps names/numbers in sync automatically, maybe we can use some macros to automatically generate enums and userland facing details at the same time, so we don't have to keep several places in sync if/when new AST nodes are added.