Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:83114 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 33115 invoked from network); 18 Feb 2015 20:18:24 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 18 Feb 2015 20:18:24 -0000 Authentication-Results: pb1.pair.com header.from=dmitry@zend.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=dmitry@zend.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain zend.com designates 209.85.160.171 as permitted sender) X-PHP-List-Original-Sender: dmitry@zend.com X-Host-Fingerprint: 209.85.160.171 mail-yk0-f171.google.com Received: from [209.85.160.171] ([209.85.160.171:40738] helo=mail-yk0-f171.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 33/38-25021-E83F4E45 for ; Wed, 18 Feb 2015 15:18:23 -0500 Received: by mail-yk0-f171.google.com with SMTP id q200so2065277ykb.2 for ; Wed, 18 Feb 2015 12:18:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=JUz6tQVs0xD1JC6TLR5oWAlyNX0hjMvfFBHLnanzkiM=; b=a/omCaNxz0szxU7ZzeiD1yrDrE4C1p7ZKTOTajhK3lMDmU5otvtDG80J4H26zwT9xV cgs0ME+qSOndYWM/LEoTXBeHeC6Pozu2U4R93ald4pWKHvdmWTCkCo0poocYFGFKcvwc GvIwBUQV+m8iQBqd8L9h70YeexkCoqoUoUEA7kQfeo1V3KdE2fQfAxS70zj5PctI7nsb J0N3xKj58I9ywIihtjVODMfFxQGUeQHtBccBYTuES/46Aox/3rWc4HmBF5UXU2IccItg cJkRQ4DtTRV/VTjlAKTw0Umf9aSEErez18nhQR/UyQhYV2Nen7nEw6ekVEgOu4aRvWZg irvg== X-Gm-Message-State: ALoCoQlD3HtN0O8YoH2G4C1FUjak88TLD5GNS0vQXkL6FTSda+pFMbrP3VoLBwcDmV/4a4D5/tUuW8oof3KZ6xyE5cc8oskvdc6ytrm5V9EOgWhf7j0jHd+xqvVACPGvGyjClxoL6ySRt5bbtAb6qfEokdk0yXasIg== MIME-Version: 1.0 X-Received: by 10.52.51.198 with SMTP id m6mr645409vdo.38.1424290699944; Wed, 18 Feb 2015 12:18:19 -0800 (PST) Received: by 10.52.74.73 with HTTP; Wed, 18 Feb 2015 12:18:19 -0800 (PST) In-Reply-To: References: Date: Thu, 19 Feb 2015 00:18:19 +0400 Message-ID: To: Nikita Popov Cc: Alexander Lisachenko , Nikita Popov , PHP internals list Content-Type: multipart/alternative; boundary=001a1136a0ea3a3c8c050f62850e Subject: Re: [PHP-DEV] [RFC][Discussion] Parser extension API From: dmitry@zend.com (Dmitry Stogov) --001a1136a0ea3a3c8c050f62850e Content-Type: text/plain; charset=UTF-8 On Wed, Feb 18, 2015 at 11:00 PM, Nikita Popov wrote: > On Wed, Feb 18, 2015 at 8:22 PM, Dmitry Stogov wrote: > >> I think the AST API shouldn't use "public" properties. >> Using it, we will have to construct the whole tree of objects, >> duplicating information from AST. >> I would propose SimpleXML approach instead - construct object only for >> node(s) we currently access. >> >> So at first we will construct just single object referring to AST root. >> Then traversing it we will create and destroy objects for necessary nodes. >> >> To access children I would propose to implementing ArrayAccess interface. >> >> >> $ast = \php\ast\parse($string); >> foreach ($ast as $child) { >> echo "\t" . $child->getKindName() . "\n"; >> foreach ($child as $grandchild) { >> echo "\t" . $child->getKindName() . "\n"; >> >> } >> } >> >> Thanks. Dmitry. >> > > I've considered using this approach, but decided against it because it > introduces a lot of "magic" for unclear gains. Lazily creating the objects > means we have to keep around the entire AST arena while at least one node > referencing it is alive. So if you're analyzing a large project with a few > thousand files and keep around a few nodes in some cases, you end up not > just with the memory usage of those nodes, but with the memory usage of the > entire ASTs. Furthermore in many cases you'll just traverse the entire AST, > in which case you'll need to instantiate all nodes anyway and the lazy > instantiation will only hurt. > > In any case, performance of constructing a full AST is pretty good - I > don't remember the exact numbers, but ast\parse_code() is only slightly > slower than doing token_get_all() on the same code. > I think the whole memory usage should be less, because at each point you'll most probably keep only few AST objects, but I understood your arguments. They may make sense, but it's hard to say what is better without comparison of implementations. Thanks. Dmitry. > > Nikita > --001a1136a0ea3a3c8c050f62850e--