Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:70237 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 40504 invoked from network); 20 Nov 2013 16:41:53 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 20 Nov 2013 16:41:53 -0000 Authentication-Results: pb1.pair.com smtp.mail=ellison.terry@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=ellison.terry@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 74.125.82.43 as permitted sender) X-PHP-List-Original-Sender: ellison.terry@gmail.com X-Host-Fingerprint: 74.125.82.43 mail-wg0-f43.google.com Received: from [74.125.82.43] ([74.125.82.43:41141] helo=mail-wg0-f43.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 39/B1-27452-056EC825 for ; Wed, 20 Nov 2013 11:41:52 -0500 Received: by mail-wg0-f43.google.com with SMTP id n12so9181989wgh.34 for ; Wed, 20 Nov 2013 08:41:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; bh=5rou8QN9EVqEHGJtIUWSNiRuqoQlp0wx9U7t5H5+IAo=; b=VZriutX0+E7hmx4gzuqKTsrRrLRSuFgQp4BvRRWso23bWn2gyvb8eLNu6RONoxWLFo heWCv45EqtGDMsL5uYHQh5r1XNbB3d17PvRGgZAMBi6O3AZhPwb8nsr2K6Ne8ZkFZyLX meF//9MXCvlh9osYmjm+4qhgn2VWIk3Av2KyLO4jStWX/kBRPGEO5oaPhhqrigMl7Ll+ aIaR2r5sZxwbvUPsrX3aMxl6pxi4JS7b3AYqhEqADbktAc6NDhz+Iljug0iPFKwUBNC1 hWLQGbXOzX9qEM2oCoym7yJKl4Ikq4LAmZEt4r4+ePABjNrDIu9ieoXpA3OYeO7llH7u /S7Q== X-Received: by 10.194.93.3 with SMTP id cq3mr1605873wjb.26.1384965709513; Wed, 20 Nov 2013 08:41:49 -0800 (PST) Received: from [192.168.1.91] (host81-129-110-107.range81-129.btcentralplus.com. [81.129.110.107]) by mx.google.com with ESMTPSA id gm2sm46623922wib.4.2013.11.20.08.41.48 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 20 Nov 2013 08:41:49 -0800 (PST) Message-ID: <528CE64A.1020303@gmail.com> Date: Wed, 20 Nov 2013 16:41:46 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 MIME-Version: 1.0 To: PHP Internals Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Comments on non-unique naming convention for closures From: ellison.terry@gmail.com (Terry Ellison) The following bugs relate to this discussion: #64291 Indeterminate GC of evaled lambda function resources #65915 Inconsistent results with require return value I don't want to discuss these or the specific fixes here, since I can work up a fix and discuss them in the bugrep. However, my one-sentence Q is "should we replace the naming convention for closures with a truly unique one?" What I would like is some feedback / guidance / discussion on the general architectural issue which underlies the reasons for these bugs occuring in the first place. * The associated closure functions are compiled during the INCLUDE_OR_EVAL opcode that included / eval'ed the originating PHP source, and are named according to the convention "\0{closure}$name$addr" where the name is the resolved filename of the source where the function was defined and the addr is the absolute address of the function definition within the memory resident copy of the source during the compilation process. The compilation creates an entry in the GC(function_table) for this function * Closure objects are instantiated during the execution ZEND_DECLARE_LAMBDA_FUNCTION opcode at runtime. This handler invokes the built-in CTOR for the closure bind the corresponding function, * Closure objects are destroyed by the closure DTOR in line with the usual PHP scoping rules. * There is no specific deletion function for the GC(function_table) entries other than normal request rundown, because the compiler and RTS can't safely determine if a closure function will no longer be required for further closure, e.g. for ($array as $i) { $f = function($x) { ... }; ... } * The "\0{closure}$name$addr" scheme does provide a sort of dirty reuse in that if another function is compiled with the same filename and at the same source address (which can happen for logically different variables due to storage reuse), then the current implementation simply overwrites the earlier function definition. Though the rationale for this isn't documented, I assume that this is because it is _usually_ safe to assume that the earlier copy is no longer required. However, it is possible to construct cases where this assumption is false -- especially as is the case of OPcached functions where the name can persist for the life of the SMA rather than just one request. What I am suggesting is that this convention should be changed for something like "\0{closure}$name$ctime$index" or even just "\0{closure}$uuid" using the standard UUID algo. Reactions? Comments? The only side effect that I can see is that in circumstances where the overwrite *does* replace stale functions then the function DTOR will no longer be taking place leading a growth in the function table with the associated memory overhead. Thanks for any feedback Terry