Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:36384 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 94425 invoked from network); 23 Mar 2008 14:44:59 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 23 Mar 2008 14:44:59 -0000 Authentication-Results: pb1.pair.com smtp.mail=helly@php.net; spf=unknown; sender-id=unknown Authentication-Results: pb1.pair.com header.from=helly@php.net; sender-id=unknown Received-SPF: unknown (pb1.pair.com: domain php.net does not designate 85.214.94.56 as permitted sender) X-PHP-List-Original-Sender: helly@php.net X-Host-Fingerprint: 85.214.94.56 aixcept.net Linux 2.6 Received: from [85.214.94.56] ([85.214.94.56:43120] helo=h1149922.serverkompetenz.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id BE/DE-10593-AEC66E74 for ; Sun, 23 Mar 2008 09:44:59 -0500 Received: from MBOERGER-ZRH.corp.google.com (72-216.1-85.cust.bluewin.ch [85.1.216.72]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by h1149922.serverkompetenz.net (Postfix) with ESMTP id 1F99911DB78; Sun, 23 Mar 2008 15:44:55 +0100 (CET) Date: Sun, 23 Mar 2008 15:41:51 +0100 Reply-To: Marcus Boerger X-Priority: 3 (Normal) Message-ID: <1431660902.20080323154151@marcus-boerger.de> To: Christian Seiler CC: internals@lists.php.net In-Reply-To: <476D2854.5070803@gmx.net> References: <98b8086f0712150818n40056cedyf0aae7a5a08a27b7@mail.gmail.com> <476582E6.7020808@zend.com> <200712172130.08216.larry@garfieldtech.com> <4FADC266-873E-4FD2-BEC8-28EA9D833297@procata.com> <476D2854.5070803@gmx.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] PATCH: Implementing closures in PHP (was: anonymous functions in PHP) From: helly@php.net (Marcus Boerger) Hello Christian, I have put your proposal as a link to a PHP GSoC 2008 idea here: http://wiki.php.net/gsoc/2008 Feel invited to add to this idea in whatever way you want :-) marcus Saturday, December 22, 2007, 4:08:04 PM, you wrote: > Hi, > I was following this thread and came upon Jeff's posting on how closures > could be implemented in PHP. > Since I would find the feature to be EXTREMELY useful, I decided to > actually implement it more or less the way Jeff proposed. So, here's the > patch (against PHP_5_3, I can write one against HEAD if you whish): > http://www.christian-seiler.de/temp/closures-php-5-3.patch > I started with Wez's patch for adding anonymous functions that aren't > closures. I changed it to make sure no shift/reduce or reduce/reduce > error occur in the grammar. Then I started implementing the actual > closure stuff. It was fun because I learned quite a lot about how PHP > actually works. > I had the following main goals while developing the patch: > 1. Don't reinvent the wheel. > 2. Don't break anything unless absolutely necessary. > 3. Keep it simple. > Jeff proposed a new type of zval that holds additional information about > the function that is to be called. Adding a new type of zval would need > changes throughout the ENTIRE PHP source and probably also throughout > quite a few scripts. But fortunately, PHP already posesses a zval that > supports the storage of arbitrary data while being very lightweight: > Resources. So I simply added a new resource type that stores zend > functions. The $var = function () {}; will now make $var a resource (of > the type "anonymous function". > Anonymous functions are ALWAYS defined at compile time, no matter where > they occur. They are simply named __compiled_lamda_1234 and added to the > global function table. But instead of simply returning the string > '__compiled_lambda_1234', I introduced a new opcode that will create > references to the correct local variables that are referenced inside the > function. > For example, if you have: > $func = function () { > echo "Hello World\n"; > }; > This will result in an anonymous function called '__compiled_lambda_0' > that is added to the function table at compile time. The opcode for the > assignment to $func will be something like: > 1 ZEND_DECLARE_ANON_FUNC ~0 '__compiled_lambda_0' > 2 ASSIGN !0, ~0 > The ZEND_DECLARE_ANON_FUNC opcode handler does the following: > It creates a new zend_function, copies the contents of the entire > structure of the function table entry corresponding to > '__compiled_lamda_0' into that new structure, increments the refcount, > registeres it as a resource and returns that resource so it can be > assigned to the variable. > Now, have a look at a real closure: > $string = "Hello World!\n"; > $func = function () { > lexical $string; > echo $string; > }; > This will result in the same opcode as above. But here, three additional > things happen: > 1. The compiler sees the keyword 'lexical' and stores the information, > that a variable called 'string' should be used inside the closure. > 2. The opcode handler sees that a variable named 'string' is marked as > lexical in the function definition. Therefore it creates a reference to > it in a HashTable of the COPIED zend_function (that will be stored in > the resource). > 3. The 'lexical $string;' translates into a FETCH opcode that will work > in exactly the same way as 'static' or 'global' - only fetching it from > the additional HashTable in the zend_function structure. > The resource destructor makes sure that the HashTable containing the > references to the lexical veriables is correctly destroyed upon > destruction of the resource. It does NOT destroy other parts of the > function structure because they will be freed when the function is > removed from the global function table. > With these changes, closures work in PHP. > Some caveats / bugs / todo: > * Calling anonymous functions by name directly is problematic if there > are lexical variables that need to be assigned. I added checks to > make sure this case does not happen. > * In the opcode handler, error handling needs to be added. > * If somebody removes the function from the global function table, > (e.g. with runkit), the new opcode will return NULL instead of > a resource (error handling is missing). Since I do increment > refcount of the zend_function, it SHOULD not cause segfaults or > memory leaks, but I haven't tested it. > * $this is kind of a problem, because all the fetch handlers in PHP > make sure $this is a special kind of variable. For the first version > of the patch I chose not to care about this because what still works > is e.g. the following: > $object = $this; > $func = function () { > lexical $object; > // do something > }; > Also, inside the closures, the class context is not preserved, so > accessing private / protected members is not possible. > I'm not sure this actually represents a problem because you can > always use normal local variables to pass values between closure > and calling method and make the calling method change the properties > itself. > * I've had some problems with eval(), have a look at the following > code: > $func = eval ('return function () { echo "Hello World!\n"; };'); > $func(); > With plain PHP, this seems to work, with the VLD extension loaded > (that shows the Opcodes), it crashes. I don't know if that's a > problem with eval() or just with VLD and I didn't have time to > investigate it further. > * Oh, yes, 'lexical' is now a keyword. Although I really don't think > that TOO many people use that as an identifier, so it probably won't > hurt THAT much. > Except those above points, it really works, even with complex stuff. Let > me show you some examples: > 1. Customized array_filter: > function filter_larger ($array, $min = 42) { > $filter = function ($value) { > lexical $min; > return ($value >= $min); > }; > return array_filter ($array, $filter); > } > $arr = array (41, 43); > var_dump (filter_larger ($arr)); // 43 > var_dump (filter_larger ($arr, 40)); // 41, 43 > var_dump (filter_larger ($arr, 44)); // empty > 2. Jeff's example: > function getAdder($x) { > return function ($y) { > lexical $x; > return $x + $y; > }; > } > $plusFive = getAdder(5); > $plusTen = getAdder(10); > echo $plusFive(4)."\n"; // 9 > echo $plusTen(7)."\n"; // 17 > 3. Nested closures > $outer = function ($value) { > return function () { > lexical $value; > return $value * 2; > }; > }; > $duplicator = $outer (4); > echo $duplicator ()."\n"; // 8 > $duplicator = $outer (8); > echo $duplicator ()."\n"; // 16 > [Ok, yeah, that example is quite stupid and should NOT be used as an > example for good code. ;-) But it's simple and demonstrates the > possibilities.] > It would be great if somebody could review the patch because I'm shure > some parts can still be cleaned up or improved. And it would be even > better if this feature would make it into PHP. ;-) > Regards, > Christian > PS: I'd like to thank Derick Rethans for his GREAT Vulcan Logic > Disassembler - without it, developement would have been a LOT more painful. > PPS: Oh, yeah, if it should be legally necessary, I grant the right to > anybody to use this patch under any OSI certified license you may want > to choose. Best regards, Marcus