17 years ago by Christian Seiler — view source

unread

Hi,

Lukas Kahwe Smith asked me to put my proposal into the PHP wiki, which I
have done:

http://wiki.php.net/rfc/closures

I also have incorporated the last comment by troels knak-nielsen about
JS behaving the same way as my patch and thus not being that much of a
WTF at all (I somehow had a different behaviour for JS in mind, I
probably got confused with yet another language).

Anyway, feel free to comment.

Regards,
Christian

17 years ago by Larry Garfield — view source

unread

Hi,

Lukas Kahwe Smith asked me to put my proposal into the PHP wiki, which I
have done:

http://wiki.php.net/rfc/closures

I also have incorporated the last comment by troels knak-nielsen about
JS behaving the same way as my patch and thus not being that much of a
WTF at all (I somehow had a different behaviour for JS in mind, I
probably got confused with yet another language).

Anyway, feel free to comment.

Should comments from user-space folk be posted here or as comments at the
bottom of the wiki page?

--
Larry Garfield AIM: LOLG42
larry@garfieldtech.com ICQ: 6817012

"If nature has made any one thing less susceptible than all others of
exclusive property, it is the action of the thinking power called an idea,
which an individual may exclusively possess as long as he keeps it to
himself; but the moment it is divulged, it forces itself into the possession
of every one, and the receiver cannot dispossess himself of it." -- Thomas
Jefferson

17 years ago by Philip Olson — view source

unread

.
Anyway, feel free to comment.

Should comments from user-space folk be posted here or as comments
at the
bottom of the wiki page?

We're all users... so here.

Regards,
Philip

17 years ago by Larry Garfield — view source

unread

Thoughts from a user-land denizen:

I conceptually really like. lambdas and closures are a feature I've been
jealous of Javascript for having since I learned how Javascript "really"
worked. :-)
I recall earlier discussion pondering if we should be using the
keyword "function" to describe this new construct. It is a function, but not
as we know it. (Thank you, Mr. Spock.) Should we consider using the
keyword "lambda" instead, or do we think re-using "function" is acceptable?
(I'm cool either way; I just wanted to make sure it was a conscious
decision.)
I don't think the lexical keyword is a wtf at all, especially given that it
works essentially like global. It simplifies both the internal code and the
user-space code, because it becomes more self-documenting. The semi-colon is
a slight wtf, but since it is the same as Javascript I think most people will
get used to it. I have no objection to it.
I am a little confused about the OOP interaction. How does a function
become a public method of the class?

class Example {
private $a = 2;

function myMethod($b) {
$lambda = function() {
lexical $b;
return $this->a * $b; // This part I get
};
return $lambda;
}
}

$e = new Example();
$lambda = $e->myMethod();
$e->$lambda(5);

That doesn't seem right at all, but that's how I interpret "Essentially,
closures inside methods are added as public methods to the class that
contains the original method." Can you give an example of what that actually
means?

Related to that, would it then be possible to add methods to a class at
runtime using lambda functions as the added methods? If so, how? If not, is
that something that could reasonably be added here without hosing performance
(or at least doing so less than stacking __call() and call_user_func_array()
does)?

I guess I just need some code examples to see how this behaves in relation to
objects, because I'm not really following it as is.

My minuscule knowledge of the engine internals looks at the description
there and doesn't see anything to be scared about, but I am far from an
authority in that subject. It does seem to side-step the "can't compile at
compile time" problem that was raised back in December, which is good.

Overall, big +1 pending clarification of how lambdas and classes interact,
which I still don't grok.

Hi,

Lukas Kahwe Smith asked me to put my proposal into the PHP wiki, which I
have done:

http://wiki.php.net/rfc/closures

I also have incorporated the last comment by troels knak-nielsen about
JS behaving the same way as my patch and thus not being that much of a
WTF at all (I somehow had a different behaviour for JS in mind, I
probably got confused with yet another language).

Anyway, feel free to comment.

Regards,
Christian

--
Larry Garfield AIM: LOLG42
larry@garfieldtech.com ICQ: 6817012

"If nature has made any one thing less susceptible than all others of
exclusive property, it is the action of the thinking power called an idea,
which an individual may exclusively possess as long as he keeps it to
himself; but the moment it is divulged, it forces itself into the possession
of every one, and the receiver cannot dispossess himself of it." -- Thomas
Jefferson

17 years ago by Alexey Zakhlestin — view source

unread

I am a little confused about the OOP interaction. How does a function
become a public method of the class?

class Example {
private $a = 2;

function myMethod($b) {
$lambda = function() {
lexical $b;
return $this->a * $b; // This part I get
};
return $lambda;
}
}

$e = new Example();
$lambda = $e->myMethod();
$e->$lambda(5);

That doesn't seem right at all, but that's how I interpret "Essentially,

closures inside methods are added as public methods to the class that

contains the original method." Can you give an example of what that actually
means?

As far as I understand, it means following:

class Example
{
private $a = 1;

private function b()
{
	return 2;
}

public function getLambda($param)
{
	$lambda = function($lparam) {
		lexical $param;

		return $this->a + $this->b() + $param + $lparam;
	}

	return $lambda;
}

}

$obj = new Example();
$lambda = $obj->getLambda(3);

$result = $lambda(4); // 1+2+3+4 => 10

--
Alexey Zakhlestin
http://blog.milkfarmsoft.com/

17 years ago by Nathan Nobbe — view source

unread

On Mon, Jun 16, 2008 at 9:57 PM, Larry Garfield larry@garfieldtech.com
wrote:

Thoughts from a user-land denizen:

Related to that, would it then be possible to add methods to a class at
runtime using lambda functions as the added methods? If so, how? If not,
is
that something that could reasonably be added here without hosing
performance
(or at least doing so less than stacking __call() and
call_user_func_array()
does)?

im curious about this as well, and i would welcome such functionality.
adding methods to class instances from within the definition of the class
seems intuitive, the $this keyword should be available, but when adding
methods via a variable which holds a handle to a class instance, well, i
wonder, would the $this keyword be available within those anonymous function
definitions?

class Dynamic {
private $someVar = 5;
/// adding a function to instances from within the class
public function addMethodAtRuntime() {
$this->dynamicFunc1 = function() {
return $this->someVar; // expected to work
}
}
}

/// adding a function to an instance externally
$dynamic = new Dynamic();
$anotherVar = 6;
$dynamic->dynamicFunc2 = function() {
lexical $anotherVar; // expected to work
return $this->someVar + $anotherVar; // would this work ?
}

/// invoking dynamically added methods
/// (anticipated behavior given above definitions)
$dynamic->addMethodAtRuntime();
echo $dynamic->dynamicFunc1(); // 5
echo $dynamic->dynamicFunc2(); // 11

-nathan

17 years ago by Christian Seiler — view source

unread

Hi!

class Dynamic {
private $someVar = 5;
/// adding a function to instances from within the class
public function addMethodAtRuntime() {
$this->dynamicFunc1 = function() {
return $this->someVar; // expected to work
}
}
}

/// invoking dynamically added methods
/// (anticipated behavior given above definitions)
$dynamic->addMethodAtRuntime();
echo $dynamic->dynamicFunc1(); // 5

This will not work - for the same reason as this does not work:

class SpecialChars {
public $process = 'htmlspecialchars';
}

$sc = new SpecialChars ();
var_dump ($sc->process ('<>')); // call to undefined ...

On the other hand, the following will work:

$sc = new SpecialChars ();
$processor = $sc->process;
var_dump ($processor ('<>')); // string(8) "<>"

The same with closures:

echo $dynamic->dynamicFunc1(); // call to undefined ...
$func = $dynamic->dynamicFunc1;
echo $func (); // 5
echo call_user_func ($dynamic->dynamicFunc1); // 5

As I sead in my other mail: I don't see closures as a method for somehow
making it possible to extend classes dynamically - if you want that, use
runkit.

Regards,
Christian

17 years ago by Christian Seiler — view source

unread

Hi,

I am a little confused about the OOP interaction. How does a function
become a public method of the class?

To clarify: the "public method" ist just the internal representation of
the lambda function and has nothing to do with the semantics of
calling the lambda itself. The "method" only means that the lambda
function defined inside another method can access the class members and
"public" only means that the lambda function can still be called from
outside the class.

class Example {
private $a = 2;

function myMethod($b) {
$lambda = function() {
lexical $b;
return $this->a * $b; // This part I get
};
return $lambda;
}
}

$e = new Example();
$lambda = $e->myMethod();
$e->$lambda(5);

No, that's not what my patch does. My patch does:

class Example {
private $a = 2;

public function myMethod ($b) {
return function () {
lexical $b;
return $this->a * $b;
};
}
}

$e = new Example ();
$lambda = $e->myMethod (4);
var_dump ($lambda ()); // int(8)
$lambda2 = $e->myMethod (6);
var_dump ($lambda2 ()); // int(12)

So esentially, it does not matter whether you define a lambda function
inside a method or a function (or in global scope, for that matter), you
always use it the same way. The in-class-method lambda function only has
the additional advantage of being able to access the private and
protected class members since internally it is treated like a public
class method.

Related to that, would it then be possible to add methods to a class at
runtime using lambda functions as the added methods?

No.

If not, is that something that could reasonably be added here without
hosing performance (or at least doing so less than stacking __call() and
call_user_func_array() does)?

If you want to add methods dynamically to classes, why not use the
runkit extension? I really don't see a point in making lambda functions
and closures something they are not.

Regards,
Christiaan

17 years ago by Larry Garfield — view source

unread

Hi,

I am a little confused about the OOP interaction. How does a function
become a public method of the class?

To clarify: the "public method" ist just the internal representation of
the lambda function and has nothing to do with the semantics of
calling the lambda itself. The "method" only means that the lambda
function defined inside another method can access the class members and
"public" only means that the lambda function can still be called from
outside the class.

If one knew how to access it, which it seems is not possible/feasible for
user-space code.

class Example {
private $a = 2;

function myMethod($b) {
$lambda = function() {
lexical $b;
return $this->a * $b; // This part I get
};
return $lambda;
}
}

$e = new Example();
$lambda = $e->myMethod();
$e->$lambda(5);

No, that's not what my patch does. My patch does:

class Example {
private $a = 2;

public function myMethod ($b) {
return function () {
lexical $b;
return $this->a * $b;
};
}
}

$e = new Example ();
$lambda = $e->myMethod (4);
var_dump ($lambda ()); // int(8)
$lambda2 = $e->myMethod (6);
var_dump ($lambda2 ()); // int(12)

So esentially, it does not matter whether you define a lambda function
inside a method or a function (or in global scope, for that matter), you
always use it the same way. The in-class-method lambda function only has
the additional advantage of being able to access the private and
protected class members since internally it is treated like a public
class method.

I see. It would be great if you could update the RFC with this information so
that it's clearer.

If you want to add methods dynamically to classes, why not use the
runkit extension? I really don't see a point in making lambda functions
and closures something they are not.

I was asking if they could be used for that, not to make them into a different
animal. As for using runkit, do I really need to answer that? :-)

Two other questions that just occurred to me:

What is the interaction with namespaces, if any? Are lambdas as
implemented here ignorant of namespace, or do they take the namespace where
they are lexically defined?
What happens with the following code?

class Foo {
private $a;
}

$f = new Foo();

$b = 5;

$f->myfunc = function($c) {
lexical $b;
print $a; // This generates an error, no?
print $b; // This prints 5, right?
print $c; // Should print whatever $c is.
}

$f->myfunc(3);

Or is the above a parse error entirely?

--
Larry Garfield AIM: LOLG42
larry@garfieldtech.com ICQ: 6817012

"If nature has made any one thing less susceptible than all others of
exclusive property, it is the action of the thinking power called an idea,
which an individual may exclusively possess as long as he keeps it to
himself; but the moment it is divulged, it forces itself into the possession
of every one, and the receiver cannot dispossess himself of it." -- Thomas
Jefferson

17 years ago by Christian Seiler — view source

unread

Hi!

I am a little confused about the OOP interaction. How does a function
become a public method of the class?
To clarify: the "public method" ist just the internal representation of
the lambda function and has nothing to do with the semantics of
calling the lambda itself. The "method" only means that the lambda
function defined inside another method can access the class members and
"public" only means that the lambda function can still be called from
outside the class.

If one knew how to access it, which it seems is not possible/feasible for
user-space code.

No, that's not what I meant. The engine uses the following internal trick:

a) Upon copmilation, my patch simply adds the lambdas as normal
functions to the function table with an automatically generated
unique (!) name. If it happens to be defined within a class method,
the function will be added as a public final method to that class.

b) That added function is not directly callable due to checks of a flag
in the internal structure of that function.

c) At the place of the function definition the compiler leaves an
opcode "grab function $generatedname and make a closure out of it".
This opcode then looks up the generated lambda function, copies the
function structure, saves the bound variables in that structure and
returns the copied structure as a resource.

d) Normally, when a function is called, the name is looked up in the
function table. The function structure that is retrieved from there
is then used to execute the function. Since a lambda resource is
already a function structure, there is no necessity to look up
anything in the function table but the function structure can be
directly passed on to the executor.

Please note step d): The closure functionality only changes the lookup
of the function - so instead of getting the function structure from a
hash table lookup I get the function structure by retrieving it from the
resource. But after the lookup of a class method there are checks for
the access protection of that method. So these access protection checks
also apply to closures that were called. If a lambda function was not
declared public, it could not be used outside of the class it was
defined in. Perhaps this makes it clearer?

I see. It would be great if you could update the RFC with this information so
that it's clearer.

Done: http://wiki.php.net/rfc/closures

Two other questions that just occurred to me:

What is the interaction with namespaces, if any? Are lambdas as
implemented here ignorant of namespace, or do they take the namespace where
they are lexically defined?

My patch itself is namespace-ignorant, but the net result is not:

a) The generated internal function names do not contain the current
namespace name, but since namespace names in function names are only
used for lookup if you want to call the function. And calling
lambdas by name (!) directly doesn't work anyway (is not supposed
to work) so this poses no problem.

b) The code inside the closure is namespace-aware because the
information of which namespace is used is added at compile time.
Either the name lookup is done entirely at compile time or the
current compiler namespace is automatically added to all runtime
lookup calls (this is already the case with current code). So
the information which namespace a function resides in is currently
irrelevant at runtime when calling other functions.

For (b) let me make two examples:

Suppose you have the following code:

namespace Foo;

function baz () {
return "Hello World!\n";
}

function bar () {
return function () {
echo baz ();
};
}

and in another file:

$lambda = Foo::bar ();
$lambda ();

This will - as expected - print "Hello World!\n".

The reason is that the compiler upon arriving at the baz() function call
inside the closure already looks up the function in the function table
directly (it knows the current namespace) - and simply creates a series
of opcodes that will call the function with the name "Foo::baz" (the
lookup is already done at compile time).

Consider this other code:

foo-bar.php:

namespace Foo;

function bar () {
return function () {
echo baz ();
};
}

foo-baz.php:

namespace Foo;

function baz () {
return "Hello World!\n";
}

baz.php:

function baz () {
return "PHP!\n";
}

test1.php:

require 'foo-bar.php';
require 'foo-baz.php';

$lambda = Foo::bar ();
$lambda ();

test2.php:

require 'foo-bar.php';
require 'baz.php';

$lambda = Foo::bar ();
$lambda ();

Running test1.php yields "Hello World!" whereas running test2.php yields
"PHP!". Why is this? Because when the compiler reaches the baz ()
function call in the closure, it cannot find the function so it cannot
determine whether it's a function in global or in namespace scope. So it
will simply add a series of opcodes that say "try Foo::bar and if that
does not exist try bar". Here again, Foo:: is added by the compiler to
the opcode because the compiler has the necessary information which
namespace the call is currently in.

The runtime execution engine NEVER looks at the function as to which
namespace the function belongs to, it ONLY looks at the function calling
opcodes that the compiler already correctly (!) generates.

Please note that I have only described what the PHP engine does with
namespaces until now ANYWAY, the closures DO NOT CHANGE ANYTHING related
to this. That's why my patch doesn't have to care about namespaces at
all but the net-result closures will.

What happens with the following code?

class Foo {
private $a;
}

$f = new Foo();

$b = 5;

$f->myfunc = function($c) {
lexical $b;
print $a; // This generates an error, no?
print $b; // This prints 5, right?
print $c; // Should print whatever $c is.
}

$f->myfunc(3);

Or is the above a parse error entirely?

Well, first of all, it's a parse error because you forgot the ; after
the } of the closure. ;-)

Then: Closures only have access to the scope in which they are defined
in. Whether you assign that closure to a class property or not does not
change the semantics of the closure itself. The closure definition is
simply the following part: function ($c) { ... }. You then use the part
to store the closure as a class property. The closure you define is
entirely oblivious of the class since it is not defined inside a class
method. Consider this code:

function foo () {
echo $a;
}

$f->myfunc = 'foo';

Well, of course, first of all $a will not work anyway because $a has
never been the object member access method of PHP. But even if you use
$this->a, that won't work with foo either. For the exact same reasons it
will not work with closures.

Your comments inside the closure are correct though: $a gives an error
(notice: undefined variable), $b gives 5 (unless $b is modified later
on) and $c gives the current value of the parameter $c.

But: As I already said last time, $f->myfunc(3) won't work. That will
give "no such method". For the same reason as $f->myfunc = 'foo';
$f->myfunc(); will not work - because -> can only be used to call
METHODS, not dynamic functions stored in properties.

Regards,
Christian

17 years ago by Edward Z. Yang — view source

unread

Larry Garfield wrote:

$e = new Example();
$lambda = $e->myMethod();
$e->$lambda(5);

That doesn't seem right at all, but that's how I interpret "Essentially,
closures inside methods are added as public methods to the class that
contains the original method." Can you give an example of what that actually
means?

At first blush, this sets off warning bells in my head, I suppose
because my notion of a lambda says that the lambda should not carry any
baggage about the context it was created in.

However, with further thought, I believe that binding the lambda's
lexical scope to the place it was defined is:

Conducive to good coding (you will always be able to look outside the
lambda to find out where the lexical variables are coming form)
Adds functionality, since anything you want to pass to the function
via the callee's context can be passed via a parameter

What would be neat, however, is the ability to rebind the lambda to
another context. Also, I don't know how other languages do it (Python?
Lisp?).

--
Edward Z. Yang GnuPG: 0x869C48DA
HTML Purifier http://htmlpurifier.org Anti-XSS Filter
[[ 3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA ]]

17 years ago by Marcus Boerger — view source

unread

Hello Christian,

very nice work. I think we should really add this to 5.3. The only thing I
don't like is the function naming (“\0__compiled_lambda_FILENAME_N”). Can
we drop the \0? For methods inside classes, do we have to provide real
private methods or do we support visibility fully? Or did you use the \0
prefix to prevent direct invocations? If so, it doesn't help, as the user
can simply create the function call with the \0.

I think the best option would be to force lambda functions being public
always. The last question is about changing visibility and overriding
functions, do we want that, or should we mark lamdas as final? Note that
preceeding the function names with a \0 does not help. In fact it might
confuse reflection. Actually how does it integrate with reflection?

Comments about the implementation:

ZendEngine2/zend_compile.c:

Why provide forward declaration for free_filename(), simply putting the
new function above the first user avoids it and does the same with
increased maintainability.

s/static void add_lexical_var (/static void add_lexical_var(/

Other than that it all looks fine.

marcus

Monday, June 16, 2008, 7:39:19 PM, you wrote:

Hi,

As a followup to the discussion in January, I'd like post a revised patch to
this list that implements closures and anonymous functions in PHP.

INTRODUCTION

Closures and lambda functions can make programming much easier in
several ways:

Lambda functions allow the quick definition of throw-away functions
that are not used elsewhere. Imaging for example a piece of code that
needs to call preg_replace_callback(). Currently, there are three
possibilities to acchieve this:

  a. Define the callback function elsewhere. This distributes code that
     belongs together throughout the file and decreases readability.

  b. Define the callback function in-place (but with a name). In 
that case
one has to use function_exists() to make sure the function is only
defined once. Example code:

      &lt;?php
         function replace_spaces ($text) {
           if (!function_exists ('replace_spaces_helper')) {
             function replace_spaces_helper ($matches) {
               return str_replace ($matches[1], ' ', '&nbsp;').' ';
             }
           }
           return preg_replace_callback ('/( +) /',

'replace_spaces_helper',
$text);
}
?>

     Here, the additional if() around the function definition makes the
     source code difficult to read.

  c. Use the present `create_function()` in order to create a function at
     runtime. This approach has several disadvantages: First of all, 
syntax
highlighting does not work because a string is passed to the
function.
It also compiles the function at run time and not at compile
time so
opcode caches can't cache the function.

Closures provide a very useful tool in order to make lambda
functions even
more useful. Just imagine you want to replace 'hello' through
'goodbye' in
all elements of an array. PHP provides the array_map() function which
accepts a callback. If you don't wan't to hard-code 'hello' and
'goodbye'
into your sourcecode, you have only four choices:

  a. Use `create_function()`. But then you may only pass literal values
     (strings, integers, floats) into the function, objects at best as
     clones (if `var_export()` allows for it) and resources not at 
all. And
you have to worry about escaping everything correctly.
Especially when
handling user input this can lead to all sorts of security issues.

  b. Write a function that uses global variables. This is ugly,
     non-reentrant and bad style.

  c. Create an entire class, instantiate it and pass the member function
     as a callback. This is perhaps the cleanest solution for this 
problem
with current PHP but just think about it: Creating an entire
class for
this extremely simple purpose and nothing else seems overkill.

  d. Don't use `array_map()` but simply do it manually (foreach). In this
     simple case it may not be that much of an issue (because one simply
     wants to iterate over an array) but there are cases where doing
     something manually that a function with a callback as parameter

does
for you is quite tedious.

 [Yes, I know that str_replace also accepts arrays as a third 
parameter so
this example may be a bit useless. But imagine you want to do a more
complex operation than simple search and replace.]

PROPOSED PATCH

I now propose a patch that implements compile-time lambda functions and
closures for PHP while keeping the patch as simple as possible. The patch is
based on a previous patch on mine which was based on ideas discussed here
end of December / start of January.

Userland perspective

The patch adds the following syntax as a valid expression:

function & (parameters) { body }

(The & is optional and indicates - just as with normal functions - that the
anonymous function returns a reference instead of a value)

Example usage:

$lambda = function () { echo "Hello World!\n"; };

The variable $lambda then contains a callable resource that may be called
through different means:

$lambda ();
call_user_func ($lambda);
call_user_func_array ($lambda, array ());

This allows for simple lambda functions, for example:

function replace_spaces ($text) {
$replacement = function ($matches) {
return str_replace ($matches[1], ' ', ' ').' ';
};
return preg_replace_callback ('/( +) /', $replacement, $text);
}

The patch implements closures by defining an additional keyword 'lexical'
that allows an lambda function (and only an lambda function) to import
a variable from the "parent scope" to the lambda function scope. Example:

function replace_in_array ($search, $replacement, $array) {
$map = function ($text) {
lexical $search, $replacement;
if (strpos ($text, $search) > 50) {
return str_replace ($search, $replacement, $text);
} else {
return $text;
}
};
return array_map ($map, array);
}

The variables $search and $replacement are variables in the scope of the
function replace_in_array() and the lexical keyword imports these variables
into the scope of the closure. The variables are imported as a reference,
so any change in the closure will result in a change in the variable of the
function itself.

If a closure is defined inside an object, the closure has full access
to the current object through $this (without the need to use 'lexical' to
import it seperately) and all private and protected methods of that class.
This also applies to nested closures. Essentially, closures inside
methods are
added as public methods to the class that contains the original method.

Closures may live longer as the methods that declared them. It is
perfectly
possible to have something like this:

function getAdder($x) {
return function ($y) {
lexical $x;
return $x + $y;
};
}

Zend internal perspective

The patch basically changes the following in the Zend engine:

When the compiler reaches a lambda function, it creates a unique name
for that
function ("\0__compiled_lambda_FILENAME_N" where FILENAME is the name of the
file currently processed and N is a per-file counter). The use of the
filename
in the function name ensures compability with opcode caches. The lambda
function is then immediately added to the function table (either the global
function table or that of the current class if declared inside a class
method).
Instead of a normal ZEND_DECLARE_FUNCTION opcode the new
ZEND_DECLARE_LAMBDA_FUNC is used as an opcode at this point. The op_array
of the new function is initialized with is_lambda = 1 and is_closure = 0.

When parsing a 'lexical' declaration inside an anonymous function the parser
saves the name of the variable that is to be imported in an array stored
as a member of the op_array structure (lexical_names).

The opcode handler for ZEND_DECLARE_LAMBDA_FUNC does the following: First of
all it creates a new op_array and copies the entire memory structure of the
lambda function into it (the opcodes themselves are not copied since they
are only referenced in the op_array structure). Then it sets is_closure = 1
on the new op_array, and for each lexical variable name that the compiler
added to the original op_array it creates a reference to that variable from
the current scope into a HashTable member in the new op_array. It also saves
the current object pointer ($this) as a member of the op_array in order to
allow for the closure to access $this. Finally it registers the new op_array
as a resource and returns that resource.

The opcode handler of the 'lexical' construct simply fetches the variable
from that HashTable and imports it into local scope of the inner function
(just like with 'global' only with a different hash table).

Some hooks were added that allow the 'lambda function' resource to be
called.
Also, there are several checks in place that make sure the lambda function
is not called directly, i.e. if someone explicitely tries to use the
internal
function name instead of using the resource return value of the declaration.

The patch

The patch is available here:
http://www.christian-seiler.de/temp/closures-php-5.3-2008-06-16-1.diff

Please note that I did NOT include the contents of zend_language_scanner.c
in the patch since that can easily be regenerated and just takes up enormous
amounts of space.

The patch itself applies against the 5.3 branch of PHP.

If I understand the discussion regarding PHP6 on this list correctly, some
people are currently undergoing the task of removing the unicode_semantics
switch and if (UG(unicode)). As soon as this task is finished I will also
provide a patch for CVS HEAD (it doesn't make much sense adopting the patch
now and then having to change it again completely afterwards).

BC BREAKS

Introduction of a new keyword 'lexical'. Since it is very improbable
that
someone should use it as a function, method, class or property name, I
think this is an acceptable break.

Other that that, I can find no BC breaks of my patch.

CAVEATS / POSSIBLE WTFS

On writing $func = function () { }; there is a semicolon necessary.
If left
out it will produce a compile error. Since any attempt to remove that
necessity would unecessarily bloat the grammar, I suggest we simply keep
it the way it is. Also, Lukas Kahwe Smith pointed out that a single
trailing semicolon after a closing brace already exists: do { }
while ();

The fact that 'lexical' creates references may cause certain WTFs:

  for ($i = 0; $i < 10; $i++) {
    $arr[$i] = function () { lexical $i; return $i; };
  }

This will not work as expected since $i is a reference and thus all
created closures would reference the same variable. In order to get this
right one has to do:

  for ($i = 0; $i < 10; $i++) {
    $loopIndex = $i;
    $arr[$i] = function () { lexical $loopIndex; return $loopIndex; };
    unset ($loopIndex);
  }

This can be a WTF for people that don't expect lexical to create an
actual reference, especially since other languages such as JavaScript
don't do it. On the other hand, global and static both DO create
references so that behaviour is consistent with current PHP.

But complex constructions such as this will probably not be used by
beginners so maintaining a good documentation should solve this.

The fact that 'lexical' is needed at all may cause WTFs. Other languages
such as JavaScript implicitely have the entire scope visible to child
functions. But since PHP does the same thing with global variables, I
find a keyword like 'lexical' much more consistent than importing the
entire scope (and always importing the entire scope costs unnecessary
performance).

FINAL THOUGHTS

My now proposed patch addresses the two main problems of my previous patch:
Support for closures in objects (with access to $this) and opcode caches. My
patch applies against PHP_5_3 and does not break any tests. It adds a
valuable
new language feature which I'd like to see in PHP.

Regards,
Christian

Best regards,
Marcus

17 years ago by Christian Seiler — view source

unread

Hi Marcus,

very nice work.

Thanks!

The only thing I don't like is the function naming
(“\0__compiled_lambda_FILENAME_N”). Can we drop the \0?

I used \0 because it is already used in two other places:

create_function (run-time lambda functions) uses \0__lambda_N
build_runtime_defined_function_key uses \0 to start function names.

I can drop it if you like; personally, I don't care for either solution

it's an internal name that may leak to userspace in some
circumstances but is never really useful for userspace anyway..

A minor side-note here: I oriented myself at
build_runtime_defined_function_key at the time of writing but I have
noticed a slight discrepancy between function names generated by
build_runtime_defined_function_key and the normal function names: When
stored in the corresponding function_table hash, for runtime defined
function keys it is opline->op1.u.constant.value.str.len, whereas for
normal function names it is »Z_STRLEN_P(...) + 1« and thus including the
trailing (not preceding!) \0 in the hash key for normal function names
but not including it for runtime defined function keys. Any idea why
that is the case? [For the record: I'm refering to the code that is
already used in PHP, not to my patch!]

Or did you use the \0 prefix to prevent direct invocations?

No, direct invocations are prevented by the is_lambda == 1 &&
is_closure == 0 check.

I think the best option would be to force lambda functions being public
always.

They are. If you look at my modified version of
zend_do_begin_function_declaration, you will see that:

if (is_lambda && CG(active_class_entry)) {
is_method = 1;
fn_flags = ZEND_ACC_PUBLIC;
if (CG(active_op_array)->fn_flags & ZEND_ACC_STATIC) {
fn_flags |= ZEND_ACC_STATIC;
}
} else if (is_lambda) {
fn_flags = 0;
}

The only attribute that is "inherited" from the parent function is
whether that function is static or not.

The last question is about changing visibility and overriding
functions, do we want that, or should we mark lamdas as final?

Internal representations of lambda fuctions should never be overridden,
so yes, ZEND_ACC_FINAL would probably be a good idea. Overriding them
won't work anyway since the new opcode that "instantiates" a closure
will always use the class in which the closure was defined to look it up.

I'll add that.

Actually how does it integrate with reflection?

Good question, I will investigate that and come back to you.

Comments about the implementation:

ZendEngine2/zend_compile.c:

Why provide forward declaration for free_filename(), simply putting the
new function above the first user avoids it and does the same with
increased maintainability.

s/static void add_lexical_var (/static void add_lexical_var(/

Ok, I will fix that.

Regards,
Christian

17 years ago by Christian Seiler — view source

unread

Hi Marcus,

I now have revised my patch to include your suggestions:

http://www.christian-seiler.de/temp/closures-php-5.3-2008-06-17-2.diff

The changes to the previous version:

\0 at the start of the compiled lambda function name is dropped.
lambdas which are class members are now marked as final
the generated name of the lambda is now also stored within the
op_array (before op_array->function_name was simply "lambda")
your suggestions for code cleanups in zend_compile.c

Actually how does it integrate with reflection?

Consider the following class:

class Example {
private $x = 0;

public function getIncer () {
return function () {
$this->x++;
};
}

public function show () {
$this->reallyShow ();
}

protected function reallyShow () {
echo "{$this->x}\n";
}
}

Running

Reflection::export(new ReflectionClass('Example'));

will yield (among other things):

Methods [4] {
Method [ <user> public method getIncer ] {
@@ /home/christian/dev/php5.3/c-tests/reflection.php 6 - 10
}

Method [ <user> final public method
_compiled_lambda/home/christian/dev/php5.3/c-tests/reflection.php_0 ] {
@@ /home/christian/dev/php5.3/c-tests/reflection.php 7 - 9
}

Method [ <user> public method show ] {
@@ /home/christian/dev/php5.3/c-tests/reflection.php 12 - 14
}

Method [ <user> protected method reallyShow ] {
@@ /home/christian/dev/php5.3/c-tests/reflection.php 16 - 18
}
}

So lambda functions appear simply as additional methods in the class,
with their generated name.

Of course, the ReflectionMethod / ReflectionFunction classes could be
extended so that it provides and additional method isLambda() in order
to determine whether a function actually is a lambda function. But I'd
rather do that in a separate step and a separate patch.

Regards,
Christian

17 years ago by Chris Stockton — view source

unread

Hello,

Great patch and a much needed feature. One thing I do not agree with is the
point in the lexical key word, seems it should be natural to inherit the
outer scope. I guess the choice of adding lexical and going slightly against
the grain of typical closure implementations like scheme or ecmascript is
that is not really consistent with php so i can understand disagreement and
your note you made on performance. Seems like the right choice to force
manual inheritance of outer scope. But great work on this, hope it gets
added and none of the core developers say it is not the php way or is only
useful in brainless languages.

-Chris

17 years ago by Marcus Boerger — view source

unread

Hello Christian, Johannes,

Tuesday, June 17, 2008, 2:24:01 PM, you wrote:

Hi Marcus,

I now have revised my patch to include your suggestions:

http://www.christian-seiler.de/temp/closures-php-5.3-2008-06-17-2.diff

The changes to the previous version:

\0 at the start of the compiled lambda function name is dropped.

lambdas which are class members are now marked as final

the generated name of the lambda is now also stored within the
op_array (before op_array->function_name was simply "lambda")

your suggestions for code cleanups in zend_compile.c

Actually how does it integrate with reflection?

Consider the following class:

class Example {
private $x = 0;

public function getIncer () {
return function () {
$this->x++;
};
}

public function show () {
$this->reallyShow ();
}

protected function reallyShow () {
echo "{$this->x}\n";
}
}

Running

Reflection::export(new ReflectionClass('Example'));

will yield (among other things):

Methods [4] {
Method [ <user> public method getIncer ] {
@@ /home/christian/dev/php5.3/c-tests/reflection.php 6 - 10
}

 Method [ <user> final public method 
_compiled_lambda/home/christian/dev/php5.3/c-tests/reflection.php_0 ] {
@@ /home/christian/dev/php5.3/c-tests/reflection.php 7 - 9
}

 Method [ <user> public method show ] {
   @@ /home/christian/dev/php5.3/c-tests/reflection.php 12 - 14
 }

 Method [ <user> protected method reallyShow ] {
   @@ /home/christian/dev/php5.3/c-tests/reflection.php 16 - 18
 }

}

So lambda functions appear simply as additional methods in the class,
with their generated name.

Of course, the ReflectionMethod / ReflectionFunction classes could be
extended so that it provides and additional method isLambda() in order
to determine whether a function actually is a lambda function. But I'd
rather do that in a separate step and a separate patch.

Yep, Reflection details can and should be addressed separately.

I think the next step is confirming with the rest of the core team that we
all ewant this. IMO it has often enough been requested and your patch
solves every tiny piece that was missing so far. And if people agree then
unfortunately you need to provide a patch for HEAD first. Actually you
could do so just now.

Johannes, what's your take on this one for 5.3?

Best regards,
Marcus

17 years ago by Stanislav Malyshev — view source

unread

Hi!

Johannes, what's your take on this one for 5.3?

I'm not Johannes and I didn't review the proposal in detail yet, but I
think we have enough for 5.3 right now. I'd think we better concentrate
on tying the loose ends and rolling beta out and then moving towards the
release than adding more and more features and never releasing it. 5.3
is not the final release until the end of times, there will be 5.4 etc.
and 6, so there will be ample opportunity to add stuff. And 5.3 has
enough stuff to be released, there's no rush to add more new things,
especially radically new ones. My opinion is that we better take some
time with it and not tie it to 5.3.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

17 years ago by Steph Fox — view source

unread

I'm not Johannes and I didn't review the proposal in detail yet, but I
think we have enough for 5.3 right now. I'd think we better concentrate
on tying the loose ends and rolling beta out and then moving towards the
release than adding more and more features and never releasing it. 5.3
is not the final release until the end of times, there will be 5.4 etc.
and 6, so there will be ample opportunity to add stuff. And 5.3 has
enough stuff to be released, there's no rush to add more new things,
especially radically new ones. My opinion is that we better take some
time with it and not tie it to 5.3.

Amen to that.

Steph

--
Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

17 years ago by Alexey Zakhlestin — view source

unread

I'm not Johannes and I didn't review the proposal in detail yet, but I
think we have enough for 5.3 right now. I'd think we better concentrate on
tying the loose ends and rolling beta out and then moving towards the
release than adding more and more features and never releasing it. 5.3 is
not the final release until the end of times, there will be 5.4 etc. and 6,
so there will be ample opportunity to add stuff. And 5.3 has enough stuff to
be released, there's no rush to add more new things, especially radically
new ones. My opinion is that we better take some time with it and not tie it
to 5.3.

although I really want to have this functionality right now, I agree with Stas.
5.3 has a lot of new things already. this should go to HEAD, and,
hopefully, it will attract more people to actually try php-6

--
Alexey Zakhlestin
http://blog.milkfarmsoft.com/

17 years ago by Marcus Boerger — view source

unread

Hello Stanislav,

nicely put but not in agreement with the PHP world. First we cannot add
a new feature like this in a mini release as it comes with an API change.
And second PHP is not anywhere close so we'd have to do it in a PHP 5.4
and personally I would like to avoid it.

marcus

Tuesday, June 17, 2008, 9:19:56 PM, you wrote:

Hi!

Johannes, what's your take on this one for 5.3?

I'm not Johannes and I didn't review the proposal in detail yet, but I
think we have enough for 5.3 right now. I'd think we better concentrate
on tying the loose ends and rolling beta out and then moving towards the
release than adding more and more features and never releasing it. 5.3
is not the final release until the end of times, there will be 5.4 etc.
and 6, so there will be ample opportunity to add stuff. And 5.3 has
enough stuff to be released, there's no rush to add more new things,
especially radically new ones. My opinion is that we better take some
time with it and not tie it to 5.3.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

Best regards,
Marcus

17 years ago by Stanislav Malyshev — view source

unread

Hi!

nicely put but not in agreement with the PHP world. First we cannot add
a new feature like this in a mini release as it comes with an API change.
And second PHP is not anywhere close so we'd have to do it in a PHP 5.4
and personally I would like to avoid it.

You meant "PHP 6 not anywhere close"? well, if we keep adding completely
new stuff to 5.3 then 5.3 would not be anywhere close so what we earn by
that? I understand that we all want new and cool stuff in PHP, but
endlessly delaying 5.3 is not the answer.
And this feature is a big thing, it's not adding yet another module or
function - it should be thoroughly reviewed and seen how it affect all
other things. It's probably very cool addition, but yet more reason to
consider all the side effects and surprises on the way.
Pushing it in now would mean we'd either have to delay 5.3 yet another
half-year or we'd release buggy 5.3 and then we'd be bound by API
compatibility and couldn't fix things that we might want to fix. I think
as much as we want to add yet another cool feature - we have to release
versions for the users to actually be able to enjoy features we already
added, from time to time. And time for 5.3 is very much due, and we
still have a bunch of work to do on it.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

17 years ago by Andrei Zmievski — view source

unread

Yes, I would rather put it in 5.4 (or whatever the next version) is and
make sure that along with lambdas/closures we have a way of referring to
functions/methods as first-class objects.

-Andrei

Marcus Boerger wrote:

Hello Stanislav,

nicely put but not in agreement with the PHP world. First we cannot add
a new feature like this in a mini release as it comes with an API change.
And second PHP is not anywhere close so we'd have to do it in a PHP 5.4
and personally I would like to avoid it.

marcus

Tuesday, June 17, 2008, 9:19:56 PM, you wrote:

Hi!

Johannes, what's your take on this one for 5.3?

I'm not Johannes and I didn't review the proposal in detail yet, but I
think we have enough for 5.3 right now. I'd think we better concentrate
on tying the loose ends and rolling beta out and then moving towards the
release than adding more and more features and never releasing it. 5.3
is not the final release until the end of times, there will be 5.4 etc.
and 6, so there will be ample opportunity to add stuff. And 5.3 has
enough stuff to be released, there's no rush to add more new things,
especially radically new ones. My opinion is that we better take some
time with it and not tie it to 5.3.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

Best regards,
Marcus

17 years ago by Stanislav Malyshev — view source

unread

Hi!

Yes, I would rather put it in 5.4 (or whatever the next version) is and
make sure that along with lambdas/closures we have a way of referring to
functions/methods as first-class objects.

Maybe we could make some object handler so that $object($foo) would work
and treat object as "functional object" called on $foo and then have
both reflection and closure object implement it?

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

17 years ago by Lars Strojny — view source

unread

Hi Markus, hi Stas,

Am Dienstag, den 17.06.2008, 12:19 -0700 schrieb Stanislav Malyshev:
[...]

I'm not Johannes and I didn't review the proposal in detail yet, but I
think we have enough for 5.3 right now. I'd think we better concentrate
on tying the loose ends and rolling beta out and then moving towards the
release than adding more and more features and never releasing it. 5.3
is not the final release until the end of times, there will be 5.4 etc.
and 6, so there will be ample opportunity to add stuff. And 5.3 has
enough stuff to be released, there's no rush to add more new things,
especially radically new ones. My opinion is that we better take some
time with it and not tie it to 5.3.

I would like to see 5.3 released first before we add really cool
features like this. I'm really +1 for closures but I please for 5.4 and

cu, Lars

17 years ago by Christian Seiler — view source

unread

Hi!

I'm not Johannes and I didn't review the proposal in detail yet, but I
think we have enough for 5.3 right now. I'd think we better concentrate
on tying the loose ends and rolling beta out and then moving towards the
release than adding more and more features and never releasing it. 5.3
is not the final release until the end of times, there will be 5.4 etc.
and 6, so there will be ample opportunity to add stuff. And 5.3 has
enough stuff to be released, there's no rush to add more new things,
especially radically new ones. My opinion is that we better take some
time with it and not tie it to 5.3.

I would like to see 5.3 released first before we add really cool
features like this. I'm really +1 for closures but I please for 5.4 and

If I may add my own personal (and biased ;-)) opinion (which may not
count much but I'd like to present the arguments): I'd like to see it in
PHP 5.3. Mainly because of two reasons:

First: My patch is quite non-intrusive, it only adds things in a few
places (new opcode, a few checks). If you only look at the non-generated
files (i.e. excluding files generated by re2c or zend_vm_gen.php), the
patch is actually not even that long:
http://www.christian-seiler.de/temp/closures-php-5.3-2008-06-17-3-redux.diff
Except for the introduction of a 'lexical' keyword I carefully designed
the patch not to have any impact at all on PHPs other behaviour. I'd
be genuinely surprised if any code breaks with my patch. I also don't
see how this would delay 5.3 - of course things have to be tested but at
least as far as I can tell the major showstoppers currently are class
inheritance rules and namespaces which still cause quite a few headaches
of their own.

Second: If closures are not supported in PHP 5.3, even with the release
of PHP 6 backwards compability will be a hindrance in using them. Since
PHP 6 will have Unicode support and thus quite a few semantic changes,
this will of course not matter much for the actual PHP applications
since these will have to change anyway. But think of class libraries:
There are many things that can be implemented in class libraries where
unicode support doesn't matter at all - such as for example advanced
date and time calculations (beyond timelib), financial calculations etc.
Such libraries will probably want to maintain compability with PHP 5.3
as long as possible. But these libraries may profit from closures.

If you still decide to include closures only from post PHP 5.3, I
suggest to at least declare 'lexical' a reserved keyword in PHP 5.3.

Just my 2 cents...

As to the patch for HEAD: I thought it best to wait for
unicode.semantics to go away along with the if (UG(unicode)) checks
before implementing it (everything else would be a waste of time - since
if I'm not mistaken someone is actually currently removing those). If I
really am mistaken in my interpretation of the discussions here on this
topic and they are not going away (at least not in the short term), I
can of course provide one now (meaning the next few days).

Regards,
Christian

17 years ago by Stanislav Malyshev — view source

unread

Hi!

First: My patch is quite non-intrusive, it only adds things in a few
places (new opcode, a few checks). If you only look at the non-generated

I think it falls into "famous last words" category. While I did not have
time yet to look into the patch in the detail, I have hard time to
believe patch creating wholly new concept in PHP, new opcodes, etc.
would have zero impact. You have to consider at least the following:
tests, documentation, how lexical interacts with other references
(global? static? just variable passed by-ref?), how closure interacts
with various reflection capabilities, how it works with bytecode caches,
what happens with lifetimes of the variables saved in closures -
especially implicit ones like $this, etc., etc. I know these questions
can be answered, and maybe even easily answered, but I think they have
to be answered without pressure of 5.3 release and commitment to the
fixed API hanging over us.

I understand your urge to have it inside ASAP - if you didn't want it,
you'd not gone through this effort to create it :) However, I still
think we better not make 5.3 dependent on yet another new feature.
As for adoption - I think it would take a long time for off-the-shelf
libraries and mainstream users to use this anyway, and for the hackers
among us it will be available in development version pretty soon after
5.3. I think if we would decide that every new feature anybody can think
about should enter into 5.3 because it will be harder to adopt it
otherwise, we'd never release 5.3 at all - look at the RFCs, we have a
bunch of ideas already, and I'm sure there will be more. We need to
release some time - what happened to that "release often" thing?

Please do not consider this to be opinion about (or against) the patch -
I think the idea is good and from preliminary glance the implementation
is very nice too, but IMHO we just can not have everything in one release.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

17 years ago by Marcus Boerger — view source

unread

Hello Stanislav,

Wednesday, June 18, 2008, 5:00:09 PM, you wrote:

Hi!

First: My patch is quite non-intrusive, it only adds things in a few
places (new opcode, a few checks). If you only look at the non-generated

I think it falls into "famous last words" category. While I did not have
time yet to look into the patch in the detail, I have hard time to

You have time to answer every little mail. It would only be fair if you
showed respect by at least looking into patches that people provide
because they tried to address long outstanding issues and even address
every little comment we all including you made in the past. Given the wiki
he even clearly showed that he understands what he is doing and that he
did care about a hell of detail.

believe patch creating wholly new concept in PHP, new opcodes, etc.
would have zero impact. You have to consider at least the following:
tests, documentation, how lexical interacts with other references
(global? static? just variable passed by-ref?), how closure interacts
with various reflection capabilities, how it works with bytecode caches,
what happens with lifetimes of the variables saved in closures -
especially implicit ones like $this, etc., etc. I know these questions
can be answered, and maybe even easily answered, but I think they have
to be answered without pressure of 5.3 release and commitment to the
fixed API hanging over us.

I understand your urge to have it inside ASAP - if you didn't want it,
you'd not gone through this effort to create it :) However, I still
think we better not make 5.3 dependent on yet another new feature.
As for adoption - I think it would take a long time for off-the-shelf
libraries and mainstream users to use this anyway, and for the hackers
among us it will be available in development version pretty soon after
5.3. I think if we would decide that every new feature anybody can think
about should enter into 5.3 because it will be harder to adopt it
otherwise, we'd never release 5.3 at all - look at the RFCs, we have a
bunch of ideas already, and I'm sure there will be more. We need to
release some time - what happened to that "release often" thing?

Please do not consider this to be opinion about (or against) the patch -
I think the idea is good and from preliminary glance the implementation
is very nice too, but IMHO we just can not have everything in one release.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

Best regards,
Marcus

17 years ago by Christopher Jones — view source

unread

Christian Seiler wrote:

As a followup to the discussion in January, I'd like post a revised
patch to this list that implements closures and anonymous functions
in PHP.

Did I miss seeing its phpt tests?
A really thorough test suite might the case for inclusion in PHP 5.3.

Chris

--
Christopher Jones, Oracle
Email: christopher.jones@oracle.com Tel: +1 650 506 8630
Blog: http://blogs.oracle.com/opal/ Free PHP Book: http://tinyurl.com/f8jad

17 years ago by Andi Gutmans — view source

unread

Hi Christian,

This is a very nice piece of work. Definitely addresses a lot of the issues we have raised in the past.
I would like to see such a solution make its way into PHP (see below re: timing).

There are some things I'd like to consider:

I am not sure that the current semantics of the "lexical" keyword is great in all cases. Is the reason why you don't allow by-value binding so that we don't have to manage more than one lambda instance per declaration?
[minor curiosity - do we want to consider reusing "parent" instead of "lexical"? I guess that could be confusing but it's not the first time we reuse a keyword when it's clear that the usage is in two different places (this is minor and I don't mind much either way although lexical doesn't mean too much to me).]
I am concerned about binding to classes. First of all we need to look into more detail what the implications are for bytecode caches when changing class entries at run-time. We may want to also consider an option where the lambda binds to the object and only has public access although I realize that may be considered by some as too limiting. We'll review these two things in the coming days.

Re: timing, I think the biggest issue we have right now with PHP 5.3 is that we are not making a clear cut on features. There's always pressure on release managers to include more (I went through the same with 5.0) but at some point you just have to stop at some place or things will never go out as there are always good ideas flowing in. Unfortunately with 5.3 that cut isn't happening and it seems to drag out longer than needed. I prefer having this discussion in the context of a hard date for a beta release after which we'll be especially strict with accepting new features. Each new feature will drag out the beta/RC cycle as they need enough time for testing/feedback/tweaks.

Andi

-----Original Message-----
From: Christian Seiler [mailto:chris_se@gmx.net]
Sent: Monday, June 16, 2008 10:39 AM
To: php-dev List
Subject: [PHP-DEV] [PATCH] [RFC] Closures and lambda functions in PHP

Hi,

As a followup to the discussion in January, I'd like post a revised patch to
this list that implements closures and anonymous functions in PHP.

INTRODUCTION

Closures and lambda functions can make programming much easier in
several ways:
Lambda functions allow the quick definition of throw-away functions
that are not used elsewhere. Imaging for example a piece of code that
needs to call preg_replace_callback(). Currently, there are three
possibilities to acchieve this:

a. Define the callback function elsewhere. This distributes code that
belongs together throughout the file and decreases readability.

b. Define the callback function in-place (but with a name). In
that case
one has to use function_exists() to make sure the function is only
defined once. Example code:
 &lt;?php
    function replace_spaces ($text) {
      if (!function_exists ('replace_spaces_helper')) {
        function replace_spaces_helper ($matches) {
          return str_replace ($matches[1], ' ', '&nbsp;').' ';
        }
      }
      return preg_replace_callback ('/( +) /',
'replace_spaces_helper',
$text);
}
?>
     Here, the additional if() around the function definition makes the
     source code difficult to read.

  c. Use the present `create_function()` in order to create a function at
     runtime. This approach has several disadvantages: First of all,
syntax
highlighting does not work because a string is passed to the
function.
It also compiles the function at run time and not at compile
time so
opcode caches can't cache the function.

Closures provide a very useful tool in order to make lambda
functions even
more useful. Just imagine you want to replace 'hello' through
'goodbye' in
all elements of an array. PHP provides the array_map() function which
accepts a callback. If you don't wan't to hard-code 'hello' and
'goodbye'
into your sourcecode, you have only four choices:

a. Use create_function(). But then you may only pass literal values
(strings, integers, floats) into the function, objects at best as
clones (if var_export() allows for it) and resources not at
all. And
you have to worry about escaping everything correctly.
Especially when
handling user input this can lead to all sorts of security issues.

b. Write a function that uses global variables. This is ugly,
non-reentrant and bad style.

c. Create an entire class, instantiate it and pass the member function
as a callback. This is perhaps the cleanest solution for this
problem
with current PHP but just think about it: Creating an entire
class for
this extremely simple purpose and nothing else seems overkill.

d. Don't use array_map() but simply do it manually (foreach). In this
simple case it may not be that much of an issue (because one simply
wants to iterate over an array) but there are cases where doing
something manually that a function with a callback as parameter
does
for you is quite tedious.

[Yes, I know that str_replace also accepts arrays as a third
parameter so
this example may be a bit useless. But imagine you want to do a more
complex operation than simple search and replace.]

PROPOSED PATCH

I now propose a patch that implements compile-time lambda functions and
closures for PHP while keeping the patch as simple as possible. The patch is
based on a previous patch on mine which was based on ideas discussed here
end of December / start of January.

Userland perspective

The patch adds the following syntax as a valid expression:

function & (parameters) { body }

(The & is optional and indicates - just as with normal functions - that the
anonymous function returns a reference instead of a value)

Example usage:

$lambda = function () { echo "Hello World!\n"; };

The variable $lambda then contains a callable resource that may be called
through different means:

$lambda ();
call_user_func ($lambda);
call_user_func_array ($lambda, array ());

This allows for simple lambda functions, for example:

function replace_spaces ($text) {
$replacement = function ($matches) {
return str_replace ($matches[1], ' ', ' ').' ';
};
return preg_replace_callback ('/( +) /', $replacement, $text);
}

The patch implements closures by defining an additional keyword 'lexical'
that allows an lambda function (and only an lambda function) to import
a variable from the "parent scope" to the lambda function scope. Example:

function replace_in_array ($search, $replacement, $array) {
$map = function ($text) {
lexical $search, $replacement;
if (strpos ($text, $search) > 50) {
return str_replace ($search, $replacement, $text);
} else {
return $text;
}
};
return array_map ($map, array);
}

The variables $search and $replacement are variables in the scope of the
function replace_in_array() and the lexical keyword imports these variables
into the scope of the closure. The variables are imported as a reference,
so any change in the closure will result in a change in the variable of the
function itself.

If a closure is defined inside an object, the closure has full access
to the current object through $this (without the need to use 'lexical' to
import it seperately) and all private and protected methods of that class.
This also applies to nested closures. Essentially, closures inside
methods are
added as public methods to the class that contains the original method.

Closures may live longer as the methods that declared them. It is
perfectly
possible to have something like this:

function getAdder($x) {
return function ($y) {
lexical $x;
return $x + $y;
};
}

Zend internal perspective

The patch basically changes the following in the Zend engine:

When the compiler reaches a lambda function, it creates a unique name
for that
function ("\0__compiled_lambda_FILENAME_N" where FILENAME is the name of the
file currently processed and N is a per-file counter). The use of the
filename
in the function name ensures compability with opcode caches. The lambda
function is then immediately added to the function table (either the global
function table or that of the current class if declared inside a class
method).
Instead of a normal ZEND_DECLARE_FUNCTION opcode the new
ZEND_DECLARE_LAMBDA_FUNC is used as an opcode at this point. The op_array
of the new function is initialized with is_lambda = 1 and is_closure = 0.

When parsing a 'lexical' declaration inside an anonymous function the parser
saves the name of the variable that is to be imported in an array stored
as a member of the op_array structure (lexical_names).

The opcode handler for ZEND_DECLARE_LAMBDA_FUNC does the following: First of
all it creates a new op_array and copies the entire memory structure of the
lambda function into it (the opcodes themselves are not copied since they
are only referenced in the op_array structure). Then it sets is_closure = 1
on the new op_array, and for each lexical variable name that the compiler
added to the original op_array it creates a reference to that variable from
the current scope into a HashTable member in the new op_array. It also saves
the current object pointer ($this) as a member of the op_array in order to
allow for the closure to access $this. Finally it registers the new op_array
as a resource and returns that resource.

The opcode handler of the 'lexical' construct simply fetches the variable
from that HashTable and imports it into local scope of the inner function
(just like with 'global' only with a different hash table).

Some hooks were added that allow the 'lambda function' resource to be
called.
Also, there are several checks in place that make sure the lambda function
is not called directly, i.e. if someone explicitely tries to use the
internal
function name instead of using the resource return value of the declaration.

The patch

The patch is available here:
http://www.christian-seiler.de/temp/closures-php-5.3-2008-06-16-1.diff

Please note that I did NOT include the contents of zend_language_scanner.c
in the patch since that can easily be regenerated and just takes up enormous
amounts of space.

The patch itself applies against the 5.3 branch of PHP.

If I understand the discussion regarding PHP6 on this list correctly, some
people are currently undergoing the task of removing the unicode_semantics
switch and if (UG(unicode)). As soon as this task is finished I will also
provide a patch for CVS HEAD (it doesn't make much sense adopting the patch
now and then having to change it again completely afterwards).

BC BREAKS

Introduction of a new keyword 'lexical'. Since it is very improbable
that
someone should use it as a function, method, class or property name, I
think this is an acceptable break.

Other that that, I can find no BC breaks of my patch.

CAVEATS / POSSIBLE WTFS

On writing $func = function () { }; there is a semicolon necessary.
If left
out it will produce a compile error. Since any attempt to remove that
necessity would unecessarily bloat the grammar, I suggest we simply keep
it the way it is. Also, Lukas Kahwe Smith pointed out that a single
trailing semicolon after a closing brace already exists: do { }
while ();

The fact that 'lexical' creates references may cause certain WTFs:

for ($i = 0; $i < 10; $i++) {
$arr[$i] = function () { lexical $i; return $i; };
}

This will not work as expected since $i is a reference and thus all
created closures would reference the same variable. In order to get this
right one has to do:

for ($i = 0; $i < 10; $i++) {
$loopIndex = $i;
$arr[$i] = function () { lexical $loopIndex; return $loopIndex; };
unset ($loopIndex);
}

This can be a WTF for people that don't expect lexical to create an
actual reference, especially since other languages such as JavaScript
don't do it. On the other hand, global and static both DO create
references so that behaviour is consistent with current PHP.

But complex constructions such as this will probably not be used by
beginners so maintaining a good documentation should solve this.

The fact that 'lexical' is needed at all may cause WTFs. Other languages
such as JavaScript implicitely have the entire scope visible to child
functions. But since PHP does the same thing with global variables, I
find a keyword like 'lexical' much more consistent than importing the
entire scope (and always importing the entire scope costs unnecessary
performance).

FINAL THOUGHTS

My now proposed patch addresses the two main problems of my previous patch:
Support for closures in objects (with access to $this) and opcode caches. My
patch applies against PHP_5_3 and does not break any tests. It adds a
valuable
new language feature which I'd like to see in PHP.

Regards,
Christian

17 years ago by Alexey Zakhlestin — view source

unread

I am not sure that the current semantics of the "lexical" keyword is great in all cases. Is the reason why you don't allow by-value binding so that we don't have to manage more than one lambda instance per declaration?

by-reference binding is much closer to other languages symantics. I
guess, that was the main reason Christian chose it.
"by-value" may still exist, if people find, that they need it, but
only in addition, please.

lambda has to reflect changing state of context, to be truly useful

--
Alexey Zakhlestin
http://blog.milkfarmsoft.com/

17 years ago by Gwynne Raskind — view source

unread

I am not sure that the current semantics of the "lexical"
keyword is great in all cases. Is the reason why you don't allow by-
value binding so that we don't have to manage more than one lambda
instance per declaration?
by-reference binding is much closer to other languages symantics. I
guess, that was the main reason Christian chose it.
"by-value" may still exist, if people find, that they need it, but
only in addition, please.

lambda has to reflect changing state of context, to be truly useful

In Lua, the language in which I've seen the most of closures and
lambda, lexical scoping is handled this way:

someVariable1 = "asdf";
someVariable2 = "jkl;";
SomeFunction = function()
local someVariable2 = "1234";

print someVariable1.." "..someVariable2.."\n";

end
print gettype(SomeFunction).."\n";
SomeFunction();
someVariable1 = "qwer";
someVariable2 "0987";
SomeFunction();

The resulting output of this code fragment would be:
function
asdf 1234
qwer 1234

The Lua interpreter handles this by resolving variable references as
they're made; "someVariable1" is looked up in the closure's scope and
not found, so the interpreter steps out one scope and looks for it
there, repeat as necessary. Once found outside the closure's scope,
something similar to the proposed "lexical" keyword happens. Closures
and lexical variables can be nested this way, to the point where a
single variable in a sixth-level closure could still have been
originally found in the global scope.

I'm not sure this would work for PHP, I'm curious what others think.

Of course, that fragment does a very poor job of showing off the
extreme flexibility of Lua with regards to functions and scoping, but
hopefully it illustrates the concept.

-- Gwynne, Daughter of the Code
"This whole world is an asylum for the incurable."

17 years ago by Richard Quadling — view source

unread

2008/6/18 Gwynne Raskind gwynne@wanderingknights.org:

I am not sure that the current semantics of the "lexical" keyword is

great in all cases. Is the reason why you don't allow by-value binding so
that we don't have to manage more than one lambda instance per declaration?

by-reference binding is much closer to other languages symantics. I
guess, that was the main reason Christian chose it.
"by-value" may still exist, if people find, that they need it, but
only in addition, please.

lambda has to reflect changing state of context, to be truly useful

In Lua, the language in which I've seen the most of closures and lambda,
lexical scoping is handled this way:

someVariable1 = "asdf";
someVariable2 = "jkl;";
SomeFunction = function()
local someVariable2 = "1234";
   print someVariable1.." "..someVariable2.."\n";
end
print gettype(SomeFunction).."\n";
SomeFunction();
someVariable1 = "qwer";
someVariable2 "0987";
SomeFunction();

The resulting output of this code fragment would be:
function
asdf 1234
qwer 1234

The Lua interpreter handles this by resolving variable references as
they're made; "someVariable1" is looked up in the closure's scope and not
found, so the interpreter steps out one scope and looks for it there, repeat
as necessary. Once found outside the closure's scope, something similar to
the proposed "lexical" keyword happens. Closures and lexical variables can
be nested this way, to the point where a single variable in a sixth-level
closure could still have been originally found in the global scope.

I'm not sure this would work for PHP, I'm curious what others think.

Of course, that fragment does a very poor job of showing off the extreme
flexibility of Lua with regards to functions and scoping, but hopefully it
illustrates the concept.

-- Gwynne, Daughter of the Code
"This whole world is an asylum for the incurable."

--

Is "nested scope" just the same as "namespace" in this regard?

--

Richard Quadling
Zend Certified Engineer : http://zend.com/zce.php?c=ZEND002498&r=213474731
"Standing on the shoulders of some very clever giants!"

17 years ago by Stanislav Malyshev — view source

unread

Hi!

The Lua interpreter handles this by resolving variable references as
they're made; "someVariable1" is looked up in the closure's scope and
not found, so the interpreter steps out one scope and looks for it

You may get into a problem here - creator's scope may not exist when you
execute the closure, and using caller's scope would be very unexpected -
usually closures are intended to capture part of creating environment,
not calling environment. It would also impose serious penalty if you
just use undefined variable - you'd have to go through whole stack up to
the top.

there, repeat as necessary. Once found outside the closure's scope,
something similar to the proposed "lexical" keyword happens. Closures

lexical in the proposal binds to creator's scope, not caller's scope, as
I understood. Anyway, binding to caller's immediate scope doesn't seem
that useful since you could just pass it as a parameter when calling.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

17 years ago by Gwynne Raskind — view source

unread

The Lua interpreter handles this by resolving variable references
as they're made; "someVariable1" is looked up in the closure's
scope and not found, so the interpreter steps out one scope and
looks for it
You may get into a problem here - creator's scope may not exist when
you execute the closure, and using caller's scope would be very
unexpected - usually closures are intended to capture part of
creating environment, not calling environment. It would also impose
serious penalty if you just use undefined variable - you'd have to
go through whole stack up to the top.

This lookup happens at the time the closure is first declared, and the
value is stored for later use by the closure; the calling scope
doesn't need to exist anymore. The problem with going to the top of
the stack is an issue, though; the Lua interpreter's idea of "scope"
is rather different from PHPs, and it's not nearly the same penalty
there.

-- Gwynne, Daughter of the Code
"This whole world is an asylum for the incurable."

17 years ago by Stanislav Malyshev — view source

unread

Hi!

This lookup happens at the time the closure is first declared, and the
value is stored for later use by the closure; the calling scope doesn't

This would work for $var, but what about $$var and various other ways of
indirect variable access?

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

17 years ago by Christian Seiler — view source

unread

Hi,

[I'm going to collect here a bit:]

Stanislav Malyshev wrote:

lexical in the proposal binds to creator's scope, not caller's scope, as
I understood. Anyway, binding to caller's immediate scope doesn't seem
that useful since you could just pass it as a parameter when calling.

Correct and I completely agree.

Chris Stockton wrote:

I am curious if is_callable will be able to detect these?

Yes, as is call_user_func able to call them. (But as I was verifying
that I saw that there was a tiny bug in the code that makes sure the
internal names are not used directly, I will fix that.)

Richard Quadling wrote:

[JS Example]
I'm not sure I would say that there is a reference used there.

I'm no expert, but x, f() and n are all working like normal variables with x
having to look outside of f().

Unless I'm getting confused with "pass by reference".

Yes, shure, ok, it's not a reference in the classical sense BUT the
effect is the same: A change INSIDE the closure changes the variable
outside. The only useful way of doing that in PHP without rewriting the
complete engine is using references - and such things are already done
via references - namely global variables: If you import a variable via
$global, in reality you are creating a reference to the actual global
variable - global $foo is actually more or less the same as $foo =&
$_GLOBALS['foo'].

Regards,
Christian

17 years ago by Christian Seiler — view source

unread

Hi!

I am not sure that the current semantics of the "lexical" keyword
is great in all cases. Is the reason why you don't allow by-value
binding so that we don't have to manage more than one lambda instance
per declaration?

First of all: global and static are also used to create references to
other variables (OK, with static they are not visible to the outside,
but nevertheless...) and second because other languages do the same. As
someone corrected me a while ago, even JS uses references, test the
following JavaScript code:

function foo () {
var x = 5;
var f = function (n) {
x += n;
};
alert (x);
f (2);
alert (x);
}

foo ();

That will yield first 5 and then 7.

[minor curiosity - do we want to consider reusing "parent" instead
of "lexical"? I guess that could be confusing but it's not the first
time we reuse a keyword when it's clear that the usage is in two
different places (this is minor and I don't mind much either way
although lexical doesn't mean too much to me).]

Consider this code:

class A {
public function printSomething ($var) {
echo "$var\n";
}
}

class B extends A {
public function printSomething ($var) {
$printer = function () {
parent $var;
parent::printSomething ('I print: ' . $var);
};
$printer ();
}
}

Yeah, of course, my example is extremely stupid since it could be done
entirely without closures but I really dread the perspective of having
to explain someone the difference between those two lines...

I am concerned about binding to classes. First of all we need to
look into more detail what the implications are for bytecode caches
when changing class entries at run-time.

Well, that's the thing: My patch does NOT change classes at runtime, so
that is totally a non-issue. :-)

When creating a lambda function inside a class method, it adds a new
class method for the lambda function at compile time (!). This
compile-time added method has a dynamic name (__compiled_lambda_F_N
where F is the filename and N is a per-file counteŕ). To an opcode cache
processing this class this added method will appear no different than a
normal class method - it can be cached just the same.

Now, upon execution of the code containing the closure, the new opcode
just copies the zend_function structure into a copy, registers that copy
as a resource and returns that resource. As soon as the resource is
garbage collected (or explicitly unset), the op_array copy is destroyed.
No modification of the actual class is done at all - the cache remains
happy.

Just for clarity I have posted a sample output of PHP with my Patch and
VLD active (<153> is the new ZEND_DECLARE_LAMBDA_FUNC opcode that VLD
does not yet know about):

http://www.christian-seiler.de/temp/php-closure-opcodes.txt

Perhaps this helps to understand better how my patch works?

We may want to also consider an option where the lambda binds to the
object and only has public access although I realize that may be
considered by some as too limiting. We'll review these two things in
the coming days.

What do you mean with "binds to the object"?

But if you only want to grant access to public object members: If I
declare a closure inside a class method, from a programmers point of
view I am still within that class - why shouldn't I be able to access
all class properties there? I would find such a limitation quite odd
(and technically unecessary).

Regards,
Christian

17 years ago by Chris Stockton — view source

unread

Hello,

I am curious if is_callable will be able to detect these? Or do we need a
is_lamba, or is_function or something. You may have mentioned it but reading
through I did not notice. I am only curious how to know when someone passed
me one of these. Maybe a type hint would be nice too but that is a different
conversation I guess.

-Chris

17 years ago by Andi Gutmans — view source

unread

Hi Christian,

Thanks for the clarifications. This helped a lot and makes me feel very confident about this implementation. I think this is a very strong proposal.

A few additional things I thought about while taking a closer look:

You mention "global" and "static" as examples of how we do things today. They are actually not good examples because the binding by reference which they do has been a real pain over the years. This is why we introduced the $GLOBALS[] array so that you could also assign by reference ($GLOBALS["foo"] =& $var). Now that I think of this example I'd actually prefer to see $LEXICALS[] or something similar to access variables then go with the broken global/static behavior. This will bite us and people will complain... In general, I always recommend to people to keep away from "global" and go with "$GLOBALS[]".
Minor implementation suggestion: I am not sure we need those flags for closures and have those if() statements before function calls. We took the same approach with other obfuscated functions/methods/variables. If the developer really wants to cheat the engine and assemble an obfuscated name then he can. It's like doing the following in C: ((fun(*)()) 0x454544)(). I say, be my guest. It just simplifies implementation a bit. No biggy but consistent with the rest of PHP.
Please check eval(). I assume it will bind to global scope but let's just make sure what happens esp. when it's called from within a method...
In PHP 5, object storage is resources done right. I don't think we should be using the resource infrastructure for this implementation and would prefer to use the object one. It's better. I suggest to take a look at it.

Will also look into byte code cache implementation issues incl. performance pieces but it looks like there shouldn't be any show stoppers here but I want to verify.

Thanks again for your hard work!

Andi

17 years ago by Stanislav Malyshev — view source

unread

Hi!

by reference ($GLOBALS["foo"] =& $var). Now that I think of this
example I'd actually prefer to see $LEXICALS[] or something similar

The problem here might be that if we do something like $LEX[$foo] in
runtime, we don't know which parts of parent's scope we need to
preserve. While I like the syntax, it may not work this way.

Which brings me to another point - how bad would it be if closure's
lifetime would be limited to parent's lifetime? Of course, this would
limit some tricks, but would allow some other - like this direct access
to parent's scope.

Minor implementation suggestion: I am not sure we need those flags
for closures and have those if() statements before function calls. We

In any case I think we don't need to waste 2 bytes (or more with
alignment) on something that's essentially 2 bits. I know it's
nitpicking, but every little bit helps :) Of course, if we drop the
flags the point is moot.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

17 years ago by Alexey Zakhlestin — view source

unread

Which brings me to another point - how bad would it be if closure's
lifetime would be limited to parent's lifetime? Of course, this would limit
some tricks, but would allow some other - like this direct access to
parent's scope.

that would be seriously bad, because it will eliminate possibility of
lambda-generating functions

--
Alexey Zakhlestin
http://blog.milkfarmsoft.com/

17 years ago by Christian Seiler — view source

unread

Hi Andi, Hi Stanislav,

You mention "global" and "static" as examples of how we do things
today. They are actually not good examples because the binding by
reference which they do has been a real pain over the years. This
is why we introduced the $GLOBALS[] array so that you could also
assign by reference ($GLOBALS["foo"] =& $var). Now that I think of
this example I'd actually prefer to see $LEXICALS[] or something
similar to access variables then go with the broken global/static
behavior. This will bite us and people will complain... In general,
I always recommend to people to keep away from "global" and go with
"$GLOBALS[]".

The problem here might be that if we do something like $LEX[$foo] in
runtime, we don't know which parts of parent's scope we need to
preserve. While I like the syntax, it may not work this way.

Yes, that's the point. 'lexical $foo' does two things (instead of global
simply doing one thing):

a) At compile time (!) remember the name of the variable specified
in an internal list assigned to the function. Example:

 function () {
   lexical $data, $i;
 }

 This will cause op_array->lexical_names to be the list "data", "i".

b) At run time, the lexical keyword creates a reference to the
lexical variables that are stored in op_array->lexical_variables
(just as global does with global scope)

The op_array->lexical_variables itself is filled in the new opcode which
is executed upon assignment (read: creation) of the closure. It's
essentially a for loop that goes through op_array->lexical_names and
adds a reference from the current symbol table to the
op_array->lexical_variables table.

So, to make an example (with line numbers for reference):

1 $data = "foo";
2 $i = 4;
3 $func = function () {
4 lexical $data, $i;
5 return array ($data, $i);
6 };
7
8 $func ();

Step 1 (Line 4 at compile time): op_array->lexical_names is set to
"data", "i".

Step 2 (Line 3 at run time): The ZEND_DECLARE_LAMBDA_FUNC opcode is
executed, it creates a copy of the op_array to store in the return
value, in the copy it initializes the hash table
op_array->lexical_variables and then creates two new variables in
op_array->lexical_variables which are references to the current scope
varialbes $data and $i:

+---------------+ +-------------------------+
| lex_variables | | EG(active_symbol_table) |
+---------------+ ref +-------------------------+
| data ------|------------------|-> data |
| i ------|------------------|-> i |
| | | func |
| | | ... |
+---------------+ +-------------------------+

Step 3 (Line 8 at run time): The closure is executed.

Step 4 (Line 4 at run time): The lexical keyword retrieves the $data and
$i variables from op_array->lexical_variables and adds a reference to
them:

+-------------------------+ +---------------+ +-------------+
| EG(active_symbol_table) | | lex_variables | | parent s.t. |
+-------------------------+ +---------------+ +-------------+
| data --------|------|-> data ---|------|-> data |
| i --------|------|-> i ---|------|-> i |
| | | | | func |
| | | | | ... |
+-------------------------+ +---------------+ +-------------+

Btw: The grammar for lexical_variable contains only T_VARIABLE (and not
${...} etc.) on purpose - to be sure the name can be extracted at
compile time.

(Just as a clarification how the patch internally works.)

Frankly, I don't really see a problem with using references. It fits
into what's already there in PHP and it assures that closures have the
necessary properties to make them useful.

Which brings me to another point - how bad would it be if closure's
lifetime would be limited to parent's lifetime? Of course, this would
limit some tricks, but would allow some other - like this direct
access to parent's scope.

That "trick" would actually completely destroy the concept of closures:
The idea behind closures is that the lexical scope during creation of
the closure is saved. If you say "I want direct access via $LEXICALS"
then the lexical scope during the execution of the closure will be
used (yeah sure, the scope will be the scope during the creation of the
closure but the state of that scope will be the scope during execution
and not creation - "unbinding" variables after defining the closure (and
therefore for example loops) will not be possible at that point).

Furthermore, the idea that the closure lives longer than the scope in
which it was declared is one of the other most basic ideas behind
closures. Also, consider this code:

function foo () {
$lambda = function () { echo "Hello World!\n"; };
$lambda ();
return $lambda;
}

$lambda = foo ();
$lambda (); # What should happen here?

That would be a major WTF in my eyes... You return something that was
perfectly valid inside the function WITHOUT CHANGE TO IT and just
because you leave the function it becomes invalid?

Personally, I don't like the idea of dumping two essential concepts of
closures just because using variable references may seem a bit of a pain
in some way.

Minor implementation suggestion: I am not sure we need those
flags for closures and have those if() statements before function
calls. We took the same approach with other obfuscated
functions/methods/variables. If the developer really wants to
cheat the engine and assemble an obfuscated name then he can. It's
like doing the following in C: ((fun(*)()) 0x454544)(). I say, be
my guest. It just simplifies implementation a bit. No biggy but
consistent with the rest of PHP.

Personally, I do like to catch all possible errors, even if they don't
matter that much. But if you think this is superflous, I can remove it.

Granted, if called directly without being a closure, all lexical
variables will be NULL, so it doesn't really represent a problem.

Please check eval(). I assume it will bind to global scope but
let's just make sure what happens esp. when it's called from within
a method...

Hmm, closures inside eval() will bind variables to the scope in which
eval() was called. But closures defined inside eval will NOT be class
methods, even if eval() is called within a class.

But I do find that behaviour consistent with what PHP currently does
with normal functions and variables: If eval()'d or include()'d inside a
function, variables will the "global scope" of eval() or the included
file will actually be the local function scope whereas defined functions
inside will automatically become global functions.

Of course, this behaviour should be documented but I don't see a reason
to try and change it.

In PHP 5, object storage is resources done right. I don't think
we should be using the resource infrastructure for this
implementation and would prefer to use the object one. It's better.
I suggest to take a look at it.

Hmm, seems like a good idea. If nobody objects in the next few days,
I'll rewrite my patch to use objects instead of resources. What class
name do you suggest?

Regards,
Christian

PS: Somebody made me aware of a segfault in my code when destroying the
closure variable while still inside the closure. I'll fix that.

17 years ago by Andi Gutmans — view source

unread

See below:

-----Original Message-----
From: Christian Seiler [mailto:chris_se@gmx.net]
Sent: Wednesday, June 18, 2008 1:14 PM
To: php-dev List
Subject: Re: [PHP-DEV] [PATCH] [RFC] Closures and lambda functions in PHP

Frankly, I don't really see a problem with using references. It fits
into what's already there in PHP and it assures that closures have the
necessary properties to make them useful.

I think you are right that there isn't really a good alternative as the "parent" scope does not necessarily exist anymore. Your solution is likely the best.

Please check eval(). I assume it will bind to global scope but
let's just make sure what happens esp. when it's called from within
a method...

Hmm, closures inside eval() will bind variables to the scope in which
eval() was called. But closures defined inside eval will NOT be class
methods, even if eval() is called within a class.

But I do find that behaviour consistent with what PHP currently does
with normal functions and variables: If eval()'d or include()'d inside a
function, variables will the "global scope" of eval() or the included
file will actually be the local function scope whereas defined functions
inside will automatically become global functions.

Of course, this behaviour should be documented but I don't see a reason
to try and change it.

I agree. It behaves as I would expect I just wanted to make sure you verify that because I didn't have the opportunity to do so. You'd actually have to work very hard for it not to behave in that way :) Let's just make sure we have unit tests for both cases just so we have a good regression on this one.

In PHP 5, object storage is resources done right. I don't think
we should be using the resource infrastructure for this
implementation and would prefer to use the object one. It's better.
I suggest to take a look at it.

Hmm, seems like a good idea. If nobody objects in the next few days,
I'll rewrite my patch to use objects instead of resources. What class
name do you suggest?

Great. I think Closure is probably a good name.
[Btw, if we want to get fancy we could even have a __toString() method on those which would print out information about the Closure. But this is not a must, just something which eventually could be nice for debugging purposes...]

PS: Somebody made me aware of a segfault in my code when destroying the
closure variable while still inside the closure. I'll fix that.

:)

Thanks,
Andi

17 years ago by troels knak-nielsen — view source

unread

In PHP 5, object storage is resources done right. I don't think
we should be using the resource infrastructure for this
implementation and would prefer to use the object one. It's better.
I suggest to take a look at it.

Hmm, seems like a good idea. If nobody objects in the next few days,
I'll rewrite my patch to use objects instead of resources. What class
name do you suggest?

Great. I think Closure is probably a good name.
[Btw, if we want to get fancy we could even have a __toString() method on those which would print out information about the Closure. But this is not a must, just something which eventually could be nice for debugging purposes...]

Using objects, instead of resources is an excellent idea. Would it be
possible to introduce a general __invoke (Or whatever name is more
fitting) magic-method, so that whichever object implements that
method, is callable with call_user_func (and directly through
variable-function-syntax). Eg.:
class Foo {
function __invoke($thing) {
echo "Foo: " . $thing;
}
}

$foo = new Foo();
$foo("bar"); // > echoes "Foo: bar"

I'm not sure how this would play together with lexical scope?

--
troels

17 years ago by Marcus Boerger — view source

unread

Hello Andi,

Thursday, June 19, 2008, 8:44:07 AM, you wrote:

See below:

-----Original Message-----
From: Christian Seiler [mailto:chris_se@gmx.net]
Sent: Wednesday, June 18, 2008 1:14 PM
To: php-dev List
Subject: Re: [PHP-DEV] [PATCH] [RFC] Closures and lambda functions in PHP

Frankly, I don't really see a problem with using references. It fits
into what's already there in PHP and it assures that closures have the
necessary properties to make them useful.

I think you are right that there isn't really a good alternative as the
"parent" scope does not necessarily exist anymore. Your solution is likely the best.

I though we are speaking of PHP here? And all I remeber is that PHP has
reference counting. So it doesn't matter if we do reference or value
binding. We simply have to increase the internal reference counter - done.

Please check eval(). I assume it will bind to global scope but
let's just make sure what happens esp. when it's called from within
a method...

Hmm, closures inside eval() will bind variables to the scope in which
eval() was called. But closures defined inside eval will NOT be class
methods, even if eval() is called within a class.

But I do find that behaviour consistent with what PHP currently does
with normal functions and variables: If eval()'d or include()'d inside a
function, variables will the "global scope" of eval() or the included
file will actually be the local function scope whereas defined functions
inside will automatically become global functions.

Of course, this behaviour should be documented but I don't see a reason
to try and change it.

I agree. It behaves as I would expect I just wanted to make sure you
verify that because I didn't have the opportunity to do so. You'd
actually have to work very hard for it not to behave in that way :) Let's
just make sure we have unit tests for both cases just so we have a good regression on this one.

In PHP 5, object storage is resources done right. I don't think
we should be using the resource infrastructure for this
implementation and would prefer to use the object one. It's better.
I suggest to take a look at it.

Hmm, seems like a good idea. If nobody objects in the next few days,
I'll rewrite my patch to use objects instead of resources. What class
name do you suggest?

Great. I think Closure is probably a good name.
[Btw, if we want to get fancy we could even have a __toString() method
on those which would print out information about the Closure. But this is
not a must, just something which eventually could be nice for debugging purposes...]

PS: Somebody made me aware of a segfault in my code when destroying the
closure variable while still inside the closure. I'll fix that.

:)

Thanks,
Andi

Best regards,
Marcus

17 years ago by Kalle Sommer Nielsen — view source

unread

Hey

I must say that the lexical keywords makes alot more sense to me which keeps
the syntax readable without making it too cryptic for the unexperinced or new
developer to php.

I think introducing both the lexical keyword and as Andi proposed a
$LEXICAL as
a to the global / $GLOBALS.

Both ways have potential, but I would personal go with the lexical keyword.

Regrads, Kalle

17 years ago by Alexander Wagner — view source

unread

First, a comment from haskell-land:
http://www.haskell.org/pipermail/haskell-cafe/2008-June/044533.html
http://www.haskell.org/pipermail/haskell-cafe/2008-June/thread.html#44379

Frankly, I don't really see a problem with using references. It fits
into what's already there in PHP and it assures that closures have the
necessary properties to make them useful.

References are necessary, but an easy way to obtain copies of variables from
the lexical context would be really nice.

I have been introduced to functional programming through Haskell, where values
are immutable, so a reference is basically the same as a copy. I like this
behaviour because it makes closures distinctly non-dangerous by default.

Getting the same behaviour out of PHP should not be as difficult as this:
for ($i = 0; $i < 10; $i++) {
$loopIndex = $i;
$arr[$i] = function () { lexical $loopIndex; return $loopIndex; };
unset ($loopIndex);
}
This is not only quite a hassle (making beer much cheaper than water, so to
speak), I also believe it to be error-prone. A lot of programmers are going
to forget that unset().

I would prefer something like this:
for ($i = 0; $i < 10; $i++) {
$arr[$i] = function () { lexical_copy $i; return $i; };
}

An alternative would be to let lexical behavie like function parameters:

copies by default
lexical $x;
objects referenced by default
lexical $obj;
other references optional
lexical &$y;

Of course this would make lexical behave quite differently from global in this
regard, decreasing consistency, but f*ck global, nobody should use that
anyway. Better to have nice lexical closures.

Gesundheit
Wag

--
Be careful about reading health books. You may die of a misprint.

Mark Twain

17 years ago by Stanislav Malyshev — view source

unread

Hi!

First, a comment from haskell-land:
http://www.haskell.org/pipermail/haskell-cafe/2008-June/044533.html
http://www.haskell.org/pipermail/haskell-cafe/2008-June/thread.html#44379

Thanks for the links, very interesting. Even a couple of comments in the
thread going beyond "PHP sucks" and really discussing the matter. :)
Best account is this:

A closure must only keep alive the varables it references, not the
whole pad on which they are allocated
[Check]
A closure must be able to call itself recursively (via a
higher-order function typically)
[Check, since you can use variable you assigned closure to inside the
closure, if I understand correctly]
Multiple references to the same body of code with different bindings
must be able to exist at the same time
[Check]
Closures must be nestable.
[Dunno - does the patch allow nesting and foo(1)(2)?]

Getting the same behaviour out of PHP should not be as difficult as this:

Well, I don't see any other way if you use references. Variables are
mutable in PHP. You could, of course, use copies, but then you'd lose
ability to update. Maybe if we drop "lexical" and use Dmitry's proposal of

$arr[$i] = function () ($i) { return $i; };

where ($i) would be copy, (&$i) would be by-ref, then it'd be easier to
control it. I know function()() is weird, but not everybody likes
lexical either :) Maybe we can do lexical &$y, but that looks weird too...

Of course this would make lexical behave quite differently from global in this

I wouldn't spend too much thought on making lexical work like global.
global is for different purpose (and with $GLOBALS is obsolete anyway :)

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

17 years ago by Alexander Wagner — view source

unread

A closure must be able to call itself recursively (via a
higher-order function typically)
[Check, since you can use variable you assigned closure to inside the
closure, if I understand correctly]

This is a matter of implementation rather than design, so it should be
resolved by testing rather than by reading the spec ;-)

Well, I don't see any other way if you use references. Variables are
mutable in PHP.

They are also copied by default (passed by value). So if lexical used copies
by default (and passed objects by reference), it would be consistent with all
of php except for global. Let global be the outcast and be consistent with
exerything else. As long as references are easily available, I think this is
the much better trade-off. And it makes water slightly cheaper than beer.

I know function()() is weird

And would become weirder if foo(1)(2) is implemented. +1 to that by the way,
allowing dereferencing for methods ( $obj->method1()->method2(); ) but not
for functions is kinda mean.

Maybe function( ) [ ] { } instead of function( ) ( ) { }
That way the different parts actually look different. Also, confusion with
arrays should be pretty much impossible here, both for the parser and human
readers.

I prefer "lexical", though. Functional programming is not the default paradigm
in PHP, so rather err on the side of explicitness.

Gesundheit
Wag

--
Remember, growing older is mandatory. Growing up is optional.

17 years ago by Stanislav Malyshev — view source

unread

Hi!

Hmm, seems like a good idea. If nobody objects in the next few days,
I'll rewrite my patch to use objects instead of resources. What class
name do you suggest?

While we are at it maybe even having special standard handler
(invoke?) that could be also used by objects created by reflection and
maybe later of some other purposes. I.e. if we do $foo($bar, $baz) and
$foo is an object and it defines invoke, then we call it (in which
case if $foo is Closure it does its thing) otherwise we get an error
"object $foo is not callable". Of course, this goes also for
is_callable, etc.
What do you think?

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

17 years ago by Lars Strojny — view source

unread

Hi Stas,

Am Montag, den 23.06.2008, 10:56 -0700 schrieb Stanislav Malyshev:

What do you think?

I really love that idea. Real Functors¹ in PHP, great!

http://en.wikipedia.org/wiki/Function_object

cu, Lars

17 years ago by Kalle Sommer Nielsen — view source

unread

Hey

How are we going to deal with Closures in class properties, like:

class PHP
{
public $closure;

public function __construct($callback)
{
	$this->closure = $callback;
}

public function closure()
{
	echo 'PHP::closure';
}

}

$closure = function()
{
echo 'PHP::$closure';
};

$test = new PHP;

Now calling $closure->closure(); whats going to happen? I would assume it
executes the method first if and if there aren't a method with the name it
will execute the $closure property?

Another subject I would like to see now the closures has been brought up again
is, how about adding type hinting in method/function prototypes:

function call(function $callback)
{
$callback();
}

call(function(){ echo 'Hello'; });

We could make reuse of the function keyword in the prototypes which saves us
from adding another keyword to the language.

Cheers,
Kalle

17 years ago by troels knak-nielsen — view source

unread

Another subject I would like to see now the closures has been brought up
again
is, how about adding type hinting in method/function prototypes:

function call(function $callback)
{
$callback();
}

Good point. If we implement closures as objects, as already suggested,
then it's simply a matter of typehinting to the classname we pick.

--
troels

17 years ago by Lukas Kahwe Smith — view source

unread

Now, upon execution of the code containing the closure, the new opcode
just copies the zend_function structure into a copy, registers that
copy
as a resource and returns that resource. As soon as the resource is
garbage collected (or explicitly unset), the op_array copy is
destroyed.
No modification of the actual class is done at all - the cache remains
happy.

So since a reference is stored, it means that the destructor of the
enclosing object is only called once not only the variable holding the
object, but also all lambda functions that were created inside of the
class have been free'ed?

regards,
Lukas Kahwe Smith
mls@pooteeweet.org

17 years ago by Gwynne Raskind — view source

unread

Now, upon execution of the code containing the closure, the new
opcode
just copies the zend_function structure into a copy, registers that
copy
as a resource and returns that resource. As soon as the resource is
garbage collected (or explicitly unset), the op_array copy is
destroyed.
No modification of the actual class is done at all - the cache
remains
happy.
So since a reference is stored, it means that the destructor of the
enclosing object is only called once not only the variable holding
the object, but also all lambda functions that were created inside
of the class have been free'ed?

I'm not up to date on the operation of the current patches to
implement closures, but typically this is how it'd work, retaining
references to what's needed as long as the closures exist.

-- Gwynne, Daughter of the Code
"This whole world is an asylum for the incurable."

17 years ago by Marcus Boerger — view source

unread

Hello Andi,

Wednesday, June 18, 2008, 8:01:34 AM, you wrote:

Hi Christian,

This is a very nice piece of work. Definitely addresses a lot of the issues we have raised in the past.
I would like to see such a solution make its way into PHP (see below re: timing).

There are some things I'd like to consider:

I am not sure that the current semantics of the "lexical" keyword is
great in all cases. Is the reason why you don't allow by-value binding so
that we don't have to manage more than one lambda instance per declaration?

[minor curiosity - do we want to consider reusing "parent" instead of
"lexical"? I guess that could be confusing but it's not the first time we
reuse a keyword when it's clear that the usage is in two different places
(this is minor and I don't mind much either way although lexical doesn't mean too much to me).]

This ius a really good idea. A keyword at a place where you define scoping
already (global, static) is a kind of straightforward solution and easy
enough to learn, understand and read. The original proposed keyword
however is pretty much misleading. And comparing the keyword solution with
other soltions like Dmitry's '|', the keyword clearly wins becasue you can
see it. The difference between '|' and ',' on the otherhand is far to
small to spot it when rading code and thus leads to confusing unmaintainable
code.

I am concerned about binding to classes. First of all we need to look
into more detail what the implications are for bytecode caches when
changing class entries at run-time. We may want to also consider an
option where the lambda binds to the object and only has public access
although I realize that may be considered by some as too limiting. We'll
review these two things in the coming days.

Re: timing, I think the biggest issue we have right now with PHP 5.3 is
that we are not making a clear cut on features. There's always pressure
on release managers to include more (I went through the same with 5.0)
but at some point you just have to stop at some place or things will
never go out as there are always good ideas flowing in. Unfortunately
with 5.3 that cut isn't happening and it seems to drag out longer than
needed. I prefer having this discussion in the context of a hard date for
a beta release after which we'll be especially strict with accepting new
features. Each new feature will drag out the beta/RC cycle as they need
enough time for testing/feedback/tweaks.

Andi

-----Original Message-----
From: Christian Seiler [mailto:chris_se@gmx.net]
Sent: Monday, June 16, 2008 10:39 AM
To: php-dev List
Subject: [PHP-DEV] [PATCH] [RFC] Closures and lambda functions in PHP

Hi,

As a followup to the discussion in January, I'd like post a revised patch to
this list that implements closures and anonymous functions in PHP.

INTRODUCTION

Closures and lambda functions can make programming much easier in
several ways:
Lambda functions allow the quick definition of throw-away functions
that are not used elsewhere. Imaging for example a piece of code that
needs to call preg_replace_callback(). Currently, there are three
possibilities to acchieve this:

a. Define the callback function elsewhere. This distributes code that
belongs together throughout the file and decreases readability.

b. Define the callback function in-place (but with a name). In
that case
one has to use function_exists() to make sure the function is only
defined once. Example code:
 &lt;?php
    function replace_spaces ($text) {
      if (!function_exists ('replace_spaces_helper')) {
        function replace_spaces_helper ($matches) {
          return str_replace ($matches[1], ' ', '&nbsp;').' ';
        }
      }
      return preg_replace_callback ('/( +) /',
'replace_spaces_helper',
$text);
}
?>
     Here, the additional if() around the function definition makes the
     source code difficult to read.

  c. Use the present `create_function()` in order to create a function at
     runtime. This approach has several disadvantages: First of all,
syntax
highlighting does not work because a string is passed to the
function.
It also compiles the function at run time and not at compile
time so
opcode caches can't cache the function.

Closures provide a very useful tool in order to make lambda
functions even
more useful. Just imagine you want to replace 'hello' through
'goodbye' in
all elements of an array. PHP provides the array_map() function which
accepts a callback. If you don't wan't to hard-code 'hello' and
'goodbye'
into your sourcecode, you have only four choices:

a. Use create_function(). But then you may only pass literal values
(strings, integers, floats) into the function, objects at best as
clones (if var_export() allows for it) and resources not at
all. And
you have to worry about escaping everything correctly.
Especially when
handling user input this can lead to all sorts of security issues.

b. Write a function that uses global variables. This is ugly,
non-reentrant and bad style.

c. Create an entire class, instantiate it and pass the member function
as a callback. This is perhaps the cleanest solution for this
problem
with current PHP but just think about it: Creating an entire
class for
this extremely simple purpose and nothing else seems overkill.

d. Don't use array_map() but simply do it manually (foreach). In this
simple case it may not be that much of an issue (because one simply
wants to iterate over an array) but there are cases where doing
something manually that a function with a callback as parameter
does
for you is quite tedious.

[Yes, I know that str_replace also accepts arrays as a third
parameter so
this example may be a bit useless. But imagine you want to do a more
complex operation than simple search and replace.]

PROPOSED PATCH

I now propose a patch that implements compile-time lambda functions and
closures for PHP while keeping the patch as simple as possible. The patch is
based on a previous patch on mine which was based on ideas discussed here
end of December / start of January.

Userland perspective

The patch adds the following syntax as a valid expression:

function & (parameters) { body }

(The & is optional and indicates - just as with normal functions - that the
anonymous function returns a reference instead of a value)

Example usage:

$lambda = function () { echo "Hello World!\n"; };

The variable $lambda then contains a callable resource that may be called
through different means:

$lambda ();
call_user_func ($lambda);
call_user_func_array ($lambda, array ());

This allows for simple lambda functions, for example:

function replace_spaces ($text) {
$replacement = function ($matches) {
return str_replace ($matches[1], ' ', ' ').' ';
};
return preg_replace_callback ('/( +) /', $replacement, $text);
}

The patch implements closures by defining an additional keyword 'lexical'
that allows an lambda function (and only an lambda function) to import
a variable from the "parent scope" to the lambda function scope. Example:

function replace_in_array ($search, $replacement, $array) {
$map = function ($text) {
lexical $search, $replacement;
if (strpos ($text, $search) > 50) {
return str_replace ($search, $replacement, $text);
} else {
return $text;
}
};
return array_map ($map, array);
}

The variables $search and $replacement are variables in the scope of the
function replace_in_array() and the lexical keyword imports these variables
into the scope of the closure. The variables are imported as a reference,
so any change in the closure will result in a change in the variable of the
function itself.

If a closure is defined inside an object, the closure has full access
to the current object through $this (without the need to use 'lexical' to
import it seperately) and all private and protected methods of that class.
This also applies to nested closures. Essentially, closures inside
methods are
added as public methods to the class that contains the original method.

Closures may live longer as the methods that declared them. It is
perfectly
possible to have something like this:

function getAdder($x) {
return function ($y) {
lexical $x;
return $x + $y;
};
}

Zend internal perspective

The patch basically changes the following in the Zend engine:

When the compiler reaches a lambda function, it creates a unique name
for that
function ("\0__compiled_lambda_FILENAME_N" where FILENAME is the name of the
file currently processed and N is a per-file counter). The use of the
filename
in the function name ensures compability with opcode caches. The lambda
function is then immediately added to the function table (either the global
function table or that of the current class if declared inside a class
method).
Instead of a normal ZEND_DECLARE_FUNCTION opcode the new
ZEND_DECLARE_LAMBDA_FUNC is used as an opcode at this point. The op_array
of the new function is initialized with is_lambda = 1 and is_closure = 0.

When parsing a 'lexical' declaration inside an anonymous function the parser
saves the name of the variable that is to be imported in an array stored
as a member of the op_array structure (lexical_names).

The opcode handler for ZEND_DECLARE_LAMBDA_FUNC does the following: First of
all it creates a new op_array and copies the entire memory structure of the
lambda function into it (the opcodes themselves are not copied since they
are only referenced in the op_array structure). Then it sets is_closure = 1
on the new op_array, and for each lexical variable name that the compiler
added to the original op_array it creates a reference to that variable from
the current scope into a HashTable member in the new op_array. It also saves
the current object pointer ($this) as a member of the op_array in order to
allow for the closure to access $this. Finally it registers the new op_array
as a resource and returns that resource.

The opcode handler of the 'lexical' construct simply fetches the variable
from that HashTable and imports it into local scope of the inner function
(just like with 'global' only with a different hash table).

Some hooks were added that allow the 'lambda function' resource to be
called.
Also, there are several checks in place that make sure the lambda function
is not called directly, i.e. if someone explicitely tries to use the
internal
function name instead of using the resource return value of the declaration.

The patch

The patch is available here:
http://www.christian-seiler.de/temp/closures-php-5.3-2008-06-16-1.diff

Please note that I did NOT include the contents of zend_language_scanner.c
in the patch since that can easily be regenerated and just takes up enormous
amounts of space.

The patch itself applies against the 5.3 branch of PHP.

If I understand the discussion regarding PHP6 on this list correctly, some
people are currently undergoing the task of removing the unicode_semantics
switch and if (UG(unicode)). As soon as this task is finished I will also
provide a patch for CVS HEAD (it doesn't make much sense adopting the patch
now and then having to change it again completely afterwards).

BC BREAKS

Introduction of a new keyword 'lexical'. Since it is very improbable
that
someone should use it as a function, method, class or property name, I
think this is an acceptable break.

Other that that, I can find no BC breaks of my patch.

CAVEATS / POSSIBLE WTFS

On writing $func = function () { }; there is a semicolon necessary.
If left
out it will produce a compile error. Since any attempt to remove that
necessity would unecessarily bloat the grammar, I suggest we simply keep
it the way it is. Also, Lukas Kahwe Smith pointed out that a single
trailing semicolon after a closing brace already exists: do { }
while ();

The fact that 'lexical' creates references may cause certain WTFs:

for ($i = 0; $i < 10; $i++) {
$arr[$i] = function () { lexical $i; return $i; };
}

This will not work as expected since $i is a reference and thus all
created closures would reference the same variable. In order to get this
right one has to do:

for ($i = 0; $i < 10; $i++) {
$loopIndex = $i;
$arr[$i] = function () { lexical $loopIndex; return $loopIndex; };
unset ($loopIndex);
}

This can be a WTF for people that don't expect lexical to create an
actual reference, especially since other languages such as JavaScript
don't do it. On the other hand, global and static both DO create
references so that behaviour is consistent with current PHP.

But complex constructions such as this will probably not be used by
beginners so maintaining a good documentation should solve this.

The fact that 'lexical' is needed at all may cause WTFs. Other languages
such as JavaScript implicitely have the entire scope visible to child
functions. But since PHP does the same thing with global variables, I
find a keyword like 'lexical' much more consistent than importing the
entire scope (and always importing the entire scope costs unnecessary
performance).

FINAL THOUGHTS

My now proposed patch addresses the two main problems of my previous patch:
Support for closures in objects (with access to $this) and opcode caches. My
patch applies against PHP_5_3 and does not break any tests. It adds a
valuable
new language feature which I'd like to see in PHP.

Regards,
Christian

--

Best regards,
Marcus

17 years ago by Dmitry Stogov — view source

unread

Hi Christian,

I took a look into your patch and found it too difficult.
So I implemented another patch (attached) which is based on your ideas.

From user's level of view it does exactly the same except for "lexical"
variables definition. I don't use any new reserver word because every
new reserved word is going to break some user code. I use the special
syntax for lambda function definition instead, which looks much clear
for me. The following code creates a lambda function with arguments $x,
$y and lexical variables $a, $b, $c.

$a = function($x, $y | $a, $b $c) {};

The patch shouldn't affect opcode caches and other extensions as it
doesn't change any structures. It uses the op_array->static_variables
for lexical variables.

The patch also fixes several small issues and adds some missing
functionality which didn't allow preg_replace_callback() (and may be
others) to work with lambda functions. Now the following example works fine.

<?php
class X {
private function foo($x) {
echo $x;
}
function bar($s) {
return function ($x | $s) {
static $n = 0;
$n++;
$s = $n.':'.$s;
$this->foo($x[0].':'.$s);
};
}
}

$x = new X;
$x = $x->bar("bye\n");
$s = 'abc';
preg_replace_callback('/[abc]/', $x, $s);
?>

It prints:

a:1:bye
b:2:1:bye
c:3:2:1:bye

Of course the patch doesn't break any existent tests.

Please review.

Thanks. Dmitry.

Christian Seiler wrote:

Hi,

As a followup to the discussion in January, I'd like post a revised
patch to
this list that implements closures and anonymous functions in PHP.

INTRODUCTION

Closures and lambda functions can make programming much easier in
several ways:
Lambda functions allow the quick definition of throw-away functions
that are not used elsewhere. Imaging for example a piece of code that
needs to call preg_replace_callback(). Currently, there are three
possibilities to acchieve this:

a. Define the callback function elsewhere. This distributes code that
belongs together throughout the file and decreases readability.

b. Define the callback function in-place (but with a name). In that
case
one has to use function_exists() to make sure the function is only
defined once. Example code:
 &lt;?php
    function replace_spaces ($text) {
      if (!function_exists ('replace_spaces_helper')) {
        function replace_spaces_helper ($matches) {
          return str_replace ($matches[1], ' ', '&nbsp;').' ';
        }
      }
      return preg_replace_callback ('/( +) /', 
'replace_spaces_helper',
$text);
}
?>
    Here, the additional if() around the function definition makes the
    source code difficult to read.

 c. Use the present `create_function()` in order to create a function at
    runtime. This approach has several disadvantages: First of all, 
syntax
highlighting does not work because a string is passed to the
function.
It also compiles the function at run time and not at compile
time so
opcode caches can't cache the function.

Closures provide a very useful tool in order to make lambda
functions even
more useful. Just imagine you want to replace 'hello' through
'goodbye' in
all elements of an array. PHP provides the array_map() function which
accepts a callback. If you don't wan't to hard-code 'hello' and
'goodbye'
into your sourcecode, you have only four choices:

a. Use create_function(). But then you may only pass literal values
(strings, integers, floats) into the function, objects at best as
clones (if var_export() allows for it) and resources not at all.
And
you have to worry about escaping everything correctly.
Especially when
handling user input this can lead to all sorts of security issues.

b. Write a function that uses global variables. This is ugly,
non-reentrant and bad style.

c. Create an entire class, instantiate it and pass the member function
as a callback. This is perhaps the cleanest solution for this
problem
with current PHP but just think about it: Creating an entire
class for
this extremely simple purpose and nothing else seems overkill.

d. Don't use array_map() but simply do it manually (foreach). In this
simple case it may not be that much of an issue (because one simply
wants to iterate over an array) but there are cases where doing
something manually that a function with a callback as parameter
does
for you is quite tedious.

[Yes, I know that str_replace also accepts arrays as a third
parameter so
this example may be a bit useless. But imagine you want to do a more
complex operation than simple search and replace.]

PROPOSED PATCH

I now propose a patch that implements compile-time lambda functions and
closures for PHP while keeping the patch as simple as possible. The
patch is
based on a previous patch on mine which was based on ideas discussed here
end of December / start of January.

Userland perspective

The patch adds the following syntax as a valid expression:

function & (parameters) { body }

(The & is optional and indicates - just as with normal functions - that the
anonymous function returns a reference instead of a value)

Example usage:

$lambda = function () { echo "Hello World!\n"; };

The variable $lambda then contains a callable resource that may be called
through different means:

$lambda ();
call_user_func ($lambda);
call_user_func_array ($lambda, array ());

This allows for simple lambda functions, for example:

function replace_spaces ($text) {
$replacement = function ($matches) {
return str_replace ($matches[1], ' ', ' ').' ';
};
return preg_replace_callback ('/( +) /', $replacement, $text);
}

The patch implements closures by defining an additional keyword
'lexical'
that allows an lambda function (and only an lambda function) to import
a variable from the "parent scope" to the lambda function scope. Example:

function replace_in_array ($search, $replacement, $array) {
$map = function ($text) {
lexical $search, $replacement;
if (strpos ($text, $search) > 50) {
return str_replace ($search, $replacement, $text);
} else {
return $text;
}
};
return array_map ($map, array);
}

The variables $search and $replacement are variables in the scope of the
function replace_in_array() and the lexical keyword imports these variables
into the scope of the closure. The variables are imported as a reference,
so any change in the closure will result in a change in the variable of the
function itself.

If a closure is defined inside an object, the closure has full access
to the current object through $this (without the need to use 'lexical' to
import it seperately) and all private and protected methods of that class.
This also applies to nested closures. Essentially, closures inside
methods are
added as public methods to the class that contains the original method.

Closures may live longer as the methods that declared them. It is
perfectly
possible to have something like this:

function getAdder($x) {
return function ($y) {
lexical $x;
return $x + $y;
};
}

Zend internal perspective

The patch basically changes the following in the Zend engine:

When the compiler reaches a lambda function, it creates a unique name
for that
function ("\0__compiled_lambda_FILENAME_N" where FILENAME is the name of
the
file currently processed and N is a per-file counter). The use of the
filename
in the function name ensures compability with opcode caches. The lambda
function is then immediately added to the function table (either the global
function table or that of the current class if declared inside a class
method).
Instead of a normal ZEND_DECLARE_FUNCTION opcode the new
ZEND_DECLARE_LAMBDA_FUNC is used as an opcode at this point. The op_array
of the new function is initialized with is_lambda = 1 and is_closure = 0.

When parsing a 'lexical' declaration inside an anonymous function the
parser
saves the name of the variable that is to be imported in an array stored
as a member of the op_array structure (lexical_names).

The opcode handler for ZEND_DECLARE_LAMBDA_FUNC does the following:
First of
all it creates a new op_array and copies the entire memory structure of the
lambda function into it (the opcodes themselves are not copied since they
are only referenced in the op_array structure). Then it sets is_closure = 1
on the new op_array, and for each lexical variable name that the compiler
added to the original op_array it creates a reference to that variable from
the current scope into a HashTable member in the new op_array. It also
saves
the current object pointer ($this) as a member of the op_array in order to
allow for the closure to access $this. Finally it registers the new
op_array
as a resource and returns that resource.

The opcode handler of the 'lexical' construct simply fetches the variable
from that HashTable and imports it into local scope of the inner function
(just like with 'global' only with a different hash table).

Some hooks were added that allow the 'lambda function' resource to be
called.
Also, there are several checks in place that make sure the lambda function
is not called directly, i.e. if someone explicitely tries to use the
internal
function name instead of using the resource return value of the
declaration.

The patch

The patch is available here:
http://www.christian-seiler.de/temp/closures-php-5.3-2008-06-16-1.diff

Please note that I did NOT include the contents of zend_language_scanner.c
in the patch since that can easily be regenerated and just takes up
enormous
amounts of space.

The patch itself applies against the 5.3 branch of PHP.

If I understand the discussion regarding PHP6 on this list correctly, some
people are currently undergoing the task of removing the unicode_semantics
switch and if (UG(unicode)). As soon as this task is finished I will also
provide a patch for CVS HEAD (it doesn't make much sense adopting the patch
now and then having to change it again completely afterwards).

BC BREAKS

Introduction of a new keyword 'lexical'. Since it is very improbable
that
someone should use it as a function, method, class or property name, I
think this is an acceptable break.

Other that that, I can find no BC breaks of my patch.

CAVEATS / POSSIBLE WTFS

On writing $func = function () { }; there is a semicolon necessary.
If left
out it will produce a compile error. Since any attempt to remove that
necessity would unecessarily bloat the grammar, I suggest we simply keep
it the way it is. Also, Lukas Kahwe Smith pointed out that a single
trailing semicolon after a closing brace already exists: do { } while
();

The fact that 'lexical' creates references may cause certain WTFs:

for ($i = 0; $i < 10; $i++) {
$arr[$i] = function () { lexical $i; return $i; };
}

This will not work as expected since $i is a reference and thus all
created closures would reference the same variable. In order to get this
right one has to do:

for ($i = 0; $i < 10; $i++) {
$loopIndex = $i;
$arr[$i] = function () { lexical $loopIndex; return $loopIndex; };
unset ($loopIndex);
}

This can be a WTF for people that don't expect lexical to create an
actual reference, especially since other languages such as JavaScript
don't do it. On the other hand, global and static both DO create
references so that behaviour is consistent with current PHP.

But complex constructions such as this will probably not be used by
beginners so maintaining a good documentation should solve this.

The fact that 'lexical' is needed at all may cause WTFs. Other languages
such as JavaScript implicitely have the entire scope visible to child
functions. But since PHP does the same thing with global variables, I
find a keyword like 'lexical' much more consistent than importing the
entire scope (and always importing the entire scope costs unnecessary
performance).

FINAL THOUGHTS

My now proposed patch addresses the two main problems of my previous patch:
Support for closures in objects (with access to $this) and opcode
caches. My
patch applies against PHP_5_3 and does not break any tests. It adds a
valuable
new language feature which I'd like to see in PHP.

Regards,
Christian

17 years ago by Federico Lebron — view source

unread

Hi Dmitry,

As a lowly userspace developer, the | syntax is a bit confusing. If I
see $x, $y | $a, $b, $c, my brain parses it as ($a, ($y | $a), $b, $c),
since , has lower precedence than |. I'd think "syntax error", then
"logical OR", but never "this refers to the variables I want imported to
inside the closures".

Also, I'd like "lexical" a bit more for the same reasons discussed in
the short array syntax ([1,2]) topic: a user faced with function($x, $y
| $a, $b, $c) has nowhere to search for what | means.

I do, however, see the benefit of not changing the scanner and not
breaking opcode caches. Would reusing "parent" be too much of a wtf?

Having little idea of how the internals work, would it be too
complicated to hook "->" so if you say $obj->var(), and var holds a
lambda function, for that function to be called instead of throwing a
syntax error?
I know it seems hackish to add methods at runtime, but this would be to
runkit's method addition what lambdas are to create_function.
IMO it would seem a bit more logical, if $obj->f = function(){echo
"foo";};, to be able to do $obj->f() instead of $f = $obj->f; $f();, and
knowing that $f() won't have access to $this (or at least, I wouldn't
suppose it would in the second case).

I also agree that shipping it with 5.3 would be a bit too rushed, since
this, like any other feature, needs to be debugged thoroughly if it's
going into production (and going to change the API). 5.4 and 6.0 don't
seem so bad, though.

Federico Lebron

Dmitry Stogov wrote:

Hi Christian,

I took a look into your patch and found it too difficult.
So I implemented another patch (attached) which is based on your ideas.

From user's level of view it does exactly the same except for "lexical"
variables definition. I don't use any new reserver word because every
new reserved word is going to break some user code. I use the special
syntax for lambda function definition instead, which looks much clear
for me. The following code creates a lambda function with arguments $x,
$y and lexical variables $a, $b, $c.

$a = function($x, $y | $a, $b $c) {};

The patch shouldn't affect opcode caches and other extensions as it
doesn't change any structures. It uses the op_array->static_variables
for lexical variables.

The patch also fixes several small issues and adds some missing
functionality which didn't allow preg_replace_callback() (and may be
others) to work with lambda functions. Now the following example works fine.

<?php
class X {
private function foo($x) {
echo $x;
}
function bar($s) {
return function ($x | $s) {
static $n = 0;
$n++;
$s = $n.':'.$s;
$this->foo($x[0].':'.$s);
};
}
}

$x = new X;
$x = $x->bar("bye\n");
$s = 'abc';
preg_replace_callback('/[abc]/', $x, $s);
?>

It prints:

a:1:bye
b:2:1:bye
c:3:2:1:bye

Of course the patch doesn't break any existent tests.

Please review.

Thanks. Dmitry.

17 years ago by Dmitry Stogov — view source

unread

I don't like "lexical" keyword, because it can be used anywhere in
function (e.q. inside "if" or loop statement), however lexical variables
must be the part of lambda function definition.

We can think about some better syntax, like

function ($x, $y) ($a, $b, $c) {};
function ($x, $y) [$a, $b, $c] {};

I like "|" separator more, but the syntax of definition is not so
important for me. It just must be clean, and the "lexical" keyword
doesn't provide clean definition.

I don't like the idea to add methods at runtime, as it can break shared
data structures in multi-threaded environment.

Thanks. Dmitry.

Federico Lebron wrote:

Hi Dmitry,

As a lowly userspace developer, the | syntax is a bit confusing. If I
see $x, $y | $a, $b, $c, my brain parses it as ($a, ($y | $a), $b, $c),
since , has lower precedence than |. I'd think "syntax error", then
"logical OR", but never "this refers to the variables I want imported to
inside the closures".

Also, I'd like "lexical" a bit more for the same reasons discussed in
the short array syntax ([1,2]) topic: a user faced with function($x, $y
| $a, $b, $c) has nowhere to search for what | means.

I do, however, see the benefit of not changing the scanner and not
breaking opcode caches. Would reusing "parent" be too much of a wtf?

Having little idea of how the internals work, would it be too
complicated to hook "->" so if you say $obj->var(), and var holds a
lambda function, for that function to be called instead of throwing a
syntax error?
I know it seems hackish to add methods at runtime, but this would be to
runkit's method addition what lambdas are to create_function.
IMO it would seem a bit more logical, if $obj->f = function(){echo
"foo";};, to be able to do $obj->f() instead of $f = $obj->f; $f();, and
knowing that $f() won't have access to $this (or at least, I wouldn't
suppose it would in the second case).

I also agree that shipping it with 5.3 would be a bit too rushed, since
this, like any other feature, needs to be debugged thoroughly if it's
going into production (and going to change the API). 5.4 and 6.0 don't
seem so bad, though.

Federico Lebron

Dmitry Stogov wrote:

Hi Christian,

I took a look into your patch and found it too difficult.
So I implemented another patch (attached) which is based on your ideas.

From user's level of view it does exactly the same except for "lexical"
variables definition. I don't use any new reserver word because every
new reserved word is going to break some user code. I use the special
syntax for lambda function definition instead, which looks much clear
for me. The following code creates a lambda function with arguments $x,
$y and lexical variables $a, $b, $c.

$a = function($x, $y | $a, $b $c) {};

The patch shouldn't affect opcode caches and other extensions as it
doesn't change any structures. It uses the op_array->static_variables
for lexical variables.

The patch also fixes several small issues and adds some missing
functionality which didn't allow preg_replace_callback() (and may be
others) to work with lambda functions. Now the following example works
fine.

<?php
class X {
private function foo($x) {
echo $x;
}
function bar($s) {
return function ($x | $s) {
static $n = 0;
$n++;
$s = $n.':'.$s;
$this->foo($x[0].':'.$s);
};
}
}

$x = new X;
$x = $x->bar("bye\n");
$s = 'abc';
preg_replace_callback('/[abc]/', $x, $s);
?>

It prints:

a:1:bye
b:2:1:bye
c:3:2:1:bye

Of course the patch doesn't break any existent tests.

Please review.

Thanks. Dmitry.

17 years ago by troels knak-nielsen — view source

unread

I don't like "lexical" keyword, because it can be used anywhere in function
(e.q. inside "if" or loop statement), however lexical variables must be the

That does sound wtf-y, indeed. Is that allowed with the global
keyword? Even if it is, I think it would be a sane limitation to put
on lexical, that it must come at the beginning of a function body
(Perhaps allowing global and static to precede it).

--
troels

17 years ago by Rodrigo Saboya — view source

unread

Dmitry Stogov escreveu:

I don't like "lexical" keyword, because it can be used anywhere in
function (e.q. inside "if" or loop statement), however lexical variables
must be the part of lambda function definition.

I agree with Dmitry: Lexical variables belong to lambda function
definition. It makes more sense to me. But I also agree that the
proposed syntax might be a little misleading.

We can think about some better syntax, like

function ($x, $y) ($a, $b, $c) {};

This looks better

function ($x, $y) [$a, $b, $c] {};

Array confusion is to be expected with this syntax, I don't like it.

FWIW, I'd like to see this on 5.3.

regards
Rodrigo Saboya

I like "|" separator more, but the syntax of definition is not so
important for me. It just must be clean, and the "lexical" keyword
doesn't provide clean definition.

I don't like the idea to add methods at runtime, as it can break shared
data structures in multi-threaded environment.

Thanks. Dmitry.

17 years ago by lenar@city.ee — view source

unread

Hi,

Rodrigo Saboya wrote:

function ($x, $y) ($a, $b, $c) {};

This looks better

function ($x, $y) [$a, $b, $c] {};

I think this looks even better:

function ($x, $y) use ($a, $b, &$c) {};

(one could use this syntax even for traditional functions to
use variable copies/references from global scope - just an idea).

my 2,
L.

17 years ago by Larry Garfield — view source

unread

Hi,

Rodrigo Saboya wrote:

function ($x, $y) ($a, $b, $c) {};

This looks better

function ($x, $y) [$a, $b, $c] {};

I think this looks even better:

function ($x, $y) use ($a, $b, &$c) {};

(one could use this syntax even for traditional functions to
use variable copies/references from global scope - just an idea).

my 2,
L.

I am not sure if "use" is the clearest word to use there (wouldn't lexical
there make more sense?), but I think the latter is a good trade-off. It
makes it explicit whether you're using by-ref or by-val passing semantics,
and the semantics and syntax are the same as for function parameters so
there's a very low wtf factor. I still am not sure if re-using
the "function" keyword is going to cause confusion, though, especially if
what is being implemented becomes (as it seems like it may) effectively an
alternate object syntax.

As one of the Haskell list denizens commented, is there a potential for memory
leakage if lambdas implicitly import $this when defined within an object
method? Javascript makes it very easy to create memory leaks via closures if
you're not very careful; I would be fine with requiring an explicit
declaration of $this if it helped avoid memory leaks.

(Even if not many people will use closures at first, I anticipate that they
will become more widely used over time by which point arguments such as "they
won't be used often enough for the memory issue to matter" will be false but
it will be too late to fix. I don't think anyone has made that argument yet,
but I'm trying to head it off before someone does. <g>)

--
Larry Garfield AIM: LOLG42
larry@garfieldtech.com ICQ: 6817012

"If nature has made any one thing less susceptible than all others of
exclusive property, it is the action of the thinking power called an idea,
which an individual may exclusively possess as long as he keeps it to
himself; but the moment it is divulged, it forces itself into the possession
of every one, and the receiver cannot dispossess himself of it." -- Thomas
Jefferson

17 years ago by Alexander Wagner — view source

unread

function ($x, $y) use ($a, $b, &$c) {};

I am not sure if "use" is the clearest word to use there (wouldn't lexical
there make more sense?)

I agree. "use" for both namespaces and closures may not be a good idea.
Otherwise +1 to this syntax for its low WTF-factor.
Look like parameters. Behave like parameters.

Also, allowing this for regular function definitions might be a nice long-term
replacement for global.

I would be fine with requiring an explicit declaration of $this if it helped
avoid memory leaks.

I would propose to always require explicit declaration of $this, even if there
is no memory-leak problem. This would make it easier to distinguish plain
lambdas from closures and would prevent closures from being created by
accident.
As functional programming is foreign to most PHP-developers, better to err on
the side of being explicit.

Gesundheit
Wag

--
Her vocabulary was as bad as, like, whatever.

17 years ago by Christian Seiler — view source

unread

Hi,

I would be fine with requiring an explicit declaration of $this if it helped
avoid memory leaks.

I would propose to always require explicit declaration of $this, even if there
is no memory-leak problem. This would make it easier to distinguish plain
lambdas from closures and would prevent closures from being created by
accident.

I believe that is not neessary: $this is not a normal variable in PHP
(see for example http://bugs.php.net/bug.php?id=33652 or
http://bugs.php.net/bug.php?id=43163) but a language construct. So it
is possible to determine if this is used at compile time (if $this is
used inside the function, op_array->this_var != -1 if I'm not mistaken)

and therefore automatically importing it (it is automatically
available in local scope of class methods anyway). The only possible
clash could be nested lambdas - but there you could simply enforce the
creation of $this in all functions containing a lambda which uses $this).

To sum it up: I believe it is possible to optimize it that way that
$this is only stored within the closure if the closure actually needs it

but without the need for declaring it explicitly.

Regards,
Christian

17 years ago by Chris Stockton — view source

unread

Hello,

No one at all thinks:
function foo($x, $y) use $a, $b, $c {
}

Looks awkward and a little out of place when compared to:

vs

function foo($x, $y) {
lexical $a, $b, $c;
}

Although the fact we have to import variables from the parent scope kinda
stinks and is not typical in closer implementations, we should at least
import into the scope in a way consistent with how we do it already with
GLOBAL right? Just seems a lot cleaner IMO.

-Chris

17 years ago by Alexander Wagner — view source

unread

No one at all thinks:
function foo($x, $y) use $a, $b, $c {
}

Looks awkward and a little out of place

It certainly is new and different in PHP, but I don't see a reason why this
should be hard to get used to.
Also, it works for Java exceptions.

void foo ()
throws IOException {
}

we should at least import into the scope in a way consistent with how we do
it already with GLOBAL right? Just seems a lot cleaner IMO.

At first I also thought that it would be nice to be consistent with global.
Then I realized that global is inconsistent with everything else in PHP (by
using references instead of copies by default), which is probably why it's
use is discouraged in favor of $GLOBALS.

I think it makes much more sense to make lexically (and globally) scoped
variables look like declared exceptions in Java and make them part of the
interface of the function, than to make them look like local variable
declarations in C.
Why allow the declaration of a reference to a global or lexical variable in
the middle of a loop?

Gesundheit
Wag

--
One hundred little bugs in the code
One hundred little bugs.
Fix a bug, link the fix in,
One hundred little bugs in the code.

17 years ago by Lars Strojny — view source

unread

Hi Alex, hi Larry,

Am Freitag, den 20.06.2008, 16:33 +0200 schrieb Alexander Wagner:
[...]

I agree. "use" for both namespaces and closures may not be a good idea.
Otherwise +1 to this syntax for its low WTF-factor.
Look like parameters. Behave like parameters.

Probably "reuse" as in "reuse from outer scope" would work?

cu, Lars

17 years ago by Stanislav Malyshev — view source

unread

Hi!

As one of the Haskell list denizens commented, is there a potential for memory
leakage if lambdas implicitly import $this when defined within an object

Not really leakage (if refcounts done right) but lifetimes extending
beyond what is expected - i.e. if some instance of closure generated by
the object is alive then the object is alive. If that's an issue, it can
be improved by storing $this only for closures that actually use it
(those messing with $$var will be in trouble).

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

17 years ago by Marcus Boerger — view source

unread

Hello Stanislav,

Friday, June 20, 2008, 7:44:10 PM, you wrote:

Hi!

As one of the Haskell list denizens commented, is there a potential for memory
leakage if lambdas implicitly import $this when defined within an object

Not really leakage (if refcounts done right) but lifetimes extending
beyond what is expected - i.e. if some instance of closure generated by
the object is alive then the object is alive. If that's an issue, it can
be improved by storing $this only for closures that actually use it
(those messing with $$var will be in trouble).

Your point being? You describe nothing new here. Just admit the fact once
again that refcounted variables are a) hard to implement correct and b)
never work 100% what you want. A GC simply isnt an artifical intelligence
that knows wht you would like it to do when reference counting gets more
complex. We know this arelready and we live with this in PHP for a long
time. And it is never a real problem in the domain we focus on. That being
short living scripts that serve internet requests.

Best regards,
Marcus

17 years ago by Christian Seiler — view source

unread

Hi Dmitry,

First of all: Your patch does really simplify things internally quite a
bit - I like it. I have a few issues though:

The patch shouldn't affect opcode caches and other extensions as it
doesn't change any structures.

I don't see a problem in changing structures for either extensions nor
opcode caches - as long as only entries are added. Binary compability
with PHP 5.2 is not provided anyway (by neither 5.3 nor 6) and source
compability is not affected if the old members are not touched or their
semantics change.

It uses the op_array->static_variables for lexical variables.

That's a point I don't like. Although you use IS_CONSTANT to cleverly
mask lexical variables, I really think a separate hash table would be a
far better idea, especially for code maintainability.

The patch also fixes several small issues and adds some missing
functionality which didn't allow preg_replace_callback() (and may be
others) to work with lambda functions.

Oh yes, I somehow missed that, thanks!

Please review.

I (personally) have some smaller issues with the patch and one big
issue:

Smaller issues:

A separate hash table for the lexical variables would be much cleaner
in my eyes.
The segfault that occurs with my patch still occurs with yours (see
below for an example)

But the one big issue is the syntax: ($foo | $bar) is just extremely
painful in my eyes. I wouldn't want to use it - and it would be quite
confusing (which side are the normal parameters, which side are the
lexical vars?). I do see your point that the 'lexical' keyword inside
the function body to actually have an effect on the function semantics
is not optimal and that the list of lexical variables is probably better
placed in the function definition. I therefore propose the following syntax:

function (parameters) { } // no closure, simply lambda
function (parameters) KEYWORD (lexical) { } // closure with lexical vars

KEYWORD could be for example 'use'. That probably describes best what
the function does: Use/import those variables from the current scope.
Example:

return function ($x) use ($s) {
static $n = 0;
$n++;
$s = $n.':'.$s;
$this->foo($x[0].':'.$s);
};

As for simply omitting the keyword, e.g. function () () - as already
suggested: I don't like that syntax either. Although I'm not a fan of
too much language verbosity (that's why I don't like Fortran, Basic and
Pascal), I think in this case, a little more verbosity wouldn't hurt -
and typing 'use' is just 3 additional characters.

Now for the examples for the smaller issues:

Segfault:

<?php

$a = function () {
$GLOBALS['a'] = NULL;
echo "destroyed closure\n";
};

var_dump ($a);
$a ();
?>

This crashes - due to the fact that the currently used op_array is
destroyed upon destruction of the variable. This could get even more
interesting if the closure called itself recursively. My proposal is to
create a copy (but not a reference, just do a normal copy, for resources
or objects that will just do the trick) of the variable internally in
zend_call_function and zend_do_fcall_common_helper into a dummy zval and
destroy that zval after the function call ended. That way, the GC won't
kick in until after the execution of the closure. In zend_call_function
that's easy - in zend_do_fcall_common helper we have the problem that
the variable containing the closure is no longer available. An idea
could be that the INIT_FCALL functions always additionally push the
lambda zval to the argument stack (inside the function it will be
ignored) and the fcall_common_helper will remove that zval from the
stack prior to returning (and free it). If a non-closure is called, NULL
(or an empty zval or whatever) could be pushed to the stack instead.
Hmm, perhap's I'll have a better idea tomorrow.

Anyway, since Andi suggested to use objects instead of resources, I'd
like to use your patch as a starting point, if there are no objections.

Regards,
Christian

17 years ago by Dmitry Stogov — view source

unread

Hi Christian,

I'm fine with your suggestion for lexical variables syntax, but I don't
know if we really need brackets around them. For now I changed syntax in
the following way.

$func = function ($x, $y) use $a, $b, $c {
}

According to segfault, I added a check that emits fatal error.

I don't like to use separate HashTable for lexical variables, because

it takes memory (however it won't be used for all regular op_arrays)
it requires new special modifier for FETCH opcode
opcode caches must check for this additional table and copy it

My idea with usage of static_variables doesn't require any opcode cache
modification at all.

I'm fine if you'll improve my patch (It's mainly yours :)

Thanks. Dmitry.

Christian Seiler wrote:

Hi Dmitry,

First of all: Your patch does really simplify things internally quite a
bit - I like it. I have a few issues though:

The patch shouldn't affect opcode caches and other extensions as it
doesn't change any structures.

I don't see a problem in changing structures for either extensions nor
opcode caches - as long as only entries are added. Binary compability
with PHP 5.2 is not provided anyway (by neither 5.3 nor 6) and source
compability is not affected if the old members are not touched or their
semantics change.

It uses the op_array->static_variables for lexical variables.

That's a point I don't like. Although you use IS_CONSTANT to cleverly
mask lexical variables, I really think a separate hash table would be a
far better idea, especially for code maintainability.

The patch also fixes several small issues and adds some missing
functionality which didn't allow preg_replace_callback() (and may be
others) to work with lambda functions.

Oh yes, I somehow missed that, thanks!

Please review.

I (personally) have some smaller issues with the patch and one big
issue:

Smaller issues:

A separate hash table for the lexical variables would be much cleaner
in my eyes.

The segfault that occurs with my patch still occurs with yours (see
below for an example)

But the one big issue is the syntax: ($foo | $bar) is just extremely
painful in my eyes. I wouldn't want to use it - and it would be quite
confusing (which side are the normal parameters, which side are the
lexical vars?). I do see your point that the 'lexical' keyword inside
the function body to actually have an effect on the function semantics
is not optimal and that the list of lexical variables is probably better
placed in the function definition. I therefore propose the following
syntax:

function (parameters) { } // no closure, simply lambda
function (parameters) KEYWORD (lexical) { } // closure with lexical vars

KEYWORD could be for example 'use'. That probably describes best what
the function does: Use/import those variables from the current scope.
Example:

return function ($x) use ($s) {
static $n = 0;
$n++;
$s = $n.':'.$s;
$this->foo($x[0].':'.$s);
};

As for simply omitting the keyword, e.g. function () () - as already
suggested: I don't like that syntax either. Although I'm not a fan of
too much language verbosity (that's why I don't like Fortran, Basic and
Pascal), I think in this case, a little more verbosity wouldn't hurt -
and typing 'use' is just 3 additional characters.

Now for the examples for the smaller issues:

Segfault:

<?php

$a = function () {
$GLOBALS['a'] = NULL;
echo "destroyed closure\n";
};

var_dump ($a);
$a ();
?>

This crashes - due to the fact that the currently used op_array is
destroyed upon destruction of the variable. This could get even more
interesting if the closure called itself recursively. My proposal is to
create a copy (but not a reference, just do a normal copy, for resources
or objects that will just do the trick) of the variable internally in
zend_call_function and zend_do_fcall_common_helper into a dummy zval and
destroy that zval after the function call ended. That way, the GC won't
kick in until after the execution of the closure. In zend_call_function
that's easy - in zend_do_fcall_common helper we have the problem that
the variable containing the closure is no longer available. An idea
could be that the INIT_FCALL functions always additionally push the
lambda zval to the argument stack (inside the function it will be
ignored) and the fcall_common_helper will remove that zval from the
stack prior to returning (and free it). If a non-closure is called, NULL
(or an empty zval or whatever) could be pushed to the stack instead.
Hmm, perhap's I'll have a better idea tomorrow.

Anyway, since Andi suggested to use objects instead of resources, I'd
like to use your patch as a starting point, if there are no objections.

Regards,
Christian

17 years ago by Lars Strojny — view source

unread

Hi Dmitry, hi Christian,

Am Freitag, den 20.06.2008, 15:12 +0400 schrieb Dmitry Stogov:

$func = function ($x, $y) use $a, $b, $c {
}

Will lexical scoping work with normal ("named") functions too?

function foo($x, $y) use $a, $b, $c {
}

cu, Lars

17 years ago by Dmitry Stogov — view source

unread

No it won't.

Dmitry.

Lars Strojny wrote:

Hi Dmitry, hi Christian,

Am Freitag, den 20.06.2008, 15:12 +0400 schrieb Dmitry Stogov:

$func = function ($x, $y) use $a, $b, $c {
}

Will lexical scoping work with normal ("named") functions too?

function foo($x, $y) use $a, $b, $c {
}

cu, Lars

17 years ago by Lars Strojny — view source

unread

Hi Dmitry,

Am Freitag, den 20.06.2008, 16:19 +0400 schrieb Dmitry Stogov:

No it won't.

While I don't want to use it, it might be really confusing to our users
that it works different to closures (because the declaration of
functions and closures looks similar). Are there any internal
limitations why not to do it?

cu, Lars

17 years ago by Dmitry Stogov — view source

unread

It is possible to do it, but I don't see any reason to invest time into
it. PHP scripts hardly ever use nested functions, and you always can
access global variables through "global" or $GLOBALS. I don't see, why
do we need another way to do the same.

Thanks. Dmitry.

Lars Strojny wrote:

Hi Dmitry,

Am Freitag, den 20.06.2008, 16:19 +0400 schrieb Dmitry Stogov:

No it won't.

While I don't want to use it, it might be really confusing to our users
that it works different to closures (because the declaration of
functions and closures looks similar). Are there any internal
limitations why not to do it?

cu, Lars

17 years ago by Alexey Zakhlestin — view source

unread

It is possible to do it, but I don't see any reason to invest time into
it. PHP scripts hardly ever use nested functions, and you always can
access global variables through "global" or $GLOBALS. I don't see, why
do we need another way to do the same.

just to clarify:

php, currently, does not have "nested functions"
but it is allowed to declare "usual", global-scoped functions from
inside other functions

--
Alexey Zakhlestin
http://blog.milkfarmsoft.com/

17 years ago by Marcus Boerger — view source

unread

Hello Dmitry,

Friday, June 20, 2008, 2:19:46 PM, you wrote:

No it won't.

Dmitry.

Lars Strojny wrote:

Hi Dmitry, hi Christian,

Am Freitag, den 20.06.2008, 15:12 +0400 schrieb Dmitry Stogov:

$func = function ($x, $y) use $a, $b, $c {
}

I really like your style here :-)

We could discuss this over and over but what are we missgin at this point,
should the patch just go into HEAD and we deal with tweaking the wording
as we move on with trying it in real life?

Personally I see 'parent' would work as well and I also doesn't matter to
me whether the keyword has to be in fron or after the opening curly brace.

thanks for your idea anyway.

marcus

Will lexical scoping work with normal ("named") functions too?

function foo($x, $y) use $a, $b, $c {
}

Best regards,
Marcus

17 years ago by Christian Seiler — view source

unread

Hi Dmitry,

I'm fine if you'll improve my patch (It's mainly yours :)

I updated my closures RFC: http://wiki.php.net/rfc/closures

I have based my new version of the patch on yours (Dmitry), but I made
some changes to that:

Objects instead of resources are used, two new files
zend_closures.[ch] are added where the new Closure class
is defined. Currently, it contains a dummy __toString method
that in future may be extended to provide enhanced debugging info,
also further additional cool stuff could be added to such a
class later on. But I prefer to only add the basic closure
functionality at first - you can always extend it once it's there.
I have not added any __invoke() magic to normal objects. This is
mainly due to the simple reason that adding that would not help
a closure implementation at all. Closures need some engine internal
magic (use a dynamically created op_array instead of looking one up,
setting the correct class scope and setting the correct EG(this). And
as I said: I want to stick with the closure basics for now.

That said, I do like the possibility of invoking objects directly, so
I suggest someone created an additional proposal for that?
I've added a patch for PHP HEAD (PHP 6.0). This is due to the fact
that Dmitry's variant of my patch has far less intersections with
the unicode functionality than my original patch, so it was quite
straight-forward to do so.
Lexical vars are now copied instead of referenced by default. Using
& in front of the var, the behaviour may be changed. I added that in
order to demonstrate that both was possible and that a simply change
of grammar suffices. In my eyes this is the main issue where a
discussion has to take place (i.e. copy or reference by default?
possibility to change default via syntax? which lexical syntax?)
before the proposal can be accepted.
I provided patches for both lexical $var and use ($var) syntaxes.
I provided a patch variant that only stores $this if $this is
explicitely used inside a closure (or a nested closure of that
closure). This works since it is possible to detect whether $this
is used at compile time. For this, I have added a this_used flag
to the op_array structure.
I added tests (Zend/tests/closures_*.phpt) that ensure the correct
behaviour of closures.

[Note that I created my own local SVN repos for developing these patches
because I was fed up with CVS's inability to local diffs and locally
mark files as added to include them in the diffs. Just to explain the
format of the patch.]

Anyway, feel free to discuss.

In my eyes, the following questions should be answered:

Do you want closures in PHP?

I have not seen a single negative reaction to my proposal, so I
assume the answer to that is yes. ;-)
Which syntax should be used for lexical variables? Should references
or copies be created by default?

This is far trickier.

First of all: There must always be the possiblity to create
references, else you can't call it closures.

Second: I prefer the 'lexical' keyword, but I could live with the
use solution [function () use ($...)].

Third: There are several arguments against default referencing:
- (use syntax) As they look similar to parameters and normal
  parameters aren't passed by ref, this could be quite
  odd.
- Loop index WTFs (see proposal)
- Speed? (You always read that refs are slower in PHP than normal
  variable copies. Is that actually true?)
- While it is possible to simply add an & in front of the variable
  name to switch to refs in "no refs default" mode, there is no
  obvious syntax to use copies in "refs default" mode other than
  unsetting the variable in the parent scope immediately after
  closure creation.
Fourth: There are several arguments for default referencing:
- (lexical syntax) global also creates a reference, why shouldn't
  lexical?
- Other languages appear to exhibit a very similar behaviour to
  PHP if PHP created references. This is due to the fact that
  other languages have a different concept of scope as PHP
  does.
Although the list of against arguments appears to be longer, I do
prefer using references by default nevertheless. But that's just
my personal opinion.
Are you OK with the change that $this is only stored when needed?

I don't see a problem. Dmitry seems to be very touchy (;-)) about
changing op_arrays but in this case it's only a flag so I don't
see a problem for opcode caches (in contrast to a HashTable where
the opcode cache must actually add code to duplicate that table).
Do you want closures in PHP 5.3?

Since the majority of core developers appear to be against it, I
presume the answer is no.

I will provide a revised patch that incorporates the results of the
following discussion for 5_3 and HEAD once consensus or at least a
majority regarding the remaining issues is reached. I will also rewrite
the proposal to reflect the discussion results and adjust the tests.
After that, I hope that someone will commit at least the HEAD version.

Regards,
Christian

17 years ago by Marcus Boerger — view source

unread

Hello Christian,

Thursday, June 26, 2008, 6:23:53 PM, you wrote:

Hi Dmitry,

I'm fine if you'll improve my patch (It's mainly yours :)

I updated my closures RFC: http://wiki.php.net/rfc/closures

I have based my new version of the patch on yours (Dmitry), but I made
some changes to that:

Objects instead of resources are used, two new files
zend_closures.[ch] are added where the new Closure class
is defined. Currently, it contains a dummy __toString method
that in future may be extended to provide enhanced debugging info,
also further additional cool stuff could be added to such a
class later on. But I prefer to only add the basic closure
functionality at first - you can always extend it once it's there.

I have not added any __invoke() magic to normal objects. This is
mainly due to the simple reason that adding that would not help
a closure implementation at all. Closures need some engine internal
magic (use a dynamically created op_array instead of looking one up,
setting the correct class scope and setting the correct EG(this). And
as I said: I want to stick with the closure basics for now.

That said, I do like the possibility of invoking objects directly, so
I suggest someone created an additional proposal for that?

I've added a patch for PHP HEAD (PHP 6.0). This is due to the fact
that Dmitry's variant of my patch has far less intersections with
the unicode functionality than my original patch, so it was quite
straight-forward to do so.

Lexical vars are now copied instead of referenced by default. Using
& in front of the var, the behaviour may be changed. I added that in
order to demonstrate that both was possible and that a simply change
of grammar suffices. In my eyes this is the main issue where a
discussion has to take place (i.e. copy or reference by default?
possibility to change default via syntax? which lexical syntax?)
before the proposal can be accepted.

I provided patches for both lexical $var and use ($var) syntaxes.

I provided a patch variant that only stores $this if $this is
explicitely used inside a closure (or a nested closure of that
closure). This works since it is possible to detect whether $this
is used at compile time. For this, I have added a this_used flag
to the op_array structure.

I added tests (Zend/tests/closures_*.phpt) that ensure the correct
behaviour of closures.

[Note that I created my own local SVN repos for developing these patches
because I was fed up with CVS's inability to local diffs and locally
mark files as added to include them in the diffs. Just to explain the
format of the patch.]

Anyway, feel free to discuss.

In my eyes, the following questions should be answered:

Do you want closures in PHP?

I have not seen a single negative reaction to my proposal, so I
assume the answer to that is yes. ;-)

yes

Which syntax should be used for lexical variables? Should references
or copies be created by default?

This is far trickier.

First of all: There must *always* be the _possiblity_ to create
references, else you can't call it closures.

Second: I prefer the 'lexical' keyword, but I could live with the
use solution [function () use ($...)].

'use' becasue no new keyword has to be introduced which would brake stuff
no matter what the keyword will be.

Third: There are several arguments against default referencing:

  * (use syntax) As they look similar to parameters and normal
                 parameters aren't passed by ref, this could be quite
                 odd.
  * Loop index WTFs (see proposal)
  * Speed? (You always read that refs are slower in PHP than normal
    variable copies. Is that actually true?)
  * While it is possible to simply add an & in front of the variable
    name to switch to refs in "no refs default" mode, there is no
    obvious syntax to use copies in "refs default" mode other than
    unsetting the variable in the parent scope immediately after
    closure creation.

I like the new ability to reference if wanted. But then I don't like
references at all. Along with the fact that in PHP objects are always
references I slightly tend to not want reference functionality. Since it
is handled in the parser you could submit with either version and check
during evaluation periode if people disagree with your choice - or the
list choice if that's what decides.

Fourth: There are several arguments for default referencing:

  * (lexical syntax) global also creates a reference, why shouldn't
                     lexical?
  * Other languages *appear* to exhibit a very similar behaviour to
    PHP if PHP created references. This is due to the fact that
    other languages have a different concept of scope as PHP
    does.

Although the list of against arguments appears to be longer, I do
prefer using references by default nevertheless. But that's just
my personal opinion.

Are you OK with the change that $this is only stored when needed?

I don't see a problem. Dmitry seems to be very touchy (;-)) about
changing op_arrays but in this case it's only a flag so I don't
see a problem for opcode caches (in contrast to a HashTable where
the opcode cache must actually add code to duplicate that table).

I see it dangerous.... eval comes to mind. And also, why create something
special when the normal way of doing things that is done everywhere else
would work too.

Do you want closures in PHP 5.3?

Since the majority of core developers appear to be against it, I
presume the answer is no.

I am still in favor it and against a 5.4.

I will provide a revised patch that incorporates the results of the
following discussion for 5_3 and HEAD once consensus or at least a
majority regarding the remaining issues is reached. I will also rewrite
the proposal to reflect the discussion results and adjust the tests.
After that, I hope that someone will commit at least the HEAD version.

Comments on the first patch version:

coooool, I am the listed author: Zend/zend_closures.c/h
you shouldn't be having a __destruct. Can you prevent that?
please drop __toString, with the new behavior only stuff that has
something to say in a string context should have a __toString
a tiny optimization:
+ZEND_API zend_closure *zend_get_closure(zval obj TSRMLS_DC) / {{{ */
+{

  zend_class_entry *ce = Z_OBJCE_P(obj);

  if (instanceof_function(ce, zend_ce_closure TSRMLS_CC)) {

          zend_closure *closure = (zend_closure *)zend_object_store_get_object(obj);

          if (closure->initialized) return closure;

```
  }
```
```
  return NULL;
```

+}

a faster way would be to:
a) add a new type (probably not so good though)
b) add a new ce_flag
+#define ZEND_ACC_CLOSURE ...
+ZEND_API zend_closure *zend_get_closure(zval obj TSRMLS_DC) / {{{ */
+{

  zend_class_entry *ce = Z_OBJCE_P(obj);

  if ((ce->ce_flags & ZEND_ACC_CLOSURE) != 0) {

          zend_closure *closure = (zend_closure *)zend_object_store_get_object(obj);

          if (closure->initialized) return closure;

```
  }
```
```
  return NULL;
```

+}

maybe even inline this function?
very high quality work! thanks a lot! very well thought out and nicely
tested even

Best regards,
Marcus

17 years ago by Christian Seiler — view source

unread

Hi Marcus,

I like the new ability to reference if wanted. But then I don't like
references at all.

As I said: Without reference support, you can't call it closures.
Closures must per definition have the possibility to change the values
of used variables in the parent scope - and the only sensible PHP way to
do that is references.

If people here say: We want copy as default and references optional,
that's fine with me since real closures can still be achieved and -
let's face it - since PHP wasn't designed with closures in mind, the
syntax will always be not 100% perfect, regardsless of how we'll do it.
But I'm stronly against removing reference support entirely.

Are you OK with the change that $this is only stored when needed?

I don't see a problem. Dmitry seems to be very touchy (;-)) about
changing op_arrays but in this case it's only a flag so I don't
see a problem for opcode caches (in contrast to a HashTable where
the opcode cache must actually add code to duplicate that table).

I see it dangerous.... eval comes to mind.

Certain things don't work anyway as of now, see bug
http://bugs.php.net/bug.php?id=43163 for example... And that's intended
behaviour.

Ok, eval() inside a closure still is a possible problem but we still
could say "ok, as soon as eval() appears, just assume $this is used".
That will benefit those people who use neither $this nor eval() inside a
closure with an optimization ($this will be GCed and not copied). And as
far as I can see, eval() is the only thing you can do inside a normal
class method that will allow you to access $this without the compiler
knowing about it. But thanks for pointing eval out, I hadn't thought of
that.

And also, why create something
special when the normal way of doing things that is done everywhere else
would work too.

What do you mean by that?

Comments on the first patch version:

coooool, I am the listed author: Zend/zend_closures.c/h

Oh, yeah, I just copied some header I found elsewhere. Should I put my
name there or what's the policy for contributions?

you shouldn't be having a __destruct. Can you prevent that?

But if I don't have a destructor, how do I garbage collect the op_array?

please drop __toString, with the new behavior only stuff that has
something to say in a string context should have a __toString

Somebody on the internals list suggested adding it to print out useful
debugging info. But I'm fine with dropping it until there is actually a
useful implementation.

a tiny optimization:
+ZEND_API zend_closure *zend_get_closure(zval obj TSRMLS_DC) / {{{ */
+{

  zend_class_entry *ce = Z_OBJCE_P(obj);

  if (instanceof_function(ce, zend_ce_closure TSRMLS_CC)) {

          zend_closure *closure = (zend_closure *)zend_object_store_get_object(obj);

          if (closure->initialized) return closure;

```
  }
```
```
  return NULL;
```

+}

Yes, good idea.

a faster way would be to:
a) add a new type (probably not so good though)
b) add a new ce_flag
+#define ZEND_ACC_CLOSURE ...

I'm quite indifferent to that change. On the one hand, it's a
performance optimization, on the other hand, it digs deeper into current
Zend code.

maybe even inline this function?

Yes, good idea.

As stated before, I'll keep all changes in mind and post and updated
patch as soon as the discussion is done.

Regards,
Christian

17 years ago by Alexander Wagner — view source

unread

I added tests (Zend/tests/closures_*.phpt) that ensure the correct
behaviour of closures.

I'd like to propose an additional test to ensure closures can all themselves:
<?php
$i = 3;

$lambda = function ($lambda) use ($i) {
if ($i==0) return;
echo $i--."\n";
$lambda($lambda);
};

$lambda($lambda);
echo "$i\n";
?>
Expected output:
3
2
1
3

I see exactly one problem with the patch, which is that the above script
shouldn't work without "use (&$i)".
I find it counterintuitive that the creation of the lambda creates a copy of
$i, but all invocations of $lambda use a reference to the same $i.
For n calls to $lambda, there are only 2 copies of $i (one global, one static
in $lambda) where I would expect n+1 copies.

Gesundheit
Wag

--
Sieh dich, nimm Sorge.

17 years ago by Christian Seiler — view source

unread

Hi!

I see exactly one problem with the patch, which is that the above script
shouldn't work without "use (&$i)".
I find it counterintuitive that the creation of the lambda creates a copy of
$i, but all invocations of $lambda use a reference to the same $i.
For n calls to $lambda, there are only 2 copies of $i (one global, one static
in $lambda) where I would expect n+1 copies.

Yes, you're right. My solution for this would be:

-------------- zend_compile.h -------------------------
-void zend_do_fetch_static_variable(znode *varname, znode
*static_assignment, int fetch_type TSRMLS_DC);
+void zend_do_fetch_static_variable(znode *varname, znode
*static_assignment, int fetch_type, int as_ref TSRMLS_DC);
-------------- zend_compile.h -------------------------

-------------- zend_compile.c -------------------------

in zend_do_fetch_lexical_variable:

-zend_do_fetch_static_variable(varname, &value, ZEND_FETCH_STATIC
TSRMLS_CC);
+zend_do_fetch_static_variable(varname, &value, ZEND_FETCH_STATIC,
is_ref TSRMLS_CC);

...
-void zend_do_fetch_static_variable(znode *varname, znode
*static_assignment, int fetch_type TSRMLS_DC)
+void zend_do_fetch_static_variable(znode *varname, znode
*static_assignment, int fetch_type, int as_ref TSRMLS_DC)
...

zend_do_assign_ref(NULL, &lval, &result TSRMLS_CC);

if (as_ref) {

  zend_do_assign_ref(NULL, &lval, &result TSRMLS_CC);

} else {

  zend_do_assign(NULL, &lval, &result TSRMLS_CC);

}

and make sure zend_do_assign can live with `NULL` for first param

-------------- zend_compile.c -------------------------

-------------- zend_language_parser.y -------------------------
static_var_list:

  static_var_list ',' `T_VARIABLE` { zend_do_fetch_static_variable(&$3,

NULL, ZEND_FETCH_STATIC TSRMLS_CC); }

| static_var_list ',' T_VARIABLE '=' static_scalar {
zend_do_fetch_static_variable(&$3, &$5, ZEND_FETCH_STATIC TSRMLS_CC); }
| T_VARIABLE { zend_do_fetch_static_variable(&$1, NULL,
ZEND_FETCH_STATIC TSRMLS_CC); }
| T_VARIABLE '=' static_scalar { zend_do_fetch_static_variable(&$1,
&$3, ZEND_FETCH_STATIC TSRMLS_CC); }

  static_var_list ',' `T_VARIABLE` { zend_do_fetch_static_variable(&$3,

NULL, ZEND_FETCH_STATIC, 1 TSRMLS_CC); }

| static_var_list ',' T_VARIABLE '=' static_scalar {
zend_do_fetch_static_variable(&$3, &$5, ZEND_FETCH_STATIC, 1 TSRMLS_CC); }
| T_VARIABLE { zend_do_fetch_static_variable(&$1, NULL,
ZEND_FETCH_STATIC, 1 TSRMLS_CC); }
| T_VARIABLE '=' static_scalar { zend_do_fetch_static_variable(&$1,
&$3, ZEND_FETCH_STATIC, 1 TSRMLS_CC); }

;
-------------- zend_language_parser.y -------------------------

Any objections (in case copies are wanted at all)?

Regards,
Christian

17 years ago by Alexander Wagner — view source

unread

Yes, you're right. My solution for this would be:

I can't get this to work, it segfaults for me now when I try to use closures.
Maybe I screwed something up, this is my first Zend-Engine-hackery.

As you agree that the current behaviour is kinda weird, just put the fix in
the next wave of patches.
Unless somebody else has an objection, of course.

Gesundheit
Wag

--
John and Mary had never met. They were like two hummingbirds who had also
never met.

17 years ago by Andi Gutmans — view source

unread

Hey Christian,

Nice job!

More below:

-----Original Message-----
From: Christian Seiler [mailto:chris_se@gmx.net]
Sent: Thursday, June 26, 2008 9:24 AM
To: Dmitry Stogov
Cc: php-dev List; Andi Gutmans; Stas Malyshev
Subject: Re: [PHP-DEV] [PATCH] [RFC] Closures and lambda functions in PHP

-snip-

Lexical vars are now copied instead of referenced by default. Using
& in front of the var, the behaviour may be changed. I added that in
order to demonstrate that both was possible and that a simply change
of grammar suffices. In my eyes this is the main issue where a
discussion has to take place (i.e. copy or reference by default?
possibility to change default via syntax? which lexical syntax?)
before the proposal can be accepted.

I think doing this mixed approach is the right one, i.e. $var1, &$var2. It delivers choice and adds clarity as it's explicit about value vs. reference.

I provided patches for both lexical $var and use ($var) syntaxes.

I lean towards the use(...) syntax.

I provided a patch variant that only stores $this if $this is
explicitely used inside a closure (or a nested closure of that
closure). This works since it is possible to detect whether $this
is used at compile time. For this, I have added a this_used flag
to the op_array structure.

Safest not to take shortcuts. You get yourself into trouble with things which will stop working. -1 on this optimization.

I added tests (Zend/tests/closures_*.phpt) that ensure the correct
behaviour of closures.

Excellent.

Anyway, feel free to discuss.

In my eyes, the following questions should be answered:

Do you want closures in PHP?

I think most people here feel it's useful or are at least indifferent.

Do you want closures in PHP 5.3?

I really think we need to get the release process for 5.3 going so I suggest to do it for the following minor version and commit it to PHP 6.

Andi

17 years ago by Alexander Wagner — view source

unread

I lean towards the use(...) syntax.

Me too.

I provided a patch variant that only stores $this if $this is
explicitely used inside a closure [..]

Safest not to take shortcuts. You get yourself into trouble with things
which will stop working. -1 on this optimization.

I believe that always implicitly referencing $this is a mistake.
Not only does it turn every lambda that is created inside an object into a
heavier closure, it also makes it impossible for the closure to outlive the
object, which can, in some rare cases, dramatically increase memory
consumption, in which case things will also stop working.
Also, given how many people put all their code into classes, this may cause
problems more often than we might think, although the wasted memory will
usually go undetected, making this whole thing a source for nasty heisenbugs.

If the optimization of only referencing $this when it is actually used is a
dangerous shortcut, the alternative should be to not implicitly reference
$this at all and require it to be importet through "use ($this)".

Do you want closures in PHP?

I think most people here feel it's useful or are at least indifferent.

There are those who want them and those who don't know that they want them.

Gesundheit
Wag

--
The army is launching a military theme park in Virginia with high-tech
simulator rides.
The Project is expected to cost 900 million dollars and none of the rides will
ever end.

Studio 60

17 years ago by Dmitry Stogov — view source

unread

I thought about use($this) too. :)
I'll try to implement it in the next version of the patch.

Thanks. Dmitry.

Alexander Wagner wrote:

I lean towards the use(...) syntax.

Me too.

I provided a patch variant that only stores $this if $this is
explicitely used inside a closure [..]
Safest not to take shortcuts. You get yourself into trouble with things
which will stop working. -1 on this optimization.

I believe that always implicitly referencing $this is a mistake.
Not only does it turn every lambda that is created inside an object into a
heavier closure, it also makes it impossible for the closure to outlive the
object, which can, in some rare cases, dramatically increase memory
consumption, in which case things will also stop working.
Also, given how many people put all their code into classes, this may cause
problems more often than we might think, although the wasted memory will
usually go undetected, making this whole thing a source for nasty heisenbugs.

If the optimization of only referencing $this when it is actually used is a
dangerous shortcut, the alternative should be to not implicitly reference
$this at all and require it to be importet through "use ($this)".

Do you want closures in PHP?
I think most people here feel it's useful or are at least indifferent.

There are those who want them and those who don't know that they want them.

Gesundheit
Wag

17 years ago by Stanislav Malyshev — view source

unread

Hi!

I thought about use($this) too. :)
I'll try to implement it in the next version of the patch.

I think implicitly using $this when it's referred to is much better.
$this is very special variable so it deserves special treatment. If we'd
need to spare a couple of bytes for that - that's not too much to pay.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

17 years ago by Larry Garfield — view source

unread

On Thursday 26 June 2008 11:23:53 am Christian Seiler wrote:

Hi Dmitry,

I'm fine if you'll improve my patch (It's mainly yours :)

I updated my closures RFC: http://wiki.php.net/rfc/closures
In my eyes, the following questions should be answered:

Do you want closures in PHP?

I have not seen a single negative reaction to my proposal, so I
assume the answer to that is yes. ;-)

Yea. :-)

Which syntax should be used for lexical variables? Should references
or copies be created by default?

This is far trickier.

First of all: There must always be the possiblity to create
references, else you can't call it closures.

Second: I prefer the 'lexical' keyword, but I could live with the
use solution [function () use ($...)].

Third: There are several arguments against default referencing:

(use syntax) As they look similar to parameters and normal
parameters aren't passed by ref, this could be quite
odd.

Loop index WTFs (see proposal)

Speed? (You always read that refs are slower in PHP than normal
variable copies. Is that actually true?)

While it is possible to simply add an & in front of the variable
name to switch to refs in "no refs default" mode, there is no
obvious syntax to use copies in "refs default" mode other than
unsetting the variable in the parent scope immediately after
closure creation.

Fourth: There are several arguments for default referencing:

(lexical syntax) global also creates a reference, why shouldn't
lexical?

Other languages appear to exhibit a very similar behaviour to
PHP if PHP created references. This is due to the fact that
other languages have a different concept of scope as PHP
does.

Although the list of against arguments appears to be longer, I do
prefer using references by default nevertheless. But that's just
my personal opinion.

I see these two issues as related, actually. Consider:

$foo = function($a, &$b) {
global $c;
lexical $d;
// ...
}

Since they look the same, you'd expect them to behave the same. However,
global will import by reference and lexical by value. Hilarity ensues, and
not the good kind. Naturally changing the behavior of global in this case is
out of the question, and as you point out defaulting to reference and having
an extra flag (a la &) to force it to value is unprecedented. I think most
seem to agree that being able to pass by value or by reference at the
developer's discretion is necessary.

However, something in the function signature itself would naturally follow the
behavior of function arguments:

$foo = function($a, &$b) lexical ($d, &$e) {
global $c;
// ...
}

Here, $d and $e behave "as you'd expect them to", with the same visual parsing
semantics as $a and $b. That to me is a much lower wtf factor than having
global and lexical keywords that look alike but behave differently. I would
therefore favor the signature-based syntax.

As an aside, I did use "lexical" in the second example above deliberately.
Personally I don't think re-using "use" here is wise, as that will make it
seem namespace related when in fact it is not. I suspect the instances of
function or constant names of "lexical" will be pretty minimal (although I
admit to having no evidence to back up that suspicion.)

(I would much much rather have a closures implementation that used "use" than
not one at all, mind you; I will still jump for joy if it lands using "use",
I just think it would be better using "lexical".)

Are you OK with the change that $this is only stored when needed?

Ignoring the compiler-level concerns, about which I know nothing useful,
doesn't this introduce a bit of wtf? "All lexically-imported variables must
be explicit or they don't exist... oh yeah, except for $this because it's
special."

To which the question is: If you are able to magically determine if $this is
used and optimize accordingly, why can't you for anything else?

To which the response is, I think: "Performance". That's still something of
an inconsistency, however. Would it make the engine code any cleaner/messier
if $this was required to be declared explicitly like everything else?

Do you want closures in PHP 5.3?

As a PHP developer I'd love to have closures in 2 years when I'm able to use
PHP 5.3 instead of 5 years when I'm able to use PHP 5.4 or PHP 6. :-) I do
understand the need to draw a line somewhere and justbloodyshipit(tm),
however, so if that's the decision I can accept that.

Regards,
Christian

You so rock. :-)

--
Larry Garfield
larry@garfieldtech.com

17 years ago by Dmitry Stogov — view source

unread

Hi Christian,

I reworked your patch a little bit.

Fixed ref/noref issues
Explicit $this passing

$a = function foo() use ($this) {}

Some code reorganization to encapsulate closure implementation.

Thanks. Dmitry.

Christian Seiler wrote:

Hi Dmitry,

I'm fine if you'll improve my patch (It's mainly yours :)

I updated my closures RFC: http://wiki.php.net/rfc/closures

I have based my new version of the patch on yours (Dmitry), but I made
some changes to that:

Objects instead of resources are used, two new files
zend_closures.[ch] are added where the new Closure class
is defined. Currently, it contains a dummy __toString method
that in future may be extended to provide enhanced debugging info,
also further additional cool stuff could be added to such a
class later on. But I prefer to only add the basic closure
functionality at first - you can always extend it once it's there.

I have not added any __invoke() magic to normal objects. This is
mainly due to the simple reason that adding that would not help
a closure implementation at all. Closures need some engine internal
magic (use a dynamically created op_array instead of looking one up,
setting the correct class scope and setting the correct EG(this). And
as I said: I want to stick with the closure basics for now.

That said, I do like the possibility of invoking objects directly, so
I suggest someone created an additional proposal for that?

I've added a patch for PHP HEAD (PHP 6.0). This is due to the fact
that Dmitry's variant of my patch has far less intersections with
the unicode functionality than my original patch, so it was quite
straight-forward to do so.

Lexical vars are now copied instead of referenced by default. Using
& in front of the var, the behaviour may be changed. I added that in
order to demonstrate that both was possible and that a simply change
of grammar suffices. In my eyes this is the main issue where a
discussion has to take place (i.e. copy or reference by default?
possibility to change default via syntax? which lexical syntax?)
before the proposal can be accepted.

I provided patches for both lexical $var and use ($var) syntaxes.

I provided a patch variant that only stores $this if $this is
explicitely used inside a closure (or a nested closure of that
closure). This works since it is possible to detect whether $this
is used at compile time. For this, I have added a this_used flag
to the op_array structure.

I added tests (Zend/tests/closures_*.phpt) that ensure the correct
behaviour of closures.

[Note that I created my own local SVN repos for developing these patches
because I was fed up with CVS's inability to local diffs and locally
mark files as added to include them in the diffs. Just to explain the
format of the patch.]

Anyway, feel free to discuss.

In my eyes, the following questions should be answered:

Do you want closures in PHP?

I have not seen a single negative reaction to my proposal, so I
assume the answer to that is yes. ;-)

Which syntax should be used for lexical variables? Should references
or copies be created by default?

This is far trickier.

First of all: There must always be the possiblity to create
references, else you can't call it closures.

Second: I prefer the 'lexical' keyword, but I could live with the
use solution [function () use ($...)].

Third: There are several arguments against default referencing:

(use syntax) As they look similar to parameters and normal
parameters aren't passed by ref, this could be quite
odd.

Loop index WTFs (see proposal)

Speed? (You always read that refs are slower in PHP than normal
variable copies. Is that actually true?)

While it is possible to simply add an & in front of the variable
name to switch to refs in "no refs default" mode, there is no
obvious syntax to use copies in "refs default" mode other than
unsetting the variable in the parent scope immediately after
closure creation.

Fourth: There are several arguments for default referencing:

(lexical syntax) global also creates a reference, why shouldn't
lexical?

Other languages appear to exhibit a very similar behaviour to
PHP if PHP created references. This is due to the fact that
other languages have a different concept of scope as PHP
does.

Although the list of against arguments appears to be longer, I do
prefer using references by default nevertheless. But that's just
my personal opinion.

Are you OK with the change that $this is only stored when needed?

I don't see a problem. Dmitry seems to be very touchy (;-)) about
changing op_arrays but in this case it's only a flag so I don't
see a problem for opcode caches (in contrast to a HashTable where
the opcode cache must actually add code to duplicate that table).

Do you want closures in PHP 5.3?

Since the majority of core developers appear to be against it, I
presume the answer is no.

I will provide a revised patch that incorporates the results of the
following discussion for 5_3 and HEAD once consensus or at least a
majority regarding the remaining issues is reached. I will also rewrite
the proposal to reflect the discussion results and adjust the tests.
After that, I hope that someone will commit at least the HEAD version.

Regards,
Christian

17 years ago by Andi Gutmans — view source

unread

I am not sure I like the idea of explicit $this.
Frankly, I really doubt we will have serious resource issues as a result of holding a reference to $this for too long. I think we're looking to solve a non-issue.

I'd prefer to take the approach which is easier to use and cleaner from a user perspective. If we ever discover this is a huge issue then we can always add support for something like "static function() {}" which does not hold a $this reference.

Andi

-----Original Message-----
From: Dmitry Stogov
Sent: Friday, June 27, 2008 10:02 AM
To: Christian Seiler
Cc: php-dev List; Andi Gutmans; Stas Malyshev
Subject: Re: [PHP-DEV] [PATCH] [RFC] Closures and lambda functions in PHP

Hi Christian,

I reworked your patch a little bit.

Fixed ref/noref issues

Explicit $this passing

$a = function foo() use ($this) {}

Some code reorganization to encapsulate closure implementation.

Thanks. Dmitry.

Christian Seiler wrote:

Hi Dmitry,

I'm fine if you'll improve my patch (It's mainly yours :)

I updated my closures RFC: http://wiki.php.net/rfc/closures

I have based my new version of the patch on yours (Dmitry), but I made
some changes to that:

Objects instead of resources are used, two new files
zend_closures.[ch] are added where the new Closure class
is defined. Currently, it contains a dummy __toString method
that in future may be extended to provide enhanced debugging info,
also further additional cool stuff could be added to such a
class later on. But I prefer to only add the basic closure
functionality at first - you can always extend it once it's there.

I have not added any __invoke() magic to normal objects. This is
mainly due to the simple reason that adding that would not help
a closure implementation at all. Closures need some engine internal
magic (use a dynamically created op_array instead of looking one up,
setting the correct class scope and setting the correct EG(this). And
as I said: I want to stick with the closure basics for now.

That said, I do like the possibility of invoking objects directly, so
I suggest someone created an additional proposal for that?

I've added a patch for PHP HEAD (PHP 6.0). This is due to the fact
that Dmitry's variant of my patch has far less intersections with
the unicode functionality than my original patch, so it was quite
straight-forward to do so.

Lexical vars are now copied instead of referenced by default. Using
& in front of the var, the behaviour may be changed. I added that in
order to demonstrate that both was possible and that a simply change
of grammar suffices. In my eyes this is the main issue where a
discussion has to take place (i.e. copy or reference by default?
possibility to change default via syntax? which lexical syntax?)
before the proposal can be accepted.

I provided patches for both lexical $var and use ($var) syntaxes.

I provided a patch variant that only stores $this if $this is
explicitely used inside a closure (or a nested closure of that
closure). This works since it is possible to detect whether $this
is used at compile time. For this, I have added a this_used flag
to the op_array structure.

I added tests (Zend/tests/closures_*.phpt) that ensure the correct
behaviour of closures.

[Note that I created my own local SVN repos for developing these
patches because I was fed up with CVS's inability to local diffs and
locally mark files as added to include them in the diffs. Just to
explain the format of the patch.]

Anyway, feel free to discuss.

In my eyes, the following questions should be answered:

Do you want closures in PHP?

I have not seen a single negative reaction to my proposal, so I
assume the answer to that is yes. ;-)

Which syntax should be used for lexical variables? Should references
or copies be created by default?

This is far trickier.

First of all: There must always be the possiblity to create
references, else you can't call it closures.

Second: I prefer the 'lexical' keyword, but I could live with the
use solution [function () use ($...)].

Third: There are several arguments against default referencing:

(use syntax) As they look similar to parameters and normal
parameters aren't passed by ref, this could be quite
odd.

Loop index WTFs (see proposal)

Speed? (You always read that refs are slower in PHP than normal
variable copies. Is that actually true?)

While it is possible to simply add an & in front of the variable
name to switch to refs in "no refs default" mode, there is no
obvious syntax to use copies in "refs default" mode other than
unsetting the variable in the parent scope immediately after
closure creation.

Fourth: There are several arguments for default referencing:

(lexical syntax) global also creates a reference, why shouldn't
lexical?

Other languages appear to exhibit a very similar behaviour to
PHP if PHP created references. This is due to the fact that
other languages have a different concept of scope as PHP
does.

Although the list of against arguments appears to be longer, I do
prefer using references by default nevertheless. But that's just
my personal opinion.

Are you OK with the change that $this is only stored when needed?

I don't see a problem. Dmitry seems to be very touchy (;-)) about
changing op_arrays but in this case it's only a flag so I don't
see a problem for opcode caches (in contrast to a HashTable where
the opcode cache must actually add code to duplicate that table).

Do you want closures in PHP 5.3?

Since the majority of core developers appear to be against it, I
presume the answer is no.

I will provide a revised patch that incorporates the results of the
following discussion for 5_3 and HEAD once consensus or at least a
majority regarding the remaining issues is reached. I will also
rewrite the proposal to reflect the discussion results and adjust the tests.
After that, I hope that someone will commit at least the HEAD version.

Regards,
Christian

17 years ago by Alexander Wagner — view source

unread

I am not sure I like the idea of explicit $this.
[..] If we ever discover this is a huge issue

Implicit unoptimized $this is never going to be a "huge issue", because it is
not badly broken, only sublty.
My crystal ball tells me that the following is going to happen:

Many people who use closures will expierience a slight increase in memory
consumption due to closures. They won't notice though, unless they run on a
memory_limit with little margin for error.
Developers with objects that use a lot of memory (e.g. because they contain
large strings or hold references to many other objects) will expierience a
significant increase in memory consumption (constant or linear) that may be
enough to cause noticable performance degradation and pop quite a few memory
limits. This is going to be relatively rare, but it will happen regularly.
PHP has a lot of users using shared hosting services.
A very small number of developers will manage to implement an algorithm
whose space complexity is changed from O(1) to O(n) or worse, which can
easily cause a catastrophic increase in memory consumtion, even when PHP is
operating without memory limit.

Also, most of the developers to whom this happens will either not notice at
all or be unable to give accurate feedback, so if this does become a
significant problem, you may never find out.

we can always add support for something like "static function() {}"

That kind of implies that the lambda-function is part of the class because it
was created inside the class. I don't like this notion. Membership in the
class should be reserved for actual members.

You could start with explicit $this, which is inconvenient but safe.
If enough developers complain about the inconvenience, you have a lot of time
to think about how to implement an optimized implicit $this. I don't see any
BC-problems here.

"Start safe, optimize later" seems sounder than "Start sublty broken, fix
later".

Gesundheit
Wag

--
The animals of Australia can be divided into three categories: Poisonous, Odd,
and Sheep.

Douglas Adams

17 years ago by Andi Gutmans — view source

unread

See below:

-----Original Message-----
From: Alexander Wagner [mailto:waqner@gmx.net]
Sent: Friday, June 27, 2008 12:31 PM
To: internals@lists.php.net
Cc: Andi Gutmans; Dmitry Stogov; Christian Seiler; Stas Malyshev
Subject: Re: [PHP-DEV] [PATCH] [RFC] Closures and lambda functions in PHP

Implicit unoptimized $this is never going to be a "huge issue", because it is
not badly broken, only sublty.
My crystal ball tells me that the following is going to happen:

Many people who use closures will expierience a slight increase in memory
consumption due to closures. They won't notice though, unless they run on a
memory_limit with little margin for error.

Developers with objects that use a lot of memory (e.g. because they contain
large strings or hold references to many other objects) will expierience a
significant increase in memory consumption (constant or linear) that may be
enough to cause noticable performance degradation and pop quite a few memory
limits. This is going to be relatively rare, but it will happen regularly.
PHP has a lot of users using shared hosting services.

A very small number of developers will manage to implement an algorithm
whose space complexity is changed from O(1) to O(n) or worse, which can
easily cause a catastrophic increase in memory consumtion, even when PHP is
operating without memory limit.

Also, most of the developers to whom this happens will either not notice at
all or be unable to give accurate feedback, so if this does become a
significant problem, you may never find out.

we can always add support for something like "static function() {}"

That kind of implies that the lambda-function is part of the class because it
was created inside the class. I don't like this notion. Membership in the
class should be reserved for actual members.

Uhm, but we are already discussing closures which belong to the class. In fact, what I suggest is not different from what the current proposal is. I actually think this is much cleaner and more straightforward than any constructs which rely on some explicit $this to be passed.

You could start with explicit $this, which is inconvenient but safe.
If enough developers complain about the inconvenience, you have a lot of time
to think about how to implement an optimized implicit $this. I don't see any
BC-problems here.

"Start safe, optimize later" seems sounder than "Start sublty broken, fix
later".

I don't really consider it broken and I don't think that this additional syntax is what I'd call start safe. I already made a suggestion for how to fix this down the road. And if there's a preference to do it today I don't mind having the "static function" syntax today. I think it's extremely consistent with what PHP already does.

Cheers,

Andi

17 years ago by Dmitry Stogov — view source

unread

Thanks for "static function ()" idea, it's much better and consistent
than "function () use ($this)". I think we should go this way.

Do you see any other issues with the patch?

Thanks. Dmitry.

Alexander Wagner wrote:

I am not sure I like the idea of explicit $this.
[..] If we ever discover this is a huge issue

Implicit unoptimized $this is never going to be a "huge issue", because it is
not badly broken, only sublty.
My crystal ball tells me that the following is going to happen:

Many people who use closures will expierience a slight increase in memory
consumption due to closures. They won't notice though, unless they run on a
memory_limit with little margin for error.

Developers with objects that use a lot of memory (e.g. because they contain
large strings or hold references to many other objects) will expierience a
significant increase in memory consumption (constant or linear) that may be
enough to cause noticable performance degradation and pop quite a few memory
limits. This is going to be relatively rare, but it will happen regularly.
PHP has a lot of users using shared hosting services.

A very small number of developers will manage to implement an algorithm
whose space complexity is changed from O(1) to O(n) or worse, which can
easily cause a catastrophic increase in memory consumtion, even when PHP is
operating without memory limit.

Also, most of the developers to whom this happens will either not notice at
all or be unable to give accurate feedback, so if this does become a
significant problem, you may never find out.

we can always add support for something like "static function() {}"

That kind of implies that the lambda-function is part of the class because it
was created inside the class. I don't like this notion. Membership in the
class should be reserved for actual members.

You could start with explicit $this, which is inconvenient but safe.
If enough developers complain about the inconvenience, you have a lot of time
to think about how to implement an optimized implicit $this. I don't see any
BC-problems here.

"Start safe, optimize later" seems sounder than "Start sublty broken, fix
later".

Gesundheit
Wag

17 years ago by Alexander Wagner — view source

unread

Fixed ref/noref issues

Works for me. See attached test 11.

Gesundheit
Wag

--
For sale: baby shoes, never worn.

flash fiction by Hemingway

17 years ago by Lars Strojny — view source

unread

Hi Christian,

thanks again for your (and Dmitry's) great work on making closures a
part of PHP.

Am Donnerstag, den 26.06.2008, 18:23 +0200 schrieb Christian Seiler:

I have not added any __invoke() magic to normal objects. This is
mainly due to the simple reason that adding that would not help
a closure implementation at all. Closures need some engine internal
magic (use a dynamically created op_array instead of looking one up,
setting the correct class scope and setting the correct EG(this). And
as I said: I want to stick with the closure basics for now.

I understand that you want to keep your proposal basic. However I have
the feeling that we need a complete implementation to make closures
really beneficial for our users. I would consider the following features
to be central for a feature complete implementation:

  * Class::__invoke() to allow functors[1]. The class "Closure" in
    your proposal should also implement that method to make
    `method_exists()` and ext/reflection behave.
  * Change the behaviour how method calls are resolved. Method calls
    on invokable objects (closures or functors) should work.

cu, Lars

[1] http://en.wikipedia.org/wiki/Functor
[2] Example for a closure assigned to a property:
class View();
{
public $escape;
}

$view = new View();
$view->escape = function($string) {
return htmlentities($string, ENT_QUOTES, 'UTF-8');
}

echo $view->escape("<script>alert(1)</script>");

17 years ago by Stanislav Malyshev — view source

unread

Hi!

  * Class::__invoke() to allow functors[1]. The class "Closure" in
    your proposal should also implement that method to make
    `method_exists()` and ext/reflection behave.
  * Change the behaviour how method calls are resolved. Method calls
    on invokable objects (closures or functors) should work.

And to close the circle, add __invoke to ReflectionFunctionAbstract and
implement it in Reflectionfunctiom and ReflectionMethod (here we might
have trouble with specifying object, so need to do some thinking on it -
maybe we'll need another class or augment ReflectionMethod somehow?)

echo $view->escape("<script>alert(1)</script>");

If we use this syntax, and $view->escape is not defined, should we call
call or get?

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

17 years ago by Lars Strojny — view source

unread

Hi Stas,

Am Sonntag, den 29.06.2008, 15:20 -0700 schrieb Stanislav Malyshev:
[...]

If we use this syntax, and $view->escape is not defined, should we
call __call or __get?

That's indeed a good question. Calling __get() after resolving
$view->escape as a property would break BC. Maybe we would do the
following:
a) method exists?
b) invokable property exists?
c) __get() exists and returnes invokable object?
d) __call() exists?
e) trigger error

The important thing with c) is that we resolve to __call() if __get()
returned something wrong to make sure currently working objects are
still working in the future.

cu, Lars

17 years ago by troels knak-nielsen — view source

unread

Hi Stas,

Am Sonntag, den 29.06.2008, 15:20 -0700 schrieb Stanislav Malyshev:
[...]

If we use this syntax, and $view->escape is not defined, should we
call __call or __get?

That's indeed a good question. Calling __get() after resolving
$view->escape as a property would break BC. Maybe we would do the

I really think, the only sane thing to do, is to invoke __call. Since
lambda's are first-class, it would make sense to get rid of __call
entirely, but as it's already there, I would say, that we should
preserve BC. With the current behaviour (invoke __call), it's still
possible to delegate to a lambda, from within the __call method.

--
troels

17 years ago by Marcus Boerger — view source

unread

Hello Stanislav,

Monday, June 30, 2008, 12:20:15 AM, you wrote:

Hi!

  * Class::__invoke() to allow functors[1]. The class "Closure" in
    your proposal should also implement that method to make
    `method_exists()` and ext/reflection behave.
  * Change the behaviour how method calls are resolved. Method calls
    on invokable objects (closures or functors) should work.

And to close the circle, add __invoke to ReflectionFunctionAbstract and
implement it in Reflectionfunctiom and ReflectionMethod (here we might
have trouble with specifying object, so need to do some thinking on it -
maybe we'll need another class or augment ReflectionMethod somehow?)

Actually a pretty good idea :-) Callable comes to mind if we really need
more names. But a ReflectionMethod could be a static method as well as a
static closure. So I think it should throw an exception in case an
instance is missing, just as it would do for non static methods.

marcus

echo $view->escape("<script>alert(1)</script>");

If we use this syntax, and $view->escape is not defined, should we call
__call or __get?

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

Best regards,
Marcus

17 years ago by Markus Fischer — view source

unread

Since some raised issues with the word "lexical", what do people think to just
re-use the (afaik deprecated) "var" keyword, so we won't need a new keyword in
the chain.

cheers,

Markus

17 years ago by Christian Seiler — view source

unread

Hi!

Since some raised issues with the word "lexical", what do people think
to just re-use the (afaik deprecated) "var" keyword, so we won't need a
new keyword in the chain.

That would be quite confusing IMHO, since JavaScript uses 'var' for the
exact opposite - to declare variables that are local and thus not
taken from the parent scope.

Regards,
Christian

17 years ago by troels knak-nielsen — view source

unread

Since some raised issues with the word "lexical", what do people think to
just re-use the (afaik deprecated) "var" keyword, so we won't need a new
keyword in the chain.

What exactly is the problem with "lexical"? I find it quite
descriptive, and I don't see how it could be confused with any other
keywords?

17 years ago by Wez Furlong — view source

unread

Just to chime in on this thread; I like your implementation and (after
reading through all the other comments so far), prefer the lexical
keyword to import variables.

As I've said before, the closure aspect of this is the hardest to gel
into PHP, which deliberately avoids inheriting scopes. Since everyone
has grown up explicitly managing this via the global keyword, I think
it makes a lot of sense to use similar syntax for getting at those
lexical values.

I'm +1 for inclusion of this into the next release of PHP (post 5.3),
and like Andrei, would love there to be a first class callable type
for dynamic invocation of "regular" functions and methods.

--Wez.

17 years ago by Lukas Kahwe Smith — view source

unread

I'm +1 for inclusion of this into the next release of PHP (post 5.3),
and like Andrei, would love there to be a first class callable type
for dynamic invocation of "regular" functions and methods.

Just a side note about the timing. I know some people have argued that
this should go into 5.3 since it could be the last PHP 5 minor
release. I am very sure we will see a 5.4. There is also the traits
patch that is more or less ready (well Andi/Marcus still need to make
up their mind how if at all the current implementation needs to be
expanded).

regards,
Lukas Kahwe Smith
mls@pooteeweet.org

17 years ago by Sebastian Bergmann — view source

unread

Lukas Kahwe Smith wrote:

I am very sure we will see a 5.4. There is also the traits patch
that is more or less ready

And the switch from bison to lemon is also on the agenda for PHP 5.4.

--
Sebastian Bergmann http://sebastian-bergmann.de/
GnuPG Key: 0xB85B5D69 / 27A7 2B14 09E4 98CD 6277 0E5B 6867 C514 B85B 5D69

17 years ago by Marcus Boerger — view source

unread

Hello Sebastian,

ok, even though I just wrote differently. It appears to me, after reading
the rest of the thread, that I shouuld maybe give in and favor a 5.4
instead.

marcus

Wednesday, June 25, 2008, 10:42:50 AM, you wrote:

Lukas Kahwe Smith wrote:

I am very sure we will see a 5.4. There is also the traits patch
that is more or less ready

And the switch from bison to lemon is also on the agenda for PHP 5.4.

--
Sebastian Bergmann http://sebastian-bergmann.de/
GnuPG Key: 0xB85B5D69 / 27A7 2B14 09E4 98CD 6277 0E5B 6867 C514 B85B 5D69

Best regards,
Marcus

[PATCH] [RFC] Closures and lambda functions in PHP

INTRODUCTION

PROPOSED PATCH

Userland perspective

Zend internal perspective

The patch

BC BREAKS

CAVEATS / POSSIBLE WTFS

FINAL THOUGHTS

INTRODUCTION

PROPOSED PATCH

Userland perspective

Zend internal perspective

The patch

BC BREAKS

CAVEATS / POSSIBLE WTFS

FINAL THOUGHTS

Maybe we could make some object handler so that $object($foo) would work and treat object as "functional object" called on $foo and then have both reflection and closure object implement it?

Please do not consider this to be opinion about (or against) the patch - I think the idea is good and from preliminary glance the implementation is very nice too, but IMHO we just can not have everything in one release.

Please do not consider this to be opinion about (or against) the patch - I think the idea is good and from preliminary glance the implementation is very nice too, but IMHO we just can not have everything in one release.

INTRODUCTION

PROPOSED PATCH

Userland perspective

Zend internal perspective

The patch

BC BREAKS

CAVEATS / POSSIBLE WTFS

FINAL THOUGHTS

--

lexical in the proposal binds to creator's scope, not caller's scope, as I understood. Anyway, binding to caller's immediate scope doesn't seem that useful since you could just pass it as a parameter when calling.

This would work for $var, but what about $$var and various other ways of indirect variable access?

In any case I think we don't need to waste 2 bytes (or more with alignment) on something that's essentially 2 bits. I know it's nitpicking, but every little bit helps :) Of course, if we drop the flags the point is moot.

I wouldn't spend too much thought on making lexical work like global. global is for different purpose (and with $GLOBALS is obsolete anyway :)

INTRODUCTION

PROPOSED PATCH

Userland perspective

Zend internal perspective

The patch

BC BREAKS

CAVEATS / POSSIBLE WTFS

FINAL THOUGHTS

INTRODUCTION

PROPOSED PATCH

Userland perspective

Zend internal perspective

The patch

BC BREAKS

CAVEATS / POSSIBLE WTFS

FINAL THOUGHTS

in zend_do_fetch_lexical_variable:

and make sure zend_do_assign can live with NULL for first param

I think implicitly using $this when it's referred to is much better. $this is very special variable so it deserves special treatment. If we'd need to spare a couple of bytes for that - that's not too much to pay.

If we use this syntax, and $view->escape is not defined, should we call __call or __get?

If we use this syntax, and $view->escape is not defined, should we call __call or __get?

Maybe we could make some object handler so that $object($foo) would work
and treat object as "functional object" called on $foo and then have
both reflection and closure object implement it?

Please do not consider this to be opinion about (or against) the patch -
I think the idea is good and from preliminary glance the implementation
is very nice too, but IMHO we just can not have everything in one release.

Please do not consider this to be opinion about (or against) the patch -
I think the idea is good and from preliminary glance the implementation
is very nice too, but IMHO we just can not have everything in one release.

lexical in the proposal binds to creator's scope, not caller's scope, as
I understood. Anyway, binding to caller's immediate scope doesn't seem
that useful since you could just pass it as a parameter when calling.

This would work for $var, but what about $$var and various other ways of
indirect variable access?

In any case I think we don't need to waste 2 bytes (or more with
alignment) on something that's essentially 2 bits. I know it's
nitpicking, but every little bit helps :) Of course, if we drop the
flags the point is moot.

I wouldn't spend too much thought on making lexical work like global.
global is for different purpose (and with $GLOBALS is obsolete anyway :)

and make sure zend_do_assign can live with `NULL` for first param

I think implicitly using $this when it's referred to is much better.
$this is very special variable so it deserves special treatment. If we'd
need to spare a couple of bytes for that - that's not too much to pay.

If we use this syntax, and $view->escape is not defined, should we call
call or get?

If we use this syntax, and $view->escape is not defined, should we call
call or get?