[Discussion] Callable types via Interfaces

2 years ago by Larry Garfield — view source

unread

Hi folks. This is a pre-discussion, in a sense, before a formal RFC. Nicolas Grekas and I have been kicking around some ideas for how to address the desire for typed callables, and have several overlapping concepts to consider. Before going down the rabbit hole on any of them we want to gauge the general feeling about the approaches to see what is worth pursuing.

We have three "brain dump" RFCs on this topic, although these are all still in super-duper early stages so don't sweat the details in them at this point. We just want to discuss the basic concepts, which I have laid out below.

https://wiki.php.net/rfc/allow_casting_closures_into_single-method_interface_implementations
https://wiki.php.net/rfc/allow-closures-to-declare-interfaces-they-implement
https://wiki.php.net/rfc/structural-typing-for-closures

The problem

function takeTwo(callable $c): int
{
return $c(1, 2);
}

Right now, we have no way to statically enforce that $c is a callable that takes 2 ints and returns an int. We can document it, but that's it.

There is one loophole, in that an interface may require an __invoke() method:

interface TwoInts
{
public function __invoke(int $a, int $b): int;
}

And then a class may implement TwoInts, and takeTwo() can type against TwoInts. However, that works only for classes, which are naturally considerably more verbose than a simple closure and represent only a subset of the possible callable types.

The usual discussion has involved a way to specify a callable type's signature, like so:

function takeTwo(callable(int $a, int $b): int $c)
{
return $c(1, 2);
}

But that runs quickly into the problem of verbosity, reusability, and type aliases, and the discussion usually dies there.

The alternative

What we propose is to instead lean into the interface approach. Specifically, recall that all closures in PHP are actually implemented as classes in the engine. That is:

$f = fn(int $x, int $y): int => $x + $y;

actually turns into (approximately) this in the engine:

$f = new class extends \Closure
{
public function __invoke(int $x, int $y): int
{
return $x + $y;
}
}

(It doesn't do syntax translation but that's effectively what the engine does.)

So all that's really missing is a way for arbitrary closures to denote that they implement an interface, and then they can be used wherever an interface is required. That neatly sidesteps the verbosity and reusability issues, and since interfaces are already well-understood there's no need to wait for type aliases.

It would not support the old-style funky callables like a function string or [$obj, 'method'], but with the advent of first-class-callables those are no longer recommended anyway so not supporting them is probably a good thing.

The same would also work for property types, which can easily type against an interface. That would mostly sidestep the current limitation of typing a property as callable, since you could provide a more-specific type instead for a double-win.

The options

There's three ways we've come up with that this design could be implemented. In concept they're not mutually exclusive, so we could do one, two, or three of these. Figuring out which approach would get the most support is the purpose of this thread.

castTo

The first is to add a castTo() method to Closure. That would produce a new object that has the same logic as the closure, but explicitly implements the interface.

That is, this:

$fn2 = $fn->castTo(TwoInts::class);

Is roughly logically equivalent to:

$fn2 = new class($fn) implements TwoInts {
public function __construct(private callable $fn) {}

public function __invoke(int $a, int $b): int
{
    return $this->fn(func_get_args();
}

};

(Whether that's what the implementation actually does or if it's smarter about it is an open question.)

In theory, this would also support any single-method interface, not just those using __invoke(). The other options below would not support that.

This does have a number of open edge cases, like what to do with a closure that is already bound to an object.

Function interfaces

The second option is to allow closures to declare up front what interfaces they implement. So:

$f = fn(int $x, int $y): int implements TwoInts => $x + $y;

This has the advantage of being more statically analyzable (both visually and for parsers). It may also be more performant (in theory), as it could translate almost trivially to:

$f = new class extends \Closure implements TwoInts
{
public function __invoke(int $x, int $y): int
{
return $x + $y;
}
}

The downside is that it only works for user-defined closures that declare their support up-front, statically. Something like strlen(...) or strtr(...) wouldn't work. It's also a bit verbose, though using bindTo() directly on the closure is of similar length:

$f = (fn(int $x, int $y): int => $x + $y)->bindTo(TwoInts::class);

Structural typing for closures

The third option would necessitate having similar logic in the engine to the first. In this case, we take a "structural typing" approach to closures; that is, "if the types match at runtime, it must be OK." This is probably closest to the earlier proposals for a callable(int $x, int $y): int syntax (which would by necessity have to be structural), but essentially uses interfaces as the type alias.

function takeTwo(TwoInts $c): int
{
return $c(1, 2);
}

$result = takeTwo(fn(int $x, int $y): int => $x + $y);

In this approach, no up-front work is needed. A callable/closure that conforms to an interface with __invoke() "just works" when it's used. Essentially this would involve detecting that the argument is a callable and the parameter is an interface with __invoke(), then trying to castTo() that interface. If it works, pass the result. If not, fail in some way.

This approach would support any arbitrary closure, including strtr(...) style FCC closures. A closure would not need to pre-declare its support ("nominal typing"), which makes it more flexible. Callables "just work."

The downside here is complexity. Currently, class type conformance is determined ahead of time, and a type check is just a lookup on a list on the class. This would necessitate loading the interface (possibly autoloading it) within the function call action, attempting the cast operation, and handling the potential fault. It also means it would only happen at the function boundary or property assignment; a closure would never work with instanceof, class_implements, etc., because to those it would still be "just" a \Closure.

A callable syntax literal (as previously discussed) would have most of the same challenges.

The discussion

So those are the options. We feel that the interface-based approach is strong, and a good way forward for getting typed callables without a bunch of dependent features needed first. These three ways of getting there are all potentially viable (give or take implementation details and edge cases in all cases, as always), all have their own pros and cons, and in concept we could very easily adopt more than one, or combine them into a single RFC.

Before we dig into any of those edge cases, however, we want to throw the question out: Is this general approach even acceptable? Are there implementation challenges to any of them we're not seeing? Would you vote for any or all of these proposals or oppose on principle? Are you interested in helping us implement any of them :-) ?

Please discuss, so we can decide how to proceed toward a real concrete proposal.

--
Larry Garfield
larry@garfieldtech.com

2 years ago by Deleu — view source

unread

On Thu, Apr 20, 2023 at 2:25 PM Larry Garfield larry@garfieldtech.com
wrote:

But that runs quickly into the problem of verbosity, reusability, and type
aliases, and the discussion usually dies there.

Just out of curiosity, was there ever a discussion thread dedicated for
Type aliases? I couldn't find it on externals.io and I was curious to know
what are the challenges there since a lot of time and effort seem to have
been put on trying to sidestep it.

--
Marco Deleu

2 years ago by Larry Garfield — view source

unread

On Thu, Apr 20, 2023 at 2:25 PM Larry Garfield larry@garfieldtech.com
wrote:

But that runs quickly into the problem of verbosity, reusability, and type
aliases, and the discussion usually dies there.

Just out of curiosity, was there ever a discussion thread dedicated for
Type aliases? I couldn't find it on externals.io and I was curious to know
what are the challenges there since a lot of time and effort seem to have
been put on trying to sidestep it.

Not that I recall, on list. There has been discussion on and off in various chat rooms over the last 3-ish years. The general pattern seems to be "that would be cool and useful! But how do we define them, and how does autoloading work for something that's a one liner?" And since no one has a good answer for that, it kinda dies there.

I wouldn't expect a formal discussion thread on it until someone has a solid proposal that addresses those questions that they want to run up the flag pole. (Not that I think that's a good process, but that's how things work right now, mostly.)

--Larry Garfield

2 years ago by Dan Ackroyd — view source

unread

Hi folks. This is a pre-discussion, in a sense, before a formal RFC.

Hi Larry,

The "Allow casting closures into single-method interface
implementations" one seems a complete non-starter, as that seems
really hard to work with. You'd have to do lots of "wiring up" to do
any significant amount of programming.

"Allow Closures to Declare Interfaces they Implement"

That sounds bad as it doesn't allow arbitrary functions to be used as callables.

We feel that the interface-based approach is strong

All three of them are using interfaces...?

But if you mean the "Structural Typing for Closures" one, then I'd
probably agree. But as currently proposed it seems like a hack, that
would be predictably regrettable in a couple of years.

and a good way forward for getting typed callables
without a bunch of dependent features needed first.

Maybe list what you think the dependent features are, so that there
isn't confusion about them, but I suspect that we're going to not
agree on how languages should be designed and evolve.

Although I really want to see typed callables, and other forms of type
aliasing, as they would be huge improvements in being able to write
code that is easy to reason about and maintain, I don't want to seem
them as soon as possible, having taken short-cuts against good
language design.

"No is temporary, yes is forever".

cheers
Dan
Ack

2 years ago by Larry Garfield — view source

unread

Hi folks. This is a pre-discussion, in a sense, before a formal RFC.

Hi Larry,

The "Allow casting closures into single-method interface
implementations" one seems a complete non-starter, as that seems
really hard to work with. You'd have to do lots of "wiring up" to do
any significant amount of programming.

That's my concern as well, and part of what led me to suggest the structural typing approach.

"Allow Closures to Declare Interfaces they Implement"

That sounds bad as it doesn't allow arbitrary functions to be used as callables.

Yes. Which is why I think it would make the most sense when combined with one of the other two options, so it's a sort of performance optimization for the general case.

We feel that the interface-based approach is strong

All three of them are using interfaces...?

I'm referring to the general idea of this thread, which is "an interface with __invoke is how you define a callable type." Everything else here is a variation on that basic premise.

But if you mean the "Structural Typing for Closures" one, then I'd
probably agree. But as currently proposed it seems like a hack, that
would be predictably regrettable in a couple of years.

In what way?

and a good way forward for getting typed callables
without a bunch of dependent features needed first.

Maybe list what you think the dependent features are, so that there
isn't confusion about them, but I suspect that we're going to not
agree on how languages should be designed and evolve.

Mainly the type alias question. Every time I see callable types discussed, it immediately sidetracks into "how do we make that less fugly to write, because callable types are naturally very verbose?" That leads directly to typedefs/type aliases, which take one of two forms:

type TwoInts = callable(int $x, int $y): int
type LinkedResponse = ResponseInterface&LinkCollectionInterface

which raises autoloading questions and means a dependency on a type defined in another package, in many cases. Or:

use callable(int $x, int $y): int as TwoInts
use ResponseInterface&LinkCollectionInterface as LinkedResponse

Which would be file-local, much like class "use" statements are. That avoids the autoload and dependency problem, at the cost of having to retype that frickin' thing in every file where it's relevant. In many cases, that could be dozens or hundreds of files repeating that line.

And that's where the discussion usually dies off.

As noted, this is still a form of structural typing, which means the function call process necessarily gets more complex (for callables).

Using existing interfaces for callable definitions side steps the implementation challenges of type aliases/typedefs, since once you have an object tagged with an interface (via any of the mechanisms described), its behavior is already very well-defined and predictable.

Although I really want to see typed callables, and other forms of type
aliasing, as they would be huge improvements in being able to write
code that is easy to reason about and maintain, I don't want to seem
them as soon as possible, having taken short-cuts against good
language design.

"No is temporary, yes is forever".

I'm happy to see forward motion on any front. If the result of this thread is that someone gets incentivized to finally figure out callable types for realsies without an interface, I'd sleep happy with that result. But this gives us something concrete to chew on, which we have so far lacked.

--Larry Garfield

2 years ago by michal.brzuchalski@gmail.com — view source

unread

There is one loophole, in that an interface may require an __invoke()
method:

interface TwoInts
{
public function __invoke(int $a, int $b): int;
}

I was playing around with the code and parser for this in 2020 but my idea
was to introduce a new syntax that is inspired by C# - Delegates [1]

delegate Reducer (?int $sum, int $item = 0): int;

class Foo implements Reducer {
public function __invoke(?int $sum, int $item = 0): int { }
}
function reduce(Reducer $reducer) {
var_dump($reducer(0, 5));
}
reduce(new Foo());
reduce(fn(?int $sum, int $item = 0): int => 8);

At the same time, I assumed structural typing for closures would be used.
I assumed the delegate will resolve into
interface Reducer {
public function __invoke(?int $sum, int $item = 0): int {}
}

I also noticed that once checked closure doesn't have to be checked against
the argument types and return type because it won't change which gives some
possibility to cache this type check.

The usual discussion has involved a way to specify a callable type's
signature, like so:

function takeTwo(callable(int $a, int $b): int $c)
{
return $c(1, 2);
}

But that runs quickly into the problem of verbosity, reusability, and type
aliases, and the discussion usually dies there.

This is why initially I thought about Delegates as in C# there are not type
aliases.
The delegate essentially resolves to an interface with __invoke(?int $sum, int $item = 0): int method.

Structural typing for closures

The third option would necessitate having similar logic in the engine to
the first. In this case, we take a "structural typing" approach to
closures; that is, "if the types match at runtime, it must be OK." This is
probably closest to the earlier proposals for a callable(int $x, int $y): int syntax (which would by necessity have to be structural), but
essentially uses interfaces as the type alias.

function takeTwo(TwoInts $c): int
{
return $c(1, 2);
}

$result = takeTwo(fn(int $x, int $y): int => $x + $y);

I'd love to see this happening.

[1]
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/delegates/

Cheers,
Michał Marcin Brzuchalski

2 years ago by Larry Garfield — view source

unread

Hi

There is one loophole, in that an interface may require an __invoke()
method:

interface TwoInts
{
public function __invoke(int $a, int $b): int;
}

I was playing around with the code and parser for this in 2020 but my idea
was to introduce a new syntax that is inspired by C# - Delegates [1]

delegate Reducer (?int $sum, int $item = 0): int;

class Foo implements Reducer {
public function __invoke(?int $sum, int $item = 0): int { }
}
function reduce(Reducer $reducer) {
var_dump($reducer(0, 5));
}
reduce(new Foo());
reduce(fn(?int $sum, int $item = 0): int => 8);

At the same time, I assumed structural typing for closures would be used.
I assumed the delegate will resolve into
interface Reducer {
public function __invoke(?int $sum, int $item = 0): int {}
}

This is effectively the same as the "typedef" version of callable types from my email to Dan a moment ago. See there for the challenges.

--Larry Garfield

2 years ago by David Gebler — view source

unread

On Thu, Apr 20, 2023 at 6:25 PM Larry Garfield larry@garfieldtech.com
wrote:

The options

There's three ways we've come up with that this design could be
implemented. In concept they're not mutually exclusive, so we could do
one, two, or three of these. Figuring out which approach would get the
most support is the purpose of this thread.

My initial feelings based on the options laid out is that anything which
can't support FCCs in the manner of strlen(...) is probably a non-starter
in terms of language design. Changes like this are fundamentally about
making things simpler, more concise and more convenient for users, not
drip-feeding a stream of "and here's yet another way of working with..."
features across releases.

Structural typing option seems like the easiest to implement in the engine
(correct me if I'm wrong?) and probably the best syntax for the user within
the interface approach. But then do we really want to introduce new runtime
checks and complexity when the general trend of the language has been in
the opposite direction? I imagine probably not.

So out of the three, I lean towards adding castTo() to Closure and it maybe
raises a to-be-determined Throwable if the closure is already bound? It's
not as friendly for the user as the other options but it seems like the
most workable, it delivers value and it most closely fits within the
existing way of working with all types of closure today.

-Dave

2 years ago by Larry Garfield — view source

unread

On Thu, Apr 20, 2023 at 6:25 PM Larry Garfield larry@garfieldtech.com
wrote:

The options

There's three ways we've come up with that this design could be
implemented. In concept they're not mutually exclusive, so we could do
one, two, or three of these. Figuring out which approach would get the
most support is the purpose of this thread.

My initial feelings based on the options laid out is that anything which
can't support FCCs in the manner of strlen(...) is probably a non-starter
in terms of language design. Changes like this are fundamentally about
making things simpler, more concise and more convenient for users, not
drip-feeding a stream of "and here's yet another way of working with..."
features across releases.

Structural typing option seems like the easiest to implement in the engine
(correct me if I'm wrong?) and probably the best syntax for the user within
the interface approach. But then do we really want to introduce new runtime
checks and complexity when the general trend of the language has been in
the opposite direction? I imagine probably not.

Our assumption is the opposite: Structural typing would be the hardest to implement, and have the largest performance risk, but be the nicest/most convenient for developers.

So out of the three, I lean towards adding castTo() to Closure and it maybe
raises a to-be-determined Throwable if the closure is already bound? It's
not as friendly for the user as the other options but it seems like the
most workable, it delivers value and it most closely fits within the
existing way of working with all types of closure today.

-Dave

--Larry Garfield

2 years ago by Levi Morrison via internals — view source

unread

Hi folks. This is a pre-discussion, in a sense, before a formal RFC. Nicolas Grekas and I have been kicking around some ideas for how to address the desire for typed callables, and have several overlapping concepts to consider. Before going down the rabbit hole on any of them we want to gauge the general feeling about the approaches to see what is worth pursuing.

We have three "brain dump" RFCs on this topic, although these are all still in super-duper early stages so don't sweat the details in them at this point. We just want to discuss the basic concepts, which I have laid out below.

https://wiki.php.net/rfc/allow_casting_closures_into_single-method_interface_implementations
https://wiki.php.net/rfc/allow-closures-to-declare-interfaces-they-implement
https://wiki.php.net/rfc/structural-typing-for-closures

The problem

function takeTwo(callable $c): int
{
return $c(1, 2);
}

Right now, we have no way to statically enforce that $c is a callable that takes 2 ints and returns an int. We can document it, but that's it.

There is one loophole, in that an interface may require an __invoke() method:

interface TwoInts
{
public function __invoke(int $a, int $b): int;
}

And then a class may implement TwoInts, and takeTwo() can type against TwoInts. However, that works only for classes, which are naturally considerably more verbose than a simple closure and represent only a subset of the possible callable types.

The usual discussion has involved a way to specify a callable type's signature, like so:

function takeTwo(callable(int $a, int $b): int $c)
{
return $c(1, 2);
}

But that runs quickly into the problem of verbosity, reusability, and type aliases, and the discussion usually dies there.

I'm going to stop here. Two big things:

I think reducing verbosity can be left for the future. We don't
have to solve that right now.
I think there's another more important reason previous attempts
failed: they will inevitably burden the programmer without type
inference.For the moment, let's assume this signature:

function takeTwo(callable(int $x, int $y): int $c);

What happens if I pass a short-closure?

takeTwo(fn ($x, $y) => $x + $y);

I would be annoyed if I had to write the type info, but particularly
the return type.

Today, if I just used a static analysis tool, there's no problem:

/** @param callable(int $x, int $y): int $c */
function takeTwo(callable $c);
takeTwo(fn ($x, $y) => $x + $y);

And another reason they failed: callables are going to want
generic types pretty commonly. Think array_filter, array_map, etc.

So, I think these brain dump RFCs are all focusing on the wrong problems.

2 years ago by Deleu — view source

unread

On Thu, Apr 20, 2023 at 8:23 PM Levi Morrison via internals <
internals@lists.php.net> wrote:

I'm going to stop here. Two big things:

I think reducing verbosity can be left for the future. We don't
have to solve that right now.

What happens if I pass a short-closure?

takeTwo(fn ($x, $y) => $x + $y);
I would be annoyed if I had to write the type info, but particularly
the return type.

Sorry for the unhelpful email, but does anybody else see the irony here?
It's just too funny to not be mentioned 😂😂😂😂

--
Marco Deleu

2 years ago by Levi Morrison via internals — view source

unread

I'm going to stop here. Two big things:

I think reducing verbosity can be left for the future. We don't
have to solve that right now.
What happens if I pass a short-closure?
takeTwo(fn ($x, $y) => $x + $y);
I would be annoyed if I had to write the type info, but particularly
the return type.
Sorry for the unhelpful email, but does anybody else see the irony here? It's just too funny to not be mentioned 😂😂😂😂

--
Marco Deleu

Sure, I get that ^_^ But the difference is that there are quite a few
ways we can solve the first verbosity (allowing fn instead of
callable, allowing type aliases which could also be useful for
unions, etc), and only things that seem hard to solve the second one
(static type inference? delayed type checks?)

2 years ago by Ilija Tovilo — view source

unread

Hi Larry and Nicolas!

https://wiki.php.net/rfc/allow_casting_closures_into_single-method_interface_implementations
https://wiki.php.net/rfc/allow-closures-to-declare-interfaces-they-implement
https://wiki.php.net/rfc/structural-typing-for-closures

What we propose is to instead lean into the interface approach. Specifically, recall that all closures in PHP are actually implemented as classes in the engine. That is:

$f = fn(int $x, int $y): int => $x + $y;

actually turns into (approximately) this in the engine:

$f = new class extends \Closure
{
public function __invoke(int $x, int $y): int
{
return $x + $y;
}
}

Just to comment on the technical aspect, I don't think this is
accurate. Closures are indeed objects, but they are all instances of
the same \Closure class. From what Nikita said in the enum RFC,
objects are optimized for size, classes are not. Having different
closures implement different interfaces does mean they probably all
need their own class, or type checks need to account for closures in
some alternative way.

Ilija

2 years ago by Nicolas Grekas — view source

unread

Hi all,

https://wiki.php.net/rfc/allow_casting_closures_into_single-method_interface_implementations

https://wiki.php.net/rfc/allow-closures-to-declare-interfaces-they-implement
https://wiki.php.net/rfc/structural-typing-for-closures

Thanks Larry for the nice introduction to those ideas.

Personally, I feel like going with adding Closure::castTo() might provide
the most immediate benefit. I expanded the rationale on the corresponding
RFC and added more examples. I'd appreciate it if all of you reading could
have another look to see if that helps to better understand the proposal.

strtr(...)->castTo(TranslatableInterface::class) is one example of RFC #1
function ($message, $parameters) implements TranslatableInterface is RFC #2

Both RFCs nicely combine together to cover many cases of typed callabled.

Then RFC#3 is a bit more adventurous (according to our understanding) but
still desirable as it's essentially about allowing the engine to
tentatively call castTo() from RFC#1 when a closure is passed as argument
while an interface is expected.

We're now wondering if we should start spending time on prototype
implementations for #1 and/or #2, and #3 in this order. Should we consider
a preliminary vote on the topic?

Nicolas

2 years ago by michal.brzuchalski@gmail.com — view source

unread

Hi Nicolas,

wt., 25 kwi 2023, 19:00 użytkownik Nicolas Grekas <
nicolas.grekas+php@gmail.com> napisał:

Hi all,

https://wiki.php.net/rfc/allow_casting_closures_into_single-method_interface_implementations

https://wiki.php.net/rfc/allow-closures-to-declare-interfaces-they-implement

https://wiki.php.net/rfc/structural-typing-for-closures

Thanks Larry for the nice introduction to those ideas.

Personally, I feel like going with adding Closure::castTo() might provide
the most immediate benefit. I expanded the rationale on the corresponding
RFC and added more examples. I'd appreciate it if all of you reading could
have another look to see if that helps to better understand the proposal.

strtr(...)->castTo(TranslatableInterface::class) is one example of RFC #1
function ($message, $parameters) implements TranslatableInterface is RFC #2

Both RFCs nicely combine together to cover many cases of typed callabled.

Then RFC#3 is a bit more adventurous (according to our understanding) but
still desirable as it's essentially about allowing the engine to
tentatively call castTo() from RFC#1 when a closure is passed as argument
while an interface is expected.

Personally I don't like this way of shaping callable types. Given examples
are really confusing me. Call of a castTo() with argument representing an
interface with a method is confusing as the method magically appears on a
closure without explicit binding to it! What if an interface has more than
one method? What if I wanna choose which one?

For me personally this goes into wrong direction.

Cheers,
Michał Marcin Brzuchalski

2 years ago by Larry Garfield — view source

unread

Personally I don't like this way of shaping callable types. Given examples
are really confusing me. Call of a castTo() with argument representing an
interface with a method is confusing as the method magically appears on a
closure without explicit binding to it! What if an interface has more than
one method? What if I wanna choose which one?

Multi-method interfaces would be explicitly disallowed and trigger an Error. If providing only a single "method", then there's no logical way for it to satisfy a multi-method interface. But there are ample single-method interfaces around PHPlandia. And a single-method interface has no ambiguity about which method the closure corresponds to.

--Larry Garfield

2 years ago by Derick Rethans — view source

unread

Before we dig into any of those edge cases, however, we want to throw
the question out: Is this general approach even acceptable?

I think I would vote against all three options. It is hard to put a
pulse on, but IMO it looks too complex. I'd say: wait for type aliases.

cheers,
Derick