[RFC][Discussion] Return Type Variance Checking

10 years ago by Levi Morrison — view source

unread

Dear Internals,

As you may remember, I closed the voting on the return types
RFC because of a design
flaw that was found during the voting process. The purpose of this
discussion thread is to explain the various options for checking
return type compatibility with parent methods that I think are viable,
to show some of the benefits and drawbacks of each option, and to
solicit feedback on which options you prefer and why. As such, it's
much longer than a normal message I would send.

The following code demonstrates the design flaw mentioned above:

class A {
    function foo(): B { return new B; }
}

class B extends A {
    function foo(): C { return new C; }
}

class C extends B {}

$b = new B;
$c = $b->foo();

I've also used it because it can adequately show the differences in
how each of the following options work:

Do covariant return types; check them at definition time
Do covariant return types; check them at runtime
Do invariant return types; check them at definition time

Option 1: Covariant return types with definition time checking

This is the option that is currently implemented in my pull request.
This option catches all return type variance issues whether code is
used or not; if it got included it is checked. This means return type
variance issues can't bite you later.

When class B is defined, the engine needs to check that the method
B::foo has a compatible return type with A::foo, in this case that
equates to checking that C is compatible with type B. This would
trigger autoloading for C but it will fail in this case because C is
defined in the same file. Note that there are ways we could fix this
issue, but not reliably because of conditionally defined classes
(defining classes in if blocks, for example) and other such dynamic
behavior. Even if we could fix this issue, there is still an issue if
A, B and C were defined in separate files and then included. You
couldn't require them in any order for it to work; you would have to
autoload.

Option 2: Covariant return types with runtime checking

This option would do the variance check when the method is used (or
potentially when the class is instantiated, or whichever comes first).
Regardless of the exact details of how this method is implemented, the
above code would work because the first time class B is used in any
way occurs after C is defined. This would also work if you separated
them out into different files

This option would be slower than option 1, but cannot quantify it
because it is not yet implemented. I suspect there would be a way to
cache the result of the variance check to not have to do it on every
instantiation or invocation, so this may be negligble.

This option has the drawback that inheritance problems can exist in
the code and won't be discovered until the code is ran. Let me repeat
that to make sure that everyone understand it: if you have return type
variance errors in your code, they would not be detected until you try
to use the class or method.

Option 3: Invariant return types with definition time checking

This means that the declared types must exactly match after
resolving aliases such as parent and self. The advantage of this
option is that the inheritance check can be done at definition time in
all cases without triggering an autoload. As such it is a bit simpler
to implement and is slightly faster to do so. The obvious downside is
that it can't support covariant return types.

Note that C++ and Java (and most modern languages) support covariant
return types, but C# supports only invariant return types.

Which of these methods do you prefer the most, and which do you prefer
the least, and why?

10 years ago by Robert Stoll — view source

unread

-----Ursprüngliche Nachricht-----
Von: morrison.levi@gmail.com [mailto:morrison.levi@gmail.com] Im Auftrag von Levi Morrison
Gesendet: Dienstag, 25. November 2014 18:09
An: internals
Betreff: [PHP-DEV] [RFC][Discussion] Return Type Variance Checking

Dear Internals,

As you may remember, I closed the voting on the return types
RFC because of a design flaw that was found during the voting process. The
purpose of this discussion thread is to explain the various options for checking return type compatibility with parent
methods that I think are viable, to show some of the benefits and drawbacks of each option, and to solicit feedback on
which options you prefer and why. As such, it's much longer than a normal message I would send.

The following code demonstrates the design flaw mentioned above:
class A {
    function foo(): B { return new B; }
}

class B extends A {
    function foo(): C { return new C; }
}

class C extends B {}

$b = new B;
$c = $b->foo();
I've also used it because it can adequately show the differences in how each of the following options work:

Do covariant return types; check them at definition time

Do covariant return types; check them at runtime

Do invariant return types; check them at definition time

Option 1: Covariant return types with definition time checking

This is the option that is currently implemented in my pull request.
This option catches all return type variance issues whether code is used or not; if it got included it is checked. This means
return type variance issues can't bite you later.

When class B is defined, the engine needs to check that the method B::foo has a compatible return type with A::foo, in
this case that equates to checking that C is compatible with type B. This would trigger autoloading for C but it will fail in this
case because C is defined in the same file. Note that there are ways we could fix this issue, but not reliably because of
conditionally defined classes (defining classes in if blocks, for example) and other such dynamic behavior. Even if we could
fix this issue, there is still an issue if A, B and C were defined in separate files and then included. You couldn't require them
in any order for it to work; you would have to autoload.

Option 2: Covariant return types with runtime checking

This option would do the variance check when the method is used (or potentially when the class is instantiated, or
whichever comes first).
Regardless of the exact details of how this method is implemented, the above code would work because the first time class
B is used in any way occurs after C is defined. This would also work if you separated them out into different files

This option would be slower than option 1, but cannot quantify it because it is not yet implemented. I suspect there would
be a way to cache the result of the variance check to not have to do it on every instantiation or invocation, so this may be
negligble.

This option has the drawback that inheritance problems can exist in the code and won't be discovered until the code is ran.
Let me repeat that to make sure that everyone understand it: if you have return type variance errors in your code, they
would not be detected until you try to use the class or method.

Option 3: Invariant return types with definition time checking

This means that the declared types must exactly match after resolving aliases such as parent and self. The advantage of
this option is that the inheritance check can be done at definition time in all cases without triggering an autoload. As such it
is a bit simpler to implement and is slightly faster to do so. The obvious downside is that it can't support covariant return
types.

Note that C++ and Java (and most modern languages) support covariant return types, but C# supports only invariant return
types.

Which of these methods do you prefer the most, and which do you prefer the least, and why?

--

It should be mentioned for Option 3 that covariant return types are not supported just at the definition level . Even with invariant return types the following is perfectly fine:
class A {
function foo(): B { return new B; }
}

class B extends A {
function foo(): B { return new C; }
}
Needless to say, IDEs using the type hints will not be able to support the user properly with such a construct (without additional type annotation in PHPDoc or such and then the annotation in the code becomes useless)

I am not sure what suits best for PHP but right now, I would say invariant for the following three reasons:

Parameters are invariant as well -> same behaviour
Invariant can be extended to covariant support later on without BC break, the other way round is not possible
The simplicity hopefully facilitates that this RFC will be accepted

As a side notice, C# allows to define covariant return types by using generics and corresponding constraints.

10 years ago by Levi Morrison — view source

unread

It should be mentioned for Option 3 that covariant return types are not supported just at the definition level . Even with invariant return types the following is perfectly fine:
class A {
function foo(): B { return new B; }
}

class B extends A {
function foo(): B { return new C; }
}

Yes, this is correct; thank you for the clarification. The phrase
"covariant return types" in the RFC and in my message above refers to
the declared type in the function signature, not the type returned by
the method. The value that is returned can always be subtype of the
declared type.

10 years ago by Stanislav Malyshev — view source

unread

Hi!

I've also used it because it can adequately show the differences in how each of the following options work:

Do covariant return types; check them at definition time

Do covariant return types; check them at runtime

Do invariant return types; check them at definition time

My opinion is that on some level it doesn't matter too much. The type
errors by itself are quite rare among the errors one can make in the
program, and usually found and fixed rather easily [1]. Errors in return
types - i.e. mismatch between declaration and code in your own code
located compactly on the screen - would be even rarer. Errors where the
difference is between covariant and invariant classes - and thus use
cases where such usage is helpful to prevent them - in my opinion, would
be so vanishingly rare that most people would never really need it, and
probably would spend much more time figuring out the right types to set
on the return values that it will ever save them on debugging. I suspect
except for the simplest cases these would be used more for
documentation/"feel good" purposes than for anything else.

Thus, taken practically, I think the option that has minimal impact on
the existing code, its speed and complexity should be taken. If it
requires reducing the expressiveness to the option 3, I don't think it
is a huge loss.

I'm not sure though what is involved in "runtime checking" and what
would be the consequences - i.e., what will be checked at runtime and
which runtime will it be? In PHP, mostly everything is "runtime"
strictly speaking, so some clarification here would help.

[1] E.g. check out this one: http://vimeo.com/74354480
There is a vivid debate about many conclusiong derived there but I think
empirical evidence is worth considering however is your stand on the
conclusions.

Stas Malyshev
smalyshev@gmail.com

10 years ago by Marc Bennewitz — view source

unread

I think it's required to do the type check on runtime (Option 2) because
one of the use cases for return type-hint are factories and such often do
instantiation in base of unknown string values:

class MyFactory {
public static function factory($name) : AdapterInterface {
$class = 'MyNamespace\Adapter' . $name;
return $class();
}
}

Marc

Am 25.11.2014 um 18:08 schrieb Levi Morrison:

Dear Internals,

As you may remember, I closed the voting on the return types
RFC because of a design
flaw that was found during the voting process. The purpose of this
discussion thread is to explain the various options for checking
return type compatibility with parent methods that I think are viable,
to show some of the benefits and drawbacks of each option, and to
solicit feedback on which options you prefer and why. As such, it's
much longer than a normal message I would send.

The following code demonstrates the design flaw mentioned above:
 class A {
     function foo(): B { return new B; }
 }

 class B extends A {
     function foo(): C { return new C; }
 }

 class C extends B {}

 $b = new B;
 $c = $b->foo();
I've also used it because it can adequately show the differences in
how each of the following options work:

Do covariant return types; check them at definition time

Do covariant return types; check them at runtime

Do invariant return types; check them at definition time

Option 1: Covariant return types with definition time checking

This is the option that is currently implemented in my pull request.
This option catches all return type variance issues whether code is
used or not; if it got included it is checked. This means return type
variance issues can't bite you later.

When class B is defined, the engine needs to check that the method
B::foo has a compatible return type with A::foo, in this case that
equates to checking that C is compatible with type B. This would
trigger autoloading for C but it will fail in this case because C is
defined in the same file. Note that there are ways we could fix this
issue, but not reliably because of conditionally defined classes
(defining classes in if blocks, for example) and other such dynamic
behavior. Even if we could fix this issue, there is still an issue if
A, B and C were defined in separate files and then included. You
couldn't require them in any order for it to work; you would have to
autoload.

Option 2: Covariant return types with runtime checking

This option would do the variance check when the method is used (or
potentially when the class is instantiated, or whichever comes first).
Regardless of the exact details of how this method is implemented, the
above code would work because the first time class B is used in any
way occurs after C is defined. This would also work if you separated
them out into different files

This option would be slower than option 1, but cannot quantify it
because it is not yet implemented. I suspect there would be a way to
cache the result of the variance check to not have to do it on every
instantiation or invocation, so this may be negligble.

This option has the drawback that inheritance problems can exist in
the code and won't be discovered until the code is ran. Let me repeat
that to make sure that everyone understand it: if you have return type
variance errors in your code, they would not be detected until you try
to use the class or method.

Option 3: Invariant return types with definition time checking

This means that the declared types must exactly match after
resolving aliases such as parent and self. The advantage of this
option is that the inheritance check can be done at definition time in
all cases without triggering an autoload. As such it is a bit simpler
to implement and is slightly faster to do so. The obvious downside is
that it can't support covariant return types.

Note that C++ and Java (and most modern languages) support covariant
return types, but C# supports only invariant return types.

Which of these methods do you prefer the most, and which do you prefer
the least, and why?

10 years ago by Levi Morrison — view source

unread

I think it's required to do the type check on runtime (Option 2) because
one of the use cases for return type-hint are factories and such often do
instantiation in base of unknown string values:

class MyFactory {
public static function factory($name) : AdapterInterface {
$class = 'MyNamespace\Adapter' . $name;
return $class();
}
}

It seems that I did not explain this clearly enough; I apologize. The
variance has to do with the declared type in the function signature
when inheritance is involved, not the type of the value returned by
the function.

For instance, under any of the three options this code will work just fine:

class Foo {}
class Goo extends Foo {}

class FooFactory {
function create(): Foo { return new Goo(); }
}

As long as the return value from FooFactory::create returns Foo or a
subtype of Foo (such as Goo), then it will work.

The variance that is under discussion in this thread is about the
declared return type in the signature:

class GooFactory extends FooFactory {
function create(): Goo {}
}

In this case, GooFactory::create() declares a return type of Goo,
which is a subtype of Foo [the return type of the inherited method
FooFactory::create()]. This is a covariant return type.

If we choose option 3, the only possible return type for
GooFactory::create is Foo.

Hopefully this clarifies the issue.

10 years ago by Marc Bennewitz — view source

unread

Am 25.11.2014 um 22:43 schrieb Levi Morrison:

I think it's required to do the type check on runtime (Option 2) because
one of the use cases for return type-hint are factories and such often do
instantiation in base of unknown string values:

class MyFactory {
public static function factory($name) : AdapterInterface {
$class = 'MyNamespace\Adapter' . $name;
return $class();
}
}
It seems that I did not explain this clearly enough; I apologize. The
variance has to do with the declared type in the function signature
when inheritance is involved, not the type of the value returned by
the function.

For instance, under any of the three options this code will work just fine:

class Foo {}
class Goo extends Foo {}

class FooFactory {
function create(): Foo { return new Goo(); }
}

As long as the return value from FooFactory::create returns Foo or a
subtype of Foo (such as Goo), then it will work.

The variance that is under discussion in this thread is about the
declared return type in the signature:

class GooFactory extends FooFactory {
function create(): Goo {}
}

In this case, GooFactory::create() declares a return type of Goo,
which is a subtype of Foo [the return type of the inherited method
FooFactory::create()]. This is a covariant return type.

If we choose option 3, the only possible return type for
GooFactory::create is Foo.

Hopefully this clarifies the issue.
Yes it does - thank you for explanation - my mistake :/

Option 3 is a no go not from OOP perspective and from consistency pov as
we already allow this in type-hint:

class FooFactory {
function create(Foo $foo): Foo { return $foo; }
}

class GooFactory extends FooFactory {
function create(Goo $goo): Goo { return $goo; }
}

10 years ago by Marc Bennewitz — view source

unread

Am 25.11.2014 um 23:13 schrieb Marc Bennewitz:

Am 25.11.2014 um 22:43 schrieb Levi Morrison:

I think it's required to do the type check on runtime (Option 2)
because
one of the use cases for return type-hint are factories and such
often do
instantiation in base of unknown string values:

class MyFactory {
public static function factory($name) : AdapterInterface {
$class = 'MyNamespace\Adapter' . $name;
return $class();
}
}
It seems that I did not explain this clearly enough; I apologize. The
variance has to do with the declared type in the function signature
when inheritance is involved, not the type of the value returned by
the function.

For instance, under any of the three options this code will work just
fine:

class Foo {}
class Goo extends Foo {}

class FooFactory {
function create(): Foo { return new Goo(); }
}

As long as the return value from FooFactory::create returns Foo or a
subtype of Foo (such as Goo), then it will work.

The variance that is under discussion in this thread is about the
declared return type in the signature:

class GooFactory extends FooFactory {
function create(): Goo {}
}

In this case, GooFactory::create() declares a return type of Goo,
which is a subtype of Foo [the return type of the inherited method
FooFactory::create()]. This is a covariant return type.

If we choose option 3, the only possible return type for
GooFactory::create is Foo.

Hopefully this clarifies the issue.
Yes it does - thank you for explanation - my mistake :/

Option 3 is a no go not from OOP perspective and from consistency pov
as we already allow this in type-hint:

class FooFactory {
function create(Foo $foo): Foo { return $foo; }
}

class GooFactory extends FooFactory {
function create(Goo $goo): Goo { return $goo; }
}
OK HHVM allows it - we also allow it but trigger an E_STRICT error
@see http://3v4l.org/UhtOb

10 years ago by Stanislav Malyshev — view source

unread

Hi!

class FooFactory {
function create(Foo $foo): Foo { return $foo; }
}

class GooFactory extends FooFactory {
function create(Goo $goo): Goo { return $goo; }
}
OK HHVM allows it - we also allow it but trigger an E_STRICT error
@see http://3v4l.org/UhtOb

This is because this code has LSP violation - if you have an object
about which you know it's typed as FooFactory, you should be able to
call it with any Foo object. But if this object is a GooFactory instead,
now not any Foo would serve, but only a subset of them - namely, Goo.
This clearly violates the principle "everything good for the parent must
be good for the child". Since PHP is a kind and nurturing language, we
only produce E_STRICT, some other languages would refuse to accept such
thing or interpret it as two different methods.
See also:
https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)#Covariant_method_argument_type

--
Stas Malyshev
smalyshev@gmail.com

10 years ago by Nikita Popov — view source

unread

Am 25.11.2014 um 22:43 schrieb Levi Morrison:

I think it's required to do the type check on runtime (Option 2) because
one of the use cases for return type-hint are factories and such often do
instantiation in base of unknown string values:

class MyFactory {
public static function factory($name) : AdapterInterface {
$class = 'MyNamespace\Adapter' . $name;
return $class();
}
}

It seems that I did not explain this clearly enough; I apologize. The
variance has to do with the declared type in the function signature
when inheritance is involved, not the type of the value returned by
the function.

For instance, under any of the three options this code will work just
fine:

class Foo {}
class Goo extends Foo {}

class FooFactory {
function create(): Foo { return new Goo(); }
}

As long as the return value from FooFactory::create returns Foo or a
subtype of Foo (such as Goo), then it will work.

The variance that is under discussion in this thread is about the
declared return type in the signature:

class GooFactory extends FooFactory {
function create(): Goo {}
}

In this case, GooFactory::create() declares a return type of Goo,
which is a subtype of Foo [the return type of the inherited method
FooFactory::create()]. This is a covariant return type.

If we choose option 3, the only possible return type for
GooFactory::create is Foo.

Hopefully this clarifies the issue.

Yes it does - thank you for explanation - my mistake :/

Option 3 is a no go not from OOP perspective and from consistency pov as
we already allow this in type-hint:

class FooFactory {
function create(Foo $foo): Foo { return $foo; }
}

class GooFactory extends FooFactory {
function create(Goo $goo): Goo { return $goo; }

}

This is not correct. Parameter typehints in PHP are invariant, so you are
not allowed to change them during inheritance. However LSP violations
during inheritance of non-abstract methods currently uses a very low
error level (E_STRICT), so you probably didn't notice. If you try the same
thing with an interface method or an explicitly abstract method, you will
receive a fatal error:

interface I1 {
function foo(A $a);
}
class C1 implements I1 {
function foo(B $b) { ... }
}

This code snippet will result in a fatal error, because it violates type
invariance.

Nikita

10 years ago by Simon Schick — view source

unread

Am 25.11.2014 um 22:43 schrieb Levi Morrison:

I think it's required to do the type check on runtime (Option 2) because
one of the use cases for return type-hint are factories and such often do
instantiation in base of unknown string values:

class MyFactory {
public static function factory($name) : AdapterInterface {
$class = 'MyNamespace\Adapter' . $name;
return $class();
}
}

It seems that I did not explain this clearly enough; I apologize. The
variance has to do with the declared type in the function signature
when inheritance is involved, not the type of the value returned by
the function.

For instance, under any of the three options this code will work just
fine:

class Foo {}
class Goo extends Foo {}

class FooFactory {
function create(): Foo { return new Goo(); }
}

As long as the return value from FooFactory::create returns Foo or a
subtype of Foo (such as Goo), then it will work.

The variance that is under discussion in this thread is about the
declared return type in the signature:

class GooFactory extends FooFactory {
function create(): Goo {}
}

In this case, GooFactory::create() declares a return type of Goo,
which is a subtype of Foo [the return type of the inherited method
FooFactory::create()]. This is a covariant return type.

If we choose option 3, the only possible return type for
GooFactory::create is Foo.

Hopefully this clarifies the issue.

Yes it does - thank you for explanation - my mistake :/

Option 3 is a no go not from OOP perspective and from consistency pov as
we already allow this in type-hint:

class FooFactory {
function create(Foo $foo): Foo { return $foo; }
}

class GooFactory extends FooFactory {
function create(Goo $goo): Goo { return $goo; }

}

This is not correct. Parameter typehints in PHP are invariant, so you are
not allowed to change them during inheritance. However LSP violations
during inheritance of non-abstract methods currently uses a very low
error level (E_STRICT), so you probably didn't notice. If you try the same
thing with an interface method or an explicitly abstract method, you will
receive a fatal error:

interface I1 {
function foo(A $a);
}
class C1 implements I1 {
function foo(B $b) { ... }
}

This code snippet will result in a fatal error, because it violates type
invariance.

Nikita

Hi, all

As in case of compatibility, I would think about the following:

A return type should be the same, or an instance of a class, extending
the the required class. This way, you have all methods and properties
you would expect.

But as of the parameters, it should be (if we allow anything different
than the class/interface expected), a class or an interface the
parent-one extends or implements. I know, that this one sounds strange
and I can't come up with a practical way. Let me put that one into an
example:

See this example:

class Bar {}
class Foo extends Bar {}
class Goo extends Foo {}

interface ITester {
function create(Goo $foo) : Bar;
}

class BarTester implements ITester {
function create(Bar $foo) : Bar { return $foo; }
}

class FooTester extends BarTester {
function create(Foo $foo) : Foo { return $foo; }
}

class GooTester extends FooTester {
function create(Goo $goo) : Goo { return $goo; }
}

$testers = array( new BarTester(), new FooTester(), new GooTester() );

/** @var $testers ITesters[] */
foreach($testers as $tester) {

$res = $tester->create(new Goo());
/** @var $res Bar (or some instance extending it) */

}

If you have an instance of ITester, you expect it to accept an
instance of Goo. As FooTester is implemented, it will always accept
Goo as instance here - yea - even more. It also accepts Foo instances
as parameter.

But as I expect the create method to return an instance of Foo (and to
make usage of any method, Foo has), the method could also return an
instance of Goo, because it has all the functionality, Foo has, and
even more.

In this example it may sounds weird and confusing, but if you take
these two things apart, it makes actually sense - each for itself.

Is there something I didn't think about, or that would get it to crash?

Bye
Simon

10 years ago by Lazare Inepologlou — view source

unread

2014-11-25 23:42 GMT+01:00 Nikita Popov nikita.ppv@gmail.com:

Am 25.11.2014 um 22:43 schrieb Levi Morrison:

On Tue, Nov 25, 2014 at 2:07 PM, Marc Bennewitz dev@mabe.berlin
wrote:

I think it's required to do the type check on runtime (Option 2)
because
one of the use cases for return type-hint are factories and such often
do
instantiation in base of unknown string values:

class MyFactory {
public static function factory($name) : AdapterInterface {
$class = 'MyNamespace\Adapter' . $name;
return $class();
}
}

It seems that I did not explain this clearly enough; I apologize. The
variance has to do with the declared type in the function signature
when inheritance is involved, not the type of the value returned by
the function.

For instance, under any of the three options this code will work just
fine:

class Foo {}
class Goo extends Foo {}

class FooFactory {
function create(): Foo { return new Goo(); }
}

As long as the return value from FooFactory::create returns Foo or a
subtype of Foo (such as Goo), then it will work.

The variance that is under discussion in this thread is about the
declared return type in the signature:

class GooFactory extends FooFactory {
function create(): Goo {}
}

In this case, GooFactory::create() declares a return type of Goo,
which is a subtype of Foo [the return type of the inherited method
FooFactory::create()]. This is a covariant return type.

If we choose option 3, the only possible return type for
GooFactory::create is Foo.

Hopefully this clarifies the issue.

Yes it does - thank you for explanation - my mistake :/

Option 3 is a no go not from OOP perspective and from consistency pov as
we already allow this in type-hint:

class FooFactory {
function create(Foo $foo): Foo { return $foo; }
}

class GooFactory extends FooFactory {
function create(Goo $goo): Goo { return $goo; }

}

This is not correct. Parameter typehints in PHP are invariant, so you are
not allowed to change them during inheritance. However LSP violations
during inheritance of non-abstract methods currently uses a very low
error level (E_STRICT), so you probably didn't notice. If you try the same
thing with an interface method or an explicitly abstract method, you will
receive a fatal error:

interface I1 {
function foo(A $a);
}
class C1 implements I1 {
function foo(B $b) { ... }
}

This code snippet will result in a fatal error, because it violates type
invariance.

Let's not compare these two:

Parameter types are contravariant, otherwise they are not type sound.
Yet, contravariance in general is of little interest (I cannot think of any
practical example), so invariance is a good compromise.
Return types are covariant. There are many useful examples already
mentioned in this mailing list.

http://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)#Covariant_method_return_type

Lazare INEPOLOGLOU
Ingénieur Logiciel

10 years ago by Rowan Collins — view source

unread

http://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)#Covariant_method_return_type

Can I just recommend that everyone interested in this discussion read that whole article (at least until it gets into the guts of generics, which gets more and more complex). It explains the concepts extremely clearly, both in their theoretical basis and their practical implementation and limitations.

I often worry that PHP is designed too much around examples and use cases, rather than more formal theoretical foundations, so I'm pleased this conversation has led me to learn those concepts. Obviously, that article points out, there is sometimes value in ignoring the theoretically pure in favour of the practical, but the adage applies that you should first understand the rules before deciding to break them.

Regards,

Rowan Collins
[IMSoP]

10 years ago by Rowan Collins — view source

unread

Lazare Inepologlou wrote on 26/11/2014 10:21:

Parameter types are contravariant, otherwise they are not type sound.
Yet, contravariance in general is of little interest (I cannot think of any
practical example), so invariance is a good compromise.

After reading the Wikipedia article, I've been thinking of some
practical example of contravariance in PHP. One involves the Iterator
and Traversable interfaces: imagine we have this class:

class Foo { public function iterate(Iterator $i) { /.../ } }

Now we make a sub-class which implements its iterate method using a
foreach() loop, so can transparently accept any Traversable:

class Bar extends Foo { public function iterate(Traversable $t) {
/.../ } }

This requires contravariance, because the child class accepts everything
the parent class would, but doesn't have an identical type hint because
it also accepts more.

A more involved example would be this:

// Two types of user, both extending a base class
abstract class User { abstract function getDisplayName(); /.../ }
class VisitingUser extends User { function getDisplayName() { return
'Anonymous Coward'; } /.../ }
class RegisteredUser extends User { function getDisplayName() { /.../
} function getUserID { /.../ } /.../ }

// An interface for injecting observers of social events carried out by
registered users
interface SocialEventListener { function handleUserEvent( RegisteredUser
$u, Event $e ); }
// An implementing class can safely use methods only present for
registered users without additional type checks
class UserTimelineWriter implements SocialEventListener { function
handleUserEvent( RegisteredUser $u, Event $e ) { $user_id =
$u->getUserID(); /.../ } }

// However, a more general observer might not use the user ID, and so
could be reused for events with any kind of user
interface UserActionListener { function handleUserEvent( User $u, Event
$e ); }
class EventLogger implements UserActionListener, SocialEventListener{
function handleUserEvent( User $u, Event $e ) { /* code relying only on
$u->getDisplayName() ... */ } }

Without parameter contravariance, there is no way to achieve this. Since
PHP doesn't have method overloading, you can't add a second version of
handleUserEvent which accepts only RegisteredUser arguments, so you have
to either change the name of the method in one of the interfaces, or
create an entire adapter class just to change the type hint.

Obviously, the exact details here are contrived to make the point, but
they don't seem all that far-fetched to me.

Regards,

Rowan Collins
[IMSoP]

10 years ago by Rowan Collins — view source

unread

Levi Morrison wrote on 25/11/2014 17:08:

Do covariant return types; check them at definition time

Do covariant return types; check them at runtime

Do invariant return types; check them at definition time

I guess there's also option 4 - do "weak invariance", as we do with
parameters: any combination of types is actually allowed, but any
variance raises an E_STRICT notice. (This is only true with class
inheritance; interface implementation is strictly invariant, raising a
fatal error if the declaration is not typehinted identically.)

I think my preference would be to implement return types with strict
invariance (option 3) initially, in order to keep the implementation and
discussion simple, and get the syntax baked into the language.

Then immediately look into solutions for covariant return types, and
possibly also contravariant parameter types (relaxing the fatals for
interface implementation, and maybe raising the remaining cases above
E_STRICT for class inheritance).

If a good solution and implementation can be found in time for 7.0, then
all the better, but if not, it can be added in 7.1, and no code written
for the hints added in 7.0 would fail.

Regards,

Rowan Collins
[IMSoP]

10 years ago by Dmitry Stogov — view source

unread

I prefer option (3) - invariant return types.
Actually, return type compatibility check should follow all the rules for
parameter type compatibility check (may be even reuse or share the code).
This solution may be implemented efficiently and consistently.
It also must be enough for 99% use cases.

Thanks. Dmitry.

On Thu, Nov 27, 2014 at 1:46 PM, Rowan Collins rowan.collins@gmail.com
wrote:

Levi Morrison wrote on 25/11/2014 17:08:

Do covariant return types; check them at definition time

Do covariant return types; check them at runtime

Do invariant return types; check them at definition time

I guess there's also option 4 - do "weak invariance", as we do with
parameters: any combination of types is actually allowed, but any variance
raises an E_STRICT notice. (This is only true with class inheritance;
interface implementation is strictly invariant, raising a fatal error if
the declaration is not typehinted identically.)

I think my preference would be to implement return types with strict
invariance (option 3) initially, in order to keep the implementation and
discussion simple, and get the syntax baked into the language.

Then immediately look into solutions for covariant return types, and
possibly also contravariant parameter types (relaxing the fatals for
interface implementation, and maybe raising the remaining cases above
E_STRICT for class inheritance).

If a good solution and implementation can be found in time for 7.0, then
all the better, but if not, it can be added in 7.1, and no code written for
the hints added in 7.0 would fail.

Regards,

Rowan Collins
[IMSoP]

10 years ago by Andrea Faulds — view source

unread

I prefer option (3) - invariant return types.
Actually, return type compatibility check should follow all the rules for
parameter type compatibility check (may be even reuse or share the code).

No, it shouldn't match parameters, that'd break type safety. What's safe for parameters is the opposite of what's safe for return types. The exception is invariance, which is safe for both.

--
Andrea Faulds
http://ajf.me/

10 years ago by Rowan Collins — view source

unread

Andrea Faulds wrote on 28/11/2014 10:57:

I prefer option (3) - invariant return types.
Actually, return type compatibility check should follow all the rules for
parameter type compatibility check (may be even reuse or share the code).
No, it shouldn't match parameters, that'd break type safety. What's safe for parameters is the opposite of what's safe for return types. The exception is invariance, which is safe for both.

I think you're both saying the same thing: current implementation of
parameter checking is invariant, proposal is to make return type
checking invariant. Thus, for now, it can share the implementation.

Later, covariant returns and/or contravariant parameters could be added,
at which point the checks would need to be split apart again.

10 years ago by Dmitry Stogov — view source

unread

I didn't get what you mean.
parameters are invariant, "invariance, which is safe for both" and " it
shouldn't match parameters" are contradictory.

Thanks. Dmitry.

I prefer option (3) - invariant return types.
Actually, return type compatibility check should follow all the rules for
parameter type compatibility check (may be even reuse or share the code).

No, it shouldn't match parameters, that'd break type safety. What's safe
for parameters is the opposite of what's safe for return types. The
exception is invariance, which is safe for both.

--
Andrea Faulds
http://ajf.me/

10 years ago by Andrea Faulds — view source

unread

I didn't get what you mean.
parameters are invariant, "invariance, which is safe for both" and " it shouldn't match parameters" are contradictory.

Well, that’s why I said “the exception is invariance”.

Andrea Faulds
http://ajf.me/

10 years ago by Levi Morrison — view source

unread

I prefer option (3) - invariant return types.
Actually, return type compatibility check should follow all the rules for
parameter type compatibility check (may be even reuse or share the code).

Realistically there isn't much code to share, especially since the two
structures are incompatible. That is, they are incompatible until
zend_string * is used in parameters (which I'm hopeful will happen).

10 years ago by Dmitry Stogov — view source

unread

Hi Levi,

if you remember, in my patch return_type was actually stored in
arg_info[-1].
it was mainly done for unification and to allow return type hinting for
internal functions (they already use arg_info[-1])
So the strictures may be compatible.
See https://gist.github.com/dstogov/8deb8b17e41c1a5abf88

I also thought about zend_string* usage in arg_info
Please review the patch if you missed it.
https://gist.github.com/dstogov/aa452a47ec30a8cb6eb5

The patches may be outdated, but they worked.
I may care about the final implementation if you agree with invariant
return type.

Thanks. Dmitry.

I prefer option (3) - invariant return types.
Actually, return type compatibility check should follow all the rules for
parameter type compatibility check (may be even reuse or share the code).

Realistically there isn't much code to share, especially since the two
structures are incompatible. That is, they are incompatible until
zend_string * is used in parameters (which I'm hopeful will happen).

[RFC][Discussion] Return Type Variance Checking

Option 1: Covariant return types with definition time checking

Option 2: Covariant return types with runtime checking

Option 3: Invariant return types with definition time checking

Option 1: Covariant return types with definition time checking

Option 2: Covariant return types with runtime checking

Option 3: Invariant return types with definition time checking

[1] E.g. check out this one: http://vimeo.com/74354480 There is a vivid debate about many conclusiong derived there but I think empirical evidence is worth considering however is your stand on the conclusions.

Option 1: Covariant return types with definition time checking

Option 2: Covariant return types with runtime checking

Option 3: Invariant return types with definition time checking

Regards,

Regards,

Regards,

Regards,

Well, that’s why I said “the exception is invariance”.

[1] E.g. check out this one: http://vimeo.com/74354480
There is a vivid debate about many conclusiong derived there but I think
empirical evidence is worth considering however is your stand on the
conclusions.