[RFC] Return Type Declarations pre-vote follow-up

11 years ago by Levi Morrison — view source

unread

PHP Internals enthusiasts,

It has been two weeks since I opened the discussion about adding return
type declarations to PHP (for those who missed the prior discussion there
are links at the bottom of this message). So far there has been
overwhelmingly supportive feedback and a few minor issues were discovered
and addressed. I thank everyone who has helped to make this RFC as good as
possible.

As of yesterday I could have moved this RFC to vote; however, due to the
announcement of the phpng branch and a few other things I worry that not
everyone has had time to sufficiently review and comment on this RFC. As
such I will delay for another week before moving this RFC into voting
phase. If a major issue is discovered then it will be addressed before
moving to vote.

Again, I thank everyone who has participated so far and invite others to
participate if they have been holding back.

RFC: https://wiki.php.net/rfc/returntypehinting
Mail Archive for announcement: http://marc.info/?t=139835533700003&r=1&w=2

11 years ago by Josh Watzman — view source

unread

PHP Internals enthusiasts,

It has been two weeks since I opened the discussion about adding return
type declarations to PHP (for those who missed the prior discussion there
are links at the bottom of this message). So far there has been
overwhelmingly supportive feedback and a few minor issues were discovered
and addressed. I thank everyone who has helped to make this RFC as good as
possible.

As of yesterday I could have moved this RFC to vote; however, due to the
announcement of the phpng branch and a few other things I worry that not
everyone has had time to sufficiently review and comment on this RFC. As
such I will delay for another week before moving this RFC into voting
phase. If a major issue is discovered then it will be addressed before
moving to vote.

Again, I thank everyone who has participated so far and invite others to
participate if they have been holding back.

Hey Levi et al! For those who don't know me, I'm Josh Watzman and I work on the Hack team at Facebook. We're really excited to see PHP picking up some of the features of Hack and would love to work with you all to ensure compatibility as much as possible. Levi has been quite proactive in chatting with us about this one, which we appreciate. I only recently subscribed to internals, and apologize for my absence until now!

We looked over the RFC, as well as actually running the test suite against HHVM. This uncovered one important issue, two minor issues, and even one bug in HHVM! I apologize if these have been brought up before -- I looked through the archives and didn't see any discussion on the list about these, but could easily have missed something.

The important issue: I'd like to raise discussion on test rfc004.php (the "missing return type on override" example in https://wiki.php.net/rfc/returntypehinting#examples). HHVM and Hack do not consider this to be an error. In type system terms, we consider a missing annotation to be both a supertype and subtype of everything. This means that, not only can a superclass omit an annotation that a subclass specifies, but a subclass can omit an annotation that the superclass specifies. While this isn't the most principled methodology from a hardcore type system perspective, we believe it is very pragmatic and interoperates well with the dynamically typed world of PHP. It's been extremely important to us at Facebook that it was always 100% safe to remove a type annotation and revert to old-style, untyped PHP -- the type system could never get in your way if you wanted to tell it to go away once and for all. There is always an easy escape hatch. If you require subclasses to have an annotation if the superclass does, this principle is broken -- local removal of the return type might introduce a different error since the subclass no longer matches the superclass! So I urge you to reconsider this. Relaxing this doesn't seriously compromise the intent of the RFC, but it does relax the constraints on those who want to only partially type their code (which is extremely common at Facebook; we've spent a lot of effort on conversion and only about 40% of our code is type annotated even today).

First minor issue: you have a return type "self". This is an LSB type, effectively the type of "new static". Levi, I remember discussing this with you in IRC, but not the result of that discussion. What is the reason for "self"? The Hack type system calls it "this", which I admit is a little confusing since it is the type of more things than just the literal variable "$this" ("new static" for example is also of type "this"). But the type "static" is much more consistent with the existing LSB language, i.e., "new static".. So is there any reason to call it "self" over "static" or "this"?

Second minor issue: your reflection implementation is slightly different from HHVM's. I'm told we call the method "getReturnTypeText", which returns the hint text, or false if none exists. I don't know how strongly we feel about this, but at the very least, I wanted to make sure you were aware that you were definitely diverging from the existing implementation here, and that it was done intentionally. (Also, I don't think you have any test coverage of this feature?)

Thanks for putting this together -- the FB team is really looking forward to it.
Josh Watzman

11 years ago by Stas Malyshev — view source

unread

Hi!

everything. This means that, not only can a superclass omit an
annotation that a subclass specifies, but a subclass can omit an
annotation that the superclass specifies. While this isn't the most

I'm not sure it is a good idea. This means having this code:
class FooGetter {
function getFoo(): Foo { return new Foo(); }
}

I can extend it as:
class TwoFaceFooGetter extends FooGetter {
function getFoo() { return rand()%2?new Bar():new Foo(); }
}

and every function getting FooGetter and expecting getFoo() to return
Foo is now broken since instead it could just get Bar(). Since the type
return specification is meant, as far as I understand, exactly to avoid
cases like that, I don't see it as a good idea.

principled methodology from a hardcore type system perspective, we
believe it is very pragmatic and interoperates well with the
dynamically typed world of PHP. It's been extremely important to us

I don't see how having strict typing and then making exceptions for
"dynamic typed world of PHP" makes much sense. If we want dynamic typed
world, we should do what Ruby, Python, Javascript and others do. If we
want to go to strict typing direction instead, we should provide
guarantees that strict typing is for. Having strict typing without
guarantees makes very little sense to me.

There is always an easy escape hatch. If you require subclasses to

The point of strict typing is not having escape hatches. If you can just
put anything into typing system and say "shut up and accept it", then
you have no guarantees that any value is of any known type. And the the
question is - why bother with strict typing at all? What advantage it
gives if it's not guarantees that values have known types?

discussion. What is the reason for "self"? The Hack type system calls
it "this", which I admit is a little confusing since it is the type

"this" is not a type in PHP. "self" is, which is the type of the class
it is mentioned in.

of more things than just the literal variable "$this" ("new static"
for example is also of type "this"). But the type "static" is much

"new static" in PHP is not of a known type statically. Dynamically, it
is of a type that the method was called with, which is signified,
somewhat confusingly, by the keyword "static" even though it is actually
not static as dynamic (yes, I know, welcome to the wonderful world of
keyword reuse).

more consistent with the existing LSB language, i.e., "new static"..
So is there any reason to call it "self" over "static" or "this"?

"self" and "static" are two distinct type keywords. "this" is not a type
keyword at all, is there a need for yet another type keyword? I don't
think so.

--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

11 years ago by Josh Watzman — view source

unread

I'm not sure it is a good idea. This means having this code:
class FooGetter {
function getFoo(): Foo { return new Foo(); }
}

I can extend it as:
class TwoFaceFooGetter extends FooGetter {
function getFoo() { return rand()%2?new Bar():new Foo(); }
}

and every function getting FooGetter and expecting getFoo() to return
Foo is now broken since instead it could just get Bar(). Since the type
return specification is meant, as far as I understand, exactly to avoid
cases like that, I don't see it as a good idea.

I understand, and this is the same argument LeviM has been making. (We just discussed on IRC in #hhvm extensively.)

Our view is that this code was already wrong. The type annotations don't make it more or less wrong. What they do let you do is enforce, once the annotation is in place, that it's correct moving forward for the places that are annotated. (And if you want to go fix up and annotate TwoFaceFooGetter, go do that! Or if you don't right now, don't!) Doing it this way also means that you don't have to convert your entire hierarchy all at once -- gradual conversion of existing code was an important feature when we were adding the Hack type system onto PHP.

The type 'self' is not late bound, unlike 'static' and exists in the language elsewhere; I am not introducing a new keyword 'self' in this RFC.

We discussed this on IRC and came to consensus that "self" is just fine, though not necessarily terribly useful if you think further than just enforcing the return type, to how that information might be propagated and used at the callsite -- for example by the Hack static typechecker, or even by a human. The type "static" tends to be what you want. But this isn't a huge deal -- "self" as proposed isn't wrong.

All we did was copy the existing structure in Parameter type hints; this was done for consistency. Is there any particular reason Hack/HHVM chose to do something entirely different for return types in regards to reflection? I don't particularly like the way type information is handled in Reflection anyway.

I don't know much about this or feel strongly one way or the other, just playing messenger.

Josh Watzman

11 years ago by Levi Morrison — view source

unread

I'm not sure it is a good idea. This means having this code:
class FooGetter {
function getFoo(): Foo { return new Foo(); }
}

I can extend it as:
class TwoFaceFooGetter extends FooGetter {
function getFoo() { return rand()%2?new Bar():new Foo(); }
}

and every function getting FooGetter and expecting getFoo() to return
Foo is now broken since instead it could just get Bar(). Since the type
return specification is meant, as far as I understand, exactly to avoid
cases like that, I don't see it as a good idea.

Our view is that this code was already wrong. The type annotations don't
make it more or less wrong. What they do let you do is enforce, once the
annotation is in place, that it's correct moving forward for the places
that are annotated. (And if you want to go fix up and annotate
TwoFaceFooGetter, go do that! Or if you don't right now, don't!) Doing it
this way also means that you don't have to convert your entire hierarchy
all at once -- gradual conversion of existing code was an important feature
when we were adding the Hack type system onto PHP.

The behavior you are proposing is good for migration and adoption, but what
then? What about the years afterwards where you now have this ability to
break the return type? It's much harder to remove this behavior than it is
to add it in later.

11 years ago by Josh Watzman — view source

unread

On May 9, 2014, at 2:37 PM, Levi Morrison <morrison.levi@gmail.com mailto:morrison.levi@gmail.com>
wrote:

I'm not sure it is a good idea. This means having this code:
class FooGetter {
function getFoo(): Foo { return new Foo(); }
}

I can extend it as:
class TwoFaceFooGetter extends FooGetter {
function getFoo() { return rand()%2?new Bar():new Foo(); }
}

and every function getting FooGetter and expecting getFoo() to return
Foo is now broken since instead it could just get Bar(). Since the type
return specification is meant, as far as I understand, exactly to avoid
cases like that, I don't see it as a good idea.

The behavior you are proposing is good for migration and adoption, but what then? What about the years afterwards where you now have this ability to break the return type? It's much harder to remove this behavior than it is to add it in later.

Hack has "strict mode" (http://docs.hhvm.com/manual/en/hack.modes.strict.php) which simply doesn't let you omit an annotation anywhere, which you can use to "lock in" a fully annotated file. We were also just discussing that it might make sense to enforce strict mode for an entire hierarchy, though we don't have anything like this right now. New Hack code can be all strict and thus all type-safe.

I'm not sure if you'd want to actually add some notion of per-file "modes" like this, but it's been important to our type system adoption. Having some way to gradually type things was critical for us. The slightly higher level of safety you get from the stricter enforcement doesn't matter if the vast majority of folks (existing code) can't use it ;)

But as you say, it's a lot easier to loosen in the future; since this proposal is stricter than HHVM, I've said my piece and will get out of your way ;) I wish you'd change your mind here, but I still think the RFC as it stands is both a good step for PHP in and of itself, as well as a good step closer to Hack.

Josh Watzman

11 years ago by Stas Malyshev — view source

unread

Hi!

Our view is that this code was already wrong. The type annotations
don't make it more or less wrong. What they do let you do is enforce,

Of course it is wrong. The whole point of having strict typing is to
catch wrong code. If everybody would write only right code, we wouldn't
need any type checks - everything would be ok anyway (not correct for
some compiled languages as there types also tell the compiler how to
convert values to bits, but true for languages like PHP).

once the annotation is in place, that it's correct moving forward for
the places that are annotated. (And if you want to go fix up and

The whole point is that it won't be correct. In your model, when you use
typed function, essentially you know nothing about its return type, as
somebody could have overridden it with function returning anything. So
the only thing typing is useful for in your model is to document our
wishes about types. We already have that with @returns.

your entire hierarchy all at once -- gradual conversion of existing
code was an important feature when we were adding the Hack type
system onto PHP.

My opinion is that we should not add features that do not make sense
conceptually. I have my issues with strict typing in php in general,
which I voiced at length here on the list already, but at least I can
see some internal logic in the concept, even though I still think it's
not the best fit for dynamic languages (neither Python nor Ruby nor
Javascript, to take three most popular ones, have it). But at least,
repeating, internal logic of knowing the variable type past the check is
there. What you are proposing is essentially documentation-only - you
can never know when you call a method what you are actually getting,
unless you explicitly check the object type for exact match (which
current strict typing syntax doesn't even allow). This is lacking even
internal logic, so I don't see how it makes sense to have such concept
in PHP.

typechecker, or even by a human. The type "static" tends to be what
you want. But this isn't a huge deal -- "self" as proposed isn't
wrong.

That depends on the method - I can see cases for both. Consider this:
class Toy {
static function ToyFactory($toy_class): self {
return new $toy_class;
}}
The one with static would have different semantics than the one with self.

--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

11 years ago by Larry Garfield — view source

unread

Hi!

Our view is that this code was already wrong. The type annotations
don't make it more or less wrong. What they do let you do is enforce,
Of course it is wrong. The whole point of having strict typing is to
catch wrong code. If everybody would write only right code, we wouldn't
need any type checks - everything would be ok anyway (not correct for
some compiled languages as there types also tell the compiler how to
convert values to bits, but true for languages like PHP).

once the annotation is in place, that it's correct moving forward for
the places that are annotated. (And if you want to go fix up and
The whole point is that it won't be correct. In your model, when you use
typed function, essentially you know nothing about its return type, as
somebody could have overridden it with function returning anything. So
the only thing typing is useful for in your model is to document our
wishes about types. We already have that with @returns.

I'm inclined to agree with Stas and and Levi here. If we want
return-type-suggestions, that already exists in docblocks and any
self-respecting IDE already takes advantage of that. If I use an actual
return type, it means I want any such broken code to fatal and die
(just as it would for a parameter type "hint"). That's how I know it's
Doing It Wrong(tm) so I can fix it. If that means I can't actually
commit the return type code until I fix the other code, so be it.
That's my incentive to fix the already broken code. :-)

--Larry Garfield

11 years ago by Lazare Inepologlou — view source

unread

Hello,

2014-05-12 2:53 GMT+02:00 Larry Garfield larry@garfieldtech.com:

Hi!

Our view is that this code was already wrong. The type annotations

don't make it more or less wrong. What they do let you do is enforce,

Of course it is wrong. The whole point of having strict typing is to
catch wrong code. If everybody would write only right code, we wouldn't
need any type checks - everything would be ok anyway (not correct for
some compiled languages as there types also tell the compiler how to
convert values to bits, but true for languages like PHP).

once the annotation is in place, that it's correct moving forward for

the places that are annotated. (And if you want to go fix up and

The whole point is that it won't be correct. In your model, when you use
typed function, essentially you know nothing about its return type, as
somebody could have overridden it with function returning anything. So
the only thing typing is useful for in your model is to document our
wishes about types. We already have that with @returns.

I'm inclined to agree with Stas and and Levi here. If we want
return-type-suggestions, that already exists in docblocks and any
self-respecting IDE already takes advantage of that. If I use an actual
return type, it means I want any such broken code to fatal and die (just
as it would for a parameter type "hint"). That's how I know it's Doing It
Wrong(tm) so I can fix it. If that means I can't actually commit the
return type code until I fix the other code, so be it. That's my incentive
to fix the already broken code. :-)

Docblocks is not the solution. If you want to take profit of static
analysis without the penalty of runtime type checking, Hack has shown the
way with its "soft type hints".

Lazare INEPOLOGLOU
Ingénieur Logiciel

11 years ago by Stas Malyshev — view source

unread

Hi!

Docblocks is not the solution. If you want to take profit of static
analysis without the penalty of runtime type checking, Hack has shown the
way with its "soft type hints".

What prevents you from doing static analysis using types specified in
docblocks?

--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

11 years ago by Josh Watzman — view source

unread

Hi!

Docblocks is not the solution. If you want to take profit of static
analysis without the penalty of runtime type checking, Hack has shown the
way with its "soft type hints".

What prevents you from doing static analysis using types specified in
docblocks?

For context/background. Hack has deliberately resisted doing this since we consider the types to be an integral part of the language. Runtime enforcement is very important for this. You need something to make sure the static analysis of the type system is not just self-consistent but consistent with reality. Until pretty close to the open source release of Hack, HHVM didn't actually enforce return types. Facebook had to do quite a bit of cleanup here, since the return type enforcement uncovered a bunch of places where the types were self-consistent but not grounded in reality.

So in our opinion putting things in docblocks just compounds this problem and can be pretty dangerous if nothing is enforcing them at runtime.

So this may or may not be a good idea for your use case (I've honestly kind of lost where this branch of the email thread was going), but that's why we haven't done it for Hack.

Josh Watzman

11 years ago by Stas Malyshev — view source

unread

Hi!

For context/background. Hack has deliberately resisted doing this
since we consider the types to be an integral part of the language.

I understand this, and this is a completely valid way of thinking. But I
think it's a way, not "the way".

So in our opinion putting things in docblocks just compounds this
problem and can be pretty dangerous if nothing is enforcing them at
runtime.

Static analysis and runtime enforcement are different things. Of course,
you can do both and they reinforce each other, but I was talking about
specifically the point that claimed that "Hack has shown the way". There
are other ways to do static analysis (including annotations and type
derivation), and as far as I understood from the previous discussion,
Hack doesn't enforce types at runtime consistently either, due to
integration/BC considerations. So, in my opinion, switching from
annotations in docblocks to non-binding annotations in the language
doesn't really change the equation with regard to static analysis
capabilities.

So this may or may not be a good idea for your use case (I've
honestly kind of lost where this branch of the email thread was
going), but that's why we haven't done it for Hack.

Thank you for explaining it.

--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

11 years ago by Lazare Inepologlou — view source

unread

Hello,

2014-05-09 23:32 GMT+02:00 Josh Watzman jwatzman@fb.com:

I'm not sure it is a good idea. This means having this code:
class FooGetter {
function getFoo(): Foo { return new Foo(); }
}

I can extend it as:
class TwoFaceFooGetter extends FooGetter {
function getFoo() { return rand()%2?new Bar():new Foo(); }
}

and every function getting FooGetter and expecting getFoo() to return
Foo is now broken since instead it could just get Bar(). Since the type
return specification is meant, as far as I understand, exactly to avoid
cases like that, I don't see it as a good idea.

I understand, and this is the same argument LeviM has been making. (We
just discussed on IRC in #hhvm extensively.)

Our view is that this code was already wrong. The type annotations don't
make it more or less wrong. What they do let you do is enforce, once the
annotation is in place, that it's correct moving forward for the places
that are annotated. (And if you want to go fix up and annotate
TwoFaceFooGetter, go do that! Or if you don't right now, don't!) Doing it
this way also means that you don't have to convert your entire hierarchy
all at once -- gradual conversion of existing code was an important feature
when we were adding the Hack type system onto PHP.

The type 'self' is not late bound, unlike 'static' and exists in the
language elsewhere; I am not introducing a new keyword 'self' in this RFC.

We discussed this on IRC and came to consensus that "self" is just fine,
though not necessarily terribly useful if you think further than just
enforcing the return type, to how that information might be propagated and
used at the callsite -- for example by the Hack static typechecker, or even
by a human. The type "static" tends to be what you want. But this isn't a
huge deal -- "self" as proposed isn't wrong.

As far as I understand, the keyword "self" was just inherited from argument
type hinting. The keyword "static" is not used there, so it was not carried
over.

Maybe "static" should be introduced to both cases, but in a separate RFC.

Lazare INEPOLOGLOU
Ingénieur Logiciel

11 years ago by Levi Morrison — view source

unread

Minor issues first:

First minor issue: you have a return type "self". This is an LSB type,

effectively the type of "new static". Levi, I remember discussing this with
you in IRC, but not the result of that discussion. What is the reason for
"self"? The Hack type system calls it "this", which I admit is a little
confusing since it is the type of more things than just the literal
variable "$this" ("new static" for example is also of type "this"). But the
type "static" is much more consistent with the existing LSB language, i.e.,
"new static". So is there any reason to call it "self" over "static" or
"this"?

The type 'self' is not late bound, unlike 'static' and exists in the
language elsewhere; I am not introducing a new keyword 'self' in this RFC.

Second minor issue: your reflection implementation is slightly different
from HHVM's. I'm told we call the method "getReturnTypeText", which returns
the hint text, or false if none exists. I don't know how strongly we feel
about this, but at the very least, I wanted to make sure you were aware
that you were definitely diverging from the existing implementation here,
and that it was done intentionally. (Also, I don't think you have any test
coverage of this feature?)

All we did was copy the existing structure in Parameter type hints; this
was done for consistency. Is there any particular reason Hack/HHVM chose to
do something entirely different for return types in regards to reflection?
I don't particularly like the way type information is handled in Reflection
anyway.

11 years ago by Marc Bennewitz — view source

unread

First minor issue: you have a return type "self". This is an LSB type, effectively the type of "new static". Levi, I remember discussing this with you in IRC, but not the result of that discussion. What is the reason for "self"? The Hack type system calls it "this", which I admit is a little confusing since it is the type of more things than just the literal variable "$this" ("new static" for example is also of type "this"). But the type "static" is much more consistent with the existing LSB language, i.e., "new static".. So is there any reason to call it "self" over "static" or "this"?

The keyword "self" should be a valid return type and simply alias the
current class.

The keyword "static" should be a valid return type and simply alias last
class of the list of inheritance.

Example 1: (fluent interface)
class Fluent {
protected $property;
public function setProperty($val) : static {
$this->property = $val;
return $this;
}
}

Example 2: (extendable singleton)
class Singleton {
protected static $inst;
public static function getInstance() : static {
static::$inst = static::$inst ?: new static();
return static::$inst;
}
}

"this" isn't a keyword and from reading perspective it is pointing to a
variable $this but $this is a object instance and not a type. The type
of the $this object is the same class "static" is pointing to as shown
in the fluent interface example.

Thanks for putting this together -- the FB team is really looking forward to it.
Josh Watzman

Marc