Hi internals,
I'd like to start the discussion on union types again, with a new proposal:
Pull Request: https://github.com/php/php-rfcs/pull/1
Rendered Proposal:
https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md
As an experiment, I'm submitting this RFC as a GitHub pull request, to
evaluate whether this might be a better medium for RFC proposals in the
future. It would be great if we could keep the discussion to the GitHub
pull request for the purpose of this experiment (keep in mind that you can
also create comments on specific lines in the proposal, not just the
overall discussion thread!) Of course, you can also reply to this mail
instead. The final vote will be held in the wiki as usual.
Relatively to the previous proposal by Bob&Levi (
https://wiki.php.net/rfc/union_types), I think the main differences in this
proposal are:
- Updated to specify interaction with new language features, like full
variance and property types. - Updated for the use of the ?Type syntax rather than the Type|null syntax.
- Only supports "false" as a pseudo-type, not "true".
- Slightly simplified semantics for the coercive typing mode.
Regards,
Nikita
Hey Nikita,
Hi internals,
I'd like to start the discussion on union types again, with a new proposal:
Pull Request: https://github.com/php/php-rfcs/pull/1
Rendered Proposal:https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md
As an experiment, I'm submitting this RFC as a GitHub pull request, to
evaluate whether this might be a better medium for RFC proposals in the
future. It would be great if we could keep the discussion to the GitHub
pull request for the purpose of this experiment (keep in mind that you can
also create comments on specific lines in the proposal, not just the
overall discussion thread!) Of course, you can also reply to this mail
instead. The final vote will be held in the wiki as usual.
Huge huge huge +1! Reviewing line-by-line allowed me to go much more in
detail with the feedback :-)
Marco Pivetta
On the subject of using GitHub for this RFC:
Personally, I think GitHub is a much better platform than the mailing
list for this kind of discussion. Mailing list threads are just not very
accessible to the average PHP user. Reading them through externals.io is
an OK experience, but actually contributing is unnecessarily complicated.
I'd imagine most people aren't really interested in something like the
recent "Annotating internal function argument and return types" thread,
but would very much like to participate in a discussion (even if it's
just leaving a "+1") about something that affects their day-to-day work,
like union types.
Generally, I'd split things up like this:
Truly internal work on PHP itself, things only people working on PHP
(as opposed to with PHP) really care about -> internals.
Changes affecting regular PHP users, RFCs for adding or removing
features and things like that -> GitHub or some other platform where
everyone can easily contribute.
Hi internals,
I'd like to start the discussion on union types again, with a new proposal:
Pull Request: https://github.com/php/php-rfcs/pull/1
Rendered Proposal:
https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.mdAs an experiment, I'm submitting this RFC as a GitHub pull request, to
evaluate whether this might be a better medium for RFC proposals in the
future. It would be great if we could keep the discussion to the GitHub
pull request for the purpose of this experiment (keep in mind that you can
also create comments on specific lines in the proposal, not just the
overall discussion thread!) Of course, you can also reply to this mail
instead. The final vote will be held in the wiki as usual.Relatively to the previous proposal by Bob&Levi (
https://wiki.php.net/rfc/union_types), I think the main differences in this
proposal are:
- Updated to specify interaction with new language features, like full
variance and property types.- Updated for the use of the ?Type syntax rather than the Type|null syntax.
- Only supports "false" as a pseudo-type, not "true".
- Slightly simplified semantics for the coercive typing mode.
Regards,
Nikita
https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md
Thank you Nikita, this would be a hugely welcomed step forward! Can't wait
to get rid of all those docblock annotations!
I've some additional suggestions that would greatly help remove more
docblocks and provide engine-enforced type-safety, maybe for the "future
scope" section. We use the all in Symfony:
- we miss a stringable union type, for
string|__toString
. This is
required when an API need lazyness regarding the generation of some
strings. Right now, we have no other option but using string|object. - we use "@return $this" quite often. On the implementation side, I'd
suggest enforcing this at compile time only (enforce all return points are
explicit, and written as "return $this", similarly to what is done for
nullable/void return types.) This makes sense only for return types. - we use "@return static" quite often too, typically when a method
returns a clone of the current instance. If anyone wonders, this is not the
same as "@return self" of methods overridden in child classes. This makes
sense only for return types.
About union types with void in them, we do use one in Symfony:
Command::execute() has "@return int|void". The reason is that if we were to
use "?int" instead, the engine would force the community to add "return
null;" where no return statement is needed at all most of the time. Right
now, we consider that the transition cost for the community is not worth
the extra boilerplate this requires. Note that there would be only one
friendly path forward: trigger a deprecation when null is returned, asking
ppl to add "return 0;". Not sure how this should impact the proposal, but
I thought it could be worth sharing.
Thanks again,
Nicolas
-
Agree on the usefulness of a stringable meta-type.
-
Hack supports an explicit “: this” return type (without dollar) when returning “new static(...)”. I think I might prefer that to “: static”.
-
From a type perspective, I don’t understand the “int|void” idea - it might make your users’ life easier, but doesn’t accord with how PHP works (which treats void as null to consumers).
-
if we’re adding to some future wish list, would love to have support for “: noreturn” when a function always throws or exits
https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md
Thank you Nikita, this would be a hugely welcomed step forward! Can't wait
to get rid of all those docblock annotations!I've some additional suggestions that would greatly help remove more
docblocks and provide engine-enforced type-safety, maybe for the "future
scope" section. We use the all in Symfony:
- we miss a stringable union type, for
string|__toString
. This is
required when an API need lazyness regarding the generation of some
strings. Right now, we have no other option but using string|object.- we use "@return $this" quite often. On the implementation side, I'd
suggest enforcing this at compile time only (enforce all return points are
explicit, and written as "return $this", similarly to what is done for
nullable/void return types.) This makes sense only for return types.- we use "@return static" quite often too, typically when a method
returns a clone of the current instance. If anyone wonders, this is not the
same as "@return self" of methods overridden in child classes. This makes
sense only for return types.About union types with void in them, we do use one in Symfony:
Command::execute() has "@return int|void". The reason is that if we were to
use "?int" instead, the engine would force the community to add "return
null;" where no return statement is needed at all most of the time. Right
now, we consider that the transition cost for the community is not worth
the extra boilerplate this requires. Note that there would be only one
friendly path forward: trigger a deprecation when null is returned, asking
ppl to add "return 0;". Not sure how this should impact the proposal, but
I thought it could be worth sharing.Thanks again,
Nicolas
Instead of using __toString
as type maybe it's better to introduce a Stringable
interface, similar to how __wakeup
and __sleep
are already superseded by Serializable
. Of course __toString
would still continue to work (for bc), but isn't usable for type hinting.
Arnold Daniels - Chat @ Spike [5acb3]
On Wed, 4 Sep 2019 at 12:30, Arnold Daniels arnold.adaniels.nl@gmail.com
wrote:
Instead of using
__toString
as type maybe it's better to introduce a
Stringable
interface, similar to how__wakeup
and__sleep
are already
superseded bySerializable
.
I support that. I don't like the naming in string|__toString
.
string|Stringable
is more readable IMO.
Peter
On Wed, 4 Sep 2019 at 10:59, Nicolas Grekas
nicolas.grekas+php@gmail.com wrote:
we use "@return $this" quite often. On the implementation side, I'd
suggest enforcing this at compile time only (enforce all return points are
explicit, and written as "return $this"
That's doing a deeper level of thing than a type system normal does.
It's enforcing the internal behaviour of a function, rather than just
defining the type of the parameter that is returned.
if we were to use "?int" instead, the engine would force the
community to add "return null;"
That sounds like a bug to me. The fact that null is returned by any
function that lacks an explicit return value, is well-defined, and one
of the most underrated aspects of PHP imo.
Arnold Daniels wrote:
Instead of using
__toString
as type maybe it's better to introduce
aStringable
interface,
Although casting things to string is probably the most common use
case, if you're going to think about an RFC along those lines,
covering all of the scalars would probably be a good idea, as that
would allow people to use specific types for values, that can then be
passed easily to functions that expect a scalar.
function setRetryLimit(int $maxRetries) {...}
class ImageUploadRetryLimit extends int {...}
function processImage(ImageUploadRetryLimit $iurl, ....) {
...
setRetryLimit($iurl);
...
}
That type* of stuff is completely possible currently in PHP, it's just
that it's a bit painful to both write and read.
cheers
Dan
Ack
*intended
Le 4 sept. 2019 à 13:54, Dan Ackroyd danack@basereality.com a écrit :
if we were to use "?int" instead, the engine would force the
community to add "return null;"That sounds like a bug to me. The fact that null is returned by any
function that lacks an explicit return value, is well-defined, and one
of the most underrated aspects of PHP imo.
This is an aspect of the eternal debate between explicit vs. implicit. In another recent thread, another aspect, namely variable initialisation, was debated. There is no “correct” solution, as it pertains much to coding style.
—Claude
Le mercredi 4 septembre 2019, 10:26:09 CEST Nikita Popov a écrit :
As an experiment, I'm submitting this RFC as a GitHub pull request, to
evaluate whether this might be a better medium for RFC proposals in the
future. It would be great if we could keep the discussion to the GitHub
pull request for the purpose of this experiment (keep in mind that you can
also create comments on specific lines in the proposal, not just the
overall discussion thread!) Of course, you can also reply to this mail
instead. The final vote will be held in the wiki as usual.
Huge "no" from me on using github for discussing RFCs.
Huge "yes" on the RFC content.
- Only supports "false" as a pseudo-type, not "true".
I think it would make sense to support true as well as I’ve seen a lot of cases of functions returning either an error string or TRUE
on success.
So that would be string|true.
Côme
Hi Côme
Huge "no" from me on using github for discussing RFCs.
Care to elaborate why? The majority seems to like it. Though I am also curious about Nikita's experience with it, as he is the one having to process the feedback.
Kind regards
Brent
Le mercredi 4 septembre 2019, 10:26:09 CEST Nikita Popov a écrit :
As an experiment, I'm submitting this RFC as a GitHub pull request, to
evaluate whether this might be a better medium for RFC proposals in the
future. It would be great if we could keep the discussion to the GitHub
pull request for the purpose of this experiment (keep in mind that you can
also create comments on specific lines in the proposal, not just the
overall discussion thread!) Of course, you can also reply to this mail
instead. The final vote will be held in the wiki as usual.Huge "no" from me on using github for discussing RFCs.
Huge "yes" on the RFC content.
- Only supports "false" as a pseudo-type, not "true".
I think it would make sense to support true as well as I’ve seen a lot of cases of functions returning either an error string or
TRUE
on success.
So that would be string|true.Côme
Le jeudi 5 septembre 2019, 12:04:55 CEST Brent a écrit :
Huge "no" from me on using github for discussing RFCs.
Care to elaborate why? The majority seems to like it. Though I am also curious about Nikita's experience with it, as he is the one having to process the feedback.
Because the PHP project should avoid depending on a privately owned centralized service for its technical discussions, and should not encourage (some would say force) people to use such platforms.
PHP is already on github but it’s only a mirror, the main git repository is at git.php.net .
Côme
Because the PHP project should avoid depending on a privately owned
centralized service for its technical discussions
This is obviously your opinion, but you haven't actually told us why this
is the case, and it's not at all obvious.
should not encourage (some would say force) people to use such platforms.
In any case it's not a choice for the contributor, internals chooses the
medium and the contributor has to use it. Whether we force them to use a
mailing list from last century or something from this century makes no
difference with regard to choice for the contributor.
Cheers
Joe
Le jeudi 5 septembre 2019, 12:04:55 CEST Brent a écrit :
Huge "no" from me on using github for discussing RFCs.
Care to elaborate why? The majority seems to like it. Though I am also
curious about Nikita's experience with it, as he is the one having to
process the feedback.Because the PHP project should avoid depending on a privately owned
centralized service for its technical discussions, and should not encourage
(some would say force) people to use such platforms.PHP is already on github but it’s only a mirror, the main git repository
is at git.php.net .Côme
Le jeudi 5 septembre 2019, 12:50:48 CEST Joe Watkins a écrit :
Because the PHP project should avoid depending on a privately owned centralized service for its technical discussions
This is obviously your opinion, but you haven't actually told us why this
is the case, and it's not at all obvious.
I thought it was obvious, sorry.
So, the problem is PHP have no control over github, and cannot know how it will evolve.
GAFAM organizations have closed people accounts before, sometimes without giving reasons.
I think PHP project should not let other organisations choose who can or cannot participate in PHP development.
should not encourage (some would say force) people to use such platforms.
In any case it's not a choice for the contributor, internals chooses the
medium and the contributor has to use it. Whether we force them to use a
mailing list from last century or something from this century makes no
difference with regard to choice for the contributor.
With a mailing list the contributor can choose its email server and email client. He does not have to agree to term of service of a third party.
https://tosdr.org/#github
With github, internal chooses to delegate to github, and github (currently microsoft) chooses who can take part in the discussion and how the discussion works. (they can change the platform tomorrow and internal won’t be able to do anything about it).
Côme
PHP have no control over github, and cannot know how it will evolve.
(they can change the platform tomorrow and internal won’t be able to do anything about it).
Those are hypothetically problems. But they do not appear to be
currently problems.
I'm pretty sure that if new problems with a medium were encountered,
we could adapt to either work around them or move to a different
system.
And in case anyone says "some people might not be able to comment on
Github" the same is true for our email lists. The signup process was
apparently broken for ages, and I've seen multiple people ask for how
to persuade the system to accept their messages. Which probably means
there are more people who never contributed because they couldn't get
past that first barrier.
Actual problems I can see with having the discussion on github:
i) There is no off-topic space. For example, apparently some people
don't understand the RFC and could do with a brief explainer on type
systems. Doing that inline to the github comments would make the
on-topic discussion harder to read.
ii) It means that the discussion is harder to track. However.....that
is already a problem. When I was putting together the info for
https://github.com/danack/RfcCodex which attempts to document why
certain ideas that keep coming up haven't succeeded yet, it was a
massive pain trying to track email threads to the RFC.
Both of those things are not really technical problems. They are
documentation problems. They would be best solved (imo) by having a
paid member of staff on the PHP project writing lots of words.
cheers
Dan
Ack
One idea could be to use a self hosted GitLab instance,
I'm pretty sure there are multiple ways via OAuth to connect to an
independent GitLab instance.
This would allow to have PR like thread on PHP's own infrastructure (even
though it seems the project is pretty bad at keeping it's infrastructure up
to date) while keeping control over it.
Best regards
George Peter Banyard
Let's name a few examples of large OSS projects managed on GitHub:
- ECMA (https://github.com/tc39)
- Rust (https://github.com/rust-lang/rust)
- React (https://github.com/facebook/react)
- Node (https://github.com/nodejs/node)
Also, let's not forget the hunderd of thousands of PHP packages hosted on GitHub. While GitHub might in theory one day decide to stop, there is no way it will happen in practice, GitHub has proven itself as a reliable platform over the past ten years.
On the topic of self hosted GitLab: let's not reinvent the wheel. Managing your own platform will take more manpower and resources, which are better invested somewhere else — the development of PHP perhaps?
I believe GitHub is the way to go. Several large communities manage their OSS on it and have proven it works, PHP should simply do the same.
Kind regards
Brent
One idea could be to use a self hosted GitLab instance,
I'm pretty sure there are multiple ways via OAuth to connect to an
independent GitLab instance.This would allow to have PR like thread on PHP's own infrastructure (even
though it seems the project is pretty bad at keeping it's infrastructure up
to date) while keeping control over it.Best regards
George Peter Banyard
I believe GitHub is the way to go. Several large communities manage their
OSS on it and have proven it works, PHP should simply do the same.
I think this is just as simplistic as saying "never". What are these
communities using it for, and what would we want to use it for? Are our
requirements the same as theirs, and is GitHub the best solution for those
requirements?
For instance:
- Rust does not use GitHub as its primary co-ordination mechanism, it has
an online forum at https://internals.rust-lang.org/ - The ECMA TC39 committee has a specific membership structure and holds
regular in-person meetings - There are undoubtedly more open-source communities not using GitHub
than who are using it
To be clear, I'm not saying these are reasons against GitHub in themselves,
but it's a rather huge leap from "here are four repos I found" to "GitHub
is the way to go"; we should be making specific arguments for why it will
meet our needs, and evaluating it among all the alternatives.
Regards,
Rowan Tommins
[IMSoP]
I believe GitHub is the way to go. Several large communities manage their OSS on it and have proven it works, PHP should simply do the same.
At the risk of giving advice, you will find conversations are far more
productive if you ask why something can't be done, rather than just
stating it will be simple.
Not only will that elicit more useful information to you, it avoids
being subtly insulting, as you're implying that something will be easy
and people are being stupid for not doing it*.
In this particular case you could have asked "what would be the
problems with moving the build systems to github?", and the answers
would include:
-
PHP has karma (aka permissions) system which github could not
support. I don't know how that could be solved/avoided. -
There is very strong reluctance to be dependent on other people's
infrastructure for things that could take a long time to migrate. e.g.
we can move discussions from one medium to another, by just telling
people to go talk over there. But for actual software CI, it's a big
deal to move from one system to another.
btw it's not obvious that the projects you linked actually have their
build systems integrated with Github. I'm pretty sure you're making
assumptions about how simple their systems are.
cheers
Dan
Ack
- in pithier form: https://signalvnoise.com/posts/439-four-letter-words
Le jeudi 5 septembre 2019, 13:07:28 CEST Dan Ackroyd a écrit :
PHP have no control over github, and cannot know how it will evolve.
(they can change the platform tomorrow and internal won’t be able to do anything about it).
Those are hypothetically problems. But they do not appear to be
currently problems.
The fact that PHP has no control over github is current, this is not hypothetical.
And in case anyone says "some people might not be able to comment on
Github" the same is true for our email lists. The signup process was
apparently broken for ages, and I've seen multiple people ask for how
to persuade the system to accept their messages. Which probably means
there are more people who never contributed because they couldn't get
past that first barrier.
It’s not the same when the project can act to fix it and when the project is powerless.
If github blocks someone from commenting we cannot do anything about it.
PHP have no control over github, and cannot know how it will evolve.
(they can change the platform tomorrow and internal won’t be able to
do anything about it).Those are hypothetically problems. But they do not appear to be
currently problems.The fact that PHP has no control over github is current, this is not
hypothetical.
The idea that the platform will change overnight in a way that makes it
unusable by the project is hypothetical.
It’s not the same when the project can act to fix it and when the project
is powerless.
If github blocks someone from commenting we cannot do anything about it.
Are you aware of any heavy-handed moderation on github, or is this, again,
a hypothetical problem?
As you will see from my other responses on this thread, I'm not totally
sold on github in particular, but I can see pros and cons more generally:
- our own systems, fully in our control, but used by nobody else, and
managed by a handful of volunteers - or: a well-established third-party system, which could change in
unpredictable ways, but is widely used, and supported by hundreds of paid
staff
Even the mailing list relies on third-party software; I presume it gets
updated regularly, and those updates could include changes in functionality
we disagree with. There is a pragmatic decision to be made between building
absolutely everything from scratch, and trusting some third parties, with
contingency plans if that trust proves ill-founded.
Regards,
Rowan Tommins
[IMSoP]
On Tue, Sep 10, 2019 at 11:02 AM Rowan Tommins rowan.collins@gmail.com
wrote:
It’s not the same when the project can act to fix it and when the project
is powerless.
If github blocks someone from commenting we cannot do anything about it.Are you aware of any heavy-handed moderation on github, or is this, again,
a hypothetical problem?
As much as I like Github for these kind of things, we're forgetting about a
critical part here; The US trade restrictions. Github being a company in
the US, is required to block certain access to users from certain countries.
Regards,
Lynn van der Berg
Are you aware of any heavy-handed moderation on github, or is this, again,
a hypothetical problem?
As much as I like Github for these kind of things, we're forgetting about
a critical part here; The US trade restrictions. Github being a company in
the US, is required to block certain access to users from certain countries.
That's a reasonable point; that (and the inverse: governments blocking
access to github in response to some perceived offence) would be a
potential issue to weigh up against the risks of running our own
infrastructure.
Regards,
Rowan Tommins
[IMSoP]
Den tir. 10. sep. 2019 kl. 13.14 skrev Rowan Tommins rowan.collins@gmail.com:
That's a reasonable point; that (and the inverse: governments blocking
access to github in response to some perceived offence) would be a
potential issue to weigh up against the risks of running our own
infrastructure.
We are already under the US law when it comes to distribution of
software as our binaries (like Windows) which contains encryption
software for export since we potentially allow countries which the US
have an embargo with, such as Iran (or more recently Venezuela). If I
remember correctly its partly the reason why PHP is registered as an
entity in the US to comply with EAR. How this is dealt with under the
hood of the project and all of that I have no idea, but even as it
is, we are not currently operating in a noman's land and already live
with restrictions imposed by the governments of the world.
--
regards,
Kalle Sommer Nielsen
kalle@php.net
Because the PHP project should avoid depending on a privately owned
centralized service for its technical discussionsThis is obviously your opinion, but you haven't actually told us why this
is the case, and it's not at all obvious.should not encourage (some would say force) people to use such platforms.
In any case it's not a choice for the contributor, internals chooses the
medium and the contributor has to use it. Whether we force them to use a
mailing list from last century or something from this century makes no
difference with regard to choice for the contributor.
I am not worrying when one uses it for draft. However anything after that
should happen in the wiki and on our git as it is the correct process.
I really like github, or gitlab, however int he current context, almost all
contributors may lose access to github (or other US based) companies based
on the US government policies or directives. Without starting a political
discussion whether or not this is valid, it is definitely a big risk. A
risk I am not really willing to take for php itself, if I may say.
best,
Le jeudi 5 septembre 2019, 12:04:55 CEST Brent a écrit :
Huge "no" from me on using github for discussing RFCs.
Care to elaborate why? The majority seems to like it. Though I am also
curious about Nikita's experience with it, as he is the one having to
process the feedback.Because the PHP project should avoid depending on a privately owned
centralized service for its technical discussions, and should not encourage
(some would say force) people to use such platforms.PHP is already on github but it’s only a mirror, the main git repository
is at git.php.net .
The "privately owned" and "centralized" parts don't bother me particularly,
but there's potentially an issue in splitting the discussion between
multiple platforms, with different logins required. An example of this is
the discussion on this RFC about type aliases - Nikita requested it to be
split into a separate discussion, but the people involved may not be
subscribed to this list, and if they are, it's hard to maintain context
when jumping between different forums.
That conversation also highlighted a limitation of the particular platform:
inline comments on GitHub PRs show as threads, but comments on the whole PR
don't, so that interleaved discussions are hard to follow. Admittedly,
that's true on a lot of e-mail clients as well (thanks to GMail
popularising "conversations" rather than "threads"), but at least views
like externals.io and news.php.net can let you navigate the tree.
I wonder if a hybrid approach would work better - the RFC is a PR (perhaps
against the language spec repo, as Andrea suggested) but the main
discussion stays on the list. Suggestions to improve the RFC itself could
be made inline on the PR by anyone who wanted to, but non-inline PR
comments would be heavily discouraged so that wider comments on the
proposal would stay here.
Either way, I think it's interesting to experiment with different ways of
working, and maybe there are other platforms we should trial as well.
Regards,
Rowan Tommins
[IMSoP]
but at least views
like externals.io and news.php.net can let you navigate the tree.
The lack of a full tree-like structure isn't the worst thing in the
world. If only because it discourages certain types of individual from
wanting to reply to every single sub-branch individually to get the
final word.
Something I am finding hard on Github, and maybe it's just because I
haven't found the option yet, is finding new posts.
Mark Randall
Hello,
Le jeudi 5 septembre 2019, 12:04:55 CEST Brent a écrit :
Huge "no" from me on using github for discussing RFCs.
Care to elaborate why? The majority seems to like it. Though I am also curious about Nikita's experience with it, as he is the one having to process the feedback.
Because the PHP project should avoid depending on a privately owned centralized service for its technical discussions, and should not encourage (some would say force) people to use such platforms.
PHP is already on github but it’s only a mirror, the main git repository is at git.php.net .
Côme
If I may put few random thoughts here. GitHub usage is inevitable. The
interface is so good with clear discussion and review options that it
would be really worthy to check it out for all PHP RFCS in the future.
The main worry here is basically that one day GitHub will go offline
and that discussion will be lost. Repository however will stay in the
Git repo and will be "timeless". Very rarely one will want to look at
the old (several 10 years) discussion comments. This is not a problem
because even with having an email archive online very rarely someone
will return to such discussion. RFC content is there and will be there
for PHP to move "elsewhere" though if such hypothetical case comes.
Tank you.
--
Peter Kokot
GitHub usage is inevitable.
Did you use the wrong word here, or are you saying that, of all the
hundreds of different platforms we could investigate, there is no chance
that we would end up using something other than github?
The interface is so good with clear discussion and review options
As my previous message, and those of several other people, show, that is
far from an established consensus. The power of an e-mail list is that
different users can use different interfaces - I've yet to see a forum
suggested that I would find easier than Thunderbird's tree view. There are
certainly downsides to e-mail, and upsides to GitHub, but let's stay calm
and evaluate our options rather than jumping at the first thing we see.
Regards,
Rowan Tommins
[IMSoP]
GitHub usage is inevitable.
Did you use the wrong word here, or are you saying that, of all the
hundreds of different platforms we could investigate, there is no chance
that we would end up using something other than github?
Plastic analogy - adding "127.0.0.1 github.com" to your /etc/hosts
file shows that developer cannot bring most of the today's (PHP)
projects to any working state without using it. That's what is meant
by inevitable because everything open source today is either on GitHub
and some minor ones scattered around custom Git repos and other Git
hosting providers. PHP is already using GitHub. Is it moving to
something else? No, so let's not complicate things more with other
hosting providers now.
USA political issues and embargo on some countries are indeed a reason
I'm also willing to accept that PHP won't be using GitHub otherwise.
For other reasons presented here, none is convincing enough to me
honestly.
--
Peter Kokot
Plastic analogy - adding "127.0.0.1 github.com" to your /etc/hosts
file shows that developer cannot bring most of the today's (PHP)
projects to any working state without using it. That's what is meant
by inevitable because everything open source today is either on GitHub
and some minor ones scattered around custom Git repos and other Git
hosting providers.
Ah, I see. Yes, having some usage of GitHub is currently pretty much
inevitable in that sense. Of course, that may change eventually, just as
SourceForge fell out of favour, but that's not something we need to worry
about.
However, projects over a certain size generally don't use it as their
main or only discussion platform, which is what we're talking about here.
PHP is already using GitHub. Is it moving to
something else? No, so let's not complicate things more with other
hosting providers now.
The question is not "should PHP ban the use of GitHub for any kind of
activity?" it's "should PHP abandon the discussion processes it's been
using for most of its history and use GitHub as a discussion forum?".
As a code collaboration platform, GitHub is pretty good, but it's not built
as a discussion forum, and there are plenty of limitations to using it as
one.
Regards,
Rowan Tommins
[IMSoP]
As a code collaboration platform, GitHub is pretty good, but it's not built
as a discussion forum, and there are plenty of limitations to using it as
one.
Could we work on agreeing on a set of requirements for such a discussion
"forum" to replace mailing lists? That would make it easier for anyone
wanting to give it a shot, to come up with a first version that has a
chance to convince everyone that this is the direction we want to follow.
Using GitHub Issues as a starting point, what would you change?
— Benjamin
On Fri, 6 Sep 2019 at 14:14, Benjamin Morel benjamin.morel@gmail.com
wrote:
As a code collaboration platform, GitHub is pretty good, but it's not built
as a discussion forum, and there are plenty of limitations to using it as
one.Could we work on agreeing on a set of requirements for such a discussion
"forum" to replace mailing lists?
That could be an interesting exercise, yes. Ideally, we should consider RFC
drafting, voting, and bug tracking as well - not because we have to replace
all of them with one platform, but because we might want to divide things
up differently from how we do at the moment.
Using GitHub Issues as a starting point, what would you change?
That's pretty much the opposite of your previous question. For one thing,
it's unanswerable without knowing the scope - e.g. would it just be for
RFCs, or all discussions?
Besides that, if we're going to introduce an anchor that we compare
everything to, surely we should say "using the setup we have as a starting
point, what would you change?"
Until we know what we're looking for, I'm really not clear why GitHub
issues should have any starting advantage over Discourse, or PHPBB, or
Trac, or Phabricator, or Bugzilla, or probably hundreds of suggestions we
could evaluate.
Regards,
Rowan Tommins
[IMSoP]
That's pretty much the opposite of your previous question. For one thing,
it's unanswerable without knowing the scope - e.g. would it just be for
RFCs, or all discussions?
I'm thinking about a generic "forum" for all discussions that happen on the
mailing lists right now, something that could be used for internals but
also for other PHP mailing lists.
Then, its scope can be expanded specifically for internals, to better
discuss RFCs, etc., but that's not what I had in mind right now.
Until we know what we're looking for, I'm really not clear why GitHub
issues should have any starting advantage over Discourse, or PHPBB, or
Trac, or Phabricator, or Bugzilla, or probably hundreds of suggestions we
could evaluate.
I chose GitHub because it was mentioned several times in this thread,
because it's already used to discuss PRs, and because I suspect pretty much
everyone on this list either uses GitHub on a daily basis, or has at
least some
experience with GitHub issues (let's face it, I google stuff every day for
many open-source projects, and most of the discussions I stumble upon are
on GitHub issues/pulls), so at least we have a starting point that
everyone knows and has learned to love or hate. Now the whole point is, if
you think another software does things better, please share!
Now obviously, should you hate GitHub issues from start to finish, then
indeed I can understand you consider this starting point a poor choice, in
this case I'd be interested to know what you dislike so much!
— Benjamin
Den tor. 5. sep. 2019 kl. 13.22 skrev Côme Chilliet come@opensides.be:
Because the PHP project should avoid depending on a privately owned centralized service for its technical discussions, and should not encourage (some would say force) people to use such platforms.
PHP is already on github but it’s only a mirror, the main git repository is at git.php.net .
As an old timer around here, I feel very strongly about moving the
medium and I prefer to be on the PHP.net infrastructure. Clearly one
of our biggest issues with that as the PHP organization is that we
poorly maintain it, and I think it could be time to rather invest into
renewing that effort instead. It seems like many have an issue with
subscribing to internals (I know it was broken for the longest time by
using the webform), but that is something we can telegraph better on
the php.net website for one thing and try to put resources into
figuring this problem out to gain momentum for more developers to join
the effort of internals development.
Using Github for PRs and relevant discussions for that is perfectly
fine with me, but switching to Github for RFCs is a big -1 from me, it
is really difficult to read new comments if you are not email
subscribed and even then it still remains hard to follow. The
individual moderation required to also sort out irrelevant comments is
also one thing I personally would not want to deal with either.
--
regards,
Kalle Sommer Nielsen
kalle@php.net
Here are my own thoughts on how the pull request discussion for union types
went...
I think the main takeaway for me is that inline comments (on specific lines
in the RFC) were really invaluable. Each comment thread discussed a
specific issue and most of them have resulted in a direct improvement to
the RFC.
Generally there was a lot of discussion of specific technical details that
we very rarely see in RFC discussions. Current RFC discussions on the
mailing list tend to be rather high level (which is fine in itself), with
nobody ever discussing the details (which is very bad).
Thinking back to https://wiki.php.net/rfc/engine_warnings, I think that RFC
could have really benefited from this discussion medium. While the mailing
list discussion ended up talking circles around more or less one single
question (undefined variables), pretty much none of the other parts of the
RFC have seen so much as a comment. I'm sure that there would be a lot more
discussion regarding specific classifications if this went up as a pull
request.
Another nice thing is that it's possible to mark a comment thread as
resolved, once the RFC has been adjusted to address the comments. That way
you don't have to see issues that were already addressed (though you can if
you like).
Having thumbs-up and thumbs-down reactions to comments was also helpful to
judge whether some comment represents a minority opinion or not, something
that is notoriously hard with current mailing list discussions (which are
almost dominated by "negative" opinions which mysteriously don't show up in
voting results).
However, while the inline comments were pleasantly insightful, the same
cannot be said for the top-level comments on the pull request. The majority
of them was borderline off-topic. While some in principle interesting
discussion happened there, it simply didn't belong in the RFC thread for
union types. The top-level comments also suffered from a lack of threading
-- I would have been less bothered about tangential discussions if they
were threaded. (To be fair: I use gmail, so I don't get threading on the
mailing list either.)
If this kind of discussion behavior is representative, then I would suggest
a workflow alone the following lines...
- RFCs are submitted as PRs on GitHub, but must be announced on the mailing
list. - The PR description should have a fat warning that top-level comments
belong on the mailing list. We can mark all top-level comments on PRs as
"off-topic" as a matter of general policy. - Top-level commentary stays on the mailing list.
This is a shift from what I originally had in mind (completely moving the
RFC process to GitHub), towards providing a way for more detailed and
specific feedback on the RFC text.
Regarding GitHub as a 3rd party. I think there are a few things to
considered:
- We're already very heavily reliant on GitHub. Most of my day-to-day
interaction with PHP core development is via GitHub and most of the
day-to-day decisions also happen there. Only the major stuff everhits this
mailing list. - The RFC repo would of course be hosted on git.php.net as usual and only
be mirrored to GitHub. - GitHub would not be the exclusive venue for RFC discussion. All RFCs are
still announced on internals and the top-level discussion can and should
still happen here.
Disclaimer: I'm trying to draw conclusions here from an experiment with a
sample size of 1, which may not be representative. Union types are a pretty
significant proposal (and also the first one to be on GH), and other,
smaller proposals might well have different discussion dynamics.
Regards,
Nikita
Le jeudi 5 septembre 2019, 12:04:55 CEST Brent a écrit :
Huge "no" from me on using github for discussing RFCs.
Care to elaborate why? The majority seems to like it. Though I am also
curious about Nikita's experience with it, as he is the one having to
process the feedback.Because the PHP project should avoid depending on a privately owned
centralized service for its technical discussions, and should not encourage
(some would say force) people to use such platforms.PHP is already on github but it’s only a mirror, the main git repository
is at git.php.net .Côme
Happy to read your thoughts on this Nikita, I think you've drawn some good conclusions.
If I may add a thought or two:
I wouldn't make any final decisions based on one experiment, especially a big RFC as this one. I think the GitHub discussion got extra attention because it was the first one, and because of the scope of the RFC. I would try to have two or three more RFCs discussed on GitHub, maybe smaller ones?
Second, there are more things we can do in order to keep the main thread on topic. Three things come to mind:
- We could add community guidelines, clearly stating that RFC comments should stay on topic
- People could be appointed to moderate the comments, allowing contributors to focus on the code instead of community management
- Conversations on GitHub can be locked as a last measurement. Repository members can still comment.
I fear that separating the main discussion from the PR will cause unnecessary confusion: important, generals remarks could be made on the "main thread", and I think there's value in keeping these remarks together with everything else.
Kind regards
Brent
Here are my own thoughts on how the pull request discussion for union types
went...I think the main takeaway for me is that inline comments (on specific lines
in the RFC) were really invaluable. Each comment thread discussed a
specific issue and most of them have resulted in a direct improvement to
the RFC.Generally there was a lot of discussion of specific technical details that
we very rarely see in RFC discussions. Current RFC discussions on the
mailing list tend to be rather high level (which is fine in itself), with
nobody ever discussing the details (which is very bad).Thinking back to https://wiki.php.net/rfc/engine_warnings, I think that RFC
could have really benefited from this discussion medium. While the mailing
list discussion ended up talking circles around more or less one single
question (undefined variables), pretty much none of the other parts of the
RFC have seen so much as a comment. I'm sure that there would be a lot more
discussion regarding specific classifications if this went up as a pull
request.Another nice thing is that it's possible to mark a comment thread as
resolved, once the RFC has been adjusted to address the comments. That way
you don't have to see issues that were already addressed (though you can if
you like).Having thumbs-up and thumbs-down reactions to comments was also helpful to
judge whether some comment represents a minority opinion or not, something
that is notoriously hard with current mailing list discussions (which are
almost dominated by "negative" opinions which mysteriously don't show up in
voting results).However, while the inline comments were pleasantly insightful, the same
cannot be said for the top-level comments on the pull request. The majority
of them was borderline off-topic. While some in principle interesting
discussion happened there, it simply didn't belong in the RFC thread for
union types. The top-level comments also suffered from a lack of threading
-- I would have been less bothered about tangential discussions if they
were threaded. (To be fair: I use gmail, so I don't get threading on the
mailing list either.)If this kind of discussion behavior is representative, then I would suggest
a workflow alone the following lines...
- RFCs are submitted as PRs on GitHub, but must be announced on the mailing
list.- The PR description should have a fat warning that top-level comments
belong on the mailing list. We can mark all top-level comments on PRs as
"off-topic" as a matter of general policy.- Top-level commentary stays on the mailing list.
This is a shift from what I originally had in mind (completely moving the
RFC process to GitHub), towards providing a way for more detailed and
specific feedback on the RFC text.Regarding GitHub as a 3rd party. I think there are a few things to
considered:
- We're already very heavily reliant on GitHub. Most of my day-to-day
interaction with PHP core development is via GitHub and most of the
day-to-day decisions also happen there. Only the major stuff everhits this
mailing list.- The RFC repo would of course be hosted on git.php.net as usual and only
be mirrored to GitHub.- GitHub would not be the exclusive venue for RFC discussion. All RFCs are
still announced on internals and the top-level discussion can and should
still happen here.Disclaimer: I'm trying to draw conclusions here from an experiment with a
sample size of 1, which may not be representative. Union types are a pretty
significant proposal (and also the first one to be on GH), and other,
smaller proposals might well have different discussion dynamics.Regards,
NikitaLe jeudi 5 septembre 2019, 12:04:55 CEST Brent a écrit :
Huge "no" from me on using github for discussing RFCs.
Care to elaborate why? The majority seems to like it. Though I am also
curious about Nikita's experience with it, as he is the one having to
process the feedback.Because the PHP project should avoid depending on a privately owned
centralized service for its technical discussions, and should not encourage
(some would say force) people to use such platforms.PHP is already on github but it’s only a mirror, the main git repository
is at git.php.net .Côme
- We could add community guidelines, clearly stating that RFC comments
should stay on topic
- People could be appointed to moderate the comments, allowing
contributors to focus on the code instead of community management
- Conversations on GitHub can be locked as a last measurement.
Repository members can still comment.I fear that separating the main discussion from the PR will cause
unnecessary confusion: important, generals remarks could be made on the
"main thread", and I think there's value in keeping these remarks
together with everything else.
I'm sceptical of that as a solution for two reasons:
Firstly, the conversations weren't necessarily wrong, they were just a slight drift of topic. The problem is not removing them from the PR, it's encouraging them to move somewhere else. I fear that saying "sign up to the mailing list and repeat that point in a completely different format" will be taken up less than "make a new thread on this same list/forum".
Secondly, the problem is partly a technical one: GitHub PRs have very poor support for replies and sub-threads, so even on-topic discussions that don't relate to a specific part of the text are hard to follow.
I think Nikita's suggestion is a good one: use a PR for making targeted suggestions to the RFC text itself, but raise the general points on the main list. That might even include saying "I've added a handful of suggestions relating to X" and discussing the wider issue that links them.
I agree it would be interesting to experiment further, and I think this hybrid approach would be a good one to try next.
Regards,
--
Rowan Tommins
[IMSoP]
Le vendredi 6 septembre 2019, 16:47:52 CEST Nikita Popov a écrit :
- GitHub would not be the exclusive venue for RFC discussion. All RFCs are
still announced on internals and the top-level discussion can and should
still happen here.
My remark on the mailing list regarding string|true was unanswered and I had to go over to github to see that the concern was discussed there.
This is the kind of problems splitting the discussion will cause, people will have to check both places or miss things.
Regarding the github PR as a practical point of view, I find it hard to check if there are new messages since last time I visited the page. Is there any way to browse messages by time? (backwards ideally).
Côme
Hi internals,
I'd like to start the discussion on union types again, with a new proposal:
Pull Request: https://github.com/php/php-rfcs/pull/1
Rendered Proposal:
https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.mdAs an experiment, I'm submitting this RFC as a GitHub pull request, to
evaluate whether this might be a better medium for RFC proposals in the
future. It would be great if we could keep the discussion to the GitHub
pull request for the purpose of this experiment (keep in mind that you can
also create comments on specific lines in the proposal, not just the
overall discussion thread!) Of course, you can also reply to this mail
instead. The final vote will be held in the wiki as usual.Relatively to the previous proposal by Bob&Levi (
https://wiki.php.net/rfc/union_types), I think the main differences in
this proposal are:
- Updated to specify interaction with new language features, like full
variance and property types.- Updated for the use of the ?Type syntax rather than the Type|null
syntax.- Only supports "false" as a pseudo-type, not "true".
- Slightly simplified semantics for the coercive typing mode.
Regards,
Nikita
Heads up, two weeks have passed, so this may now go to voting...
I believe relative to my original draft the main change that has happened
as a result of the discussion is the use of T1|T2|null syntax instead of
?(T1|T2) syntax for nullable types. ?T becomes an alias for T|null. People
felt fairly strongly that while ?T is a nice shorthand for a common case,
unions should use the T1|T2|null syntax that both reads better and is
already well-established from phpdoc.
I figured I should mentioned this for people who haven't been following the
GitHub thread...
Nikita
Den 2019-09-18 kl. 15:33, skrev Nikita Popov:
Hi internals,
I'd like to start the discussion on union types again, with a new proposal:
Pull Request: https://github.com/php/php-rfcs/pull/1
Rendered Proposal:
https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.mdAs an experiment, I'm submitting this RFC as a GitHub pull request, to
evaluate whether this might be a better medium for RFC proposals in the
future. It would be great if we could keep the discussion to the GitHub
pull request for the purpose of this experiment (keep in mind that you can
also create comments on specific lines in the proposal, not just the
overall discussion thread!) Of course, you can also reply to this mail
instead. The final vote will be held in the wiki as usual.Relatively to the previous proposal by Bob&Levi (
https://wiki.php.net/rfc/union_types), I think the main differences in
this proposal are:
- Updated to specify interaction with new language features, like full
variance and property types.- Updated for the use of the ?Type syntax rather than the Type|null
syntax.- Only supports "false" as a pseudo-type, not "true".
- Slightly simplified semantics for the coercive typing mode.
Regards,
NikitaHeads up, two weeks have passed, so this may now go to voting...
I believe relative to my original draft the main change that has happened
as a result of the discussion is the use of T1|T2|null syntax instead of
?(T1|T2) syntax for nullable types. ?T becomes an alias for T|null. People
felt fairly strongly that while ?T is a nice shorthand for a common case,
unions should use the T1|T2|null syntax that both reads better and is
already well-established from phpdoc.I figured I should mentioned this for people who haven't been following the
GitHub thread...Nikita
Hi Nikita,
Planned to comment on ? vs null on Github, but here it goes. Advantage
with ? syntax was in my eyes that it's clear it's not a stand-alone type or
what I should call it, but rather a "change/variant" to an existing type.
Having it stated as T1|T2|null lead it a bit in the direction of being a
more
stand-alone type. Anyway, I'm quite happy with the proposal as it is. So
thanks for the excellent work!
r//Björn L
Hi Nikita,
Thank you for your work on this, I look forward to being able to use union
types!
The one thing that bothers me a bit is introducing yet another
inconsistency by supporting false and not true as a type. Sure, I do get
the need for false as a valid type for historical reasons, but at the same
time I thought it would be interesting to see if true was being used as
well in the wild, before closing the door on it.
Therefore I reviewed all Composer packages with > 1 million downloads (1211
packages total), and here are the results:
31 packages
https://gist.github.com/BenMorel/56dfff576b1d8a719cd71b14ef614d2e#file-packages-txt
use at least one @var, @param or @return containing true as a type.
-
there are many illegitimate use cases, such as @return true
https://github.com/drupal/core/blob/c447ebb6a89149ef86e0af06bc00fec63027ec0d/modules/views/src/ViewExecutable.php#L719
, @return bool|true
https://github.com/php-http/message/blob/144d13ba024d0c597830636907144bc7c23f50dd/src/Authentication/AutoBasicAuth.php#L23,
@return true|false
https://github.com/zendframework/zend-form/blob/ff4c9ec32d141e4ac622c2aa40a8c9eb77a40725/src/Annotation/AnnotationBuilder.php#L412,
or even @return bool|true|false
https://github.com/slevomat/coding-standard/blob/964a3ff08d7ee924510c3b705f826165adaa97fe/tests/Sniffs/Namespaces/data/fullyQualifiedClassNameInAnnotationNoErrors.php#L41 -
there are, however, quite a few potentially legitimate use cases. Among
others: -
@return true|WP_Error
https://github.com/johnpbloch/wordpress-core/blob/4a96ef28019f3f529465eac71c718afea2c88e8b/wp-includes/rest-api/class-wp-rest-request.php#L641
(johnpbloch/wordpress-core) : "True if the JSON data was passed or no
JSON data was provided, WP_Error if invalid JSON was passed." -
@return true|string
https://github.com/johnpbloch/wordpress-core/blob/bd27b2bc49483b796fdb228f9b0614f9880823ee/wp-includes/ms-load.php#L72
(johnpbloch/wordpress-core) : "Returns true on success, or drop-in file
to include." -
@return string|true
https://github.com/PHP-DI/PHP-DI/blob/b891248f04c594dae4c07650bc0815d0dc81bacb/src/Compiler/Compiler.php#L317
(php/di) : True if compilable, string explaining why if not -
@return string|true
https://github.com/zendframework/zend-authentication/blob/86d6553e1cfe5ed2bde53cbea9fd162c6875a9c8/src/Adapter/Ldap.php#L378
(zendframework/zend-authentication) : True if successfull, error message
if not -
@param int|true
https://github.com/yiisoft/yii2/blob/fdbe45af3a4f923a97b718d8694981e1e195403d/framework/db/Query.php#L1266
(yiisoft/yii2) : "Use 0 to indicate that..., Use a negative number to
indicate that..., Use booleantrue
to indicate that ..." -
@param callable|resource|string|true|null
https://github.com/symfony/symfony/blob/242f24427d34bd47ef0e3a027a5a4e979d713c6f/src/Symfony/Component/VarDumper/Dumper/AbstractDumper.php#L117
(symfony/symfony) : "A line dumper callable, an opened stream, an output
path or true to return the dump"
Full list here
https://gist.github.com/BenMorel/56dfff576b1d8a719cd71b14ef614d2e#file-types-txt
.
So although true as a type does not have the same historical background as
false, it does seem that it's being used by enough packages to be worth
considering; what do you think?
— Benjamin
To have any information content, a variable (or return value, or
parameter) needs to have more than one possible value.
If one of those values is TRUE, the most natural alternative value
would be FALSE.
But should this really be baked into the language?
Should really "try to be smart" here?
Isn't it better to aim for completeness and consistency, and provide a
"round" developer experience with the least amount of surprise?
The same could be asked about other restrictions, e.g. false|null.
Ideally we should have a type system with the properties of a
mathematical space of some sort.
Artificial restrictions would ruin this.
It won't be perfect, because the different types are not "symmetric".
But we can at least try to make it as consistent as possible.
Some use cases I can think of:
Use case: true[] assoc
I like to use true[] for map-like associative arrays.
So, $map[$key] = true;
For each key, the array either has the value TRUE, or no value at all.
No need to store false values.
An extended use case might be a (tree = true|tree[]), where each leaf
would have the value true.
Does this mean we need true for parameters and return types?
It will be rare. Especially in the first case of a simple true[],
there is no point returning the value in a variable, if we not also
allow FALSE
for "not found".
Use case: Generics
However, what if the code is written in a generic way? E.g. if PHP
gets native generics one day, or some userland generics library based
on code generation.
For an iterator over true[], the return value for ->current() would be
"true|null". (I just tried, it is null and not false, see
https://3v4l.org/PvWWl).
For a generic method receiving values from the array, the parameter
type would be "true".
It would be really disappointing if a tool that attempts to give us
generics would have to work around arbitrary limitations of the typing
system.
Use case: Variance
Imagine a base class or interface method which returns boolean for
success / failure.
We extend this method, and our implementation is always successful, so
we will always return true.
We can indicate this in the type hint.
This indicates that calling code does not need to check the return
value, it can simply assume that it was successful.
interface Operation {
function run(): bool
}
interface SafeOperation extends Operation {
function run(): true // always successful.
}
Use case: Code generation and reflection tools
I would say that for any automated tool, special cases of disallowed
types make things more complicated.
This could be tools for code generation, parsing, or some kind of
"type reflection" with type arithmetics.
Use case: Legacy code
Finally, in some cases it is irrelevant whether the combination of
types makes sense or not.
We simply want to type-harden legacy code.
Hi Nikita,
Thank you for your work on this, I look forward to being able to use union
types!The one thing that bothers me a bit is introducing yet another
inconsistency by supporting false and not true as a type. Sure, I do get
the need for false as a valid type for historical reasons, but at the same
time I thought it would be interesting to see if true was being used as
well in the wild, before closing the door on it.Therefore I reviewed all Composer packages with > 1 million downloads (1211
packages total), and here are the results:31 packages
https://gist.github.com/BenMorel/56dfff576b1d8a719cd71b14ef614d2e#file-packages-txt
use at least one @var, @param or @return containing true as a type.
there are many illegitimate use cases, such as @return true
https://github.com/drupal/core/blob/c447ebb6a89149ef86e0af06bc00fec63027ec0d/modules/views/src/ViewExecutable.php#L719
, @return bool|true
https://github.com/php-http/message/blob/144d13ba024d0c597830636907144bc7c23f50dd/src/Authentication/AutoBasicAuth.php#L23,
@return true|false
https://github.com/zendframework/zend-form/blob/ff4c9ec32d141e4ac622c2aa40a8c9eb77a40725/src/Annotation/AnnotationBuilder.php#L412,
or even @return bool|true|false
https://github.com/slevomat/coding-standard/blob/964a3ff08d7ee924510c3b705f826165adaa97fe/tests/Sniffs/Namespaces/data/fullyQualifiedClassNameInAnnotationNoErrors.php#L41there are, however, quite a few potentially legitimate use cases. Among
others:@return true|WP_Error
https://github.com/johnpbloch/wordpress-core/blob/4a96ef28019f3f529465eac71c718afea2c88e8b/wp-includes/rest-api/class-wp-rest-request.php#L641
(johnpbloch/wordpress-core) : "True if the JSON data was passed or no
JSON data was provided, WP_Error if invalid JSON was passed."@return true|string
https://github.com/johnpbloch/wordpress-core/blob/bd27b2bc49483b796fdb228f9b0614f9880823ee/wp-includes/ms-load.php#L72
(johnpbloch/wordpress-core) : "Returns true on success, or drop-in file
to include."@return string|true
https://github.com/PHP-DI/PHP-DI/blob/b891248f04c594dae4c07650bc0815d0dc81bacb/src/Compiler/Compiler.php#L317
(php/di) : True if compilable, string explaining why if not@return string|true
https://github.com/zendframework/zend-authentication/blob/86d6553e1cfe5ed2bde53cbea9fd162c6875a9c8/src/Adapter/Ldap.php#L378
(zendframework/zend-authentication) : True if successfull, error message
if not@param int|true
https://github.com/yiisoft/yii2/blob/fdbe45af3a4f923a97b718d8694981e1e195403d/framework/db/Query.php#L1266
(yiisoft/yii2) : "Use 0 to indicate that..., Use a negative number to
indicate that..., Use booleantrue
to indicate that ..."@param callable|resource|string|true|null
https://github.com/symfony/symfony/blob/242f24427d34bd47ef0e3a027a5a4e979d713c6f/src/Symfony/Component/VarDumper/Dumper/AbstractDumper.php#L117
(symfony/symfony) : "A line dumper callable, an opened stream, an output
path or true to return the dump"Full list here
https://gist.github.com/BenMorel/56dfff576b1d8a719cd71b14ef614d2e#file-types-txt
.So although true as a type does not have the same historical background as
false, it does seem that it's being used by enough packages to be worth
considering; what do you think?— Benjamin
Le 23 sept. 2019 à 22:14, Benjamin Morel benjamin.morel@gmail.com a écrit :
So although true as a type does not have the same historical background as
false, it does seem that it's being used by enough packages to be worth
considering; what do you think?
Considering to support true
will raise several questions. They can be resolved, of course, but that will put more design decisions over the shoulders of the Union Types RFC:
-
The first one, of course, is why the list of possible literal values is still restricted. Because now you’ll want to use other literal values, e.g.,
0
or"some-string"
. -
Another issue is the relation between
Foo | false | true
andFoo | bool
: whether the first form is allowed; whether they are equivalent w.r.t. variance rules; how casting totrue | false
works (or doesn’t work) underdeclare(strict_types=0)
. -
Also, if we support more than two literal values as types (and especially when not all of them indicate failure), we’ll want more strongly to have them to appear as standalone types, e.g.,
true | null
; and then, naturally, justtrue
and justnull
. At this point, the question will be raised whether it is desirable to have bothnull
andvoid
as return types.
The choice of supporting precisely the two literal values null
and false
is not arbitrary: They are the two values that are the most often used as sentinel values (for indicating failure or absence). It is true that true
is also sometimes used as sentinel value (more rarely and not among the internal functions), but the same can be said of other literal values (one of your examples includes 0
).
—Claude
On Tue, Sep 24, 2019 at 12:24 PM Claude Pache claude.pache@gmail.com
wrote:
The choice of supporting precisely the two literal values
null
and
false
is not arbitrary: They are the two values that are the most often used as
sentinel values (for indicating failure or absence). It is true that
true
is
also sometimes used as sentinel value (more rarely and not among the
internal functions), but the same can be said of other literal values
(one of your examples includes0
).
While I personally think false
makes sense as an allowed "type", I also
don't want to see the union types RFC get held up on such a tiny detail.
I would propose either of the following alternatives:
1/ Remove false
from the proposal. It can always be added at a later
time, but not taken away.
2/ Make this detail a sub-vote. I would suggest that this sub-vote should
also be subject to a 2/3 majority in order to pass.
-Sara
On Tue, Sep 24, 2019 at 12:24 PM Claude Pache claude.pache@gmail.com
wrote:The choice of supporting precisely the two literal values
null
and
false
is not arbitrary: They are the two values that are the most often used as
sentinel values (for indicating failure or absence). It is true that
true
is
also sometimes used as sentinel value (more rarely and not among the
internal functions), but the same can be said of other literal values
(one of your examples includes0
).While I personally think
false
makes sense as an allowed "type", I also
don't want to see the union types RFC get held up on such a tiny detail.I would propose either of the following alternatives:
1/ Removefalse
from the proposal. It can always be added at a later
time, but not taken away.
2/ Make this detail a sub-vote. I would suggest that this sub-vote should
also be subject to a 2/3 majority in order to pass.-Sara
I would tend to agree with Sara. That seems to be the only issue of contention and it is (AFAIK) reasonably straightforward to add "later" without blocking the rest of Union types. It would mean some functions wouldn't be able to get a fully accurate return type yet but... they've survived this long without them, they can wait a bit longer while this gets settled and/or some even more robust alternative is found.
--Larry Garfield
I would tend to agree with Sara. That seems to be the only issue of
contention and it is (AFAIK) reasonably straightforward to add "later"
without blocking the rest of Union types. It would mean some functions
wouldn't be able to get a fully accurate return type yet but... they've
survived this long without them, they can wait a bit longer while this gets
settled and/or some even more robust alternative is found.
Note that I'm personally alright with only bool as a type (i.e. strpos()
:
int|bool). I was just pointing out that introducing false alone is adding
yet another inconsistency to the language, and it may not need this right
now.
If everything is to be voted at once, I'd suggest a split vote as follows:
- vote 1: introduce union types (base proposal with bool only) : yes / no
- vote 2: add false/true types: false only / false and true / no
— Benjamin
On Tue, Sep 24, 2019 at 12:24 PM Claude Pache claude.pache@gmail.com
wrote:The choice of supporting precisely the two literal values
null
and
false
is not arbitrary: They are the two values that are the most often used as
sentinel values (for indicating failure or absence). It is true that
true
is
also sometimes used as sentinel value (more rarely and not among the
internal functions), but the same can be said of other literal values
(one of your examples includes0
).While I personally think
false
makes sense as an allowed "type", I also
don't want to see the union types RFC get held up on such a tiny detail.I would propose either of the following alternatives:
1/ Removefalse
from the proposal. It can always be added at a later
time, but not taken away.
2/ Make this detail a sub-vote. I would suggest that this sub-vote should
also be subject to a 2/3 majority in order to pass.-Sara
This RFC is currently held up by a lack of implementation. Once that is
done, the RFC will go forward as-is (barring any novel concerns). Because I
consider it an important part of the overall proposal (*), I will neither
remove the false type, nor split it into a separate vote. People may vote
against the whole RFC if they disagree with this aspect, or any other
aspect of the proposal.
Regards,
Nikita
(*) While certainly not the primary reason for why we should support union
types, the reason why I brought this proposal forward at this time
specifically, is that the lack of union types is a blocker for my pet
project of providing comprehensive type annotations for internal functions.
Supporting "false" is strictly necessary for this purpose, because it is
part of nearly all unions as far as internal functions are concerned.
This RFC is currently held up by a lack of implementation. Once that is
done, the RFC will go forward as-is (barring any novel concerns). Because I
consider it an important part of the overall proposal (*), I will neither
remove the false type, nor split it into a separate vote. People may vote
against the whole RFC if they disagree with this aspect, or any other
aspect of the proposal.Regards,
Nikita(*) While certainly not the primary reason for why we should support union
types, the reason why I brought this proposal forward at this time
specifically, is that the lack of union types is a blocker for my pet
project of providing comprehensive type annotations for internal functions.
Supporting "false" is strictly necessary for this purpose, because it is
part of nearly all unions as far as internal functions are concerned.
Hi,
What if a 'new' type is added specifically to identify this case? Instead
of giving it int|false
, it could be int|failed
, where failed
would
allow a single value only: false. I like it more than naming the return
type false
. I'm not sure this type should be declarable in user-land,
meaning it's limited to the core and a marker for legacy. failed
(or name
it whatever) and false
would still be identical, yet has a more distinct
identification and does not imply there could also a type called true
.
Figured I'd toss in here as an out-of-the-box idea.
Regards,
Lynn van der Berg
Le 26 sept. 2019 à 11:21, Lynn kjarli@gmail.com a écrit :
I'm not sure this type should be declarable in user-land, meaning it's limited to the core and a marker for legacy.
In general, we should avoid to give builtin code functionality that can’t be reproduced in userland, unless there is a good reason for it. For the particular case, people may want to reproduce in userland an improved or corrected version of a given internal function, and they need be able to give their function the same return type information than the original.
—Claude
This RFC is currently held up by a lack of implementation. Once that is
done, the RFC will go forward as-is (barring any novel concerns). Because I
consider it an important part of the overall proposal (*), I will neither
remove the false type, nor split it into a separate vote. People may vote
against the whole RFC if they disagree with this aspect, or any other
aspect of the proposal.(*) While certainly not the primary reason for why we should support union
types, the reason why I brought this proposal forward at this time
specifically, is that the lack of union types is a blocker for my pet
project of providing comprehensive type annotations for internal functions.
Supporting "false" is strictly necessary for this purpose, because it is
part of nearly all unions as far as internal functions are concerned.
Maybe an option would be to introduce "falsable" types by marking them
with an exclamation mark in front? While I'm generally not in favor of
having failure signalled by returning false (except for bool functions),
and it wouldn't solve the issue generally (some functions can return
multiple types in addition to false), it appears to fit to be able to
mark nullable types with a question mark.
By the way, I wouldn't call that a "pet project". In my opinion, that
is a really big improvement, and I hope that these stub files can become
the single source of truth in the not too distant future.
Regards,
Christoph
While certainly not the primary reason for why we should support union
types, the reason why I brought this proposal forward at this time
specifically, is that the lack of union types is a blocker for my pet
project of providing comprehensive type annotations for internal functions.
Supporting "false" is strictly necessary for this purpose, because it is
part of nearly all unions as far as internal functions are concerned.
How is that blocked by adding only "bool"?
— Benjamin
Den 2019-09-26 kl. 10:06, skrev Nikita Popov:
On Tue, Sep 24, 2019 at 12:24 PM Claude Pache claude.pache@gmail.com
wrote:The choice of supporting precisely the two literal values
null
and
false
is not arbitrary: They are the two values that are the most often used as
sentinel values (for indicating failure or absence). It is true that
true
is
also sometimes used as sentinel value (more rarely and not among the
internal functions), but the same can be said of other literal values
(one of your examples includes0
).While I personally think
false
makes sense as an allowed "type", I also
don't want to see the union types RFC get held up on such a tiny detail.I would propose either of the following alternatives:
1/ Removefalse
from the proposal. It can always be added at a later
time, but not taken away.
2/ Make this detail a sub-vote. I would suggest that this sub-vote should
also be subject to a 2/3 majority in order to pass.-Sara
This RFC is currently held up by a lack of implementation. Once that is
done, the RFC will go forward as-is (barring any novel concerns). Because I
consider it an important part of the overall proposal (*), I will neither
remove the false type, nor split it into a separate vote. People may vote
against the whole RFC if they disagree with this aspect, or any other
aspect of the proposal.Regards,
Nikita(*) While certainly not the primary reason for why we should support union
types, the reason why I brought this proposal forward at this time
specifically, is that the lack of union types is a blocker for my pet
project of providing comprehensive type annotations for internal functions.
Supporting "false" is strictly necessary for this purpose, because it is
part of nearly all unions as far as internal functions are concerned.
Hi Nikita,
Given the feedback on 23/9 from B Morel regarding
occurrence of true as a return value, would you then
consider adding true as a valid return type in unions?
r//Björn L
Hi internals,
I'd like to start the discussion on union types again, with a new proposal:
Pull Request: https://github.com/php/php-rfcs/pull/1
Rendered Proposal:
https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.mdAs an experiment, I'm submitting this RFC as a GitHub pull request, to
evaluate whether this might be a better medium for RFC proposals in the
future. It would be great if we could keep the discussion to the GitHub
pull request for the purpose of this experiment (keep in mind that you can
also create comments on specific lines in the proposal, not just the
overall discussion thread!) Of course, you can also reply to this mail
instead. The final vote will be held in the wiki as usual.Relatively to the previous proposal by Bob&Levi (
https://wiki.php.net/rfc/union_types), I think the main differences in
this proposal are:
- Updated to specify interaction with new language features, like full
variance and property types.- Updated for the use of the ?Type syntax rather than the Type|null
syntax.- Only supports "false" as a pseudo-type, not "true".
- Slightly simplified semantics for the coercive typing mode.
Regards,
Nikita
An implementation of this proposal is now available at
https://github.com/php/php-src/pull/4838. As the implementation didn't turn
up any new issues, I think it's time to move this RFC forward to voting.
Nikita
Hi Nikita,
Can you please give me one/two days, before starting the voting, for implementation review (at least until October 25),
Thanks. Dmitry.
Hi Nikita,
I checked the Union Type implementation, and it more or less good. I mean, it implements the RFC in almost the best way.
However, as I don't like the RFC itself. Especially, unions of multiple classes and interference with type variance, I'll vote against this.
Actually, I think PHP already took wrong direction implementing "typed references" and "type variance".
Introducing more "typing", we suggest using it, but this "typing" comes with a huge cost of run-time checks.
From 10% (scalar type hint of argument) to 3 times (typed reference assignment) performance degradation.
Anyone may rerun my benchmarks https://gist.github.com/dstogov/fb32023e8dd55e58312ae0e5029556a9
Thanks. Dmitry.
Hi Dmitry,
Actually, I think PHP already took wrong direction implementing "typed
references" and "type variance".
Introducing more "typing", we suggest using it, but this "typing" comes
with a huge cost of run-time checks.
From 10% (scalar type hint of argument) to 3 times (typed reference
assignment) performance degradation.
Anyone may rerun my benchmarks
https://gist.github.com/dstogov/fb32023e8dd55e58312ae0e5029556a9
I think this performance impact is a real concern; PHP is the only language
I know of which implements type checks entirely at run-time in production
code, and we should ask ourselves if that's definitely the right approach.
Would it be possible, at least in principle, to build a static analysis
tool which could prove that certain type checks would never fail, and prime
OpCache with code that leaves them out? As I understand it, this is how
Dart works: the compiler only emits run-time checks for assertions it can't
prove true by static analysis.
The simpler idea I had in this area was caching what type checks a value
had passed, either on each zval or perhaps just at the class level, so that
checking the same type again would be much faster, even if it was a complex
union with multiple interfaces.
Regards,
Rowan Tommins
[IMSoP]
The simpler idea I had in this area was caching what type checks a value
had passed, either on each zval or perhaps just at the class level, so that
checking the same type again would be much faster, even if it was a complex
union with multiple interfaces.
Last night I was playing about with caching the last passed interface as
part of the instanceof_function.
https://gist.github.com/marandall/38d7dba6600889d897f2c8fc57532f31
When used in the likes of loops which are performing the same check over
and over, it yields about a 18% increase when using a 3 layer deep
nested interface.
It currently just stores a cache of the last match in the class entry so
it would hit as many uses as possible.
For T_INSTANCEOF
it could potentially use an extended cache slot to
store a reference to it (would need expanding out into a separate struct
as the cache slot is already used for converting the static name to a ce).
I'm less clear about how we could handle it for everything else, like
parameter type validation. The underlying implementation already seems
pretty well optimized so I'm not sure where a performance penalty is to
be found.
--
Mark Randall
I think this performance impact is a real concern; PHP is the only language
I know of which implements type checks entirely at run-time in production
code, and we should ask ourselves if that's definitely the right approach.
As they are runtime checks, would an ini setting where they can be
completely disabled be feasible? So during development and in production
when the performance decrease doesn't matter, I can have the full runtime
type checking. But when absolutely needed, the checking can be disabled.
Peter
As they are runtime checks, would an ini setting where they can be
completely disabled be feasible? So during development and in production
when the performance decrease doesn't matter, I can have the full runtime
type checking. But when absolutely needed, the checking can be disabled.
Note that I would personally never disable these checks in production, as
they may prevent potential bugs down the road, that would not have
necessarily been caught in dev.
I would rather expect PHP to bring down the cost of these checks in
production to a negligible level, as mentioned in my previous message
(static analysis and JIT).
— Benjamin
On Fri, 25 Oct 2019 at 10:09, Peter Bowyer phpmailinglists@gmail.com
wrote:
On Thu, 24 Oct 2019 at 13:47, Rowan Tommins rowan.collins@gmail.com
wrote:I think this performance impact is a real concern; PHP is the only
language
I know of which implements type checks entirely at run-time in production
code, and we should ask ourselves if that's definitely the right
approach.As they are runtime checks, would an ini setting where they can be
completely disabled be feasible? So during development and in production
when the performance decrease doesn't matter, I can have the full runtime
type checking. But when absolutely needed, the checking can be disabled.Peter
I think this performance impact is a real concern; PHP is the only language
I know of which implements type checks entirely at run-time in production
code, and we should ask ourselves if that's definitely the right approach.
Would it be possible, at least in principle, to build a static analysis
tool which could prove that certain type checks would never fail, and prime
OpCache with code that leaves them out? As I understand it, this is how
Dart works: the compiler only emits run-time checks for assertions it can't
prove true by static analysis.
I was wondering the same thing, especially in the light of preloading,
where a lot of class/function definitions are known on startup.
For example, doing:
function x(): int {}
function y(int $foo) {}
y(x());
Should definitely not redundantly check the type of the y() parameter, if
possible.
Also, in the case of JIT, shouldn't type hints actually improve
performance?
— Benjamin
Hi Dmitry,
Actually, I think PHP already took wrong direction implementing "typed
references" and "type variance".
Introducing more "typing", we suggest using it, but this "typing" comes
with a huge cost of run-time checks.
From 10% (scalar type hint of argument) to 3 times (typed reference
assignment) performance degradation.
Anyone may rerun my benchmarks
https://gist.github.com/dstogov/fb32023e8dd55e58312ae0e5029556a9I think this performance impact is a real concern; PHP is the only language
I know of which implements type checks entirely at run-time in production
code, and we should ask ourselves if that's definitely the right approach.Would it be possible, at least in principle, to build a static analysis
tool which could prove that certain type checks would never fail, and prime
OpCache with code that leaves them out? As I understand it, this is how
Dart works: the compiler only emits run-time checks for assertions it can't
prove true by static analysis.The simpler idea I had in this area was caching what type checks a value
had passed, either on each zval or perhaps just at the class level, so that
checking the same type again would be much faster, even if it was a complex
union with multiple interfaces.Regards,
Rowan Tommins
[IMSoP]
Hi Nikita,
I checked the Union Type implementation, and it more or less good. I mean,
it implements the RFC in almost the best way.
However, as I don't like the RFC itself. Especially, unions of multiple
classes and interference with type variance, I'll vote against this.Actually, I think PHP already took wrong direction implementing "typed
references" and "type variance".
Introducing more "typing", we suggest using it, but this "typing" comes
with a huge cost of run-time checks.
From 10% (scalar type hint of argument) to 3 times (typed reference
assignment) performance degradation.
Anyone may rerun my benchmarks
https://gist.github.com/dstogov/fb32023e8dd55e58312ae0e5029556a9Thanks. Dmitry.
For reference, here are the results I get with/without JIT:
https://gist.github.com/nikic/2a2d363fffaa3aeb251da976f0edbc33
I think that union types don't really change much in terms of the
performance calculus of type checking, because type checking is equally
fast (or slow) for a single scalar type, and 5 different scalar types: they
are all handled in a single bit check. The only case that is indeed
somewhat slow is if multiple class types are used in the union, because we
actually have to check each one until we find a match. I do hope that
having unions with a large number of classes will not be common.
Generally, this area could use some more optimization from the
implementation side. I spent a bit of time yesterday optimizing how we
perform instanceof checks (the implementation was quite inefficient,
especially for interfaces), which had a large positive impact on class type
checks. There's more low hanging fruit like this, for example
verify_return_type has no JIT implementation yet (which is why with JIT
simple arg type checks have nearly zero overhead, while return type checks
have a large overhead). Assignments to typed properties are also badly
optimized, because everything is punted to the slow path, while we should
be able to handle the simple cases with just one extra bit check, without
going through the complex implementation that deals with things like weak
typing coercions.
I do think that performance considerations are important when it comes to
new typing features (which is why, for example, we have categorically
rejected a "typed arrays" implementation that has to check all array
elements), but don't see union types are particular problematic in that
regard, beyond what we already have.
Nikita
From: Dmitry Stogov dmitry@zend.com
Sent: Tuesday, October 22, 2019 15:38
To: Nikita Popov nikita.ppv@gmail.com; PHP internals <
internals@lists.php.net>
Subject: Re: [PHP-DEV] Re: [RFC] Union Types v2Hi Nikita,
Can you please give me one/two days, before starting the voting, for
implementation review (at least until October 25),Thanks. Dmitry.
From: Nikita Popov nikita.ppv@gmail.com
Sent: Tuesday, October 22, 2019 12:36
To: PHP internals internals@lists.php.net
Subject: [PHP-DEV] Re: [RFC] Union Types v2Hi internals,
I'd like to start the discussion on union types again, with a new
proposal:Pull Request: https://github.com/php/php-rfcs/pull/1
Rendered Proposal:https://github.com/nikic/php-rfcs/blob/union-types/rfcs/0000-union-types-v2.md
As an experiment, I'm submitting this RFC as a GitHub pull request, to
evaluate whether this might be a better medium for RFC proposals in the
future. It would be great if we could keep the discussion to the GitHub
pull request for the purpose of this experiment (keep in mind that you
can
also create comments on specific lines in the proposal, not just the
overall discussion thread!) Of course, you can also reply to this mail
instead. The final vote will be held in the wiki as usual.Relatively to the previous proposal by Bob&Levi (
https://wiki.php.net/rfc/union_types), I think the main differences in
this proposal are:
- Updated to specify interaction with new language features, like full
variance and property types.- Updated for the use of the ?Type syntax rather than the Type|null
syntax.- Only supports "false" as a pseudo-type, not "true".
- Slightly simplified semantics for the coercive typing mode.
Regards,
NikitaAn implementation of this proposal is now available at
https://github.com/php/php-src/pull/4838. As the implementation didn't
turn
up any new issues, I think it's time to move this RFC forward to voting.Nikita
CAUTION: This email originated from outside of the organization. Do not
click on links or open attachments unless you recognize the sender and know
the content is safe.
Hi Nikita,
in your results, assignment to typed reference is ~3 time slower without JIT and ~10 times slower with JIT.
But this is not the worst case. I can simple craft a script where single assignment will lead to hundreds type checks.
<?php
class A {
public ?A $ref;
function __construct(&$ref = null) {
$this->ref =& $ref;
}
}
for ($i = 0; $i < 100; $i++) {
$a[$i] = new A($ref);
}
$ref = new A; // <= 100 type checks on a single assignment
?>
In case we add union types, we can make 200, 300 check...
The worst thing, for me, is variance together with union of multiple class types.
Each such method compatibility check is going to be a puzzle solving, and I imagine people, proud of writing "complex pearls".
My main concern, I don't like to complicate the language with features that shouldn't be often used, and I don't like to complicate implementation and reduce performance without real need.
In my opinion, union of standard types and single class or null should be enough for most typing use cases.
Thanks. Dmitry.
From: Nikita Popov nikita.ppv@gmail.com
Sent: Friday, October 25, 2019 13:22
To: Dmitry Stogov dmitry@zend.com
Cc: PHP internals internals@lists.php.net
Subject: Re: [PHP-DEV] Re: [RFC] Union Types v2
Hi Nikita,
I checked the Union Type implementation, and it more or less good. I mean, it implements the RFC in almost the best way.
However, as I don't like the RFC itself. Especially, unions of multiple classes and interference with type variance, I'll vote against this.
Actually, I think PHP already took wrong direction implementing "typed references" and "type variance".
Introducing more "typing", we suggest using it, but this "typing" comes with a huge cost of run-time checks.
From 10% (scalar type hint of argument) to 3 times (typed reference assignment) performance degradation.
Anyone may rerun my benchmarks https://gist.github.com/dstogov/fb32023e8dd55e58312ae0e5029556a9
Thanks. Dmitry.
For reference, here are the results I get with/without JIT: https://gist.github.com/nikic/2a2d363fffaa3aeb251da976f0edbc33
I think that union types don't really change much in terms of the performance calculus of type checking, because type checking is equally fast (or slow) for a single scalar type, and 5 different scalar types: they are all handled in a single bit check. The only case that is indeed somewhat slow is if multiple class types are used in the union, because we actually have to check each one until we find a match. I do hope that having unions with a large number of classes will not be common.
Generally, this area could use some more optimization from the implementation side. I spent a bit of time yesterday optimizing how we perform instanceof checks (the implementation was quite inefficient, especially for interfaces), which had a large positive impact on class type checks. There's more low hanging fruit like this, for example verify_return_type has no JIT implementation yet (which is why with JIT simple arg type checks have nearly zero overhead, while return type checks have a large overhead). Assignments to typed properties are also badly optimized, because everything is punted to the slow path, while we should be able to handle the simple cases with just one extra bit check, without going through the complex implementation that deals with things like weak typing coercions.
I do think that performance considerations are important when it comes to new typing features (which is why, for example, we have categorically rejected a "typed arrays" implementation that has to check all array elements), but don't see union types are particular problematic in that regard, beyond what we already have.
Nikita
// <= 100 type checks on a single assignment
That code contains a reference that is re-used 100 times.
I personally prefer not to use references at all, but if people do
want to use them, we could put a note that references are bad for
performance when used with types. People can then choose between:
- using types, and not references for good performance.
- using references, and not types for good performance.
- using references and types, and getting slightly less good performance.
I don't like to complicate the language with features
that shouldn't be often used,
Can we start the discussion about deprecating references in PHP 8 then?
Very few people are writing code using them currently, and they seem
to cause quite a lot of confusion, judging by the bug reports about
them. If you're also saying that they are making it difficult to write
performant PHP, that seems to be a good argument for looking at what
would be needed to be done to remove them.
cheers
Dan
Ack
Removing references would be great for implementation ? , but this doesn't look realistic in context of PHP language.
HHVM already made steps into this direction with Hack.
Thanks. Dmitry.
I do think that performance considerations are important when it comes to
new typing features (which is why, for example, we have categorically
rejected a "typed arrays" implementation that has to check all array
elements), but don't see union types are particular problematic in that
regard, beyond what we already have.
While union types barely seem to add to the cost of type checking, I
also understand the general sentiment: Type checks can be expensive,
should we really add more?
Perhaps it might be an idea to wait a little before starting the vote to
show the cost going down as more optimization work is done. Just to
build confidence with those who are hesitant to give it their approval.
Regards,
Dik Takken
Perhaps it might be an idea to wait a little before starting the vote to
show the cost going down as more optimization work is done. Just to
build confidence with those who are hesitant to give it their approval.
I don't know if this post has been accidentally sitting in an outbox for
a bit too long and has just sent, but the vote has been underway for a
while now.
It currently has more than 90% in favour with good turnout.
--
Mark Randall
For reference, here are the results I get with/without JIT:
https://gist.github.com/nikic/2a2d363fffaa3aeb251da976f0edbc33
I toyed a bit with the benchmark script (union_bench.php) as well and
wanted to share some observations. First of all I noticed the benchmark
script has a typo on line 90 where it is calling the wrong function. It
should read:
func6(1, 2, 3, 4, 5);
When running the corrected script I see that adding 5 argument type
checks and a return type check cause almost 4x slowdown. My results
(with opcache / jit):
func($a,$b,$c,$d,$e) 0.680 0.583
func(int $a,$b,$c,$d,$e): int 2.106 2.009
However, this appears to be entirely due to the return type check
lacking a JIT implementation, as pointed out by Nikita. Adding one more
test to the benchmark shows this nicely:
func($a,$b,$c,$d,$e) 0.675 0.575
func(int $a,$b,$c,$d,$e) 0.574 0.475
func(int $a,$b,$c,$d,$e): int 2.106 2.009
Now we can see that the argument type hint actually improves
performance, I guess due to it narrowing down the number of possible
types that need to be considered for the function arguments.
Union types allow for more accurate type hinting as well as type hinting
in places where this is currently not possible. As a result union types
can be used to obtain performance gains. As an example, consider the
case where the return type hint matches the type information that
opcache has inferred about the variable that is returned. In that case,
the return type check is optimized away. Let us try and leverage union
types to make this happen. From the benchmark script we take func6:
function func6(int $a, int $b, int $c, int $d, int $e) : int {
return $a + $b + $c + $d + $e;
}
and adjust it to read:
function func6(int $a, int $b, int $c, int $d, int $e) : int|float {
return $a + $b + $c + $d + $e;
}
Now the return type hint matches what opcache infers the result of the
expression will be and the cost of return type checking disappears
completely:
func($a,$b,$c,$d,$e) 0.663 0.568
func(int $a,$b,$c,$d,$e) 0.574 0.475
func(int $a,$b,$c,$d,$e): int|float 0.561 0.466
Then, on to another observation. The SSA forms currently produced by
opcache show union types like string|int. This suggests that opcache
supports union types for type inference already. It explains why opcache
can nicely optimize type checks away even when union types are used.
This is not true for unions of classes though. A union type like int|Foo
copies into the SSA form just fine while Foo|Bar becomes 'object'. Code
like this:
class Foo {}
class Bar {}
function func(): Foo|Bar {
return new Foo();
}
func();
produces the following SSA form:
func: ; (lines=4, args=0, vars=0, tmps=1, ssa_vars=2, no_loops)
; (before dfa pass)
; /php-src/sapi/cli/test.php:6-8
; return [object]
BB0: start exit lines=[0-3]
; level=0
#0.V0 [object (Foo)] = NEW 0 string("Foo")
DO_FCALL
VERIFY_RETURN_TYPE #0.V0 [object (Foo)] -> #1.V0 [object]
RETURN #1.V0 [object]
which will still perform a return type check even though the return type
hint matches the actual type of the variable. Apparently the union type
support in opcache is present but incomplete.
So, while union types can incur higher type checking cost they also
provide more powerful means to help type inference and improve
performance. As opcache improves over time I think we can expect the
cost to decrease while the gain increases. Or am I too optimistic here?
Regards,
Dik Takken
Hi Dik,
For reference, here are the results I get with/without JIT:
https://gist.github.com/nikic/2a2d363fffaa3aeb251da976f0edbc33I toyed a bit with the benchmark script (union_bench.php) as well and
wanted to share some observations. First of all I noticed the benchmark
script has a typo on line 90 where it is calling the wrong function. It
should read:func6(1, 2, 3, 4, 5);
When running the corrected script I see that adding 5 argument type
checks and a return type check cause almost 4x slowdown. My results
(with opcache / jit):func($a,$b,$c,$d,$e) 0.680 0.583
func(int $a,$b,$c,$d,$e): int 2.106 2.009
Thanks for catching this. At least, now I see 2 times slowdown without
JIT, that I expected, but didn't see.
func($a,$b,$c,$d,$e) 1.746 1.555
func(int $a,$b,$c,$d,$e): int 3.647 3.455
JIT will able to eliminate type checks only if it exactly knows the
called function at caller site. Unfortunately, this is quite rare case,
because the functions may be declared in different files, OOP, type
variance, etc.
However, this appears to be entirely due to the return type check
lacking a JIT implementation, as pointed out by Nikita. Adding one more
test to the benchmark shows this nicely:func($a,$b,$c,$d,$e) 0.675 0.575
func(int $a,$b,$c,$d,$e) 0.574 0.475
func(int $a,$b,$c,$d,$e): int 2.106 2.009Now we can see that the argument type hint actually improves
performance, I guess due to it narrowing down the number of possible
types that need to be considered for the function arguments.Union types allow for more accurate type hinting as well as type hinting
in places where this is currently not possible. As a result union types
can be used to obtain performance gains. As an example, consider the
case where the return type hint matches the type information that
opcache has inferred about the variable that is returned. In that case,
the return type check is optimized away. Let us try and leverage union
types to make this happen. From the benchmark script we take func6:function func6(int $a, int $b, int $c, int $d, int $e) : int {
return $a + $b + $c + $d + $e;
}and adjust it to read:
function func6(int $a, int $b, int $c, int $d, int $e) : int|float {
return $a + $b + $c + $d + $e;
}Now the return type hint matches what opcache infers the result of the
expression will be and the cost of return type checking disappears
completely:func($a,$b,$c,$d,$e) 0.663 0.568
func(int $a,$b,$c,$d,$e) 0.574 0.475
func(int $a,$b,$c,$d,$e): int|float 0.561 0.466Then, on to another observation. The SSA forms currently produced by
opcache show union types like string|int. This suggests that opcache
supports union types for type inference already. It explains why opcache
can nicely optimize type checks away even when union types are used.This is not true for unions of classes though. A union type like int|Foo
copies into the SSA form just fine while Foo|Bar becomes 'object'. Code
like this:class Foo {}
class Bar {}function func(): Foo|Bar {
return new Foo();
}func();
produces the following SSA form:
func: ; (lines=4, args=0, vars=0, tmps=1, ssa_vars=2, no_loops)
; (before dfa pass)
; /php-src/sapi/cli/test.php:6-8
; return [object]
BB0: start exit lines=[0-3]
; level=0
#0.V0 [object (Foo)] = NEW 0 string("Foo")
DO_FCALL
VERIFY_RETURN_TYPE #0.V0 [object (Foo)] -> #1.V0 [object]
RETURN #1.V0 [object]which will still perform a return type check even though the return type
hint matches the actual type of the variable. Apparently the union type
support in opcache is present but incomplete.
So, while union types can incur higher type checking cost they also
provide more powerful means to help type inference and improve
performance. As opcache improves over time I think we can expect the
cost to decrease while the gain increases. Or am I too optimistic here?
In my experience, static optimizations are not able to eliminate most
type checks in PHP. Probably, if we developed more complete type-system
and used type declaration everywhere we could achieve better results.
Introducing more type checks and more complex rules will increase
run-time overhead.
I'm currently working on attempt of speculative optimizations based on
run-time feedback, and the results might change the whole picture a bit.
Anyway, I'm especially against of mixing multiple classes in unions, not
because of performance, but because of complex rules of method
inheritance compatibility checks in conjunction with type variance.
Thanks. Dmitry.
Regards,
Dik TakkenCAUTION: This email originated from outside of the organization. Do not click on links or open attachments unless you recognize the sender and know the content is safe.
I'm currently working on attempt of speculative optimizations based on
run-time feedback, and the results might change the whole picture a bit.
That looks like yet another impressive bit of work. It would be very
interesting to see some preliminary results when you're ready to share them.
Regards,
Dik Takken