Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding support for enumerations and algebraic data types to PHP. This is a not-small task, so we've broken it up into several stages. The first stage, unit enumerations, are just about ready for public review and discussion.
The overarching plan (for context, NOT the thing to comment on right now) is here: https://wiki.php.net/rfc/adts
The first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is mostly done, but not 100% complete. Still, it's far enough along to start a discussion on and get broader feedback on the outstanding nits.
I should note that while the design has been collaborative, credit for the implementation goes entirely to Ilija. Blame for any typos in the RFC itself go entirely to me.
dons flame-retardant suit
--
Larry Garfield
larry@garfieldtech.com
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding support for enumerations and algebraic data types to PHP. This is a not-small task, so we've broken it up into several stages. The first stage, unit enumerations, are just about ready for public review and discussion.
The overarching plan (for context, NOT the thing to comment on right now) is here: https://wiki.php.net/rfc/adts
The first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is mostly done, but not 100% complete. Still, it's far enough along to start a discussion on and get broader feedback on the outstanding nits.
I should note that while the design has been collaborative, credit for the implementation goes entirely to Ilija. Blame for any typos in the RFC itself go entirely to me.
dons flame-retardant suit
--
Larry Garfield
larry@garfieldtech.com
I'd like to see the specific reasons for the restrictions listed in
Comparison to objects[1]. In general if something's value is even
debatable then the default position should be to remain consistent
with the rest of the language. It should take a strong argument to
introduce any artificial limitation and it's useful to have that in
the RFC.
[1] https://wiki.php.net/rfc/enumerations#comparison_to_objects
Dec 4, 2020 7:37:51 PM Paul Crovella paul.crovella@gmail.com:
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding support for enumerations and algebraic data types to PHP. This is a not-small task, so we've broken it up into several stages. The first stage, unit enumerations, are just about ready for public review and discussion.
The overarching plan (for context, NOT the thing to comment on right now) is here: https://wiki.php.net/rfc/adts
The first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is mostly done, but not 100% complete. Still, it's far enough along to start a discussion on and get broader feedback on the outstanding nits.
I should note that while the design has been collaborative, credit for the implementation goes entirely to Ilija. Blame for any typos in the RFC itself go entirely to me.
dons flame-retardant suit
--
Larry Garfield
larry@garfieldtech.comI'd like to see the specific reasons for the restrictions listed in
Comparison to objects[1]. In general if something's value is even
debatable then the default position should be to remain consistent
with the rest of the language. It should take a strong argument to
introduce any artificial limitation and it's useful to have that in
the RFC.[1] https://wiki.php.net/rfc/enumerations#comparison_to_objects
The reasoning general comes down to one of 2 things:
- they involve state, and enum cases have no state. They may get reintroduced with tagged unions, but for now methods relating to state would just be confusing.
- we couldn't figure out what possible use they'd have (like static methods on cases, which without data are exactly the same as normal methods.)
--Larry Garfield
Dec 4, 2020 7:37:51 PM Paul Crovella paul.crovella@gmail.com:
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding support for enumerations and algebraic data types to PHP. This is a not-small task, so we've broken it up into several stages. The first stage, unit enumerations, are just about ready for public review and discussion.
The overarching plan (for context, NOT the thing to comment on right now) is here: https://wiki.php.net/rfc/adts
The first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is mostly done, but not 100% complete. Still, it's far enough along to start a discussion on and get broader feedback on the outstanding nits.
I should note that while the design has been collaborative, credit for the implementation goes entirely to Ilija. Blame for any typos in the RFC itself go entirely to me.
dons flame-retardant suit
--
Larry Garfield
larry@garfieldtech.comI'd like to see the specific reasons for the restrictions listed in
Comparison to objects[1]. In general if something's value is even
debatable then the default position should be to remain consistent
with the rest of the language. It should take a strong argument to
introduce any artificial limitation and it's useful to have that in
the RFC.[1] https://wiki.php.net/rfc/enumerations#comparison_to_objects
The reasoning general comes down to one of 2 things:
- they involve state, and enum cases have no state. They may get reintroduced with tagged unions, but for now methods relating to state would just be confusing.
- we couldn't figure out what possible use they'd have (like static methods on cases, which without data are exactly the same as normal methods.)
--Larry Garfield
Thank you for updating the RFC.
enum cases have no state
Unless there's a bit left out from this RFC this is not completely
true, you've just limited them to annoying ways of working with data,
e.g. static variables.
static methods on cases, which without data are exactly the same as normal methods
This is an argument against instance methods, which you kept, rather
than static methods, which you didn't. How is this division anything
but arbitrary?
Constructors - Not relevant without data/state.
Destructors - Not relevant without data/state.
This doesn't explain the need to remove them. What problem do they cause?
Enum/Case constants - Not necessary as methods fulfill the same use case.
How is this specific to enums? Why is it necessary to add this
inconsistency to the language in this RFC? What happens to constants
inherited from interfaces?
Enum/Case properties - Not necessary as methods fulfill the same use case, plus we want to avoid state.
Dynamic properties - Avoid state. Plus, they're a bad idea on classes anyway.
Magic methods except for those specifically listed below - Most of the excluded ones involve state.
Again, what is it particular to enums that necessitates adding these
inconsistencies to the language? I get that you want to avoid state,
but how is that an enum thing rather than a you thing? What happens to
properties gained from traits?
enum cases have no state
Unless there's a bit left out from this RFC this is not completely
true, you've just limited them to annoying ways of working with data,
e.g. static variables.
I'm not sure what you mean here, but it sounds a bit like you're saying
there are ways to emulate mutable state, so why bother restricting it?
The point is that since enum cases are singleton objects, any instance
state would actually be global across the program, which is probably not
what people would expect.
static methods on cases, which without data are exactly the same as normal methods
This is an argument against instance methods, which you kept, rather
than static methods, which you didn't. How is this division anything
but arbitrary?
On a singleton object, instance methods and late static binding are
equivalent, but instance methods are probably more intuitive, since
people are more used to writing code like "$this->suit->shape();" than
"$this->suit::shape();"
I guess both could be supported, but other than more code style
decisions to make, I can't think of what they'd add.
Again, what is it particular to enums that necessitates adding these
inconsistencies to the language? I get thatyou want to avoid state,
but how is that an enum thing rather than a you thing?
"Inconsistency" is a straw man here, because these are a brand new
concept that already does things objects don't do. So let's flip it
around: do you have any use cases for directly storing state on an enum
case, remembering that such state would be unavoidably global?
Restricting them will likely make other desirable features easier - for
instance, how would serialization work:
$a = Suit::Hearts;
$a->setColour(Colour::Pink);
$serialized = serialize($a); // does this serialize the instance state?
$a->setColour(Colour::Purple);
$b = unserialize($serialized);
assert($a === $b); // as discussed elsewhere, this should be true
assert($a->getColour() === $b->getColour()); // are they both pink?
both purple?
Note that Larry's longer term plan is for "algebraic data types",
including "tagged unions": https://wiki.php.net/rfc/adts Unlike
straight-forward enum cases, these are not singletons, and each instance
has its own associated state.
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
Note that Larry's longer term plan is for "algebraic data types",
including "tagged unions": https://wiki.php.net/rfc/adts Unlike
straight-forward enum cases, these are not singletons, and each instance
has its own associated state.
I was going to respond, but I think Rowan summed it up here. Stateful enums are called tagged unions, and that's a separate phase to keep the implementation and discussion focused for now. We hope to add those in the future, but for now unit and scalar enums have a ton of value unto themselves.
--Larry Garfield
Note that Larry's longer term plan is for "algebraic data types",
including "tagged unions": https://wiki.php.net/rfc/adts Unlike
straight-forward enum cases, these are not singletons, and each instance
has its own associated state.I was going to respond, but I think Rowan summed it up here.
That's disappointing as he didn't even attempt to address the majority
of my concerns.
Stateful enums are called tagged unions, and that's a separate phase to keep the implementation and discussion focused for now.
I am only talking about the RFC you presented, you and Rowan are the
ones bringing up others.
We hope to add those in the future, but for now unit and scalar enums have a ton of value unto themselves.
--Larry Garfield
To be clear I want enums in PHP, I think most people do, which is why
I'd like to see this RFC improved.
enum cases have no state
Unless there's a bit left out from this RFC this is not completely
true, you've just limited them to annoying ways of working with data,
e.g. static variables.I'm not sure what you mean here, but it sounds a bit like you're saying
there are ways to emulate mutable state, so why bother restricting it?
Static variables don't emulate mutable state, they are mutable state.
My question is why go out of the way to restrict things arbitrarily,
particularly when the result is ineffective and just makes things suck
more. It's irrelevant extra work to annoy people with no gain.
The point is that since enum cases are singleton objects, any instance
state would actually be global across the program, which is probably not
what people would expect.
Instance state being global is a well-known problem with singletons.
Maybe don't use singletons then. Or simply document them as was done
in the RFC. I'd prefer the former since singletons don't seem to buy
much here but problems, though maybe I'm missing something. In any
case why is static state being (kinda sorta) restricted along with it?
static methods on cases, which without data are exactly the same as normal methods
This is an argument against instance methods, which you kept, rather
than static methods, which you didn't. How is this division anything
but arbitrary?On a singleton object, instance methods and late static binding are
equivalent, but instance methods are probably more intuitive, since
people are more used to writing code like "$this->suit->shape();" than
"$this->suit::shape();"I guess both could be supported, but other than more code style
decisions to make, I can't think of what they'd add.
Consistency. Which reduces both the wtf factor when working with them
and edge cases for other features that interact with them.
Again, what is it particular to enums that necessitates adding these
inconsistencies to the language? I get thatyou want to avoid state,
but how is that an enum thing rather than a you thing?"Inconsistency" is a straw man here, because these are a brand new
concept that already does things objects don't do.
Inconsistency is in no way a straw man. This RFC has enums being
implemented as "fancy objects", based on classes, backed by objects,
with the results being treated as classes and objects in every way
except for.. well mostly a bunch of artificial restrictions that
appear to have nothing to do with enums at all.
So let's flip it
around: do you have any use cases for directly storing state on an enum
case, remembering that such state would be unavoidably global?
No, that is a garbage standard that's been abused around here for too
long. This RFC is proposing to introduce inconsistencies into the
language, it's up to it to justify them. It is not up to me or anyone
else to come along and pass your personal code review just to ask for
that justification.
Restricting them will likely make other desirable features easier - for
instance, how would serialization work:
Like serialization already works with objects, including singletons.
$a = Suit::Hearts;
$a->setColour(Colour::Pink);
$serialized = serialize($a); // does this serialize the instance state?
$a->setColour(Colour::Purple);
$b = unserialize($serialized);
assert($a === $b); // as discussed elsewhere, this should be true
assert($a->getColour() === $b->getColour()); // are they both pink?
both purple?Note that Larry's longer term plan is for "algebraic data types",
including "tagged unions": https://wiki.php.net/rfc/adts Unlike
straight-forward enum cases, these are not singletons, and each instance
has its own associated state.
Longer term plans are irrelevant except to avoid inadvertently
shutting the door on something. This RFC is up for discussion, and
will be up for voting, in isolation. It has to be able to stand on its
own.
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
Le 07/12/2020 à 02:00, Paul Crovella a écrit :
Longer term plans are irrelevant except to avoid inadvertently
shutting the door on something. This RFC is up for discussion, and
will be up for voting, in isolation. It has to be able to stand on its
own.
I strongly disagree with you. Something as important as new type-related
paradigm must be considered along its full roadmap. This enum RFC tries
to prepare the future, and if it wasn't, tagged unions and ADT wouldn't
be possible at all. Since they are all sensible changes, having a
roadmap is what makes all consistent, it must be considered and
discussed along its global roadmap.
You probably should consider that enum cases using singleton classes
only as being an implementation detail, not a high-level design choice,
if you start to mix up real classes with those enum cases, future low
level engine or compiler improvement, for performance or consistency,
will not be possible anymore because people will be stuck in using them
as classes, which they are not conceptually.
Any failure in considering the future will result in a tight awkward
situation were improvements will not be possible anymore.
Regards,
Pierre
Instance state being global is a well-known problem with singletons.
Maybe don't use singletons then. Or simply document them as was done
in the RFC. I'd prefer the former since singletons don't seem to buy
much here but problems, though maybe I'm missing something.
Yes, I think you are missing something - or maybe I am, because I
honestly can't picture what it would look like for enums not to be
singletons.
Would Suit::Hearts be a constructor, producing a new instance each time,
each with its own state? Would we then overload ===, so that
Suit::Hearts === Suit::Hearts was still true somehow?
In any case why is static state being (kinda sorta) restricted along
with it?
On the face of it, I agree, static properties could be supported. But
looking at the details of the current proposal, it might actually take
some thought to make them feel natural. As I understand it, each case
acts like a sub-class, which is useful for over-riding instance methods,
but would mean a static property would be defined separately on each case:
enum Suit {
static $data;
case Hearts;
case Spades;
case Clubs;
case Diamonds;
}
Suit::$data = 42;
$mySuit = Suit::Hearts;
var_dump($mySuit::$data); // will not print 42, because
Suit::Hearts::$data is a different property
As Pierre says, the idea of backing enums onto objects is mostly an
implementation detail; their fundamental design is based on how enums
are generally used, and implemented in other languages.
Rather than "objects which are a bit enum-like", it might be useful to
frame them as "enums which are a bit object-like". The primary
consistency needs to be with what people will expect an enum to do.
Backing them onto objects makes it easy to add on any object-like
behaviour that feels useful, but once we've added it, it's much harder
to remove or change if we realise it's causing problems for users, or
getting in the way of other features.
That's why I was asking if you had use cases in mind, because I was
starting from that position: assume they have no features, and add the
ones we think are necessary and achievable.
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
Hi Larry and Ilja
It's great to see you're looking into enums, thanks! I have a few considerations from a userland point of view. I've been maintaining a userland enum implementation for a while now [1] so I think I share a thing or two about my experience.
- Scalar enums are spot on, exactly what I'd expect.
- Support with match is awesome, and I think makes it so that array key support isn't necessary.
- Others already addressed that serialization and deserialization would be a nice feature. A common use case is to store enums in a datastore of some kind, and it would be nice not having to make dedicated factories for them.
- The
case
syntax feels quirky. I assume it's because PHP wouldn't allow something like this:
enum Suit: string {
Hearts = 'H';
Diamonds = 'D';
Clubs = 'C';
Spades = 'S';
}
Finally, I've got one (rather large) concern about object enums, specifically with methods implemented on a per-enum basis. I did the same [2] when I first implemented my userland package. From my research back then, I believe only Java [3] allowed this behaviour. If you've checked out that link, you've seen that value specific methods have been removed in v2. That's with good reason: they turned out to rather cumbersome to maintain and even a bit useless. Here's why:
- You example shows one method, the
color
one, which is still kind of manageable. If you allow enum methods though, you'll often end up with more than one method:label
,color
,index
,name
,id
, are a few that come to mind. In the end an enum grows very large and unmanageable, with often lots of repeated code. - Enum value methods actually are the state pattern [4] in disguise. One difference being that enums objects can't manage their own internal state, so they become less useful in applying the state pattern compared to using classes.
I think enums shouldn't aim to solve the state pattern. It's out of scope for what enums should do and their way of solving the state pattern will be worse in practice compared to using classes. I'd say it would be good to keep the defintion of enums in mind:
"an enumerated type […] is a data type consisting of a set of named values called elements, members, enumeral, or enumerators of the type. The enumerator names are usually identifiers that behave as constants in the language." [5]
"Named values" and "constants" being the keywords here, there's no "behaviour" implemented by enum values, which is why only a small amount of languages allow this kind of functionality.
I realise enum objects might seem like a good idea to provide more value-specific functionality in a concise way, but let's compare per-value methods with a method on the base enum:
enum Suit implements Colorful {
case Hearts {
public function color(): string {
return "Red";
}
}
case Diamonds {
public function color(): string {
return "Red";
}
}
case Clubs {
public function color(): string {
return "Black";
}
}
case Spades {
public function color(): string {
return "Black";
}
}
public function shape(): string {
return "Rectangle";
}
}
vs
enum Suit implements Colorful {
case Hearts;
case Diamonds;
case Clubs;
case Spades;
public function color(): string {
return match ($this) {
Suit::Hearts, Suite::Diamonds => "Red",
Suit::Clubs, Suite::Spades => "Black",
}
}
}
In summary:
- If you'd use enum objects for "simple functionality", I'd say
match
will always be the more concise way. - If you'd use enum objects for handling complex state, you're better off using classes and properly implementing the state pattern.
I don't think enum objects should be a blocker, if people really want it then fine. Based on my experience though, I'm rather sure that they won't be very useful, and would love to hear your opinion on the matter.
Kind regards
Brent
[1] https://github.com/spatie/enum
[2] https://github.com/spatie/enum/tree/v1#enum-specific-methods
[3] https://www.geeksforgeeks.org/enum-in-java/
[4] https://en.wikipedia.org/wiki/State_pattern
[5] https://en.wikipedia.org/wiki/Enumerated_type
Instance state being global is a well-known problem with singletons.
Maybe don't use singletons then. Or simply document them as was done
in the RFC. I'd prefer the former since singletons don't seem to buy
much here but problems, though maybe I'm missing something.Yes, I think you are missing something - or maybe I am, because I honestly can't picture what it would look like for enums not to be singletons.
Would Suit::Hearts be a constructor, producing a new instance each time, each with its own state? Would we then overload ===, so that Suit::Hearts === Suit::Hearts was still true somehow?
In any case why is static state being (kinda sorta) restricted along with it?
On the face of it, I agree, static properties could be supported. But looking at the details of the current proposal, it might actually take some thought to make them feel natural. As I understand it, each case acts like a sub-class, which is useful for over-riding instance methods, but would mean a static property would be defined separately on each case:
enum Suit {
static $data;
case Hearts;
case Spades;
case Clubs;
case Diamonds;
}Suit::$data = 42;
$mySuit = Suit::Hearts;
var_dump($mySuit::$data); // will not print 42, because Suit::Hearts::$data is a different propertyAs Pierre says, the idea of backing enums onto objects is mostly an implementation detail; their fundamental design is based on how enums are generally used, and implemented in other languages.
Rather than "objects which are a bit enum-like", it might be useful to frame them as "enums which are a bit object-like". The primary consistency needs to be with what people will expect an enum to do.
Backing them onto objects makes it easy to add on any object-like behaviour that feels useful, but once we've added it, it's much harder to remove or change if we realise it's causing problems for users, or getting in the way of other features.
That's why I was asking if you had use cases in mind, because I was starting from that position: assume they have no features, and add the ones we think are necessary and achievable.
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]--
To unsubscribe, visit: https://www.php.net/unsub.php
Hi Paul. Although we're on hold for a bit while Ilija makes some changes in direction (see previous email), I wanted to respond to your concerns in particular because it sounds like you're misunderstanding the scope and roadmap of what we're trying to do.
Enumerations, as a general concept, are stateless. Or rather, the idea of state doesn't even apply to them. The integer 5 doesn't have any state of its own. All instances of the integer 5 are logically identical to all other instances of the integer 5.
Enumerations are a way to allow users to define that kind of "list of allowed values" where each value is always identical to itself. Depending on the language they often have the ability to define additional operations on that custom type, which are (from what I've found) nearly always implemented in a method-like syntax for simplicity. In most languages that do so it's straightforward because enums are built on top of objects for convenience, but for instance C# doesn't use objects for enums but still has a way to attach method-like functions to them. (See the survey review I did that's linked from the RFC.)
That the PHP RFC for enums uses objects internally is incidental. A few people suggested that we use an entirely new zval type for it, which could no doubt be done and achieve the same end syntax but would be a lot more work internally to do.
So if you're defining, say, an enum for cardinal directions (North, South, East, West), There are literally only 4 possible values for that type, and those values must always be equal to themselves. Attaching "5 miles" to "South" in one script and "2 km" to "South" in another script is nonsensical, just like attaching "feet" to the number 5 and later changing it to "fingers" just doesn't make any sense at all.
That's why, although we are building enums on top of objects, they cannot and must not have mutable state. They can have predefined methods on them that return a static value (such as a label() method), but that doesn't violate the "5 always === 5" rule. So anything that offers a way to violate South === South cannot be allowed. If you want that, then you don't want enums; you want classes. Don't get distracted by the word "singleton" here. That's an implementation detail, not something the user is ever exposed to.
It is true that, if you're attaching operations to enum cases that have no state, a method, a static method, and a constant are often interchangeable. A constant is strictly less capable since it cannot have logic, but a method and static method are functionally equivalent. The reason for forbidding static methods and constants is largely simplicity. Instance methods can cover all of those use cases, so it's only one mechanism to code and only mechanism for users to have to think about. "Should I use mechanism X or Y when they are virtually identical?" is a question that is best avoided. Too, it's a lot easier to add features later if we find they would be useful than to take them away if we find they're problematic.
For example, the change I just discussed in my previous email about switching to a single class with instances as constants means that static methods can only be on the enum (Suit), because there is no per-case-class to put them on (Diamonds). The same is true of per-instance constants. If we'd promised that those things worked, we wouldn't be able to make this change.
If when the dust settles there's a good reason to expand the functionality further in ways that still don't violate "South === South," (such as adding constants to the enum type itself, perhaps) those can be considered at that time.
As both Rowan and I mentioned, the long-term plan includes future RFCs for other functionality such as tagged unions. If you want state, then what you want is not an enum but a tagged union. For whatever reason, most languages with tagged unions seem to build them into and onto enums. I'm not entirely sure why, and I'm honestly not 100% convinced that's the right approach in PHP, but that's a debate for a future RFC. Within the scope of this RFC, enums must always obey "South === South", and thus must be stateless. When we get to tagged unions we can and likely will debate how state works there.
Longer term plans are irrelevant except to avoid inadvertently
shutting the door on something. This RFC is up for discussion, and
will be up for voting, in isolation. It has to be able to stand on its
own.
This is true up to a point. Yes, each RFC should be able to stand on its own, but that doesn't mean it must be everything and the kitchen sink. Trying to add enums, tagged unions, pattern matching, overridable identity, and generics all at once (since all of those are related to each other) would result in a massive and unreviewable patch, a massive and unreviewable RFC, and people not able to follow all the moving parts well enough to comment on them intelligently. That would almost certainly fail.
Explicitly setting up a series of RFCs like we're doing is, as far as I am aware, novel for PHP. It's the correct way to handle a larger task like this, however, especially when there clear division points that "chunk" the work in natural ways. Even if enums in this RFC are the only thing that passes, that's still a major win for PHP. It's not as big a win as tagged unions with robust pattern matching and generics on the side would be, but it's still a big step forward. By breaking it into chunks, we make it more likely that as much as can be done will get done, and approved, and into the language, even if the end result is the same feature set when 8.1.0 gets tagged.
--Larry Garfield
Hi Paul. Although we're on hold for a bit while Ilija makes some changes in direction (see previous email)
I'm looking forward to seeing the results.
Enumerations, as a general concept, are stateless. Or rather, the idea of state doesn't even apply to them.
They're (enumerated) lists of things. You need a way to make a list,
identify an item as being on the list, and differentiate it from other
items on the list.
The integer 5 doesn't have any state of its own. All instances of the integer 5 are logically identical to all other instances of the integer 5.
For simple types like integers, sure. Objects however can carry
additional data around without interfering with identifying or
differentiating them. It doesn't matter if object 5 has my address
book, is wearing a funny hat, or just cached the results from a bunch
of expensive method executions - you can still identify it and
distinguish it just fine.
That the PHP RFC for enums uses objects internally is incidental.
This is a complete reversal from what you've otherwise written in and
about the RFC.
So if you're defining, say, an enum for cardinal directions (North, South, East, West), There are literally only 4 possible values for that type, and those values must always be equal to themselves. Attaching "5 miles" to "South" in one script and "2 km" to "South" in another script is nonsensical, just like attaching "feet" to the number 5 and later changing it to "fingers" just doesn't make any sense at all.
Whether or not an object holding other data makes sense is irrelevant
to enums. Take it up in code review.
It is true that, if you're attaching operations to enum cases that have no state, a method, a static method, and a constant are often interchangeable. A constant is strictly less capable since it cannot have logic, but a method and static method are functionally equivalent. The reason for forbidding static methods and constants is largely simplicity.
You've significantly added to the WTF factor and edge cases when
working with them. For example enums can implement interfaces. Except
interfaces with constants. Or maybe you can implement interfaces with
constants, you just don't inherit the constants. Wait no that'd break
type declarations. Perhaps then you can inherit constants from
interfaces but you still can't declare them yourself. So the way to
use constants in enums is to declare them on an interface and
implement that to inherit them. Maybe.
This isn't simplicity. None of these options makes working with enums better.
Whether you prefer constants, static methods, or instance methods
isn't even relevant to enums. Again take it up in code review.
For example, the change I just discussed in my previous email about switching to a single class with instances as constants means that static methods can only be on the enum (Suit), because there is no per-case-class to put them on (Diamonds). The same is true of per-instance constants. If we'd promised that those things worked, we wouldn't be able to make this change.
The thing never shipped, you can make any change you want. If you're
pretending it did ship then you couldn't make that change anyway
without breaking instance methods so your point is moot.
If when the dust settles there's a good reason to expand the functionality further in ways that still don't violate "South === South," (such as adding constants to the enum type itself, perhaps) those can be considered at that time.
As you've said elsewhere:
Because enums are based on classes, there's not really any added complexity. They inherit the ability to have methods from being classes. It would be more work to separate it.
The same holds for constants, static methods, properties, (traits? do
these even work? some of them?) This isn't about adding functionality,
it's about not arbitrarily removing it.
As both Rowan and I mentioned, the long-term plan includes future RFCs for other functionality such as tagged unions. If you want state,
I want a coherent enum RFC that isn't complicated by unrelated
personal preferences being shoehorned into it. If you want to make an
RFC to deprecate class constants in favor of methods - fine, but at
least make the RFC about that. Trying to wedge it in here half-assed
only creates problems.
then what you want is not an enum but a tagged union.
You're saying this as if they're necessarily different things.
For whatever reason, most languages with tagged unions seem to build them into and onto enums.
Languages that implemented enums such that they function as tagged
unions saved themselves the effort of making the same thing twice.
They also saved their users from dealing with whatever subtle
differences would've been included to distinguish them.
I'm not entirely sure why, and I'm honestly not 100% convinced that's the right approach in PHP, but that's a debate for a future RFC.
Any debate about enums is appropriate for this RFC.
Within the scope of this RFC, enums must always obey "South === South", and thus must be stateless.
South === South can hold with or without state and isn't even required
for enums. All you need is to know it's on the list and for it to be
distinguishable from others on the list.
Longer term plans are irrelevant except to avoid inadvertently
shutting the door on something. This RFC is up for discussion, and
will be up for voting, in isolation. It has to be able to stand on its
own.This is true up to a point.
The point is nothing in the RFC is contingent on the others. If this
RFC passes and none of the others do, or none of the other work gets
done, that has to be okay. This means treating this RFC on its own.
As you said at the beginning:
The overarching plan (for context, NOT the thing to comment on right now)
Yet you keep bringing it up...
Explicitly setting up a series of RFCs like we're doing is, as far as I am aware, novel for PHP. It's the correct way to handle a larger task like this, however, especially when there clear division points that "chunk" the work in natural ways. Even if enums in this RFC are the only thing that passes, that's still a major win for PHP. It's not as big a win as tagged unions with robust pattern matching and generics on the side would be, but it's still a big step forward. By breaking it into chunks, we make it more likely that as much as can be done will get done, and approved, and into the language, even if the end result is the same feature set when 8.1.0 gets tagged.
--Larry Garfield
--
To unsubscribe, visit: https://www.php.net/unsub.php
Le 05/12/2020 à 00:24, Larry Garfield a écrit :
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding support for enumerations and algebraic data types to PHP. This is a not-small task, so we've broken it up into several stages. The first stage, unit enumerations, are just about ready for public review and discussion.
The overarching plan (for context, NOT the thing to comment on right now) is here: https://wiki.php.net/rfc/adts
The first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is mostly done, but not 100% complete. Still, it's far enough along to start a discussion on and get broader feedback on the outstanding nits.
I should note that while the design has been collaborative, credit for the implementation goes entirely to Ilija. Blame for any typos in the RFC itself go entirely to me.
dons flame-retardant suit
Hello !
I fully support all this initiative. Here is a first question, regarding
the enum::cases() static method:
If the enumeration has no primitive equivalent, the array will be
packed (indexed sequentially starting from 0). If the enumeration has a
primitive equivalent, the keys will be the corresponding primitive for
each enumeration. If the enumeration is of type |float|, the keys will
be rendered as strings. (So a primitive equivalent of |1.5| will result
in a key of |“1.5”|.)
Does this mean that an enum can't have two cases with the same primitive
value ? I would very much being able to do so, for example, when you
change a name and want to keep the legacy for backward compatibility.
I think that ::cases() should always return an array/iterable without
keys, and let userland write their own code to create hashmaps with
those when they need it. I think that having one case (non primitive
cases) that doesn't yield string keys and the other (primitive cases)
which holds string keys may create a confusion because behavior is
different.
If behavior is different, this would mean you couldn't mix primitive and
non primitive cases on the same enum, which, why not couldn't we do that?
Regards,
Pierre
Hi,
I think that ::cases() should always return an array/iterable without
keys, and let userland write their own code to create hashmaps with
those when they need it. I think that having one case (non primitive
cases) that doesn't yield string keys and the other (primitive cases)
which holds string keys may create a confusion because behavior is
different.If behavior is different, this would mean you couldn't mix primitive and
non primitive cases on the same enum, which, why not couldn't we do that?
I've no strong feelings and I think your idea makes sense.
But I would argue, of course without having hard facts, that the
majority will have the use-case for a 1:1, aka array with key =>
value, mapping.
Thus I would argue that having (considering your proposal being added)
still such functionality would be useful for a very wide-audience
without everyone having to reinvent a helper method to use the iterable.
thanks for considering,
- Markus
Le 05/12/2020 à 12:14, Markus Fischer a écrit :
Hi,
I think that ::cases() should always return an array/iterable without
keys, and let userland write their own code to create hashmaps with
those when they need it. I think that having one case (non primitive
cases) that doesn't yield string keys and the other (primitive cases)
which holds string keys may create a confusion because behavior is
different.If behavior is different, this would mean you couldn't mix primitive
and non primitive cases on the same enum, which, why not couldn't we
do that?I've no strong feelings and I think your idea makes sense.
But I would argue, of course without having hard facts, that the
majority will have the use-case for a 1:1, aka array with key =>
value, mapping.Thus I would argue that having (considering your proposal being added)
still such functionality would be useful for a very wide-audience
without everyone having to reinvent a helper method to use the iterable.
I agree with that last statement about wide-audience usefulness,
nevertheless my opinion is that it's not about being useful to the most
but bring consistency at the language level. If for consistency I have
to write an array_map()
call whenever I need a choices list, I'm good
with that. Moreover, bringing enum is a thousands times more useful than
having a sugar candy method for building hashmaps !
There are many use cases you'd want an hashmap out of those enums, but
each use case will probably differ in subtle ways. I'd more comfortable
with a coherent behavior that everyone will understand easily, with no
behavior subtleties, and each user implement its need on this side.
Why not, in that case, just add another method than ::cases(), such as,
I don't know, ::to[Hash]Map() for example ?
Regards,
Pierre
Le 05/12/2020 à 00:24, Larry Garfield a écrit :
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding support for enumerations and algebraic data types to PHP. This is a not-small task, so we've broken it up into several stages. The first stage, unit enumerations, are just about ready for public review and discussion.
The overarching plan (for context, NOT the thing to comment on right now) is here: https://wiki.php.net/rfc/adts
The first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is mostly done, but not 100% complete. Still, it's far enough along to start a discussion on and get broader feedback on the outstanding nits.
I should note that while the design has been collaborative, credit for the implementation goes entirely to Ilija. Blame for any typos in the RFC itself go entirely to me.
dons flame-retardant suit
Another question, about match() behavior:
This usage requires no modification of |match|. It is a natural
implication of the current functionality.
May be this could be the time to have a "strict match", using enums, we
can "statically" guess if branches are missing (of course, whenever you
use default it valid to miss branches) - one thing I'd love is PHP to
throw a fatal error when compiling a match expression whose missing
branches, and not wait until runtime to fail.
Regards,
Pierre
Le 05/12/2020 à 00:24, Larry Garfield a écrit :
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding support for enumerations and algebraic data types to PHP. This is a not-small task, so we've broken it up into several stages. The first stage, unit enumerations, are just about ready for public review and discussion.
The overarching plan (for context, NOT the thing to comment on right now) is here: https://wiki.php.net/rfc/adts
The first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is mostly done, but not 100% complete. Still, it's far enough along to start a discussion on and get broader feedback on the outstanding nits.
I should note that while the design has been collaborative, credit for the implementation goes entirely to Ilija. Blame for any typos in the RFC itself go entirely to me.
dons flame-retardant suit
Another question, about match() behavior:
This usage requires no modification of |match|. It is a natural
implication of the current functionality.May be this could be the time to have a "strict match", using enums, we
can "statically" guess if branches are missing (of course, whenever you
use default it valid to miss branches) - one thing I'd love is PHP to
throw a fatal error when compiling a match expression whose missing
branches, and not wait until runtime to fail.
I think that would make a great follow-up, but it's out of scope for now. Having match statically know what the available types are when the variable type isn't yet known (because it's in another file) is... I don't know how to do that. That's a broad problem across PHP, frankly. If we can figure out a way to do so, I'd support adding it in the future. (I can't speak for Ilija, but I suspect he would be on board as well.)
--Larry Garfield
Le 05/12/2020 à 15:42, Larry Garfield a écrit :
Another question, about match() behavior:
This usage requires no modification of |match|. It is a natural
implication of the current functionality.May be this could be the time to have a "strict match", using enums, we
can "statically" guess if branches are missing (of course, whenever you
use default it valid to miss branches) - one thing I'd love is PHP to
throw a fatal error when compiling a match expression whose missing
branches, and not wait until runtime to fail.
I think that would make a great follow-up, but it's out of scope for now. Having match statically know what the available types are when the variable type isn't yet known (because it's in another file) is... I don't know how to do that. That's a broad problem across PHP, frankly. If we can figure out a way to do so, I'd support adding it in the future. (I can't speak for Ilija, but I suspect he would be on board as well.)--Larry Garfield
Yes, I guess that's a broad problem that could only be fixed with
package handling at the language level, and autoloading done by the
compiler and not userland (simply put, that can only be fixed by killing
autoloading once for all). But that's a much more complex and
bikeshedding topic. Let's consider that your RFC is good as it is (and
it is honestly, I love it).
Regards,
Pierre
Le 05/12/2020 à 00:24, Larry Garfield a écrit :
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding
support for enumerations and algebraic data types to PHP. This is a
not-small task, so we've broken it up into several stages. The first
stage, unit enumerations, are just about ready for public review and
discussion.The overarching plan (for context, NOT the thing to comment on right
now) is here: https://wiki.php.net/rfc/adtsThe first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is
mostly done, but not 100% complete. Still, it's far enough along to start
a discussion on and get broader feedback on the outstanding nits.I should note that while the design has been collaborative, credit for
the implementation goes entirely to Ilija. Blame for any typos in the RFC
itself go entirely to me.dons flame-retardant suit
Another question, about match() behavior:
This usage requires no modification of |match|. It is a natural
implication of the current functionality.May be this could be the time to have a "strict match", using enums, we
can "statically" guess if branches are missing (of course, whenever you
use default it valid to miss branches) - one thing I'd love is PHP to
throw a fatal error when compiling a match expression whose missing
branches, and not wait until runtime to fail.I think that would make a great follow-up, but it's out of scope for now.
Having match statically know what the available types are when the variable
type isn't yet known (because it's in another file) is... I don't know how
to do that. That's a broad problem across PHP, frankly. If we can figure
out a way to do so, I'd support adding it in the future. (I can't speak
for Ilija, but I suspect he would be on board as well.)
Static analysis can (should?) be taken care of by any of the tools
available for PHP. Wouldn't it be a waste of effort to try to include it in
the runtime parser as well?
Am 05.12.20, 00:25 schrieb "Larry Garfield" larry@garfieldtech.com:
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding support for enumerations and algebraic data types to PHP. This is a not-small task, so we've broken it up into several stages. The first stage, unit enumerations, are just about ready for public review and discussion.
The overarching plan (for context, NOT the thing to comment on right now) is here: https://wiki.php.net/rfc/adts
The first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is mostly done, but not 100% complete. Still, it's far enough along to start a discussion on and get broader feedback on the outstanding nits.
I should note that while the design has been collaborative, credit for the implementation goes entirely to Ilija. Blame for any typos in the RFC itself go entirely to me.
*dons flame-retardant suit*
Hi Larry,
thanks for your great initiative and hard work in this!
I'm the author of an emulated enumeration lib [1] and really looking forward seeing native enumeration support in PHP.
Here are some questions about your RFC:
-
How can you access a defined case value?
Something likeSuit::Spades->value()
-
Are the cases case-insensitive or case-sensitive?
Suit::Spades
vs.Suit::SPADES
-
Are cases serializable?
Suit::Spades === unserialize(serialize(Suit::Spades)) // true
-
Do cases have a stable ordinal number / position and is that accessible?
This would be very interesting on implementing an optimized EnumSet using bit-array / bit-set internally -
I often use metadata in enumerations and so I would be very interested to allow constants.
I do understand that they share the same naming table and these needs to be unique but disabling constants altogether would limit the use-cases in my opinion. -
Is it possible to detect / type-hint if something is any enum?
Suit::Spades instanceof Enum
This basically means that all enumerations would to based on a general enum.
This would be very helpful on providing functionalities especially for enumerations thinking about a doctrine enumeration type or again an EnumSet / EnumMap.
Thanks Marc
[1] https://github.com/marc-mabe/php-enum
--
Larry Garfield
larry@garfieldtech.com
--
To unsubscribe, visit: https://www.php.net/unsub.php
*dons flame-retardant suit*
Hi Larry,
thanks for your great initiative and hard work in this!
I'm the author of an emulated enumeration lib [1] and really looking
forward seeing native enumeration support in PHP.Here are some questions about your RFC:
- How can you access a defined case value?
Something likeSuit::Spades->value()
At the moment it's only accessible via the ::cases() method. It may be appropriate to yoink the ->value() or ->value name to access it another way. We're debating that right now.
The primitive equivalent case is tricky, because other languages have about 14 different ways to think about it, all of them in the end incompatible. :-) We know we don't want to just have "fancy constants," but for those cases you do want/need a primitive instead (mainly writing to a DB or screen) it needs to be easy enough to get that. There's a number of competing priorities we're trying to balance here to make the DX as clean as possible.
To also respond to Pierre here, at present the primitives must be unique. We want to minimize the boilerplate that people have to write on every enum for common cases, and the cases() method is part of that. It won't capture everything, obviously, but if the common cases can be made trivial and the rare cases possible, that's a win.
- Are the cases case-insensitive or case-sensitive?
Suit::Spades
vs.Suit::SPADES
They end up as class names internally, so I believe that means they'd be case insensitive.
- Are cases serializable?
Suit::Spades === unserialize(serialize(Suit::Spades)) // true
Right now they'd do the same as objects, so they'd serialize as an object. Unserializing like that, though... hm, that would probably NOT still be === due to the way PHP handles objects. That's probably undesireable, but I'm not sure at the moment the best way around that. I'll have to discuss with Iliya.
- Do cases have a stable ordinal number / position and is that
accessible?
This would be very interesting on implementing an optimized EnumSet
using bit-array / bit-set internally
The order returned from cases() is stable as lexical order, but there's no ordinal number to access. If you want a bit set, you could do:
enum Permissions: int {
case Read = 0x1,
case Write = 0x10,
case Exec = 0x100,
}
Right now there's no support for case ReadWrite = self::Read | self::Write. I don't know if that would be easy to add in the future, but I'd be OK with it if so. Mainly this runs into the same tricky questions around primitive equivalent handling (and what we can coax the lexer to do).
- I often use metadata in enumerations and so I would be very
interested to allow constants.
I do understand that they share the same naming table and these needs
to be unique but disabling constants altogether would limit the
use-cases in my opinion.
That's what methods are for, or potentially __get. Allowing even
Is it possible to detect / type-hint if something is any enum?
Suit::Spades instanceof Enum
This basically means that all enumerations would to based on a
general enum.
This would be very helpful on providing functionalities especially
for enumerations thinking about a doctrine enumeration type or again an
EnumSet / EnumMap.
Not at the moment. We're discussing the implications of adding that.
What exactly are EnumSet and EnumMap, in your mind?
--Larry Garfield
- Are cases serializable?
Suit::Spades === unserialize(serialize(Suit::Spades)) // true
Right now they'd do the same as objects, so they'd serialize as an object. Unserializing like that, though... hm, that would probably NOT still be === due to the way PHP handles objects. That's probably undesireable, but I'm not sure at the moment the best way around that. I'll have to discuss with Iliya.
I guess what it comes down to is whether / how easily a class can return
an existing instance when asked to unserialize, rather than setting
properties on an existing instance. That is, given the string
"C:4:Suit:6:{Spades}" can the class definition return the appropriate
singleton for Suits::Spades rather than a newly constructed object?
If this proves tricky to implement, it would probably be better to
forbid serialization than using the default object format and breaking
the singleton-ness of the case objects.
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
Hi Larry Garfield,
Right now they'd do the same as objects, so they'd serialize as an object.
Unserializing like that, though... hm, that would probably NOT still be === due to the way PHP handles objects.
That's probably undesireable, but I'm not sure at the moment the best way around that. I'll have to discuss with Iliya.
At a glance, it seems like it's doable in php 8.1 internals to make the unserializer return a singleton if you override the C unserialize
callback of the class (this could introduce edge cases during unserialization failure for destruction, expected to be solvable).
This would require using the C:
serialize()
encoding (php uses that for classes implementing Serializable), not o:
(unserialize can be set to a generic C function that creates a value and puts it in *object*
)
// Zend/zend.h
struct _zend_class_entry {
/* .... serializer callbacks */
int (*serialize)(zval *object, unsigned char **buffer, size_t *buf_len, zend_serialize_data *data);
int (*unserialize)(zval *object, zend_class_entry *ce, const unsigned char *buf, size_t buf_len, zend_unserialize_data *data);
// ext/standard/var_unserializer.c, in object_custom()
if (ce->unserialize == NULL) {
zend_error(E_WARNING, "Class %s has no unserializer", ZSTR_VAL(ce->name));
object_init_ex(rval, ce);
} else if (ce->unserialize(rval, ce, (const unsigned char*)*p, datalen, (zend_unserialize_data *)var_hash) != SUCCESS) {
return 0;
}
(I'm one of the maintainers of https://pecl.php.net/package/igbinary , an alternative binary php serializer)
- Tyson
Am 05.12.20, 16:08 schrieb "Larry Garfield" larry@garfieldtech.com:
>
> * How can you access a defined case value?
> Something like `Suit::Spades->value()`
At the moment it's only accessible via the ::cases() method. It may be appropriate to yoink the ->value() or ->value name to access it another way. We're debating that right now.
The primitive equivalent case is tricky, because other languages have about 14 different ways to think about it, all of them in the end incompatible. :-) We know we don't want to just have "fancy constants," but for those cases you do want/need a primitive instead (mainly writing to a DB or screen) it needs to be easy enough to get that. There's a number of competing priorities we're trying to balance here to make the DX as clean as possible.
For me it doesn't really matter how to get the primitive value as long as there is a simple way to do so.
As you wrote
Passing a Primitive Case to a primitive-typed parameter or return will produce the primitive value in weak-typing mode, and produce a TypeError in strict-typing mode.
This would already break simple function calls like the following on strlen
enum Suit: string {
case Hearts = 'H';
...
public function label(): string { return 'Label of ' . $this . ' with length ' . strlen($this); }
// or more clear in my opinion
public function label(): string { return 'Label of ' . $this->value . ' with length ' . strlen($this->value); }
}
To also respond to Pierre here, at present the primitives must be unique. We want to minimize the boilerplate that people have to write on every enum for common cases, and the cases() method is part of that. It won't capture everything, obviously, but if the common cases can be made trivial and the rare cases possible, that's a win.
I agree here that the defined primitives needs to be unique as well.
> * Do cases have a stable ordinal number / position and is that
> accessible?
> This would be very interesting on implementing an optimized EnumSet
> using bit-array / bit-set internally
The order returned from cases() is stable as lexical order, but there's no ordinal number to access. If you want a bit set, you could do:
enum Permissions: int {
case Read = 0x1,
case Write = 0x10,
case Exec = 0x100,
}
Right now there's no support for case ReadWrite = self::Read | self::Write. I don't know if that would be easy to add in the future, but I'd be OK with it if so. Mainly this runs into the same tricky questions around primitive equivalent handling (and what we can coax the lexer to do).
> * Is it possible to detect / type-hint if something is any enum?
> ` Suit::Spades instanceof Enum`
>
> This basically means that all enumerations would to based on a
> general enum.
> This would be very helpful on providing functionalities especially
> for enumerations thinking about a doctrine enumeration type or again an
> EnumSet / EnumMap.
Not at the moment. We're discussing the implications of adding that.
What exactly are EnumSet and EnumMap, in your mind?
An EnumSet is a set (unique cases) of the same enumeration type.
An EnumMap maps cases of the same enumeration type to another value.
The interesting part here is that this can be done in a very efficient way without the need to iterate over it.
Like you have defined an enumeration of 200 countries
enum Country: string {
case USA = 'US',
case Germany = 'DE',
case Australia = 'AU',
...
}
$set1 = Country::USA | Country::Germany; // EnumSet<Country> of [Country::USA, Country::Germany]
$set2 = Country::USA | Country:: Australia; // EnumSet<Country> of [Country::USA, Country::Australia]
$set3 = $set1 & $set2; // EnumSet<Country> of [Country::USA]
$set4 = $set1 & ~$set2; // EnumSet<Country> of [Country:: Germany]
This could of course also be done without operator overloading but it looks very clean with them __
All done internally using bit operations __
Greetings Marc
--Larry Garfield
--
To unsubscribe, visit: https://www.php.net/unsub.php
(Apologies if this posts twice, I may have accidentally hit send before
I'd finished writing it)
- Do cases have a stable ordinal number / position and is that accessible?
This would be very interesting on implementing an optimized EnumSet using bit-array / bit-set internally
If you want to associate a stable value with each case, you can use a
"primitive-equivalent enum" and assign each case an integer manually. As
the RFC says:
There are no auto-generated primitive equivalents (e.g., sequential
integers).
I think that's a good design decision, because the language itself can't
actually guarantee a stable ordinal number if the enum's author makes
changes; e.g. given:
enum Colour { case RED; case GREEN; case BLUE; }
somebody might decide the cases should be in alphabetical order:
enum Colour { case BLUE; case GREEN; case RED; }
It seems unnecessarily confusing to have this be a breaking change
because code somewhere relied on the ordering. It's much clearer if the
author has to declare an intentional value, and can then maintain it
when making cosmetic changes to their own code:
enum Colour { case BLUE=3; case GREEN=2; case RED=1; }
- I often use metadata in enumerations and so I would be very
interested to allow constants.
Could you give an example what you mean? Metadata on individual cases is
supported by methods, which map more cleanly to things like interfaces,
and the notion of each case as a singleton object, not a static class.
That said, I have a related question: can enum cases be used as the
value of constants? e.g.:
class OldMaid {
public const SUIT = Suit::Spades;
public const VALUE = CardValue::Queen;
}
If so, this leads to an interesting use case for constants on the enum
itself (not the cases) to define aliases:
enum Suit {
case Hearts;
case Diamonds;
case Clubs;
case Spades;
const Tiles = self::Diamonds;
const Clovers = self::Clubs;
const Pikes = self::Spades;
}
// The constants and cases can be used interchangeably, since constants
and cases are de-referenced with the same syntax
assert(Suit::Spades === Suit::Pikes);
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
Am 05.12.20, 16:21 schrieb "Rowan Tommins" rowan.collins@gmail.com:
(Apologies if this posts twice, I may have accidentally hit send before
I'd finished writing it)
> * Do cases have a stable ordinal number / position and is that accessible?
> This would be very interesting on implementing an optimized EnumSet using bit-array / bit-set internally
If you want to associate a stable value with each case, you can use a
"primitive-equivalent enum" and assign each case an integer manually. As
the RFC says:
> There are no auto-generated primitive equivalents (e.g., sequential
integers).
I think that's a good design decision, because the language itself can't
actually guarantee a stable ordinal number if the enum's author makes
changes; e.g. given:
enum Colour { case RED; case GREEN; case BLUE; }
somebody might decide the cases should be in alphabetical order:
enum Colour { case BLUE; case GREEN; case RED; }
It seems unnecessarily confusing to have this be a breaking change
because code somewhere relied on the ordering. It's much clearer if the
author has to declare an intentional value, and can then maintain it
when making cosmetic changes to their own code:
enum Colour { case BLUE=3; case GREEN=2; case RED=1; }
Sorry for the confusion - I mean stable within the same process - not over different processes / systems.
> * I often use metadata in enumerations and so I would be very
> interested to allow constants.
Could you give an example what you mean? Metadata on individual cases is
supported by methods, which map more cleanly to things like interfaces,
and the notion of each case as a singleton object, not a static class.
That said, I have a related question: can enum cases be used as the
*value* of constants? e.g.:
class OldMaid {
public const SUIT = Suit::Spades;
public const VALUE = CardValue::Queen;
}
If so, this leads to an interesting use case for constants on the enum
itself (not the cases) to define aliases:
enum Suit {
case Hearts;
case Diamonds;
case Clubs;
case Spades;
const Tiles = self::Diamonds;
const Clovers = self::Clubs;
const Pikes = self::Spades;
}
// The constants and cases can be used interchangeably, since constants
and cases are de-referenced with the same syntax
assert(Suit::Spades === Suit::Pikes);
I mean on mapping something to something else defined as a single assoc array constant.
Something like:
enum Role {
case User,
case Admin,
...
}
enum Action {
case Order_Edit,
case Order_Read,
private const BY_ROLE = [
Role::User => [self::Order_Read],
Role::Admin => [self::Order_Read, self::Order_Edit],
];
public function isAllowed(User $user) {
return in_array($this, self::BY_ROLE[$user->role]);
}
}
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
--
To unsubscribe, visit: https://www.php.net/unsub.php
I mean on mapping something to something else defined as a single assoc array constant.
Something like:enum Role {
case User,
case Admin,
...
}enum Action {
case Order_Edit,
case Order_Read,private const BY_ROLE = [
Role::User => [self::Order_Read],
Role::Admin => [self::Order_Read, self::Order_Edit],
];public function isAllowed(User $user) {
return in_array($this, self::BY_ROLE[$user->role]);
}
}
That example can be rewritten directly as a static method and match
statement, which I think would look like this:
enum Action {
case Order_Edit;
case Order_Read;
private static function byRole(Role $role) {
return match($role) {
Role::User => [self::Order_Read],
Role::Admin => [self::Order_Read, self::Order_Edit],
};
}
public function isAllowed(User $user) {
return in_array($this, self::byRole($user->role));
}
}
The scenario can also be modelled the other way around with case methods:
enum Action {
public function isAllowed(User $user) {
// Default case denying all, might be useful with a long list of cases
return false;
}
case Order_Edit {
public function isAllowed(User $user) {
return $user->role === Role::Admin;
}
}
case Order_Read {
public function isAllowed(User $user) {
return $user->role === Role::User
|| $user->role === Role::Admin;
}
}
}
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
> * I often use metadata in enumerations and so I would be very > interested to allow constants. Could you give an example what you mean? Metadata on individual cases is supported by methods, which map more cleanly to things like interfaces, and the notion of each case as a singleton object, not a static class. That said, I have a related question: can enum cases be used as the *value* of constants? e.g.: class OldMaid { public const SUIT = Suit::Spades; public const VALUE = CardValue::Queen; }
At present, no. They're "just" objects, and you can't assign an object to a constant. Unfortunately I'm not sure how to enable that without making them not-objects, which introduces all sorts of other complexity.
I mean on mapping something to something else defined as a single assoc
array constant.
Something like:enum Role {
case User,
case Admin,
...
}enum Action {
case Order_Edit,
case Order_Read,private const BY_ROLE = [
Role::User => [self::Order_Read],
Role::Admin => [self::Order_Read, self::Order_Edit],
];public function isAllowed(User $user) {
return in_array($this, self::BY_ROLE[$user->role]);
}
}
Because of their object-ness, I think you'd have to use a weak map defined at runtime:
class AccessControl {
private WeakMap $perms;
public function __construct() {
$this->perms = new WeakMap();
$this->perms[Role::User] = [Action::Order_Read, Action::Order_Edit];
$this->perms[Role::Admin] = [Action::Order_Read, Action::Order_Edit];
}
public function isAllowed($user, $action): bool {
return in_array($action, $this->perms[$user->role]);
}
}
--Larry Garfield
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding
support for enumerations and algebraic data types to PHP. This is a
not-small task, so we've broken it up into several stages. The first
stage, unit enumerations, are just about ready for public review and
discussion.The overarching plan (for context, NOT the thing to comment on right
now) is here: https://wiki.php.net/rfc/adtsThe first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is
mostly done, but not 100% complete. Still, it's far enough along to
start a discussion on and get broader feedback on the outstanding nits.I should note that while the design has been collaborative, credit for
the implementation goes entirely to Ilija. Blame for any typos in the
RFC itself go entirely to me.dons flame-retardant suit
--
Larry Garfield
larry@garfieldtech.com
Thank you everyone for the feedback so far! I've updated the RFC with a few changes, based on discussion here and elsewhere:
- Clarified that "enums behave like objects unless otherwise specified." That should clarify a lot of edge case questions. (Which is specifically one of the reasons to build off of objects. We can inherit answers to most edge cases.)
- Primitive-backed Cases have been renamed to Scalar Enums, because "Primitive-backed" is just too clumsy to say or write all the time.
- There's now formal internal interfaces defined for Enum, UnitEnum, and ScalarEnum to define the methods mentioned. These serve as both documentation and to allow user-space code to tell when a value it's dealing with is an enum, and a particular type of enum. (And therefore what enum methods are available.)
- Added a value() method to ScalarEnum, to make it easier to get at the scalar equivalent arbitrarily.
- Fixed a bunch of typos.
- Updated the reflection section to have
getType()
return a ReflectionType rather than a bare string.
(The patch will be updated for the above shortly as Ilija's time allows.)
--Larry Garfield
Thanks a lot for this RFC, Larry and Iliya! I can't imagine the amount of
thought and work put into this.
Enums are definitely a most-wanted PHP feature.
I played a bit with the early implementation, and love it so far. Here are
my thoughts on the RFC and the current implementation:
Serialization
I guess what it comes down to is whether / how easily a class can return
an existing instance when asked to unserialize, rather than setting
properties on an existing instance. That is, given the string
"C:4:Suit:6:{Spades}" can the class definition return the appropriate
singleton for Suits::Spades rather than a newly constructed object?If this proves tricky to implement, it would probably be better to
forbid serialization than using the default object format and breaking
the singleton-ness of the case objects.
+1, totally agree with this statement. That's the first thing I tried when
playing with the implementation, and noticed that serialization is not
supported:
echo unserialize(serialize(Status::Active));
Notice: `unserialize()`: Error at offset 0 of 42 bytes
I would definitely expect strict equality to be maintained on enum cases
even after unserialization!
var_export()
Currently, var_export()
returns a __set_state() syntax:
var_export(Status::Active);
Foo\Bar\Status::Active::__set_state(array(
))
I guess this should just return Foo\Bar\Status::Active.
Scalar Enums, ::cases()
The implementation does not support these yet, so I haven't had a chance to
play with them.
I share Pierre R.'s concerns, though:
Does this mean that an enum can't have two cases with the same primitive
value ? I would very much being able to do so, for example, when you
change a name and want to keep the legacy for backward compatibility.
But I'd understand if you just disallow duplicate scalar values altogether,
this is probably the most sensible solution here.
Another idea that comes to mind is that cases() could return an iterator
instead of an array, having the cases as keys and the scalars as values,
but this would probably come as a surprise and be bad for DX.
Enum & UnitEnum interfaces
The implementation does not seem to support these yet. Taking the examples
from the RFC:
Suit::Hearts instanceof Enum; // true => Parse error: syntax error,
unexpected token "enum"
Suit::Hearts instanceof UnitEnum; // true => FALSE
Best of luck with the RFC!
— Benjamin
Thanks a lot for this RFC, Larry and Iliya! I can't imagine the amount of
thought and work put into this.
Enums are definitely a most-wanted PHP feature.I played a bit with the early implementation, and love it so far. Here are
my thoughts on the RFC and the current implementation:
Yay!
Serialization
I guess what it comes down to is whether / how easily a class can return
an existing instance when asked to unserialize, rather than setting
properties on an existing instance. That is, given the string
"C:4:Suit:6:{Spades}" can the class definition return the appropriate
singleton for Suits::Spades rather than a newly constructed object?If this proves tricky to implement, it would probably be better to
forbid serialization than using the default object format and breaking
the singleton-ness of the case objects.+1, totally agree with this statement. That's the first thing I tried when
playing with the implementation, and noticed that serialization is not
supported:echo unserialize(serialize(Status::Active)); Notice: `unserialize()`: Error at offset 0 of 42 bytes
I would definitely expect strict equality to be maintained on enum cases
even after unserialization!
I've opened a task to just block serialization entirely for now. It's probably best to not support it at all than to have half-arsed buggy support, at least for now: https://github.com/Crell/enum-comparison/issues/46
var_export()
Currently,
var_export()
returns a __set_state() syntax:var_export(Status::Active); Foo\Bar\Status::Active::__set_state(array( ))
I guess this should just return Foo\Bar\Status::Active.
Good point. I've opened a task for that: https://github.com/Crell/enum-comparison/issues/47
Scalar Enums, ::cases()
The implementation does not support these yet, so I haven't had a chance to
play with them.
I share Pierre R.'s concerns, though:Does this mean that an enum can't have two cases with the same primitive
value ? I would very much being able to do so, for example, when you
change a name and want to keep the legacy for backward compatibility.But I'd understand if you just disallow duplicate scalar values altogether,
this is probably the most sensible solution here.
Another idea that comes to mind is that cases() could return an iterator
instead of an array, having the cases as keys and the scalars as values,
but this would probably come as a surprise and be bad for DX.
Off hand, I'm not sure if any languages support duplicate enums. I can see the use case for renaming, but I suspect it's going to just be too complicated to support in practice. I'll make a note in the RFC that they must be unique.
Enum & UnitEnum interfaces
The implementation does not seem to support these yet. Taking the examples
from the RFC:Suit::Hearts instanceof Enum; // true => Parse error: syntax error,
unexpected token "enum"
Suit::Hearts instanceof UnitEnum; // true =>FALSE
Yeah, we only added that to the RFC text an hour or three before you posted; Ilija has to catch up with the new text yet. :-) Please give him some time.
--Larry Garfield
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding
support for enumerations and algebraic data types to PHP. This is a
not-small task, so we've broken it up into several stages. The first
stage, unit enumerations, are just about ready for public review and
discussion.The overarching plan (for context, NOT the thing to comment on right
now) is here: https://wiki.php.net/rfc/adtsThe first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is
mostly done, but not 100% complete. Still, it's far enough along to
start a discussion on and get broader feedback on the outstanding nits.I should note that while the design has been collaborative, credit for
the implementation goes entirely to Ilija. Blame for any typos in the
RFC itself go entirely to me.dons flame-retardant suit
--
Larry Garfield
larry@garfieldtech.comThank you everyone for the feedback so far! I've updated the RFC with
a few changes, based on discussion here and elsewhere:
- Clarified that "enums behave like objects unless otherwise
specified." That should clarify a lot of edge case questions. (Which
is specifically one of the reasons to build off of objects. We can
inherit answers to most edge cases.)- Primitive-backed Cases have been renamed to Scalar Enums, because
"Primitive-backed" is just too clumsy to say or write all the time.- There's now formal internal interfaces defined for Enum, UnitEnum,
and ScalarEnum to define the methods mentioned. These serve as both
documentation and to allow user-space code to tell when a value it's
dealing with is an enum, and a particular type of enum. (And therefore
what enum methods are available.)- Added a value() method to ScalarEnum, to make it easier to get at the
scalar equivalent arbitrarily.- Fixed a bunch of typos.
- Updated the reflection section to have
getType()
return a
ReflectionType rather than a bare string.(The patch will be updated for the above shortly as Ilija's time allows.)
Another update:
Based on feedback here, and after further discussion with Ilija and NIkita, we're going to try a different tack on some implementation details. Specifically:
- We're going to shift Enums to be a single class with a bunch of secret properties inside to hold the different case object instances, rather than a class per case. That should make the overall memory usage lower, especially for enums with a large number of cases. As an unfortunate side effect, this will preclude per-case methods, at least for now. :-(
- The ReflectionCase class will naturally go away at that point.
- For serialization, we'll introduce a new serialization marker, enum, which will make it feasible too round-trip an enum while maintaining singleton-ness. More specifically, the deserialize routine would essentially become $type::from($value), where $type and $value are pulled from the serialized version.
I was also wrong in one of my earlier statements; as currently implemented, enum cases are implemented as class constants that reference an object. (Something the engine is allowed to do even if user code cannot.) That means they do work as parameter defaults and can be assigned as a default value of an object property or constant. Huzzah!
Because it's the holidays these changes won't be immediate, but expect us to come back in a few weeks with the next draft.
Thank you everyone for your interest!
--Larry Garfield
Le 08/12/2020 à 18:40, Larry Garfield a écrit :
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding
support for enumerations and algebraic data types to PHP. This is a
not-small task, so we've broken it up into several stages. The first
stage, unit enumerations, are just about ready for public review and
discussion.The overarching plan (for context, NOT the thing to comment on right
now) is here: https://wiki.php.net/rfc/adtsThe first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is
mostly done, but not 100% complete. Still, it's far enough along to
start a discussion on and get broader feedback on the outstanding nits.I should note that while the design has been collaborative, credit for
the implementation goes entirely to Ilija. Blame for any typos in the
RFC itself go entirely to me.dons flame-retardant suit
--
Larry Garfield
larry@garfieldtech.com
Thank you everyone for the feedback so far! I've updated the RFC with
a few changes, based on discussion here and elsewhere:
- Clarified that "enums behave like objects unless otherwise
specified." That should clarify a lot of edge case questions. (Which
is specifically one of the reasons to build off of objects. We can
inherit answers to most edge cases.)- Primitive-backed Cases have been renamed to Scalar Enums, because
"Primitive-backed" is just too clumsy to say or write all the time.- There's now formal internal interfaces defined for Enum, UnitEnum,
and ScalarEnum to define the methods mentioned. These serve as both
documentation and to allow user-space code to tell when a value it's
dealing with is an enum, and a particular type of enum. (And therefore
what enum methods are available.)- Added a value() method to ScalarEnum, to make it easier to get at the
scalar equivalent arbitrarily.- Fixed a bunch of typos.
- Updated the reflection section to have
getType()
return a
ReflectionType rather than a bare string.(The patch will be updated for the above shortly as Ilija's time allows.)
Another update:Based on feedback here, and after further discussion with Ilija and NIkita, we're going to try a different tack on some implementation details. Specifically:
- We're going to shift Enums to be a single class with a bunch of secret properties inside to hold the different case object instances, rather than a class per case. That should make the overall memory usage lower, especially for enums with a large number of cases. As an unfortunate side effect, this will preclude per-case methods, at least for now. :-(
- The ReflectionCase class will naturally go away at that point.
- For serialization, we'll introduce a new serialization marker, enum, which will make it feasible too round-trip an enum while maintaining singleton-ness. More specifically, the deserialize routine would essentially become $type::from($value), where $type and $value are pulled from the serialized version.
I was also wrong in one of my earlier statements; as currently implemented, enum cases are implemented as class constants that reference an object. (Something the engine is allowed to do even if user code cannot.) That means they do work as parameter defaults and can be assigned as a default value of an object property or constant. Huzzah!
Because it's the holidays these changes won't be immediate, but expect us to come back in a few weeks with the next draft.
Thank you everyone for your interest!
--Larry Garfield
Hello,
I'm really glad to see all of you working hard on this, I have nothing
more interesting to say than "Thank you", all of you, very much.
And for the record I didn't say so because I didn't want to raise any
flame or troll, but now that Nikita put it on the carpet, I'm free to
say that implementation choices did make me fear for performances, but
they are implementation details, and future can always improve it. Now I
don't fear anymore, thanks !
Have a good confined holidays all !
Regards,
Pierre
- We're going to shift Enums to be a single class with a bunch of secret properties inside to hold the different case object instances, rather than a class per case. That should make the overall memory usage lower, especially for enums with a large number of cases. As an unfortunate side effect, this will preclude per-case methods, at least for now. :-(
Imo, this is good. This whole sub-classing part looked to me as too much
(at least for this step). Thank you for all this work.
--
Aleksander Machniak
Kolab Groupware Developer [https://kolab.org]
Roundcube Webmail Developer [https://roundcube.net]
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
Am 08.12.20 um 18:40 schrieb Larry Garfield:
For serialization, we'll introduce a new serialization marker, enum, which will make it feasible too round-trip an enum while maintaining singleton-ness. More specifically, the deserialize routine would essentially become $type::from($value), where $type and $value are pulled from the serialized version.
I wonder whether this mechanism could be generalized to a serialization
mechanism that would allow e.g. Database Repositories to pull entities
from a database upon unserialization.
My example would something like this (simplified):
interface IdentifierBasedSerializable {
static function serializeToIdentifier(self $object) : string;
static function unserializeFromIdentifier(string $identifier) : self
}
class ExternallyStoredEntity implements IdentifierBasedSerializable {
static function serializeToIdentifier(self $object) : string {
return $object->id;
}
static function unserializeFromIdentifier(string $identifier): self{
if (!\array_key_exists($identifier, self::$objectCache)) {
self::$objectCache[$identifier] = new self($identifier);
}
return self::$objectCache[$identifier];
}
private static $objectCache = [];
public string $id;
public function __construct(string $id) { $this->id = $id; }
}
Until now it is not possible to unserialize an object to the canonical
instance created earlier in the program execution.
Serialization and subsequent unserialization breaks object identity
($a->id === $b->id <=> $a === $b) that is often desired when working
with objects that represent external entities. The list of allowed enum
values is in this regard a very simple database of identifiers.
I know that passing a complex serialization string to a static
unserialization factory method has the serious problem that circular
references can not be handled. But using only an identifier as the sole
contents of the serialization string avoids this problem entirely.
Greets
Dennis
Hi Larry,
sob., 5 gru 2020 o 00:25 Larry Garfield larry@garfieldtech.com napisał(a):
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding
support for enumerations and algebraic data types to PHP. This is a
not-small task, so we've broken it up into several stages. The first
stage, unit enumerations, are just about ready for public review and
discussion.The overarching plan (for context, NOT the thing to comment on right now)
is here: https://wiki.php.net/rfc/adtsThe first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is
mostly done, but not 100% complete. Still, it's far enough along to start
a discussion on and get broader feedback on the outstanding nits.I should note that while the design has been collaborative, credit for the
implementation goes entirely to Ilija. Blame for any typos in the RFC
itself go entirely to me.dons flame-retardant suit
Thanks for taking the topic. I love it.
Regarding the ::cases()
method on UnitEnum I guess it'd be more natural
to cast enum into an array like:
(array) Suit;
but I realize it'd be harder to implement. The question is if it was even
considered?
Regarding the ::from()
method responsible for casting was a natural cast
operator considered instead?
For eg.
$suit = (Suit) $record['suit'];
Instead of:
$suit = Suit::from($record['suit']);
Cheers,
Michał Marcin Brzuchalski
Hi Larry,
Thanks for taking the topic. I love it.
Regarding the
::cases()
method on UnitEnum I guess it'd be more natural
to cast enum into an array like:(array) Suit;
but I realize it'd be harder to implement. The question is if it was even
considered?Regarding the
::from()
method responsible for casting was a natural cast
operator considered instead?
For eg.
$suit = (Suit) $record['suit'];
Instead of:
$suit = Suit::from($record['suit']);Cheers,
Michał Marcin Brzuchalski
We didn't really get into explicit casting, since it's so rarely used in conventional PHP code these days. I'm not sure off hand if supporting (Suit)"H" or (string)Suit::Hearts would be easy or hard. That would be an Ilija question.
Assuming it's feasible to do, what do people feel about supporting that? IMO, cases(), from(), and values() need to be kept no matter what as they're more self documenting and can be passed around as callables. So the question is just whether we should also try to add casting as an alias to those operations.
--Larry Garfield
Le 07/12/2020 à 16:26, Larry Garfield a écrit :
Thanks for taking the topic. I love it.
Regarding the
::cases()
method on UnitEnum I guess it'd be more natural
to cast enum into an array like:(array) Suit;
but I realize it'd be harder to implement. The question is if it was even
considered?Regarding the
::from()
method responsible for casting was a natural cast
operator considered instead?
For eg.
$suit = (Suit) $record['suit'];
Instead of:
$suit = Suit::from($record['suit']);Cheers,
Michał Marcin Brzuchalski
We didn't really get into explicit casting, since it's so rarely used in conventional PHP code these days. I'm not sure off hand if supporting (Suit)"H" or (string)Suit::Hearts would be easy or hard. That would be an Ilija question.Assuming it's feasible to do, what do people feel about supporting that? IMO, cases(), from(), and values() need to be kept no matter what as they're more self documenting and can be passed around as callables. So the question is just whether we should also try to add casting as an alias to those operations.
--Larry Garfield
Hello,
I deeply hate dark magic, and my motto is always "explicit is better".
So I'm NOT in favor of magic/explicit casting neither array-like syntax
on enums.
If it was only about me, I'd remove array syntax at all in PHP except
for real numerical-indexed static arrays in favor of explicit
list/hashmap/etc methods, but that's probably my Java past talking here :)
Regards,
Pierre
Assuming it's feasible to do, what do people feel about supporting that? IMO, cases(), from(), and values() need to be kept no matter what as they're more self documenting and can be passed around as callables. So the question is just whether we should also try to add casting as an alias to those operations.
From my experience answering questions about SimpleXML on Stack
Overflow, I can confirm that people find magic behaviour of (string)
hard to discover and understand, and it's not uncommon to see someone
write $foo->__toString() because they're more familiar with methods.
That's even more true for other casts, e.g. (int)$foo and (float)$foo,
which can be supported by built-in classes but not userland ones, so are
even less discoverable.
I can see the appeal of overloading cast syntax, but I would personally
be on the "just use explicit methods" side.
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
Assuming it's feasible to do, what do people feel about supporting that? IMO, cases(), from(), and values() need to be kept no matter what as they're more self documenting and can be passed around as callables. So the question is just whether we should also try to add casting as an alias to those operations.
From my experience answering questions about SimpleXML on Stack
Overflow, I can confirm that people find magic behaviour of (string)
hard to discover and understand, and it's not uncommon to see someone
write $foo->__toString() because they're more familiar with methods.That's even more true for other casts, e.g. (int)$foo and (float)$foo,
which can be supported by built-in classes but not userland ones, so are
even less discoverable.I can see the appeal of overloading cast syntax, but I would personally
be on the "just use explicit methods" side.Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
I spoke with Ilija and it turns out it would be really hard to do anyway, so casts are out. I rescind my request for feedback on that point. :-)
--Larry Garfield
On Sat, Dec 5, 2020 at 12:25 AM Larry Garfield larry@garfieldtech.com
wrote:
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding
support for enumerations and algebraic data types to PHP. This is a
not-small task, so we've broken it up into several stages. The first
stage, unit enumerations, are just about ready for public review and
discussion.The overarching plan (for context, NOT the thing to comment on right now)
is here: https://wiki.php.net/rfc/adtsThe first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is
mostly done, but not 100% complete. Still, it's far enough along to start
a discussion on and get broader feedback on the outstanding nits.I should note that while the design has been collaborative, credit for the
implementation goes entirely to Ilija. Blame for any typos in the RFC
itself go entirely to me.dons flame-retardant suit
Thanks for the proposal Ilija and Larry! Enums are long overdue. Some
initial thoughts, in no particular order:
-
I think that serialization support needs to be part of the initial
proposals. Otherwise you wouldn't be able to store an enum in a property
(without poisoning the whole object graph as "non-serializable"). I expect
that a clean solution to this will require a new serialization modifier,
but don't think this should be overly hard to add. This should also not
introduce any serialization format compatibility concerns as long as it is
introduced in the same version that enums are, as any payloads using the
new format would only be meaningful on PHP versions that support enums. -
As "enum" becomes a reserved keyword, you can' have an interface called
"Enum"... If you wanted to, you could probably avoid a reserved keyword by
taking a page out of C++'s book and making the syntax "enum class Foo {}"
rather than "enum Foo {}", where "enum" would be a contextual keyword. I
think this is worth at least considering, because I expect that there's a
lot of existing enum libraries that will break due to this reserved
keyword. While they can now be replaced by native enums, this will cause
issues during migration and with code that is compatible with more than one
PHP version. -
Rather than WeakMap, the possibly more natural choice for using enum keys
is SplObjectStorage. Of course, SplObjectStorage, like anything that is
part of SPL, has some peculiarities... Of course, just allowing them as
array keys would be ideal, but I agree that this should not be covered by
this RFC. This is something I may look into. -
While not mentioned in the RFC, you mentioned in this discussion that
enum cases cannot be stored in constants:
At present, no. They're "just" objects, and you can't assign an object
to a constant. Unfortunately I'm not sure how to enable that without
making them not-objects, which introduces all sorts of other complexity.
This should at the least be clarified in the RFC. Is this a limitation for
constants specifically, or anything using constexpr initializers? For
example, is writing "function foo(MyEnum $e = MyEnum::SomeDefaultCase)"
possibly? It would be a significant limitation if it weren't.
Generally, I think the limitation on objects in constants is mostly
artificial and you should consider lifting it as part of this RFC.
Previously you simply couldn't create an object in a constexpr initializer,
so supporting this wasn't really relevant. With enums, this becomes
important.
-
This has already been mentioned by others, but (in conjunction with the
previous point), I think that allowing class constants on enums is pretty
useful to allow case aliases. I agree that cases should be unique to start
with, but aliases should be possible, and class constants provide a very
neat way to provide this. -
While I originally liked the idea, after perusing the examples in the
RFC, I am not convinced that it is a good idea to allow methods (or
anything else) to be defined on a per-case level. Having methods on the
enum itself makes sense, but having them on each case seems like it
unnecessarily complicates the design, gives people multiple ways to write
the same thing and may encourage bad design.
I think there are two primary ways in which methods might be used: First,
defining a method for each case, such as your example in
https://wiki.php.net/rfc/enumerations#advanced_exclusive_values. In this
case you have the choice of either defining it on each case, or to define
it as a method on the whole enum. When would you choose one approach over
the other?
Defining the method on the whole enum seems generally superior to me,
because it guarantees that all cases have the method from an API
perspective (rather than just making it an incidental fact -- though I
guess you could add an abstract method to the enum?) Additionally it
requires a lot less code, especially if match is used. The example in the
RFC is even a bit skewed, because once PSR gets its dirty fingers on this
feature, all those "{" will get broken out on a new line and it will take
even more code.
The other usage is if a method is only defined by some of the cases, as
in the https://wiki.php.net/rfc/enumerations#state_machine example. This is
something I find very dubious from a design perspective, and not something
I would like to enable by making it simpler to implement. Do you know if
other languages have precedent for methods on individual enum cases?
-
On the implementation side, a general concern I have is that this
requires generating a new class not just for each enum, but for each enum
case. Some usages of enums (say lexer tokens, AST node kinds, etc) may
require hundreds of enum cases, and will generate hundreds of separate
classes. Unlike objects, class entries are not cheap and are not designed
to be cheap. -
As another implementation note, existing switch/match jumptable
optimizations will not work for enums. This is pretty unfortunate, but I
don't have an idea on how we could make them work. -
I find the automatic downcast of enums to their scalar values a bit
problematic when taking the overall direction of the language into account.
We want less implicit casts, not more. While I'm sure this will work nicely
in some cases, it certainly won't in others. I daresay that passing an enum
to the $offset parameter ofsubstr()
doesn't make sense regardless of
whether the enum has an int backing it or not. Explicitly requiring a
->value() call doesn't seem like an undue burden to me.
Regards,
Nikita
Hi Nikita,
At present, no. They're "just" objects, and you can't assign an object
to a constant. Unfortunately I'm not sure how to enable that without
making them not-objects, which introduces all sorts of other complexity.This should at the least be clarified in the RFC. Is this a limitation for
constants specifically, or anything using constexpr initializers? For
example, is writing "function foo(MyEnum $e = MyEnum::SomeDefaultCase)"
possibly? It would be a significant limitation if it weren't.Generally, I think the limitation on objects in constants is mostly
artificial and you should consider lifting it as part of this RFC.
Previously you simply couldn't create an object in a constexpr initializer,
so supporting this wasn't really relevant. With enums, this becomes
important.
I agree it should be clarified (and have phpt tests).
Another option would be to lift the limitation on objects in constant expression,
but only on immutable objects.
(currently just enum values)
There's a range of opinions on internals over what a constant would ideally be.
(something that is actually constant and not dependent on environment to make code as readable as possible,
or something that does not change at runtime, etc.)
So there may be objections if the change introduces the ability to write code like this,
but maybe more developers would find supporting only enums acceptable (continue throwing for non-enums):
define(CONFIGURATION, new stdClass());
class X { const CONFIGURATION = CONFIGURATION; }
X::CONFIGURATION->endpoints = [$dynamicUrl];
// do stuff
X::CONFIGURATION->endpoints = [$otherDynamicUrl];
Then again, this suggestion to only support enums may also be an unsatisfying compromise.
Regards,
- Tyson
Thank you so much for moving this forward, Larry and Ilija! I myself have
tried to draft my proposal at https://wiki.php.net/rfc/enum_v2 which has
aspects both similar and different from yours. And "v2" is another
indicator that this feature is desired by many, so yay.
On Sat, Dec 5, 2020 at 2:25 AM Larry Garfield larry@garfieldtech.com
wrote:
A few comments:
- Having to write "case" everywhere is extremely verbose, I can't think of
a popular language that requires so much typing in this area. Can the
keyword be omitted completely? - Performance concern: I suspect that most developers will not use the
advanced features from this proposal and your future plans much, instead
they will want to use it in very simple contexts, e.g.
"$order->calculateWeight(Weight::Gross)". Would every such call require an
object allocation and initialization? What about comparisons, would
expressions like "if ($cardType === Suit::Clubs) ..." require a new object? - What do you think about the part of my proposal that allows omitting enum
name where it's obvious from context, e.g. "pick_a_card(Clubs)" if the
declaration of pick_a_card() requires a parameter of type Suit?
--
Best regards,
Max Semenik
I myself have tried to draft my proposal athttps://wiki.php.net/rfc/enum_v2 which has
aspects both similar and different from yours.
I'm afraid that proposal would be a strong -1 from me, as "fancy
constants" are my least favourite type of enum. In particular, this
would be a deal-breaker:
Conversion from other types is not checked, thus enums can hold values
not covered by their constants.
The main reason I want enums is to make it impossible to represent
invalid states, so that they don't have to be accounted for at run-time.
Allowing someone to write (Month)13 without an immediate error would
completely invalidate that purpose.
- Performance concern: I suspect that most developers will not use the
advanced features from this proposal and your future plans much, instead
they will want to use it in very simple contexts, e.g.
"$order->calculateWeight(Weight::Gross)". Would every such call require an
object allocation and initialization? What about comparisons, would
expressions like "if ($cardType === Suit::Clubs) ..." require a new object?
As I understand it, Weight::Gross just looks up the same object each
time, so no allocation would be needed unless this happened to be the
first time you'd mentioned that enum.
Once you have an instance of an object, you're just passing around or
comparing a pointer (i.e. an integer) anyway, so I would expect it to
have roughly the same performance as passing or comparing a bare integer.
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
Did you discuss exhaustiveness checking already?
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding
support for enumerations and algebraic data types to PHP. This is a
not-small task, so we've broken it up into several stages. The first
stage, unit enumerations, are just about ready for public review and
discussion.The overarching plan (for context, NOT the thing to comment on right now)
is here: https://wiki.php.net/rfc/adtsThe first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is
mostly done, but not 100% complete. Still, it's far enough along to start
a discussion on and get broader feedback on the outstanding nits.I should note that while the design has been collaborative, credit for the
implementation goes entirely to Ilija. Blame for any typos in the RFC
itself go entirely to me.dons flame-retardant suit
--
Larry Garfield
larry@garfieldtech.com--
To unsubscribe, visit: https://www.php.net/unsub.php
Did you discuss exhaustiveness checking already?
Nevermind, this is done by match already.
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding
support for enumerations and algebraic data types to PHP. This is a
not-small task, so we've broken it up into several stages. The first
stage, unit enumerations, are just about ready for public review and
discussion.The overarching plan (for context, NOT the thing to comment on right now)
is here: https://wiki.php.net/rfc/adtsThe first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is
mostly done, but not 100% complete. Still, it's far enough along to start
a discussion on and get broader feedback on the outstanding nits.I should note that while the design has been collaborative, credit for
the implementation goes entirely to Ilija. Blame for any typos in the RFC
itself go entirely to me.dons flame-retardant suit
--
Larry Garfield
larry@garfieldtech.com--
To unsubscribe, visit: https://www.php.net/unsub.php
Greetings, denizens of Internals!
Ilija Tovilo and I have been working for the last few months on adding support for enumerations and algebraic data types to PHP. This is a not-small task, so we've broken it up into several stages. The first stage, unit enumerations, are just about ready for public review and discussion.
The overarching plan (for context, NOT the thing to comment on right now) is here: https://wiki.php.net/rfc/adts
The first step, for unit enumerations, is here:
Great work.
There's still a few bits we're sorting out and the implementation is mostly done, but not 100% complete. Still, it's far enough along to start a discussion on and get broader feedback on the outstanding nits.
- Will enum methods be idempotent?
If not, shouldn't they be?
- Will enum cases be able to have attributes applied given the change in implementation?
- "Cases are not intrinsically backed by a scalar value."
Completely agree with not using an ordinal value. It is too easy to change an ordinal value and break some hidden dependency.
However, I would ask we consider a default string value, i.e. that this:
enum Size {
case Small;
case Medium;
case Large;
}
Would be equivalent to:
enum Size {
case Small = 'Small';
case Medium = 'Medium';
case Large = 'Large';
}
The justification is for use-cases with a large number of cases it would be too easy to have typos and/or copy-paste errors if the developer has to explicitly specify the value.
Also, the following would leave the values of Medium and Large undefined given that enumerations supports only a single type at a time:
enum Size {
case Small = 1;
case Medium;
case Large;
}
So, can enums get a default value of their name as a string when zero values are provided?
- "Class/Enum inheritance. - Enums are by design a closed list"
I would ask if this is really necessary to disable inheritance?
Consider the following as a example where I use a known list for clarity but where I am really more interested is lists a developer maintains, i.e. their own apps list of errors:
enum OsErrors {
case EPERM = 1;
case EINTR = 4;
case EIO = 5;
case ENXIO = 6;
case E2BIG = 7;
....
}
enum FileErrors extends OsErrors {
case ENOENT = 2;
case ENOEXEC = 8;
case EBADF = 9;
....
}
enum ProcessErrors extends OsErrors {
case ESRCH = 3;
case ECHILD = 10;
....
}
Without inheritance a developer could not create a new error enum, such as SpeachRecognitionErrors and be able to include the base errors without having to duplicate them.
Now it is very possible that someone can focus on my example and explain why enums are not the right solution here, but please do not to that.
Instead please consider without inheritance nobody would be able to reuse a base set of enums w/o duplication of names and/or values.
So can we revisit the idea of disallowing inheritance?
- Someone else mentioned shortcut syntax, which I would like to mention again, although I realize implement details might make this a non-starter.
So if I have a function that accepts a Size from above, e.g.:
function show(Size $size) {}
Then it would be great if we could call the function like this since its parameter was type-hinted:
show(::Medium)
Rather than always having to write:
show(Size::Medium)
So can we consider a shortcut syntax?
I should note that while the design has been collaborative, credit for the implementation goes entirely to Ilija. Blame for any typos in the RFC itself go entirely to me.
dons flame-retardant suit
--
Larry Garfield
larry@garfieldtech.com--
To unsubscribe, visit: https://www.php.net/unsub.php
Again, great work!
-Mike
- Will enum methods be idempotent?
If not, shouldn't they be?
Can you clarify what you mean here? The only meaningful question I can
think of is "can they change the object's state?" That's mostly answered
in the RFC, most notably by specifying that enums may not have instance
properties, to avoid them having any state to change.
Unless I'm missing something, trying to define "idempotence" or "pure
functions" any more strictly than that would surely be a massive project
in itself - for a start, you'd need a whitelist of all built-in
operations which were side-effect free (i.e. no file writes,
configuration changes, output, etc, etc).
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
- Will enum methods be idempotent?
If not, shouldn't they be?
Can you clarify what you mean here?
For a given parameter set the method would always return the same value. Note I am not suggesting that values would be fixed across different executions, only within a given execution.
Said another way, idempotent and deterministic[1] are effectively equivalent.
Taking this example the now() method would not be idempotent nor deterministic:
enum DayParts {
case Morning;
case Afternoon;
case Evening;
case Night;
public static function now() {
$now = intval(date('H',time()));
return match(true) {
$now < 8 => static::Night,
$now < 12 => static::Morning,
$now < 18 => static::Afternoon,
default => static::Evening,
};
}
}
The only meaningful question I can think of is "can they change the object's state?" That's mostly answered in the RFC, most notably by specifying that enums may not have instance properties, to avoid them having any state to change.
Unless I'm missing something, trying to define "idempotence" or "pure functions" any more strictly than that would surely be a massive project in itself - for a start, you'd need a whitelist of all built-in operations which were side-effect free (i.e. no file writes, configuration changes, output, etc, etc).
I do not believe it should be a massive project. I believe it could be implemented with a simple map that takes a hash of the input parameters and maps to their return value. For idempotency this value should be calculated the first time the method is called.
After the first call every subsequent call would hash the input parameters, lookup the pre-calculated value from the map, and the just return it without re-executing any code in the method except for the return statement (I am considering of the desire to set a breakpoint in XDEBUG on return for all accesses, not just the first.)
The one potential concern is if the number of potential input permutations is large it could eat a lot of memory if the app actually used a lot of them. But then that is true on any use of memory in PHP, so if it became a problem for a developer I think it should be on them to rearchitect their solution to avoid using too much memory.
This is an important question to answer now because IF the answer to idempotency is "no" then restricting it a later to be idempotent would require accepting a BC break. But if the answer is YES, the restriction could always be relaxed in a future version if that ever made sense.
-Mike
Unless I'm missing something, trying to define "idempotence" or "pure functions" any more strictly than that would surely be a massive project in itself - for a start, you'd need a whitelist of all built-in operations which were side-effect free (i.e. no file writes, configuration changes, output, etc, etc).
I do not believe it should be a massive project. I believe it could be
implemented with a simple map that takes a hash of the input parameters
and maps to their return value. For idempotency this value should be
calculated the first time the method is called.After the first call every subsequent call would hash the input
parameters, lookup the pre-calculated value from the map, and the just
return it without re-executing any code in the method except for the
return statement (I am considering of the desire to set a breakpoint in
XDEBUG on return for all accesses, not just the first.)The one potential concern is if the number of potential input
permutations is large it could eat a lot of memory if the app actually
used a lot of them. But then that is true on any use of memory in PHP,
so if it became a problem for a developer I think it should be on them
to rearchitect their solution to avoid using too much memory.This is an important question to answer now because IF the answer to
idempotency is "no" then restricting it a later to be idempotent would
require accepting a BC break. But if the answer is YES, the
restriction could always be relaxed in a future version if that ever
made sense.-Mike
What you're describing is memoization. Memoization is only safe on idempotent functions. Please do not confuse the two terms, as they mean very different things.
However, as long as global variables exist in the language and there are file IO operations defined, we cannot guarantee that a given method is idempotent and thus safely memoizable. Saying "well this function/method really should be idempotent if you're doing it right" (even when correct) is insufficient justification for blindly memoizing it. That's true regardless of whether or not the method in question is on an enum.
Having a way for developers to flag a function as safe to memoize would be helpful, but is a completely different topic from Enums.
Forbidding enum-bound state is as close to guaranteed idempotence as PHP allows, which is what the current RFC does.
--Larry Garfield
- Someone else mentioned shortcut syntax, which I would like to mention
again, although I realize implement details might make this a non-starter.
So if I have a function that accepts a Size from above, e.g.:
function show(Size $size) {}
Then it would be great if we could call the function like this since its
parameter was type-hinted:show(::Medium)
Rather than always having to write:
show(Size::Medium)
So can we consider a shortcut syntax?
I'm not sure what value a shortcut syntax would bring, but it would surely
break, or at least make ambiguous, union types:
function show(Enum1|Enum2 $value) {}
show(::Medium)
Which enum would Medium resolve to?
- Benjamin
- Someone else mentioned shortcut syntax, which I would like to mention again, although I realize implement details might make this a non-starter.
So if I have a function that accepts a Size from above, e.g.:
function show(Size $size) {}
Then it would be great if we could call the function like this since its parameter was type-hinted:
show(::Medium)
Rather than always having to write:
show(Size::Medium)
So can we consider a shortcut syntax?
I'm not sure what value a shortcut syntax would bring, but it would surely break, or at least make ambiguous, union types:
function show(Enum1|Enum2 $value) {} show(::Medium)
Which enum would Medium resolve to?
Good question.
What could happen is:
-
It could resolve to the Enum that has a ::Medium,
-
Or it could be disallowed for union type hints (probably the better option as adding a "Medium" to the other enum could inadvertently break code that calls it.)
-
This suggestion could just be tabled.
Again, I am just asking if this is something we could consider because it would be nice to shorten line length in cases where expressions get really long.
Maybe a different question could be if a "use" statement could empower making explicit shorthands/aliases?
-Mike
P.S. I completely understand if either of these things are out of scope for this RFC.
The first step, for unit enumerations, is here:
https://wiki.php.net/rfc/enumerations
There's still a few bits we're sorting out and the implementation is mostly done, but not 100% complete.
I use 'Enumerations' quite extensively but have not found the lack of a
hard coded base for that a limitation at least partially because as
provided in databases the fundamental restrictions that imposes make
using them something of a 'bodge job'.
There are two areas I would highlight as causing problems. ...
1/ Dynamic Enumerations ... where the application may need to add or
delete values over time. I have a number of tables in the database which
provide a dynamic list of elements which are used to provide the lists
inside the PHP functionality. These are invariably managed by a numeric
index in addition to the text of each item, and historic values remain
in the database flagged as inactive.
2/ An area that PHP remains poor at supporting, internationalization.
Since the vast number of end users do not have English as a first
language, then translations of the enumeration values becomes essential
and the table approach obviously works nicely here since one simply
provides multiple sets of text in parallel and select the language
needed as an option.
I am not saying that there is anything wrong with the proposal, only
that as with many aspects being proposed these days, there is a distinct
lack of consideration on just how some aspects of their use can be
expanded to cover internationalization and the example in the RFC has no
obvious way of supporting a different language?
--
Lester Caine - G8HFL
Contact - https://lsces.uk/wiki/Contact
L.S.Caine Electronic Services - https://lsces.uk
Model Engineers Digital Workshop - https://medw.uk
Rainbow Digital Media - https://rainbowdigitalmedia.uk