Hi All,
This has come up in passing a few times recently, but I'm not sure
there's ever been a dedicated discussion of it: would it be useful for
PHP to have a built-in Enumeration type, and if so, how should it look?
Many other languages have enum types, with varying functionality. The
central concept is always that there's some declared type with a bunch
of named constants, but beyond that there's a lot of variability: for
starters, is it a type of object, a special way of naming integers, or
an uncomparable type more like booleans and nulls?
Here are my thoughts on what I'd like from a PHP enum - I'm not using
"should" in a particularly strong sense, it's just a way of framing the
points of discussion:
-
enums should act like values, not mutable objects; an array-style
copy-on-write behaviour would be possible, but it feels wrong to me to
store any state in an enum variable, so just plain immutable would be my
preference -
the members of an enum should have a name which is accesible as a
string, e.g. Weekdays::SUNDAY->getName() == 'SUNDAY' -
there should be no accessible "value" for a member; the value of
Weekdays::SUNDAY is simply Weekdays::SUNDAY, not 0 or 7 or 'SUNDAY' (I'm
thinking that internally, each member would be represented as an object
pointer, so there's no real benefit to forcing people to number
everything like some languages do) -
each enum member should be considered a singleton, in the sense that
you can't construct or destroy one, only reference them; all possible
instances would be created as soon as the enum was defined -
following from (3) and (4), an enum value should compare equal only
to itself, unlike in C# for instance, where it would be comparable to an
integer based on its numeric value; similarly it shouldn't be possible
to cast to or from an enum -
an enum should be able to have additional fields; these would be
initialised in the enum's definition; this is inspired by Java and
Python's ability to pass parameters to the "constructor" of the enum,
but it feels weird to me for any logic to happen in that constructor
other than simple assignments, so I'm thinking of a simpler syntax and
implementation. It also simplifies immutability if no userland code ever
writes to the properties. There may be an important use case for
constructor logic I'm missing though? -
an enum should have default static methods for accessing all the
members of the enum as an associative array -
enums should be a first-class type, is_object(Weekdays::SUNDAY)
should return false, for instance; maybe Weekdays::SUNDAY instanceof
Weekdays should return true though -
additional static and instance methods should be definable, bearing
in mind the immutability constraints already discussed
Given the above, I think we might end up with something like this:
enum Weekdays {
member MONDAY; // if there are no fields to initalise, the member
just needs its name declaring
member TUESDAY ( 2, 'Chewsdae' ); // positional arguments for
populating fields in the order they are defined; a bit like Java, but
without the constructor
member WEDNESDAY { $dayNumber = 3, $sillyName = 'Wed Nose Day' };
// or maybe a named-parameter syntax to make things clearer
member THURSDAY, FRIDAY, SATURDAY, SUNDAY; // don't force people to
write each entry on its own line; maybe even the "member" keyword is too
much?
public $dayNumber, $sillyName; // fields initialised for each member
public static function getWeekend() {
return [ self::SATURDAY, self::SUNDAY ];
}
public function getZeroIndexedDayNumber() {
return $this->dayNumber - 1;
}
}
$today = Weekdays::THURSDAY;
foreach ( Weekdays::getMembers() as $day_name => $day ) {
echo $day->dayNumber;
if ( $day == $today ) {
echo "Today!";
} else {
echo $day_name; // equivalently: echo $day->getName();
}
}
// Do we need a static method to access a member by name, or is this
good enough?
$favourite_day = Weekdays::getMembers()[ $_GET['fav_day'] ];
So, what are anyone's thoughts? Am I rambling on too much as usual? ;)
Regards,
--
Rowan Collins
[IMSoP]
Hi,
2015-09-17 20:06 GMT-03:00 Rowan Collins rowan.collins@gmail.com:
Hi All,
This has come up in passing a few times recently, but I'm not sure there's
ever been a dedicated discussion of it: would it be useful for PHP to have a
built-in Enumeration type, and if so, how should it look?Many other languages have enum types, with varying functionality. The
central concept is always that there's some declared type with a bunch of
named constants, but beyond that there's a lot of variability: for starters,
is it a type of object, a special way of naming integers, or an uncomparable
type more like booleans and nulls?Here are my thoughts on what I'd like from a PHP enum - I'm not using
"should" in a particularly strong sense, it's just a way of framing the
points of discussion:
enums should act like values, not mutable objects; an array-style
copy-on-write behaviour would be possible, but it feels wrong to me to store
any state in an enum variable, so just plain immutable would be my
preferencethe members of an enum should have a name which is accesible as a string,
e.g. Weekdays::SUNDAY->getName() == 'SUNDAY'there should be no accessible "value" for a member; the value of
Weekdays::SUNDAY is simply Weekdays::SUNDAY, not 0 or 7 or 'SUNDAY' (I'm
thinking that internally, each member would be represented as an object
pointer, so there's no real benefit to forcing people to number everything
like some languages do)each enum member should be considered a singleton, in the sense that you
can't construct or destroy one, only reference them; all possible instances
would be created as soon as the enum was definedfollowing from (3) and (4), an enum value should compare equal only to
itself, unlike in C# for instance, where it would be comparable to an
integer based on its numeric value; similarly it shouldn't be possible to
cast to or from an enuman enum should be able to have additional fields; these would be
initialised in the enum's definition; this is inspired by Java and Python's
ability to pass parameters to the "constructor" of the enum, but it feels
weird to me for any logic to happen in that constructor other than simple
assignments, so I'm thinking of a simpler syntax and implementation. It also
simplifies immutability if no userland code ever writes to the properties.
There may be an important use case for constructor logic I'm missing though?an enum should have default static methods for accessing all the members
of the enum as an associative arrayenums should be a first-class type, is_object(Weekdays::SUNDAY) should
return false, for instance; maybe Weekdays::SUNDAY instanceof Weekdays
should return true thoughadditional static and instance methods should be definable, bearing in
mind the immutability constraints already discussedGiven the above, I think we might end up with something like this:
enum Weekdays {
member MONDAY; // if there are no fields to initalise, the member just
needs its name declaring
member TUESDAY ( 2, 'Chewsdae' ); // positional arguments for populating
fields in the order they are defined; a bit like Java, but without the
constructor
member WEDNESDAY { $dayNumber = 3, $sillyName = 'Wed Nose Day' }; // or
maybe a named-parameter syntax to make things clearer
member THURSDAY, FRIDAY, SATURDAY, SUNDAY; // don't force people to
write each entry on its own line; maybe even the "member" keyword is too
much?public $dayNumber, $sillyName; // fields initialised for each member public static function getWeekend() { return [ self::SATURDAY, self::SUNDAY ]; } public function getZeroIndexedDayNumber() { return $this->dayNumber - 1; }
}
$today = Weekdays::THURSDAY;
foreach ( Weekdays::getMembers() as $day_name => $day ) {
echo $day->dayNumber;
if ( $day == $today ) {
echo "Today!";
} else {
echo $day_name; // equivalently: echo $day->getName();
}
}// Do we need a static method to access a member by name, or is this good
enough?
$favourite_day = Weekdays::getMembers()[ $_GET['fav_day'] ];So, what are anyone's thoughts? Am I rambling on too much as usual? ;)
Regards,
--
Rowan Collins
[IMSoP]
A kitten is working on that https://wiki.php.net/rfc/enum. Many points
on the linked RFC are compatible to the points you raised so it might
be worth reading before even starting any new proposal.
Regards,
Márcio
A kitten is working on thathttps://wiki.php.net/rfc/enum. Many points
on the linked RFC are compatible to the points you raised so it might
be worth reading before even starting any new proposal.
Aha, I hadn't seen that, should have thought to search the wiki. Still,
interesting that we seems to mostly agree on the key decisions, although
I've gone into more detail about things that he's left as Future Scope,
which is fair enough.
The main thing I'd change if I was writing it is the implicit "ordinal"
property; I don't really see the point of insisting that the order I
write "Red, Green, Blue" has some meaning. On the other hand, I'd give
much more prominence to the name, e.g. by making values() return an
associative array. You're much more likely to want to say "the value
whose name is 'RED'" than "the value I defined first in the list", IMHO.
If you have some kind of initialiser syntax for the values - with or
without a constructor - you get to have ordinals or explicit values
(like the Flags example) if you want them, and just names if not:
enum RenewalAction{
Deny( 0 ),
Approve( 1 );
public $ordinal;
}
enum Flags{
a (1 << 0),
b( 1 << 1),
c( 1 << 2),
d( 1 << 3 ); public $value; }
enum PimaryColours { RED, GREEN, BLUE }
Regards,
--
Rowan Collins
[IMSoP]
Am 18.09.2015 um 01:56 schrieb Rowan Collins rowan.collins@gmail.com:
A kitten is working on thathttps://wiki.php.net/rfc/enum. Many points
on the linked RFC are compatible to the points you raised so it might
be worth reading before even starting any new proposal.Aha, I hadn't seen that, should have thought to search the wiki. Still, interesting that we seems to mostly agree on the key decisions, although I've gone into more detail about things that he's left as Future Scope, which is fair enough.
The main thing I'd change if I was writing it is the implicit "ordinal" property; I don't really see the point of insisting that the order I write "Red, Green, Blue" has some meaning. On the other hand, I'd give much more prominence to the name, e.g. by making values() return an associative array. You're much more likely to want to say "the value whose name is 'RED'" than "the value I defined first in the list", IMHO.
If you have some kind of initialiser syntax for the values - with or without a constructor - you get to have ordinals or explicit values (like the Flags example) if you want them, and just names if not:
enum RenewalAction{
Deny( 0 ),
Approve( 1 );public $ordinal;
}
enum Flags{
a (1 << 0),
b( 1 << 1),
c( 1 << 2),
d( 1 << 3 ); public $value; }enum PimaryColours { RED, GREEN, BLUE }
Regards,
--
Rowan Collins
[IMSoP]
The reason it is not an associative array is that the names are not important.
But we actually need ordinal(), else we'll end up writing array_search(TheEnum::values(), $value).
It's mainly important for the purpose of serializing/unserializing (manually or internally).
You never should rely on the ordinal value of the enum for anything. Enums are supposed to be value-less. It's a type. It's not a value. The only usage of the value is going external.
Also, I honestly think the Flags example is a bad one. (At least as long as we don't have a possibility to typehint combined enums...). If you need that, use class constants. They provide exactly what you need.
Bob
Am 18.09.2015 um 01:56 schrieb Rowan Collins rowan.collins@gmail.com:
A kitten is working on thathttps://wiki.php.net/rfc/enum. Many points
on the linked RFC are compatible to the points you raised so it might
be worth reading before even starting any new proposal.Aha, I hadn't seen that, should have thought to search the wiki. Still, interesting that we seems to mostly agree on the key decisions, although I've gone into more detail about things that he's left as Future Scope, which is fair enough.
The main thing I'd change if I was writing it is the implicit "ordinal" property; I don't really see the point of insisting that the order I write "Red, Green, Blue" has some meaning. On the other hand, I'd give much more prominence to the name, e.g. by making values() return an associative array. You're much more likely to want to say "the value whose name is 'RED'" than "the value I defined first in the list", IMHO.
If you have some kind of initialiser syntax for the values - with or without a constructor - you get to have ordinals or explicit values (like the Flags example) if you want them, and just names if not:
enum RenewalAction{
Deny( 0 ),
Approve( 1 );public $ordinal;
}
enum Flags{
a (1 << 0),
b( 1 << 1),
c( 1 << 2),
d( 1 << 3 ); public $value; }enum PimaryColours { RED, GREEN, BLUE }
Regards,
--
Rowan Collins
[IMSoP]The reason it is not an associative array is that the names are not important.
But we actually need ordinal(), else we'll end up writing array_search(TheEnum::values(), $value).
It's mainly important for the purpose of serializing/unserializing (manually or internally).You never should rely on the ordinal value of the enum for anything. Enums are supposed to be value-less. It's a type. It's not a value. The only usage of the value is going external.
Also, I honestly think the Flags example is a bad one. (At least as long as we don't have a possibility to typehint combined enums...). If you need that, use class constants. They provide exactly what you need.
Bob
Besides providing serialization mapping (or, the equivalent of a __toString() value if the enum is used in a strong context), an important benefit to providing a value for enums would be to allow for changing or deprecating enums. So you might want to say something like:
enum FormEvents {
PRE_SUBMIT,
PRE_BIND = PRE_SUBMIT, //deprecated name, provide mapping to new name
}
Another good reason for wanting to provide values is so that you could write code like this:
$eventHandlers = [
FormEvents::PRE_SUBMIT => function() { … },
FormEvents::POST_SUBMIT => function() { … },
...
];
Although maybe the ideal solution here would be to allow for array indexes to have more types than just int and string (so that the array index actually is the enum, not some serialized representation).
-John
The reason it is not an associative array is that the names are not
important.
...
You never should rely on the ordinal value of the enum for anything.
I feel like I'm missing something here. In my mind, the only absolute universal about all enum implementations is that you refer to values by constant names - e.g. Weekdays::SUNDAY. The name 'SUNDAY' is as fundamental and unchanging as the name 'Weekdays'.
We rely on names to reference classes and functions all the time, and to serialize properties of an object; so what is it about enums that makes having an integer accessible so important?
I note that Java does supply an ordinal(), but the docs say you should basically never use it.
Regards,
Rowan Collins
[IMSoP]
Am 18.09.2015 um 10:42 schrieb Rowan Collins rowan.collins@gmail.com:
The reason it is not an associative array is that the names are not
important.
...
You never should rely on the ordinal value of the enum for anything.I feel like I'm missing something here. In my mind, the only absolute universal about all enum implementations is that you refer to values by constant names - e.g. Weekdays::SUNDAY. The name 'SUNDAY' is as fundamental and unchanging as the name 'Weekdays'.
We rely on names to reference classes and functions all the time, and to serialize properties of an object; so what is it about enums that makes having an integer accessible so important?
I note that Java does supply an ordinal(), but the docs say you should basically never use it.
Regards,
Rowan Collins
[IMSoP]
Well, I think we should either have an ordinal or a name.
But not both.
Currently, after thinking about it, I'm in favor of just a name. And no ordinal.
Having both is, I think, unnecessary.
Bob
Bob Weinand wrote on 18/09/2015 15:23:
Well, I think we shouldeither have an ordinalor a name.
But not both.Currently, after thinking about it, I'm in favor of just a name. And no ordinal.
Having both is, I think, unnecessary.
Yeah, that was my instinct. It's a bit like surrogate keys on database
tables - if your table has a natural Primary Key, there's no need to add
an auto-increment field. :)
And the name kind of has to exist anyway, or you can't reference the
enum value at all. You could make it inaccessible, but then not being
able to get a string representation for debugging, or to do a dynamic
Weekdays::getByName($_GET['day']), would be annoying.
Regards,
Rowan Collins
[IMSoP]
I few questions wrt the rfc: https://wiki.php.net/rfc/enum
An enum value is only equal to itself.
I'm not sure I agree. How then do I store the enum into a DB and compare it
after reading?
switch($db->query('select role from user where user_id = 123')->fetch()[0])
{
case UserRole::Admin:
include 'admin.php';
break;
case UserRole::User;
include 'user.php';
break;
default:
include 'login.php';
break;
}
Would this not ALWAYS include login if UserRole is an enum (instead of a
final class with constants as I use now, which a switch like this will work
for)? Is this only possible with the magic value method? In which case I'd
have to check the value method, something like case
UserRole::Admin->ordinal() (or name, or value or whatever)?
This means that if the name enum is used for a property, function,
method, class, trait or interface there will now be a parse error instead.
Surely, a property can be named enum, since it can be named other reserved
words, for example?
$ php
<?php
class Foo {
public $function = 'callMe';
public $trait = 'useMe';
public $class = 'instantiateMe';
}
$f = new Foo();
var_dump(get_object_vars($f));
array(3) {
'function' =>
string(6) "callMe"
'trait' =>
string(5) "useMe"
'class' =>
string(13) "instantiateMe"
}
Also, I really like the idea of being able to type hint the enum in the
switch(UserRole) that was mentioned. Not sure I like the idea of an
implicit throw though, consider the UserRole above, I might have 10
different roles, but the current switch is only valid for 5 of those roles.
In this situation, if I add a default: return Auth::NotAuthorized; will
that supress the implicit fall throw? If so, how often will the implicit
throw really be hit? I know we have standards that all case statements must
have default statements, to make sure every case is handled, whether we
foresaw it or not.
Am 18.09.2015 um 02:27 schrieb Ryan Pallas derokorian@gmail.com:
I few questions wrt the rfc: https://wiki.php.net/rfc/enum
An enum value is only equal to itself.
I'm not sure I agree. How then do I store the enum into a DB and compare it
after reading?
switch($db->query('select role from user where user_id = 123')->fetch()[0])
{
case UserRole::Admin:
include 'admin.php';
break;
case UserRole::User;
include 'user.php';
break;
default:
include 'login.php';
break;
}
Would this not ALWAYS include login if UserRole is an enum (instead of a
final class with constants as I use now, which a switch like this will work
for)? Is this only possible with the magic value method? In which case I'd
have to check the value method, something like case
UserRole::Admin->ordinal() (or name, or value or whatever)?
You could use UserRole::values()[$ordinal] and then compare against that.
This means that if the name enum is used for a property, function,
method, class, trait or interface there will now be a parse error instead.
Surely, a property can be named enum, since it can be named other reserved
words, for example?
$ php
<?php
class Foo {
public $function = 'callMe';
public $trait = 'useMe';
public $class = 'instantiateMe';
}
$f = new Foo();
var_dump(get_object_vars($f));array(3) {
'function' =>
string(6) "callMe"
'trait' =>
string(5) "useMe"
'class' =>
string(13) "instantiateMe"
}
Obvious mistake in the RFC, I just changed it...
Also, I really like the idea of being able to type hint the enum in the
switch(UserRole) that was mentioned. Not sure I like the idea of an
implicit throw though, consider the UserRole above, I might have 10
different roles, but the current switch is only valid for 5 of those roles.
In this situation, if I add a default: return Auth::NotAuthorized; will
that supress the implicit fall throw? If so, how often will the implicit
throw really be hit? I know we have standards that all case statements must
have default statements, to make sure every case is handled, whether we
foresaw it or not.
I'd really put that under Future Scope through. It's not really important to the feature of Enums themselves and always can be added later on.
Bob
Am 18.09.2015 um 01:06 schrieb Rowan Collins rowan.collins@gmail.com:
Hi All,
This has come up in passing a few times recently, but I'm not sure there's ever been a dedicated discussion of it: would it be useful for PHP to have a built-in Enumeration type, and if so, how should it look?
Many other languages have enum types, with varying functionality. The central concept is always that there's some declared type with a bunch of named constants, but beyond that there's a lot of variability: for starters, is it a type of object, a special way of naming integers, or an uncomparable type more like booleans and nulls?
Here are my thoughts on what I'd like from a PHP enum - I'm not using "should" in a particularly strong sense, it's just a way of framing the points of discussion:
enums should act like values, not mutable objects; an array-style copy-on-write behaviour would be possible, but it feels wrong to me to store any state in an enum variable, so just plain immutable would be my preference
the members of an enum should have a name which is accesible as a string, e.g. Weekdays::SUNDAY->getName() == 'SUNDAY'
there should be no accessible "value" for a member; the value of Weekdays::SUNDAY is simply Weekdays::SUNDAY, not 0 or 7 or 'SUNDAY' (I'm thinking that internally, each member would be represented as an object pointer, so there's no real benefit to forcing people to number everything like some languages do)
each enum member should be considered a singleton, in the sense that you can't construct or destroy one, only reference them; all possible instances would be created as soon as the enum was defined
following from (3) and (4), an enum value should compare equal only to itself, unlike in C# for instance, where it would be comparable to an integer based on its numeric value; similarly it shouldn't be possible to cast to or from an enum
an enum should be able to have additional fields; these would be initialised in the enum's definition; this is inspired by Java and Python's ability to pass parameters to the "constructor" of the enum, but it feels weird to me for any logic to happen in that constructor other than simple assignments, so I'm thinking of a simpler syntax and implementation. It also simplifies immutability if no userland code ever writes to the properties. There may be an important use case for constructor logic I'm missing though?
an enum should have default static methods for accessing all the members of the enum as an associative array
enums should be a first-class type, is_object(Weekdays::SUNDAY) should return false, for instance; maybe Weekdays::SUNDAY instanceof Weekdays should return true though
additional static and instance methods should be definable, bearing in mind the immutability constraints already discussed
Given the above, I think we might end up with something like this:
enum Weekdays {
member MONDAY; // if there are no fields to initalise, the member just needs its name declaring
member TUESDAY ( 2, 'Chewsdae' ); // positional arguments for populating fields in the order they are defined; a bit like Java, but without the constructor
member WEDNESDAY { $dayNumber = 3, $sillyName = 'Wed Nose Day' }; // or maybe a named-parameter syntax to make things clearer
member THURSDAY, FRIDAY, SATURDAY, SUNDAY; // don't force people to write each entry on its own line; maybe even the "member" keyword is too much?public $dayNumber, $sillyName; // fields initialised for each member
public static function getWeekend() {
return [ self::SATURDAY, self::SUNDAY ];
}public function getZeroIndexedDayNumber() {
return $this->dayNumber - 1;
}
}$today = Weekdays::THURSDAY;
foreach ( Weekdays::getMembers() as $day_name => $day ) {
echo $day->dayNumber;
if ( $day == $today ) {
echo "Today!";
} else {
echo $day_name; // equivalently: echo $day->getName();
}
}// Do we need a static method to access a member by name, or is this good enough?
$favourite_day = Weekdays::getMembers()[ $_GET['fav_day'] ];So, what are anyone's thoughts? Am I rambling on too much as usual? ;)
Regards,
--
Rowan Collins
[IMSoP]
I like enums in general, but I'd like to note that there's already a RFC in draft by Levi:
https://wiki.php.net/rfc/enum https://wiki.php.net/rfc/enum
As far as I know, the RFC is already fairly final and just lacks an implementation.
So, I'd consider bikeshedding an actual RFC first.
Bob
Am 18.09.2015 um 01:06 schrieb Rowan Collins rowan.collins@gmail.com:
This has come up in passing a few times recently, but I'm not sure there's ever been a dedicated discussion of it: would it be useful for PHP to have a built-in Enumeration type, and if so, how should it look?
I like enums in general, but I'd like to note that there's already a RFC in draft by Levi:
https://wiki.php.net/rfc/enum https://wiki.php.net/rfc/enum
As far as I know, the RFC is already fairly final and just lacks an implementation.
So, I'd consider bikeshedding an actual RFC first.
If we’re bikeshedding, one feature I would really like to see, with typehinting, is warnings if all cases of an enum aren’t handled in a switch. So, for example, given our example Weekdays enum, if I wrote this code:
switch(Weekday $someWeekday) {
case Weekday::MONDAY: break;
case Weekday::TUESDAY: break;
}
By providing the typehint, I’m indicating that I want to get a warning/error that the switch does not cover all enum values. This would be very handy if an enum value is added after initial development and someone misses a switch statement in cleanup.
The typehint would also allow generating a warning if someone did something like
switch(Weekday $someWeekday) {
//case … all the weekdays: break;
case ‘I am not a Weekday’: echo ‘Generate a fatal error here because string is not a Weekday.’;
}
-John
Am 18.09.2015 um 01:52 schrieb John Bafford jbafford@zort.net:
Am 18.09.2015 um 01:06 schrieb Rowan Collins rowan.collins@gmail.com:
This has come up in passing a few times recently, but I'm not sure there's ever been a dedicated discussion of it: would it be useful for PHP to have a built-in Enumeration type, and if so, how should it look?
I like enums in general, but I'd like to note that there's already a RFC in draft by Levi:
https://wiki.php.net/rfc/enum https://wiki.php.net/rfc/enum
As far as I know, the RFC is already fairly final and just lacks an implementation.
So, I'd consider bikeshedding an actual RFC first.
If we’re bikeshedding, one feature I would really like to see, with typehinting, is warnings if all cases of an enum aren’t handled in a switch. So, for example, given our example Weekdays enum, if I wrote this code:
switch(Weekday $someWeekday) {
case Weekday::MONDAY: break;
case Weekday::TUESDAY: break;
}By providing the typehint, I’m indicating that I want to get a warning/error that the switch does not cover all enum values. This would be very handy if an enum value is added after initial development and someone misses a switch statement in cleanup.
The typehint would also allow generating a warning if someone did something like
switch(Weekday $someWeekday) {
//case … all the weekdays: break;
case ‘I am not a Weekday’: echo ‘Generate a fatal error here because string is not a Weekday.’;
}-John
So, you mean like an implicit
default: throw Error("Unhandled enum value Weekday::FRIDAY");
Possible, but then you also just can add
default: assert(0);
instead of a typehint.
At compile-time won't be really possible, as, when the switch is encountered, the enum might not yet have been loaded. That's one of the consequences of PHP's lazy inclusion system...
Bob
Am 18.09.2015 um 01:52 schrieb John Bafford jbafford@zort.net:
If we’re bikeshedding, one feature I would really like to see, with typehinting, is warnings if all cases of an enum aren’t handled in a switch. So, for example, given our example Weekdays enum, if I wrote this code:
switch(Weekday $someWeekday) {
case Weekday::MONDAY: break;
case Weekday::TUESDAY: break;
}By providing the typehint, I’m indicating that I want to get a warning/error that the switch does not cover all enum values. This would be very handy if an enum value is added after initial development and someone misses a switch statement in cleanup.
The typehint would also allow generating a warning if someone did something like
switch(Weekday $someWeekday) {
//case … all the weekdays: break;
case ‘I am not a Weekday’: echo ‘Generate a fatal error here because string is not a Weekday.’;
}So, you mean like an implicit
default: throw Error("Unhandled enum value Weekday::FRIDAY");Possible, but then you also just can add
default: assert(0);
instead of a typehint.At compile-time won't be really possible, as, when the switch is encountered, the enum might not yet have been loaded. That's one of the consequences of PHP's lazy inclusion system...
Compile-time might not be possible, true, but at least this would provide for runtime checks, and the benefit of explicitly stating intention is that intention is explicitly stated. Also, it would allow the sort of errors I’m proposing be generated to all be standardized across all applications, which would allow fancy error handlers, such as with Symfony, to offer helpful context-specific suggestions. Also, it’d provide static analyzers additional information by which to provide correctness alerts.
-John
Am 18.09.2015 um 02:35 schrieb John Bafford jbafford@zort.net:
Am 18.09.2015 um 01:52 schrieb John Bafford jbafford@zort.net:
If we’re bikeshedding, one feature I would really like to see, with typehinting, is warnings if all cases of an enum aren’t handled in a switch. So, for example, given our example Weekdays enum, if I wrote this code:
switch(Weekday $someWeekday) {
case Weekday::MONDAY: break;
case Weekday::TUESDAY: break;
}By providing the typehint, I’m indicating that I want to get a warning/error that the switch does not cover all enum values. This would be very handy if an enum value is added after initial development and someone misses a switch statement in cleanup.
The typehint would also allow generating a warning if someone did something like
switch(Weekday $someWeekday) {
//case … all the weekdays: break;
case ‘I am not a Weekday’: echo ‘Generate a fatal error here because string is not a Weekday.’;
}So, you mean like an implicit
default: throw Error("Unhandled enum value Weekday::FRIDAY");Possible, but then you also just can add
default: assert(0);
instead of a typehint.At compile-time won't be really possible, as, when the switch is encountered, the enum might not yet have been loaded. That's one of the consequences of PHP's lazy inclusion system...
Compile-time might not be possible, true, but at least this would provide for runtime checks, and the benefit of explicitly stating intention is that intention is explicitly stated. Also, it would allow the sort of errors I’m proposing be generated to all be standardized across all applications, which would allow fancy error handlers, such as with Symfony, to offer helpful context-specific suggestions. Also, it’d provide static analyzers additional information by which to provide correctness alerts.
-John
Static analyzers etc. all can look at the cases and conclude that the type indeed should have been of a certain Enum. I don't really see a particular advantage here.
And for the run-time checks, just add a default: assert(0); as said. I don't see loads of benefits from this extra syntax.
Also, as said, I'd pretty much prefer that to be separated out in a different RFC as it isn't crucial to the Enum feature itself.
Bob
Am 18.09.2015 um 01:06 schrieb Rowan Collins rowan.collins@gmail.com:
This has come up in passing a few times recently, but I'm not sure there's ever been a dedicated discussion of it: would it be useful for PHP to have a built-in Enumeration type, and if so, how should it look?
I like enums in general, but I'd like to note that there's already a RFC in draft by Levi:
https://wiki.php.net/rfc/enum https://wiki.php.net/rfc/enum
As far as I know, the RFC is already fairly final and just lacks an implementation.
So, I'd consider bikeshedding an actual RFC first.
If we’re bikeshedding, one feature I would really like to see, with typehinting, is warnings if all cases of an enum aren’t handled in a switch. So, for example, given our example Weekdays enum, if I wrote this code:
switch(Weekday $someWeekday) {
case Weekday::MONDAY: break;
case Weekday::TUESDAY: break;
}By providing the typehint, I’m indicating that I want to get a warning/error that the switch does not cover all enum values. This would be very handy if an enum value is added after initial development and someone misses a switch statement in cleanup.
The typehint would also allow generating a warning if someone did something like
switch(Weekday $someWeekday) {
//case … all the weekdays: break;
case ‘I am not a Weekday’: echo ‘Generate a fatal error here because string is not a Weekday.’;
}
I've looked into this but not in the same syntax you've provided. I
was exploring a match
construct that has different semantics:
match ($foo) {
case Weekday::Monday: {
echo "Monday";
} // it is not possible to fall-through with match/case
case Weekday::Tuesday: echo Monday; // single expression
doesn't need braces
} // if it hits the end without a match a runtime error will be emitted
I think this fits into the dynamic behavior of PHP a little better
than forcing a case for each enum value at compile-time, even if it
was possible (I suspect it is not).
Levi et al,
Am 18.09.2015 um 01:06 schrieb Rowan Collins rowan.collins@gmail.com:
This has come up in passing a few times recently, but I'm not sure there's ever been a dedicated discussion of it: would it be useful for PHP to have a built-in Enumeration type, and if so, how should it look?
I like enums in general, but I'd like to note that there's already a RFC in draft by Levi:
https://wiki.php.net/rfc/enum https://wiki.php.net/rfc/enum
As far as I know, the RFC is already fairly final and just lacks an implementation.
So, I'd consider bikeshedding an actual RFC first.
If we’re bikeshedding, one feature I would really like to see, with typehinting, is warnings if all cases of an enum aren’t handled in a switch. So, for example, given our example Weekdays enum, if I wrote this code:
switch(Weekday $someWeekday) {
case Weekday::MONDAY: break;
case Weekday::TUESDAY: break;
}By providing the typehint, I’m indicating that I want to get a warning/error that the switch does not cover all enum values. This would be very handy if an enum value is added after initial development and someone misses a switch statement in cleanup.
The typehint would also allow generating a warning if someone did something like
switch(Weekday $someWeekday) {
//case … all the weekdays: break;
case ‘I am not a Weekday’: echo ‘Generate a fatal error here because string is not a Weekday.’;
}I've looked into this but not in the same syntax you've provided. I
was exploring amatch
construct that has different semantics:match ($foo) { case Weekday::Monday: { echo "Monday"; } // it is not possible to fall-through with match/case case Weekday::Tuesday: echo Monday; // single expression
doesn't need braces
} // if it hits the end without a match a runtime error will be emittedI think this fits into the dynamic behavior of PHP a little better
than forcing a case for each enum value at compile-time, even if it
was possible (I suspect it is not).
This is basically pattern matching. And while I'd LOVE to see support
for it in PHP, I think it's a bit off-topic here. Perhaps a new thread
could be opened for it (or ideally a RFC)...?
As far as enums, I wonder if they would be necessary if we supported
algebraic types, since you could define an "enum" as a virtual type:
const MONDAY = 0;
const TUESDAY = 1;
const WEDNESDAY = 2;
const THURSDAY = 3;
const FRIDAY = 4;
use MONDAY | TUESDAY | WEDNESDAY | THURSDAY | FRIDAY as WEEKDAY;
function foo(WEEKDAY $day) {
// must be an integer 0-4, or castable to 0-4 unless strict_types is on.
}
Just a thought...
Anthony
As far as enums, I wonder if they would be necessary if we supported
algebraic types, since you could define an "enum" as a virtual type:const MONDAY = 0;
const TUESDAY = 1;
const WEDNESDAY = 2;
const THURSDAY = 3;
const FRIDAY = 4;
use MONDAY | TUESDAY | WEDNESDAY | THURSDAY | FRIDAY as WEEKDAY;function foo(WEEKDAY $day) {
// must be an integer 0-4, or castable to 0-4 unless strict_types is on.
}Just a thought...
Anthony
Thats the kind of functionality I would expect from enums, but having a
dedicated syntax for it would be better for code organization and
documentation. Is much better to document:
/**
- Days of the week for blah blah.
*/
enum Weekday {...}
Its purpose would be much clear than documenting this:
use MONDAY | TUESDAY | WEDNESDAY | THURSDAY | FRIDAY as WEEKDAY;
Also documentation generators like phpdocumentor or apigen could
generate a list of enum declarations with its descriptions and values.
Also when type hinting a function parameter like in your example it can
be a link to the enum definition on documentation.
As far as enums, I wonder if they would be necessary if we supported
algebraic types, since you could define an "enum" as a virtual type:const MONDAY = 0;
const TUESDAY = 1;
const WEDNESDAY = 2;
const THURSDAY = 3;
const FRIDAY = 4;
use MONDAY | TUESDAY | WEDNESDAY | THURSDAY | FRIDAY as WEEKDAY;function foo(WEEKDAY $day) {
// must be an integer 0-4, or castable to 0-4 unless strict_types is
on.
}
I think this is a fundamentally different style of enum to what Levi and I are thinking of.
In your example - and this seems to be what C# means by enum, for instance - it's a subset of a scalar type, usually an integer, with some names as mnemonics. Other than that it behaves exactly as the base type would, and can be readily cast, maybe even implicitly, to and from that base type.
The other style is where each enum type is a completely distinct type, not just a whitelist of values in some other domain. In a sense (and this is what got me thinking about enums recently), you can consider null to be an enum with only one value, and boolean to be an enum with two - you can define (int)true as 1, but true is not an integer, and 1 is not a boolean value.
I suspect both styles may have their place, but algebraic types certainly wouldn't take away my thirst for object-like enums.
Regards,
Rowan Collins
[IMSoP]
- there should be no accessible "value" for a member; the value of
Weekdays::SUNDAY is simply Weekdays::SUNDAY, not 0 or 7 or 'SUNDAY' (I'm
thinking that internally, each member would be represented as an object
pointer, so there's no real benefit to forcing people to number
everything like some languages do)
Well, how are you supposed to serialize an enum value without some sort
of numerical representation or value? Maybe you could serialize it by
converting the enum to a string representation of its name but that
wouldn't be efficient and also if a developer refactored its code base
and renamed an enum, when unserializing the data it wouldn't be properly
assigned. Also other languages don't force you to assign value for an
enum constants, if I recall correctly languages like C/C++ automatically
assing an interger value to enumerations in they order they were typed so:
enum BadDays {MONDAY, SUNDAY, THURSDAY=5, SATURDAY}
would be MONDAY = 0, SUNDAY = 1, THURSDAY=5, SATURDAY=6
So assigning a value is optional.
I see the enum's rfc mentions boxing primitive types but that seems to
not solve what I'm saying here at all.
Enums should be a nice way of grouping constant values without hitting
on the performance (for everything else use a class). The rfc is
proposing a nice syntax but it doesn't fully covers
serialization/unserialization of enumerations.
Well, how are you supposed to serialize an enum value without some
sort of numerical representation or value? Maybe you could serialize
it by
converting the enum to a string representation of its name but that
wouldn't be efficient and also if a developer refactored its code base
and renamed an enum, when unserializing the data it wouldn't be
properly assigned.
I don't think serializing to a name is particularly more inefficient
than serializing to an integer - depending on the internal
implementation, of course. Or do you mean efficiency in terms of the
length of the string produced?
Nor do I see it as particularly likely that somebody will rename an enum
member - that would instantly break all code referring to it anyway.
They could however change the order of fields without realising that it
would make any difference, since most access would be of the form
TypeName::ValueName.
Regards,
--
Rowan Collins
[IMSoP]
I don't think serializing to a name is particularly more inefficient
than serializing to an integer - depending on the internal
implementation, of course. Or do you mean efficiency in terms of the
length of the string produced?
Exactly! Lets say you want to create a MembershipStatus
enum MembershipStatus{
INACTIVE=0,
ACTIVE=1
}
It would be more memory/space efficient to store 1 or 0 in a database or
serialized string, also it would be faster to parse than storing the
full name, eg (bad serialized example but just to show the idea):
e:16:"MembershipStatus":1:{i:0;}
instead of
e:16:"MembershipStatus":1:{s:8:"INACTIVE";}
Nor do I see it as particularly likely that somebody will rename an enum
member - that would instantly break all code referring to it anyway.
They could however change the order of fields without realising that it
would make any difference, since most access would be of the form
TypeName::ValueName.Regards,
The renaming thingy, I'm wouldn't be so sure because IDE's can introduce
refactoring tools for enumerations that rename all occurrences of an
enum to another. Unless you are working on a public framework that many
different people is using, in that case you would restraint from
renaming the enumerations.
Jefferson Gonzalez wrote on 18/09/2015 03:20:
I don't think serializing to a name is particularly more inefficient
than serializing to an integer - depending on the internal
implementation, of course. Or do you mean efficiency in terms of the
length of the string produced?Exactly! Lets say you want to create a MembershipStatus
enum MembershipStatus{
INACTIVE=0,
ACTIVE=1
}It would be more memory/space efficient to store 1 or 0 in a database
or serialized string, also it would be faster to parse than storing
the full name, eg (bad serialized example but just to show the idea):e:16:"MembershipStatus":1:{i:0;}
instead of
e:16:"MembershipStatus":1:{s:8:"INACTIVE";}
I would expect it to look more like this:
e:16:"MembershipStatus":8:"INACTIVE"
It seems to me that there are lots of things you could do to be
efficient - declared object properties could be serialized by their
ordinal position, rather than listing their names, for instance - but
that would make the serialization more fragile. It seems to me that
serializing enum values by order or "value" rather than name is much the
same thing.
The renaming thingy, I'm wouldn't be so sure because IDE's can
introduce refactoring tools for enumerations that rename all
occurrences of an enum to another. Unless you are working on a public
framework that many different people is using, in that case you would
restraint from renaming the enumerations.
Sure, but it seems more likely that somebody would add an extra value to
an enum in alphabetical order, thus messing up serialized ordinals, than
they are to refactor their code base to change the name of one of the
values.
And if you rename a class or property, serialized objects of that class
will already break, so having serialized enums break on the same action
seems perfectly natural, whereas having them silently unserialize to the
wrong value because you changed the ordinal positions of the values
seems rather dangerous.
Regards,
Rowan Collins
[IMSoP]
Well, how are you supposed to serialize an enum value without some sort of
numerical representation or value? Maybe you could serialize it by
converting the enum to a string representation of its name but that
wouldn't be efficient and also if a developer refactored its code base
and renamed an enum, when unserializing the data it wouldn't be properly
assigned.I don't think serializing to a name is particularly more inefficient than
serializing to an integer - depending on the internal implementation, of
course. Or do you mean efficiency in terms of the length of the string
produced?Nor do I see it as particularly likely that somebody will rename an enum
member - that would instantly break all code referring to it anyway. They
could however change the order of fields without realising that it would
make any difference, since most access would be of the form
TypeName::ValueName.
Actually, you need to think about compatibility in a bigger perspective.
One of the first things I would want do if PHP were to grow enums, is
to add support for it to Apache Thrift. For those not familiar with
it, Thrift is basically an IDL for specifying messages and services
(APIs) that generates code for a lot of languages, including PHP. Same
principle as Google's Protocol Buffers / grpc.
Anyway, some of the point in using an IDL for your APIs is having a
graceful way to deal with changes over time, because you must always
deal with a certain amount of old clients or servers.
Thrift and Protobuf's enums are represented as integers internally,
and that is what goes on the wire. If you change the name of an enum
field
in your IDL, you will get a source incompatibility, but that's
acceptable as long as you change the code. It only affects that single
client or server. What is not okay is to change the enum's integer
representation, because that breaks the protocol, and you can no
longer communicate with older clients or servers.
The point I'm trying to make is that an enum's normalized
representation should be an integer, and that is also how it should be
serialized. It also happens to be how most other languages chose to
represent enums, and deviating from that will cause all kinds of pain
the minute you need to do type conversions between formats or
languages.
- Stig
The point I'm trying to make is that an enum's normalized
representation should be an integer, and that is also how it should be
serialized. It also happens to be how most other languages chose to
represent enums, and deviating from that will cause all kinds of pain
the minute you need to do type conversions between formats or
languages.
You can always serialize things however you want. Using serialize()
is just a convenience – there is absolutely nothing that prevents you
from using a custom serialization routine. Note that while Java has
built in serialization it is often not used, and instead libraries
like Google's GSON are used. You register a type with hooks for
serializing and deserializing, etc.
It sounds like this is what you need anyway. Since built-in
serialization happens differently in each language you'd probably want
something custom in each language.
You can always serialize things however you want. Using
serialize()
is just a convenience – there is absolutely nothing that prevents you
from using a custom serialization routine. Note that while Java has
built in serialization it is often not used, and instead libraries
like Google's GSON are used. You register a type with hooks for
serializing and deserializing, etc.It sounds like this is what you need anyway. Since built-in
serialization happens differently in each language you'd probably want
something custom in each language.
Whilst that might be the case with serialize(), by extension you also
lose the convenience of being able to use *json_encode() *on an already
properly structured object and immediately having a representation of that
object ready to send over the wire.
C#'s enums seem a good model to follow. 0-indexed by default, alternatively
you can specify the first key to change to (for example) 1-indexed or
specify all keys:
enum Days {Sat=1, Sun, Mon, Tue, Wed, Thu, Fri};
Dan Cryer wrote on 22/09/2015 16:06:
C#'s enums seem a good model to follow.
It's worth pointing out that C#'s enums are basically the same as C's -
a typedef over int with a handful of helper methods in the standard
library. They don't even range-check on assignment, so that a "weekday"
variable can be given the value 42. [1]
This is very different to Java or Python, where enums are a special kind
of object, strictly typed, and comparable only with themselves. (Python
classes can relax these constraints by overloading various operators.)
Java's enums have no intrinsic "value", but can have arbitrary fields
set up by a constructor. They do have an ordinal() method, but "Most
programmers will have no use for this method." The toString() method
returns the name of the Enum constant. [2][3]
Python's enums have both a name and a value, but the value can be of any
type. Most of the examples in the manual do use integer values, though,
and there is a suggestion of how to implement auto-numbering. [4]
References:
[1] https://msdn.microsoft.com/en-us/library/cc138362.aspx
[2] https://docs.oracle.com/javase/tutorial/java/javaOO/enum.html
[3] http://docs.oracle.com/javase/7/docs/api/java/lang/Enum.html
[4] https://docs.python.org/3/library/enum.html
Regards,
Rowan Collins
[IMSoP]
Dan Cryer wrote on 22/09/2015 16:06:
C#'s enums seem a good model to follow.
It's worth pointing out that C#'s enums are basically the same as C's - a
typedef over int with a handful of helper methods in the standard library.
They don't even range-check on assignment, so that a "weekday" variable can
be given the value 42. 1This is very different to Java or Python, where enums are a special kind of
object, strictly typed, and comparable only with themselves. (Python classes
can relax these constraints by overloading various operators.)Java's enums have no intrinsic "value", but can have arbitrary fields set up
by a constructor. They do have an ordinal() method, but "Most programmers
will have no use for this method." The toString() method returns the name of
the Enum constant. [2][3]Python's enums have both a name and a value, but the value can be of any
type. Most of the examples in the manual do use integer values, though, and
there is a suggestion of how to implement auto-numbering. [4]References:
1 https://msdn.microsoft.com/en-us/library/cc138362.aspx
2 https://docs.oracle.com/javase/tutorial/java/javaOO/enum.html
[3] http://docs.oracle.com/javase/7/docs/api/java/lang/Enum.html
[4] https://docs.python.org/3/library/enum.html
All good things to point out. This is the kind of thing I had hoped to
point out when I submitted it officially for discussion… but enums got
brought up before I could do that.
I would also like to point out that Swift has several kinds of
enums1. Rust's enums are different as well2. Enums are very
different in different languages.
Essentially I don't know if we will ever agree on which style of enums
to have, but the hope is that we can all agree that the common
functionality of all of them is useful and go forward to that point.
Am 22.09.2015 um 12:38 schrieb Stig Bakken:
The point I'm trying to make is that an enum's normalized
representation should be an integer, and that is also how it should be
serialized.
Makes sense to me.
Actually, you need to think about compatibility in a bigger perspective.
One of the first things I would want do if PHP were to grow enums, is
to add support for it to Apache Thrift. For those not familiar with
it, Thrift is basically an IDL for specifying messages and services
(APIs) that generates code for a lot of languages, including PHP. Same
principle as Google's Protocol Buffers / grpc.Anyway, some of the point in using an IDL for your APIs is having a
graceful way to deal with changes over time, because you must always
deal with a certain amount of old clients or servers.Thrift and Protobuf's enums are represented as integers internally,
and that is what goes on the wire. If you change the name of an enum
field
in your IDL, you will get a source incompatibility, but that's
acceptable as long as you change the code. It only affects that single
client or server. What is not okay is to change the enum's integer
representation, because that breaks the protocol, and you can no
longer communicate with older clients or servers.The point I'm trying to make is that an enum's normalized
representation should be an integer, and that is also how it should be
serialized. It also happens to be how most other languages chose to
represent enums, and deviating from that will cause all kinds of pain
the minute you need to do type conversions between formats or
languages.
- Stig
Is also worth noting that if you keep performance in mind, enums whith
values represented as integers should be a much more
performant/efficient implementation since it would require less cpu
cycles and memory to build and read them. In the other side the RFC
proposal is treating the enum values as objects which would require a
bigger zval and more cpu cycles to construct them and read them.
So basically from a performance perspective enums expressed as:
class enum {
const FIELD1=0;
const FIELD2=1;
}
should go more easy on the engine than:
class enum extends enumeration
{
public FIELD1 = new enum(value, name);
public FIELD2 = new enum(value, name);
...
}
The latter would be bloated, which emulates the java way of doing
things, one of the reasons why JVM consumes so much ram and cpu cycles
for simple applications.
Again, enumerations should be a nice way of grouping/categorizing
values/flags for a specific use. Over-engineering the concept can lead
to bloatness.
Enums should offer the commodities that we don't have with:
class MyConstants{
const SOME_CONST;
const OTHER_CONST;
}
These should be:
- dedicated syntax
- type hinting with range checking
- much better and concise documentation that clearly shows what type of
constants/flags a parameter can receive
Everything else besides that can be considered over-engineering and
bloatness (I over used the 'bloatness' term :D).
Jefferson Gonzalez wrote on 22/09/2015 20:28:
Is also worth noting that if you keep performance in mind, enums whith
values represented as integers should be a much more
performant/efficient implementation since it would require less cpu
cycles and memory to build and read them. In the other side the RFC
proposal is treating the enum values as objects which would require a
bigger zval and more cpu cycles to construct them and read them.
The internal implementation can never be quite as simple as an IS_LONG
zval if we want to provide anything at all beyond a C-style enum. I for
one do not want Days::MONDAY === Months::JANUARY to return true, and nor
should typeof(Days::MONDAY) return "long"; this requires at the very
least a struct with type name (enum "class") and value.
Memory efficiency is only trivially different, since all possible
instances exist in memory only once, so each zval pointing at
Days::MONDAY would just be a single pointer to the single instance of
that structure. Initial creation of the enum instances would be slower,
but again only needs to happen once when the enum is declared, so is
unlikely to be a major concern.
Every other action I can think of is either the same, slower, or
impossible if you use an int-based representation:
- Comparing two enum instances can again be optimised based on the "only
one instance" rule: you can directly compare the pointers. - Getting the name of a class constant based on its definition (e.g. to
display an Exception's code in human-readable form) requires extremely
inefficient use of reflection. An int-based enum implementation would
have to do something similar, an object-based one could cache the
information in the instance. - Absolutely any other additional behaviour or fields would be either
impossible with an int-based representation, or require exactly the same
lookups as an object-based implementation would provide anyway.
Looking at Levi's PoC branch, the actual approach taken is a hybrid
anyway: the _zend_enum struct directly holds the name and an "ordinal"
z_long directly, and only access the object representation if these two
pieces of information are not sufficient.
Regards,
Rowan Collins
[IMSoP]
The internal implementation can never be quite as simple as an IS_LONG
zval if we want to provide anything at all beyond a C-style enum. I for
one do not want Days::MONDAY === Months::JANUARY to return true, and nor
should typeof(Days::MONDAY) return "long"; this requires at the very
least a struct with type name (enum "class") and value.
Right, if we think of the implementation as a class with constant
properties, then the Days enum would hold an array/hash of associative
zvals, the zvals structure should include a pointer to a
zend_enum_struct, this zend_enum_struct may hold a pointer to the parent
zend_class_entry/zend_enum_entry, a pointer to its name as stored on the
zend_class_entry/zend_enum_entry properties table, and the
ordinal/numerical value. So:
function something(Days $day);
something(Days::Monday);
would be a matter of comparing the zend_class_entry/zend_enum_entry
pointer referenced on Monday with that of the Days enum.
Again comparing Days::MONDAY === Months::JANUARY should be fast if we
only compare the pointer of the zend_enum_entry.
Memory efficiency is only trivially different, since all possible
instances exist in memory only once, so each zval pointing at
Days::MONDAY would just be a single pointer to the single instance of
that structure. Initial creation of the enum instances would be slower,
but again only needs to happen once when the enum is declared, so is
unlikely to be a major concern.
Initializing a zend_enum_struct that only holds a pointer to parent
enum, pointer to name, and ordinal value would be faster and require
less memory than initializing a new class for each enum value. It is
true that this only happens once but every bit counts for better
performance. After all php model for every request is Execute/Die,
Execute/Die... it doesn't keeps everything initialized and ready for use.
- Getting the name of a class constant based on its definition (e.g. to
display an Exception's code in human-readable form) requires extremely
inefficient use of reflection. An int-based enum implementation would
have to do something similar, an object-based one could cache the
information in the instance.
As I said before, if we store a pointer to the property name of the
zend_class_entry/zend_enum_entry directly on the zend_enum_struct we can
retrieve its name rapidly without going into the
zend_class_entry/zend_enum_entry and doing a properties lookup.
- Absolutely any other additional behaviour or fields would be either
impossible with an int-based representation, or require exactly the same
lookups as an object-based implementation would provide anyway.
Any other fancy stuff like Days::Monday->Something() would be in my
opinion over engineering the typical use of enumerations.
Looking at Levi's PoC branch, the actual approach taken is a hybrid
anyway: the _zend_enum struct directly holds the name and an "ordinal"
z_long directly, and only access the object representation if these two
pieces of information are not sufficient.
Nice that theres already work on some implementation :), I have missed
this feature so much on PHP.
Hi All,
This has come up in passing a few times recently, but I'm not sure
there's ever been a dedicated discussion of it: would it be useful for
PHP to have a built-in Enumeration type, and if so, how should it look?
For what it's worth, there's SplEnum ;)
--
Ben Scholzen
http://www.dasprids.de
Hi All,
This has come up in passing a few times recently, but I'm not sure
there's ever been a dedicated discussion of it: would it be useful for
PHP to have a built-in Enumeration type, and if so, how should it look?Many other languages have enum types, with varying functionality. The
central concept is always that there's some declared type with a bunch
of named constants, but beyond that there's a lot of variability: for
starters, is it a type of object, a special way of naming integers, or
an uncomparable type more like booleans and nulls?Here are my thoughts on what I'd like from a PHP enum - I'm not using
"should" in a particularly strong sense, it's just a way of framing the
points of discussion:
enums should act like values, not mutable objects; an array-style
copy-on-write behaviour would be possible, but it feels wrong to me to
store any state in an enum variable, so just plain immutable would be my
preferencethe members of an enum should have a name which is accesible as a
string, e.g. Weekdays::SUNDAY->getName() == 'SUNDAY'there should be no accessible "value" for a member; the value of
Weekdays::SUNDAY is simply Weekdays::SUNDAY, not 0 or 7 or 'SUNDAY' (I'm
thinking that internally, each member would be represented as an object
pointer, so there's no real benefit to forcing people to number
everything like some languages do)each enum member should be considered a singleton, in the sense that
you can't construct or destroy one, only reference them; all possible
instances would be created as soon as the enum was definedfollowing from (3) and (4), an enum value should compare equal only
to itself, unlike in C# for instance, where it would be comparable to an
integer based on its numeric value; similarly it shouldn't be possible
to cast to or from an enuman enum should be able to have additional fields; these would be
initialised in the enum's definition; this is inspired by Java and
Python's ability to pass parameters to the "constructor" of the enum,
but it feels weird to me for any logic to happen in that constructor
other than simple assignments, so I'm thinking of a simpler syntax and
implementation. It also simplifies immutability if no userland code ever
writes to the properties. There may be an important use case for
constructor logic I'm missing though?an enum should have default static methods for accessing all the
members of the enum as an associative arrayenums should be a first-class type, is_object(Weekdays::SUNDAY)
should return false, for instance; maybe Weekdays::SUNDAY instanceof
Weekdays should return true thoughadditional static and instance methods should be definable, bearing
in mind the immutability constraints already discussedGiven the above, I think we might end up with something like this:
enum Weekdays {
member MONDAY; // if there are no fields to initalise, the member
just needs its name declaring
member TUESDAY ( 2, 'Chewsdae' ); // positional arguments for
populating fields in the order they are defined; a bit like Java, but
without the constructor
member WEDNESDAY { $dayNumber = 3, $sillyName = 'Wed Nose Day' };
// or maybe a named-parameter syntax to make things clearer
member THURSDAY, FRIDAY, SATURDAY, SUNDAY; // don't force people to
write each entry on its own line; maybe even the "member" keyword is too
much?
Member is even a little too much IMO. A comma-separated list syntax
would be better, simply because its shorter and more similar to enum
syntax on other languages.
public $dayNumber, $sillyName; // fields initialised for each member public static function getWeekend() { return [ self::SATURDAY, self::SUNDAY ]; } public function getZeroIndexedDayNumber() { return $this->dayNumber - 1; }
}
$today = Weekdays::THURSDAY;
foreach ( Weekdays::getMembers() as $day_name => $day ) {
echo $day->dayNumber;
if ( $day == $today ) {
echo "Today!";
} else {
echo $day_name; // equivalently: echo $day->getName();
}
}// Do we need a static method to access a member by name, or is this
good enough?
$favourite_day = Weekdays::getMembers()[ $_GET['fav_day'] ];
Not sure if this could be allowed, but simply using name interpolation
would be better. e.g.:
$favourite_day = Weekdays::$_GET['fav_day'];
Similar to $object->$name and nothing to do with StaticClass::$property
syntax.
So, what are anyone's thoughts? Am I rambling on too much as usual? ;)
Regards,
Is there any interest in enum subtypes? As in, allowing each member to
also be a class? This would allow algebraic data typing, which would be
a pretty powerful addition to the language.
--
Stephen
member THURSDAY, FRIDAY, SATURDAY, SUNDAY; // don't force people
to
write each entry on its own line; maybe even the "member" keyword is
too
much?Member is even a little too much IMO. A comma-separated list syntax
would be better, simply because its shorter and more similar to enum
syntax on other languages.
The keyword just felt like it fitted better with the rest of the language, and made the list look less like it's floating when there are other things in the braces (fields, methods, etc). But as you say it's common in other languages to simply have a comma separated list at the start of the declaration, before anything else.
I didn't want to get too bogged down in the syntax, though, more to gauge reaction to the general direction, which can be summed up as "an enum is an immutable object, with an easily available name but no other default properties, and a mechanism for decorating with additional properties and methods as desired".
Regards,
Rowan Collins
[IMSoP]
Hi All,
This has come up in passing a few times recently, but I'm not sure there's
ever been a dedicated discussion of it: would it be useful for PHP to have a
built-in Enumeration type, and if so, how should it look?Many other languages have enum types, with varying functionality. The
central concept is always that there's some declared type with a bunch of
named constants, but beyond that there's a lot of variability: for starters,
is it a type of object, a special way of naming integers, or an uncomparable
type more like booleans and nulls?So, what are anyone's thoughts? Am I rambling on too much as usual? ;)
Regards,
--
Rowan Collins
[IMSoP]
Hi,
personally, I feel that you should be able to assign numeric value to
each element (and have them implicit if not specified).
This is imho better for serialization (but it can be done with names
as well, yeah) - but more importantly, it also allows usage with
bitwise operators, so you could use them as a "flags" (ie. $weekend =
Days::SATURDAY | Days::SUNDAY).
Regards
Pavel Kouřil
Hi All,
This has come up in passing a few times recently, but I'm not sure there's
ever been a dedicated discussion of it: would it be useful for PHP to have a
built-in Enumeration type, and if so, how should it look?Many other languages have enum types, with varying functionality. The
central concept is always that there's some declared type with a bunch of
named constants, but beyond that there's a lot of variability: for starters,
is it a type of object, a special way of naming integers, or an uncomparable
type more like booleans and nulls?So, what are anyone's thoughts? Am I rambling on too much as usual? ;)
Regards,
--
Rowan Collins
[IMSoP]Hi,
personally, I feel that you should be able to assign numeric value to
each element (and have them implicit if not specified).This is imho better for serialization (but it can be done with names
as well, yeah) - but more importantly, it also allows usage with
bitwise operators, so you could use them as a "flags" (ie. $weekend =
Days::SATURDAY | Days::SUNDAY).
In my opinion this is the least valuable form of enums that exists in
any language I am aware of. However I was careful in the RFC to not
prevent this from being a possibility. I would much, much prefer enums
that are more like Rust's, Haskell's or Swift's. It's worth noting
Swift has at least three different kinds of enums, one of which would
allow the kind of behavior you are wanting.
Hi All,
This has come up in passing a few times recently, but I'm not sure there's
ever been a dedicated discussion of it: would it be useful for PHP to have a
built-in Enumeration type, and if so, how should it look?Many other languages have enum types, with varying functionality. The
central concept is always that there's some declared type with a bunch of
named constants, but beyond that there's a lot of variability: for starters,
is it a type of object, a special way of naming integers, or an uncomparable
type more like booleans and nulls?So, what are anyone's thoughts? Am I rambling on too much as usual? ;)
Regards,
--
Rowan Collins
[IMSoP]Hi,
personally, I feel that you should be able to assign numeric value to
each element (and have them implicit if not specified).This is imho better for serialization (but it can be done with names
as well, yeah) - but more importantly, it also allows usage with
bitwise operators, so you could use them as a "flags" (ie. $weekend =
Days::SATURDAY | Days::SUNDAY).In my opinion this is the least valuable form of enums that exists in
any language I am aware of. However I was careful in the RFC to not
prevent this from being a possibility. I would much, much prefer enums
that are more like Rust's, Haskell's or Swift's. It's worth noting
Swift has at least three different kinds of enums, one of which would
allow the kind of behavior you are wanting.
It may not be valuable in your opinion, but I have seen this kind of
enum usage on a lot of open source projects which is really useful to
represent more than 1 value in a single variable, which would be more
efficient and less memory consuming than doing something like:
$weekend = array(
Days::SATURDAY;
Days::SUNDAY;
)
And anyway, most new languages have over engineered the purpose of
enums. Enums should just be a nice way of grouping constant values with
a human readable name where the ordinal value should be optionally set
for consistency in case that storing it is required.
In any case this is a nice read:
https://en.wikipedia.org/wiki/Enumerated_type
personally, I feel that you should be able to assign numeric value to
each element (and have them implicit if not specified).This is imho better for serialization (but it can be done with names
as well, yeah) - but more importantly, it also allows usage with
bitwise operators, so you could use them as a "flags" (ie. $weekend =
Days::SATURDAY | Days::SUNDAY).
That's actually two different requests: firstly, that every enum should have a numeric value, rather than it being opt-in. And secondly, that enums should have overloaded operators that allow their use in contexts where integers would be expected.
I draw that distinction because overloaded operators are fairly rare in PHP (and largely unavailable in userland).
In some languages (e.g. C#) enums are just a name for an integer, and can be readily compared to, and cast from or to, ordinary integers. That's one way of allowing operators to work on them, but it leads to things like Days::Monday == Months::January returning true, which doesn't feel right to me.
I think an enum-like type specifically for bitsets, which overloaded bitwise operators without ever exposing the underlying integers, might be interesting, though.
Regards,
Rowan Collins
[IMSoP]
So, what are anyone's thoughts? Am I rambling on too much as usual? ;)
What happens when you need to switch language?
This is one area where a well formatted database system fills many
holes. Simply selecting a subset of entries from a table allows a user
to select the enumerated values in their language. We hit no end of
problems where English is encoded into the code rather than simply a set
of numbers.
--
Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
So, what are anyone's thoughts? Am I rambling on too much as usual?
;)What happens when you need to switch language?
Yes, the names of enum members should be considered labels, not descriptions. They're strings only for the convenience of programmers working with them, just like variable names, which would more efficiently be written $1, $2, etc.
Your enum value might be State::OFF_MAP_E, which would be looked up in a database or translation library to the appropriate translation of "You are off the East side of the map".
Regards,
Rowan Collins
[IMSoP]