Avoiding enum reserved keyword

4 years ago by Nikita Popov — view source

unread

Hi internals,

I'm a bit concerned about the addition of the "enum" reserved keyword as
part of https://wiki.php.net/rfc/enumerations. The problem is that there
are quite a few existing enum libraries (such as
https://github.com/myclabs/php-enum) that define an Enum class. While the
implementation of enums in PHP 8 obsoletes these libraries, it still
constitutes a migration problem, especially for libraries supporting more
than one PHP version.

I don't believe that the keyword is strictly necessary: We can recognize
enum declarations as T_STRING T_STRING, where the former is checked to be
equal to "enum" by the parser. It so happens that this is syntactically
unambiguous at this time. We may be forced to introduce the keyword at a
later time, if it becomes ambiguous.

Another possibility would be to recognize T_ENUM in the lexer, but only if
it is followed by whitespace and an identifier. This would possibly be
friendlier for tooling using token_get_all(). It would not permit comments
in between the tokens though.

Thoughts?

Regards,
Nikita

4 years ago by Larry Garfield — view source

unread

Hi internals,

I'm a bit concerned about the addition of the "enum" reserved keyword as
part of https://wiki.php.net/rfc/enumerations. The problem is that there
are quite a few existing enum libraries (such as
https://github.com/myclabs/php-enum) that define an Enum class. While the
implementation of enums in PHP 8 obsoletes these libraries, it still
constitutes a migration problem, especially for libraries supporting more
than one PHP version.

I don't believe that the keyword is strictly necessary: We can recognize
enum declarations as T_STRING T_STRING, where the former is checked to be
equal to "enum" by the parser. It so happens that this is syntactically
unambiguous at this time. We may be forced to introduce the keyword at a
later time, if it becomes ambiguous.

Another possibility would be to recognize T_ENUM in the lexer, but only if
it is followed by whitespace and an identifier. This would possibly be
friendlier for tooling using token_get_all(). It would not permit comments
in between the tokens though.

Thoughts?

Regards,
Nikita

If I understand correctly, neither of these proposals would change the user-facing syntax, right? Just the parser details?

I'm fine with that as a transition plan. I think long-term it's better to have all keywords behave the same, but if we could use one of these alternates for now and then switch to a normal T_ENUM in 9.0 (thus not breaking a class named Enum until then) I'd be fine with that.

I can't see a comment between "enum" and "Suit" being useful, so that's an acceptable tradeoff for me.

--Larry Garfield

4 years ago by Matthew Brown — view source

unread

Another possibility would be to recognize T_ENUM in the lexer, but only if
it is followed by whitespace and an identifier. This would possibly be
friendlier for tooling using token_get_all(). It would not permit comments
in between the tokens though.

I like this option. I can't think anyone would want to write "enum /**
some comment */ Foo {...}"