Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:112453 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 23821 invoked from network); 7 Dec 2020 10:55:45 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 7 Dec 2020 10:55:45 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id D7C8018053A for ; Mon, 7 Dec 2020 02:24:39 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-lj1-f170.google.com (mail-lj1-f170.google.com [209.85.208.170]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 7 Dec 2020 02:24:39 -0800 (PST) Received: by mail-lj1-f170.google.com with SMTP id f11so2494588ljn.2 for ; Mon, 07 Dec 2020 02:24:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=hdcjCw5FI/7LAYmCZ2mgeA3wzWG5p0ZDzQfOZDe5QOI=; b=QUsi4g0vwCUwyz1bqjAyCFTGIRL3JA4dGTRE0Fy8A3EjrMJM0FrFcfVOJmUIaGQgLr D7RiTuM8GWEzYS9snyu8hOpVlZ4DVb2OnBkuoJ7HH76LIU6vQkbKzhemsTjBl00Unbcx GFymsikgvHKPQQBwKnRgz48R0130iqeRKK6G1Q0HIB3iISCXequd2XBX0bTgHtrroMHQ qGEd/wem3cs8GvIT+w8/WhR0Bm43vvii17yvvsZ/337XOMgJ1aY3nVEJghH4K+Rd24GG txhGAjjW95trLHiSv/edjpPKmlz6NCJSp4BrhJVbIpGdMCqEC3cKy5/f+Jf+f5ztDYuM U72Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hdcjCw5FI/7LAYmCZ2mgeA3wzWG5p0ZDzQfOZDe5QOI=; b=obKAEkqJGeLFBUQpbuHoKT8VRfpi7Oeosc3b157e4ToiAKJ5sQ03ZBEEo8MAmjKRsZ 4nKw5BZJ6J948CXEp3afl74MKsb288ixjKXOaV8RotreB/Mfxd/hZEssZuZ39ARQOWD/ VAtIy2mrNOcu6d/p/iYzfGD+Tx10TkWH36re+6XFn5G22DJjTj/Ap9O94B2SGNZgT6rY BjzxGYRxsioRvpAvkcpcCJET1fEEkMyW8zT2/jP3dRB5BFqYhO5BCWGZWJgaLyr9KK/8 0sd2iUh3nLIAVm6O8X1h1ZLU8SEOySOJwz9i2YvAEv3s47zGFGbpOkENx3PFxsHY/z8V vfJw== X-Gm-Message-State: AOAM530e3Z9yAvc+eMQL2vrHqugpLk6lDttsOQQB0n2mmNY6LsZRFuEQ 9aHON8bg981uDEh1vTXlAi5PwlZzz49Izj2fvJFW6HjS2qmbhQ== X-Google-Smtp-Source: ABdhPJwZbyx70eKR5MxWnsxXtsHJ+2prTuys9KYOlTEOJm5+v2Hb2EATIYaxgiOVCRaZ1dALI+yUNeeA6lqinr0IltM= X-Received: by 2002:a05:651c:1398:: with SMTP id k24mr1498655ljb.30.1607336677469; Mon, 07 Dec 2020 02:24:37 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Date: Mon, 7 Dec 2020 11:24:21 +0100 Message-ID: To: Larry Garfield Cc: php internals Content-Type: multipart/alternative; boundary="000000000000b1efe805b5dd3b6d" Subject: Re: [PHP-DEV] [RFC] Enumerations From: nikita.ppv@gmail.com (Nikita Popov) --000000000000b1efe805b5dd3b6d Content-Type: text/plain; charset="UTF-8" On Sat, Dec 5, 2020 at 12:25 AM Larry Garfield wrote: > Greetings, denizens of Internals! > > Ilija Tovilo and I have been working for the last few months on adding > support for enumerations and algebraic data types to PHP. This is a > not-small task, so we've broken it up into several stages. The first > stage, unit enumerations, are just about ready for public review and > discussion. > > The overarching plan (for context, NOT the thing to comment on right now) > is here: https://wiki.php.net/rfc/adts > > The first step, for unit enumerations, is here: > > https://wiki.php.net/rfc/enumerations > > There's still a few bits we're sorting out and the implementation is > mostly done, but not 100% complete. Still, it's far enough along to start > a discussion on and get broader feedback on the outstanding nits. > > I should note that while the design has been collaborative, credit for the > implementation goes entirely to Ilija. Blame for any typos in the RFC > itself go entirely to me. > > *dons flame-retardant suit* > Thanks for the proposal Ilija and Larry! Enums are long overdue. Some initial thoughts, in no particular order: * I think that serialization support needs to be part of the initial proposals. Otherwise you wouldn't be able to store an enum in a property (without poisoning the whole object graph as "non-serializable"). I expect that a clean solution to this will require a new serialization modifier, but don't think this should be overly hard to add. This should also not introduce any serialization format compatibility concerns as long as it is introduced in the same version that enums are, as any payloads using the new format would only be meaningful on PHP versions that support enums. * As "enum" becomes a reserved keyword, you can' have an interface called "Enum"... If you wanted to, you could probably avoid a reserved keyword by taking a page out of C++'s book and making the syntax "enum class Foo {}" rather than "enum Foo {}", where "enum" would be a contextual keyword. I think this is worth at least considering, because I expect that there's a lot of existing enum libraries that will break due to this reserved keyword. While they can now be replaced by native enums, this will cause issues during migration and with code that is compatible with more than one PHP version. * Rather than WeakMap, the possibly more natural choice for using enum keys is SplObjectStorage. Of course, SplObjectStorage, like anything that is part of SPL, has some peculiarities... Of course, just allowing them as array keys would be ideal, but I agree that this should not be covered by this RFC. This is something I may look into. * While not mentioned in the RFC, you mentioned in this discussion that enum cases cannot be stored in constants: > At present, no. They're "just" objects, and you can't assign an object to a constant. Unfortunately I'm not sure how to enable that without making them not-objects, which introduces all sorts of other complexity. This should at the least be clarified in the RFC. Is this a limitation for constants specifically, or anything using constexpr initializers? For example, is writing "function foo(MyEnum $e = MyEnum::SomeDefaultCase)" possibly? It would be a significant limitation if it weren't. Generally, I think the limitation on objects in constants is mostly artificial and you should consider lifting it as part of this RFC. Previously you simply couldn't create an object in a constexpr initializer, so supporting this wasn't really relevant. With enums, this becomes important. * This has already been mentioned by others, but (in conjunction with the previous point), I think that allowing class constants on enums is pretty useful to allow case aliases. I agree that cases should be unique to start with, but aliases should be possible, and class constants provide a very neat way to provide this. * While I originally liked the idea, after perusing the examples in the RFC, I am not convinced that it is a good idea to allow methods (or anything else) to be defined on a per-case level. Having methods on the enum itself makes sense, but having them on each case seems like it unnecessarily complicates the design, gives people multiple ways to write the same thing and may encourage bad design. I think there are two primary ways in which methods might be used: First, defining a method for each case, such as your example in https://wiki.php.net/rfc/enumerations#advanced_exclusive_values. In this case you have the choice of either defining it on each case, or to define it as a method on the whole enum. When would you choose one approach over the other? Defining the method on the whole enum seems generally superior to me, because it guarantees that all cases have the method from an API perspective (rather than just making it an incidental fact -- though I guess you could add an abstract method to the enum?) Additionally it requires a lot less code, especially if match is used. The example in the RFC is even a bit skewed, because once PSR gets its dirty fingers on this feature, all those "{" will get broken out on a new line and it will take even more code. The other usage is if a method is only defined by *some* of the cases, as in the https://wiki.php.net/rfc/enumerations#state_machine example. This is something I find very dubious from a design perspective, and not something I would like to enable by making it simpler to implement. Do you know if other languages have precedent for methods on individual enum cases? * On the implementation side, a general concern I have is that this requires generating a new class not just for each enum, but for each enum case. Some usages of enums (say lexer tokens, AST node kinds, etc) may require hundreds of enum cases, and will generate hundreds of separate classes. Unlike objects, class entries are not cheap and are not designed to be cheap. * As another implementation note, existing switch/match jumptable optimizations will not work for enums. This is pretty unfortunate, but I don't have an idea on how we could make them work. * I find the automatic downcast of enums to their scalar values a bit problematic when taking the overall direction of the language into account. We want less implicit casts, not more. While I'm sure this will work nicely in some cases, it certainly won't in others. I daresay that passing an enum to the $offset parameter of substr() doesn't make sense regardless of whether the enum has an int backing it or not. Explicitly requiring a ->value() call doesn't seem like an undue burden to me. Regards, Nikita --000000000000b1efe805b5dd3b6d--