Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:116680 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 9152 invoked from network); 17 Dec 2021 21:01:39 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 17 Dec 2021 21:01:39 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 83C411804A8 for ; Fri, 17 Dec 2021 14:04:26 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-lf1-f49.google.com (mail-lf1-f49.google.com [209.85.167.49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 17 Dec 2021 14:04:26 -0800 (PST) Received: by mail-lf1-f49.google.com with SMTP id z7so7324020lfi.11 for ; Fri, 17 Dec 2021 14:04:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Ko3E5UQWGZbeFusQXhzQMSSR81xhK4gPIrtG2PIT09M=; b=NYbdld/rzjO1VnS/7TnnF9I5AL5FY2/evVWPSh6TTH/SlmmytfPOJzhn/Qf0jo61rM YPTTbw2PEyohPtV8M4rhGF5nyPMtE2qKJVPteMSepiQWWSUr7YKFy6Y38Y1ByIdWy+cG b72qrA7/7DuqwLs1WcqA4QyXadwvh+CnTWBCTNlGQ/AtUB6QjZVXNX9ahvpG3O+JbeAa RmAQ2n5BekqN/3/jeYfC4g+EUwJZsi3MIGwdb1gPHyTJTX0HRnKwkKsDb2h358CfGCUm WaUI81Ll11orw/60ffGT3cbDUkV44vwIB9t1Dz+rRsKgs4VBbmdF78IRKztRTqsokh5I ES6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Ko3E5UQWGZbeFusQXhzQMSSR81xhK4gPIrtG2PIT09M=; b=No0AiZn/S6yCXuDrc7rawokEXYdu9J81cA0QEDX4q7ZRidv0xRHXeCkGW6eKZJpqlB n7sO/83PIe3Dvfu7rsVZGhdmZ3UJgX+oLoiw5addezdOFDzZFUEsclU5r9gFa/g9Yk/S x9ROEBnbX+s9+ULYI6aordu31OWjfKfKlFhMq5GqC7zqnFzngdmf642jehFDSekFGZ7G siQzFntkqFvrE2gVUwZnJAnWCCg7NWQW7gEGBPJ9RKQSksseuZz6Hu2vs56yyRLLcgkm L7F+l5RZ8fzShQJchQtHbNq6amwcyxJcHMAM9kJ0TLBzcny/w3dqBfrAoKVOqR7qksCC 22hQ== X-Gm-Message-State: AOAM532htlvAvSthLx1+vJ/z/7SEmXIVFWyatxASemLkdDF6aIJ53dAc cgoShLUT3k83JxmAtislawi7qaUeH1qUKbYcvN8= X-Google-Smtp-Source: ABdhPJyNnHRmtOyLm2NDeYF/AVdykiIYJanq+vyj8JDToOjOjrZI59lkCM0edIpy/4FNG/S7XnoOVYjz08tv8LyGAds= X-Received: by 2002:a05:6512:10c2:: with SMTP id k2mr4637550lfg.209.1639778664333; Fri, 17 Dec 2021 14:04:24 -0800 (PST) MIME-Version: 1.0 References: <44b3fb4b-4693-1639-c8c0-5e17296c196e@gmail.com> <4b58c011-ed87-ba87-201d-0cf8e4116c6f@processus.org> In-Reply-To: Date: Fri, 17 Dec 2021 14:04:12 -0800 Message-ID: To: Matt Fonda Cc: PHP internals Content-Type: multipart/alternative; boundary="000000000000cc44a005d35eb897" Subject: Re: [PHP-DEV] [RFC] User Defined Operator Overloads (v0.6) From: jordan.ledoux@gmail.com (Jordan LeDoux) --000000000000cc44a005d35eb897 Content-Type: text/plain; charset="UTF-8" On Fri, Dec 17, 2021 at 12:26 PM Matt Fonda wrote: > > Thanks for the info. I share Stas's unease with having many different > places we must look in order to understand what $foo * $bar actually > executes. I'm also uneasy with the requirement of union typing in order for > an operator to support multiple types. This will lead to implementations > which are essentially many methods packed into one: one "method" for each > type in the union, and potentially one "method" for each LHS vs. RHS. When > combined, these two issues will make readability difficult. It will be > difficult to know what $foo * $bar actually executes, and once we find it, > the implementation may be messy. > > I agree that returning a union is a recipe for a problem, but the fact > that the input parameter must be a union can imply that the return value > must also be a union. For example, Num * Num may return Num, but Num * > Vector3 may return Vector3, or Vector3 * Vector3 may represent dot product > and return Num. But let's not get hung up on specific scenarios; it's a > problem that exists in the general sense, and I believe that if PHP is to > offer operator overloading, it should do so in a way that is type safe and > unambiguous. > > Method overloading could address both issues (LHS always "owns" the > implementation, and has a separate implementation for each type allowed on > the RHS). But I see this as a non-starter because it would not allow scalar > types on the LHS. > > It's difficult to think of a solution that addresses both of these issues > without introducing more. One could imagine something like the following: > > register_operator(*, function (Foo $lhs, Bar $rhs): Foo { ...}); > register_operator(*, function (Bar $lhs, Foo $rhs): Foo { ...}); > register_operator(*, function (int $lhs, Foo $rhs): int { ...}); > > But this just brings a new set of problems, including visibility issues > (i.e. can't use private fields in the implementation), and the fact that > this requires executing a function at runtime rather than being defined at > compile time. > > I don't have any ideas that address all of these issues, but I do think > they deserve further thought. > With respect, these are not things that were overlooked. Method overloads is something that I understand to be a complete non-starter within PHP. I do not want to speak for other people, but I have been told multiple times by multiple people that this is a feature which there is significant resistance to, to the point of being something which should be avoided. Certainly, it is a separate feature from operator overloading, and shouldn't be included as part of this RFC. As you noted, all of the alternatives have multiple *other* issues. I considered many different ways to implement this, and I decided that this particular way of doing it presented the fewest problems. The reason I made that decision was that problems such as visibility issues would affect nearly every implementation. But the issue of non-sibling type resolution is something which would only affect a small subset of very complicated programs in general. So I chose to confine the issues to the more complex implementations, because these are likely also the ones where the developer is more experienced or has more resources to solve the issues presented. In general, unioning types should be seen as a "code smell" with this feature in my personal opinion. If you start to see 4, 5, 6 different types in your parameters, it should be a signal that you want to re-examine how you are implementing them. I think it works well for this purpose, as many developers already try to refactor code which has very complicated type unions. Given that method overloads were off the table, and that the only realistic way to provide for visibility concerns was to place the overloads on classes, I see the requirement of union typing the operators as a guard rail to help developers avoid implementations which are prone to error or make the program excessively complex to understand. If we created something instead that was a global register of type combinations, such as those suggested by Mel, the implementations would likely be all in one place (some kind of bootstrap or header file), but now would be completely separated from the actual implementations. I *did* consider all these issues quite extensively. I *think* that the solution I'm presenting creates the smallest amount of issues for the smallest set of users. In practice, the two most common usages for this feature (in my estimation) are likely to be userland scalar object implementations, and currency objects. Both of these are very self-contained, and unlikely to want to interact with external objects. The main applications that would be interested in doing that are complex mathematical libraries (the kind of application that would fit your example of Vector * Num). Such libraries are very likely to make subordinate calls within the operator overloads, as the implementations of the mathematics themselves are already very complex and likely used in multiple ways at different times (spoken from experience as someone who maintains a complex mathematics library). For those kinds of applications, the library itself is inherently complex, and I very much doubt that operator overloads will be the main source of complexity and confusion. When dealing with such math, the more difficult parts to use are things that are related to the math itself, such as the idea that complex numbers don't have a <=> relationship to other numbers but do have a == relationship, or the concept of stochastic rounding for applications such as machine learning. I am definitely open to improvements and suggestions, I just want to be clear that this wasn't overlooked. As you wrote out, the alternatives that are obvious to explore present problems that would be experienced on a more widespread basis, and I felt it was best to avoid that. I looked at how other languages implement this feature as well, including Python, R, and C++, to examine how those programming communities interact with different language designs. This RFC is closest to the design of Python, as the concerns within Python are much more similar to the concerns within PHP. If you find another alternative to explore I am happy to discuss it. These same trade-offs exist in other languages which have this feature. Again, I'd look at Python for the closest analogue to this RFC, where operator overloads are used extensively by many of the applications you would expect, but do not appear to present these unstoppable complexity problems to most applications. They are more widely problematic in C++, but several of the most common sources of pain with C++ operator overloading are entirely avoided (on purpose) in this RFC. You cannot overload the assignment operator, you cannot overload the logical operators, you cannot implement == and != with different logic. Even Python allows for you to define > and < with different logic (it doesn't even require a boolean return value). If this RFC were to be accepted, PHP would have some of the most restrictive and logically consistent operator overloads of any language I've investigated as part of this RFC. Is my proposal perfect? I very much doubt that. There is always room for improvement. But an extreme amount of care went into trying to limit the amount of "gunk" this feature will generate, some of it not obvious at first glance of the RFC. Jordan --000000000000cc44a005d35eb897--