Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:93752 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 319 invoked from network); 3 Jun 2016 15:13:47 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 3 Jun 2016 15:13:47 -0000 X-Host-Fingerprint: 137.50.159.184 oa-edu-159-184.wireless.abdn.ac.uk Received: from [137.50.159.184] ([137.50.159.184:29154] helo=localhost.localdomain) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 9A/32-22497-9AE91575 for ; Fri, 03 Jun 2016 11:13:46 -0400 Message-ID: <9A.32.22497.9AE91575@pb1.pair.com> To: internals@lists.php.net References: <0A.C5.62101.1C860575@pb1.pair.com> Date: Fri, 3 Jun 2016 16:13:42 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:43.0) Gecko/20100101 Firefox/43.0 SeaMonkey/2.40 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Posted-By: 137.50.159.184 Subject: Re: [PHP-DEV] Re: [RFC] [PRE-VOTE] Union types From: ajf@ajf.me (Andrea Faulds) Hi, Bob Weinand wrote: > I think this is more of a presentation problem. > As you say, there's not much a better way to do that. Well, that's why I question if we should do it at all, if there's no way to do it without this complexity. > It's basically our weak casting rules, just applied to the most lossless type available. Right, I think I can see the logic to that. But I don't think it works out so well in practice. Consider the hypothetical example `int|string`. Under your rules, if I pass a float to that parameter, it becomes a string. That makes sense if the function just wants a number, in integer or string format, for that parameter. But what if the function wants either a number (in integer format) or a string, and does different things depending on which of these it gets? Now the proposed weak typing rules are broken. A real-world example of this is array indices. Given array keys can be either integers or strings, then if we wanted to take an array key as a parameter, surely we'd use the type declaration `int|string`, right? Well, it turns out that in PHP, $a[1.5] is not equivalent to $a["1.5"], it's equivalent to $a[1]. Yet, if I pass 1.5 to an `int|string` typed parameter, I'd get "1.5". So, we'd end up with an inconsistency: "wrong", 1 => "right"]; var_dump($arr[1.5]); // string(5) "right" var_dump(fetchValueForKeyInArray($arr, $key)); // string(5) "wrong" ?> Trying to do the most "lossless" conversion works for some use-cases, but it doesn't work for others, and it doesn't necessarily match existing behaviour in other parts of PHP. This is a wider concern for me, in that union types have two different use cases: where you want to accept any type of some ad-hoc super-type (int|float as an ad-hoc "number", array|\Traversable as an ad-hoc "thing that can be foreach()'d), but there is also the use case of where you want to accept two completely unrelated types (int|string for two kinds of array key, int|array, etc.) and perhaps do different things depending on the types given. This latter use case strikes me as making a function overly complicated (make a separate function instead), and it interacts poorly with weak typing (if PHP decides to give you a different type, you get a completely different code path!), yet the addition of union types encourages writing such functions! (At least, for people who want to use type declarations throughout their code.) Moreover, I think whatever weak typing precedence rules we choose, they can only work reasonably for one or the other of these use-cases, not both. I think that `int|string` is a succint example of that: whichever ruleset we choose, it will do the wrong thing for some usecase, we're just deciding which one to privilege. > It's out weak casting rules which are so complex; the RFCs combination is not particularly complex. The RFC uses simple rules, but the resulting behaviour is complex. PHP's weak typing rules are complicated enough as they are, and cause enough problem, without the addition of yet more rules, this time for how weak typing should work within union types. I don't think 2 days is enough time to discuss this before voting on it. Thanks. -- Andrea Faulds https://ajf.me/