Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:58757 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 52155 invoked from network); 7 Mar 2012 21:59:46 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 7 Mar 2012 21:59:46 -0000 Authentication-Results: pb1.pair.com smtp.mail=johncrenshaw@priacta.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=johncrenshaw@priacta.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain priacta.com designates 64.95.72.241 as permitted sender) X-PHP-List-Original-Sender: johncrenshaw@priacta.com X-Host-Fingerprint: 64.95.72.241 mxout.myoutlookonline.com Received: from [64.95.72.241] ([64.95.72.241:35254] helo=mxout.myoutlookonline.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id AD/DB-15180-F4AD75F4 for ; Wed, 07 Mar 2012 16:59:44 -0500 Received: from mxout.myoutlookonline.com (localhost [127.0.0.1]) by mxout.myoutlookonline.com (Postfix) with ESMTP id 6D5D6554741; Wed, 7 Mar 2012 16:59:41 -0500 (EST) X-Virus-Scanned: by SpamTitan at mail.lan Received: from HUB022.mail.lan (unknown [10.110.2.1]) by mxout.myoutlookonline.com (Postfix) with ESMTP id 56942553F48; Wed, 7 Mar 2012 16:59:40 -0500 (EST) Received: from MAILR001.mail.lan ([10.110.18.27]) by HUB022.mail.lan ([10.110.17.22]) with mapi; Wed, 7 Mar 2012 16:59:20 -0500 To: Anthony Ferrara CC: Simon Schick , Kris Craig , Raymond Irving , "internals@lists.php.net" Date: Wed, 7 Mar 2012 16:59:32 -0500 Thread-Topic: [PHP-DEV] Scalar Type Hinting Thread-Index: Acz8EmMPZIwpb7VzTHaN6mX66IlRMQAhKTQQ Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: RE: [PHP-DEV] Scalar Type Hinting From: johncrenshaw@priacta.com (John Crenshaw) > From: Anthony Ferrara [mailto:ircmaxell@gmail.com]=20 > > John, > > On Tue, Mar 6, 2012 at 9:04 PM, John Crenshaw = wrote: > > A good number of issues with the current proposal were raised during th= e discussion on the mailing list. I don't feel like digging them all up rig= ht now, but off the top of my head I remember the following being raised an= d never saw any consensus for how to resolve them: > > I went over the replies to the initial POC thread that I posted > (http://marc.info/?t=3D133066037200001&r=3D1&w=3D2) and I'll rebut your r= eplies. > You've been spending a lot of time defending these proposals and trying to = prove wrong feedback that raises concerns. This is preventing you from actu= ally using the feedback to improve the proposals. You are losing out on per= haps the biggest advantage of the RFC process, which is that multiple minds= can work together to hammer out an idea and make it really shine. Most of your "rebuttals" focus entirely on whether the RFC contains suffici= ent information to make something technically work. That's not the point at= all. I've read the RFC. I have no doubt that it "works", nor I think does = anyone else. That's not the issue. To quote a recent film: "Titan: Oh yeah? What's the difference? Megamind: P= RESENTATION!". The code syntax is the UI that PHP presents to developers. Y= es, this "works" (in the sense that it is possible to implement what the RF= C describes), but there are serious usability and communication problems. L= anguage is also one of the trickiest interfaces to work with because once y= ou commit to something you are pretty much stuck with it forever (namespace= separator). If every PHP developer is going to have to deal with this unti= l the end of time it needs to be *awesome*. > > - inconsistent syntax (one syntax for scalars, a different one for=20 > > classes) > > This is actually discussed in the RFC, as it is not inconsistent (it's ac= tually consistent with what the patch tries to achieve). The syntax for cl= asses and normal arrays is a strict check, where if the match fails an erro= r is thrown. This syntax attempts to distinguish between that functionalit= y by providing a different syntax altogether. And since it's casting the p= arameters, the syntax feels natural (for that goal). > You can mince words, but that doesn't change the problem. It is utterly inc= onsistent with the expectations in creates. You'll argue that it creates an= expectation of a type cast, but you'd be wrong in far too many cases. The= syntax is similar enough to the syntax for parameter types in other langua= ges that developers will think of it as basically the same thing. The synta= x differs however from existing parameter type syntax. The behavioral difference is also a problem, being too different from param= eter typing to be useful (it doesn't actually vet the parameter), and yet c= lose enough to validate the confusion (behaves too similarly to an implicit= conversion). In the end you have a bizarre syntax that looks like one thin= g but is conceptually another, but with a subtle behavioral difference that= is invisible except when it fails to fail. > > - conflicting syntax (I.E. array vs. (array), RFC simply "allows"=20 > > this, and ignores the confusion that this will create for users.) > > Actually, it doesn't simply allow that. It did it for a very specific re= ason. "array" is a strict check, and "(array)" is a casting check. > One will fatal if a non-array is passed, and the other will attempt to co= nvert the parameter to an array. Very different functionality, which are b= oth internally consistent with the other syntax... > I know about the behavioral difference. I'm not talking about a technical c= onflict, I'm talking about a conflict in the mind of the developer. Given f= unction(array $a, (array) $b){} the difference between $a and $b is a very = advanced distinction and will be completely lost by the average developer. The confusion is made worse by the fact that function(array $a) works, but = function(int $a) doesn't, but function((int) $a) does. I know why, but the = average developer just learning PHP for the first time won't get it at all. > > - different from the syntax used in the docs > > Actually, it's the exact same syntax used for casting in the docs. > It's different from class type hints, because it's intended to be so. > If you don't like it, that's fine. But it's intentionally different. > Sometimes I think you miss the point on purpose. In the docs substr is defined as: string substr ( string $string , int $start [, int $length ] ) If I wanted to write the same thing in my own code I would have to write: function my_substr ( (string) $string , (int) $start , (int) $length ) One syntax for the docs, a different one in my code. > > - lack of sufficient function to justify a core change > > That's absolutely something to be considered. However, I see erroring on= invalid casts as a bigger issue not the responsibility of a > **casting** hint patch. So that's why I mention explicitly in the RFC an= d my blog post that solving that problem should be another RFC (culminating= in a series of 3 RFCs that each work together very well to fill the overal= l need). > Well, if you argue that this proposal is actually a typecasting enhancement= and not a parameter type enhancement, then yes, you'd be right. That's not= how this is presented though, that's not where it is presented (Scalar Typ= e Hinting discussion), that's not when it was presented, and that's not how= it is getting perceived or measured. As far as expanding type casting in this way, it doesn't make sense. There = is no precedent, it looks and will get treated like parameter hinting (whic= h you are saying it isn't), there is no additional functional value, and ve= ry little value of any sort beyond casts (just a tiny documentation boost a= nd a savings in the number of characters in the few cases where a dumb type= cast is good enough.) > > - chaos surrounding null (to accept and if so to cast or not? Creates=20 > > a conflict between consistency and implications of the syntax) > > That's absolutely a valid concern. And that's what should be being discu= ssed, if people really feel that the current implementation is wrong... > > > - conflicts with references (RFC tries to address this by simply=20 > > disallowing references, which IMO just ignores the need that would=20 > > have caused this sort of code in the first place.) > > Actually, it explicitly disallows references. This was added because it = was explicitly requested on the list in many discussions (including the ini= tial one for the POC). But if you can make a case on why to implement it, = and why it makes sense, then we can add it back. > Again, yes, I know that the RFC takes a position on this. My point is not t= hat the RFC doesn't take a position, my point was that the position taken "= ignores the need that would have caused this sort of code in the first plac= e." Reference parameters are basically return values. Just because a parameter = will return a value doesn't mean that the type no longer matters. In fact, = it means that the calling function should EXPECT the type to change, and EX= PECT the value to be modified. One example that occurs to me off hand is pr= eg_match, where the 3rd parameter is documented as an array reference. You = can actually pass anything, but it's going to be an array when the function= returns. > > There were others, but I'm not making an exhaustive list. > > Well, the ones that I saw were either rectified, requested (and > implemented) or (to me) showed a lack of understanding about the rational= e and decisions (which is valid, but shouldn't really bear that much on the= discussion, as I feel if you want to raise an issue, you should at least h= ave a cursory understanding of what you're commenting about)... > > >> As far as it being crippled, I'm not sure what you mean, just because = it's only doing casting? > >> > > > > Yes, casting is barely better than doing nothing. If I want a dumb type= cast I can do that already. What I can't do without massive boilerplate eve= rywhere is an intelligent conversion that accepts safe conversions and give= s a warning or error on unsafe conversions. > > Again, I personally see casting data-loss a bigger issue than just parame= ter hinting, which should get its own RFC to clean up. That's why I didn't= include it here. On purpose... > You can't raise errors like this on lossy casts. Giving a warning if an imp= licit conversion loses data (123 + '456xyz', or substr('foo', 'bar')) makes= sense, but if I explicitly write (int)'456xyz' or (int)23.7, I must not ge= t a warning about data loss (otherwise it would be difficult/impossible to = force a deliberately lossy conversion without warnings). Since we can't rai= se warnings on casts, you really won't be able to do it with casted paramet= ers either (unless you're willing to accept a consistency break). These wil= l be stuck in silence forever. > > Well, the big issue for me is that I think the type casting hints start= s from a fundamentally bad foundation. The premise here is based on a synta= x that is ugly and strange, and the syntax opens a bunch of new problems th= at I don't see a way to resolve. The reason for selecting this syntax was t= o avoid a BC break from reserving some new keywords. I don't really see thi= s as a good trade. > > To be perfectly clear, the reason for that syntax was **not** to avoid a = BC break. It was to avoid a FC break. I wanted a casting hint, and I didn= 't want to restrict the possibility of adding strict hints later. And I wa= nted to clearly distinguish the current strict hints (array + class) and th= e new casting hints. To me, the most logical way of doing that was to re-u= se the casting syntax. Could we have done foo(int? $foo) or foo(*int $foo)= or foo(@int $foo) or whatever? > Well, if that is the case, you did not succeed. The FC break you originally= worked to avoid is something that can never happen anyway, and this does n= ot avoid the most critical FC break. First, "strict" types (in the sense of not allowing an integer to be passed= to a string parameter) are never going to happen in PHP. Even C++ looks fo= r an implicit conversion and uses that if available. PHP has well defined i= mplicit conversions between scalars, and there is absolutely no chance that= PHP is ever going to implement any form of typing that is more strict than= typing in C++. A strict set of scalar types hint that don't juggle anythin= g is just not in the cards. Second, (bear with me) the form of typing that people seem to have settled = on before this RFC was introduced is "the same typing as the core, but at t= he language level." To use a prior example, the core has: string substr ( string $string , int $start ){...} and the following behavior: substr('foo', 'bar'); // E_WARNING substr(123, '1'); // '23' I should be able to write an equivalent: function my_substr ( string $string , int $start ){..} and I should get matching behavior: my_substr('foo', 'bar'); // E_WARNING my_substr(123, '1'); // '23' Now here's the point, if somehow this proposal were accepted and incorporat= ed into the language it would make it nearly impossible to later achieve th= e above. Having both would be confusing, because the behavioral difference = is extremely narrow. Your syntax at that point really just silences lossy i= mplicit conversions and would actually be more similar to what one might ex= pect from: function my_substr ( @string $string, @int $start ){..} > Sure. But I'd argue that anything except foo(int $foo) would have lead t= o FAR more confusion. > I don't think any other syntax is even an option. This syntax is used by ex= isting type hints in PHP, and in the PHP documentation, and in basically al= l generated library documentation, and used for parameter types in every sy= ntactically similar language. > This way, it's re-using an existing syntax that **every** intermediate de= veloper is intimate with, and applying it to a case that's basically functi= oning along the same lines... > And reusing that syntax leads this straight into the dead end indicated pre= viously when talking about the FC objective. John Crenshaw Priacta, Inc.