Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:82915 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 90243 invoked from network); 17 Feb 2015 01:44:37 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 17 Feb 2015 01:44:37 -0000 Authentication-Results: pb1.pair.com smtp.mail=drewparoski@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=drewparoski@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.216.173 as permitted sender) X-PHP-List-Original-Sender: drewparoski@gmail.com X-Host-Fingerprint: 209.85.216.173 mail-qc0-f173.google.com Received: from [209.85.216.173] ([209.85.216.173:47704] helo=mail-qc0-f173.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id FA/C2-08417-40D92E45 for ; Mon, 16 Feb 2015 20:44:37 -0500 Received: by mail-qc0-f173.google.com with SMTP id w7so26723234qcr.4 for ; Mon, 16 Feb 2015 17:44:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=sJlIvlFwqmrvxwqxEHx8yhaCRbIN8VhciCtGpbPMeTA=; b=jDrxDJ2zQIHVR9MyPthZI7/IehHTS3/PmR64IUrLJvmr/FhL6LqtnXOqZexWw7+1Hn j8vA7ABDgF+BOwQ8X8haD6A9RVMI3qs8tEbwPCkXi2Igt52jWvB9UaJPlJBCwUYFjSbw zR/TOV3sf4XuRIcM5aQ5zLd8WPFoMF0N2MBDUbzcBfDPskmWwgog1Y5rkXykmEmo69MS 54nRRm5AKm0TtrWbDbrYC57rFRW8CWHxM88HRK+zmqpKnmB7fRD5cchI+PP3sYOucSLa 9BB+rjTZbz9b4PxwfLekHrG5D/gnVw60mgxMp4aTB3koK+2he3XyFIoSFnZ5rrC7XlV/ EG0A== MIME-Version: 1.0 X-Received: by 10.229.190.6 with SMTP id dg6mr105256qcb.16.1424137474288; Mon, 16 Feb 2015 17:44:34 -0800 (PST) Received: by 10.229.57.138 with HTTP; Mon, 16 Feb 2015 17:44:34 -0800 (PST) In-Reply-To: <54E290E5.3020508@lerdorf.com> References: <011801d04a07$83ab1c00$8b015400$@php.net> <016f01d04a3a$e9183220$bb489660$@php.net> <54E290E5.3020508@lerdorf.com> Date: Mon, 16 Feb 2015 17:44:34 -0800 Message-ID: To: Rasmus Lerdorf Cc: Sara Golemon , francois@php.net, Philip Sturgeon , Arvids Godjuks , Jefferson Gonzalez , Rowan Collins , PHP internals Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] Reviving scalar type hints From: drewparoski@gmail.com (Drew Paroski) On Mon, Feb 16, 2015 at 4:52 PM, Rasmus Lerdorf wrote: > On 02/16/2015 03:58 PM, Sara Golemon wrote: >> On Mon, Feb 16, 2015 at 2:50 PM, Fran=C3=A7ois Laupretre wrote: >>> Once again, anyone can take over version 0.3, if it is so great. Why do= n't you do it ? >>> I will play the game, stop working on my proposal, and vote 'yes' again= . >>> But don't ask me to do it in your place. >>> >> If nobody else does it, I will. >> >> I think Andrea's 0.3 proposal was extremely well balanced, served >> everyone's needs whether they would admit it or not, and who's only >> failing (subjectively termed) was the use of declare(). I think >> declare() is fine and not nearly as ugly as some have slandered it to >> be, but I'm willing to read the winds and modify it for v0.4. > > I still disagree strongly that it serves everyone's needs. The internal > API and APIs provided by extensions are completely messed up by this > approach. Userspace authors get the choice when they write their code. > Even if the caller has turned on strict, if you haven't added strict > types to your functions/methods you are fine. Internal API and extension > authors don't get this luxury. If the caller turns on strict, then > suddenly an API that was written explicitly to make use of the coercive > characteristics PHP has had for 20+ years breaks and there is nothing I > can do about it as an extension author. This is beyond just int->float > coercion, which is, of course, also missing. > > A simple example: > > number_format($num); > > So, for various inputs without strict enabled: > > int(1000) > 1,000 > > string(4) "1000" > 1,000 > > float(1000) > 1,000 > > string(5) "1000 " > Notice: A non well formed numeric value encountered in ... > 1,000 > > string(5) " 1000" > 1,000 > > string(9) "1000 dogs" > Notice: A non well formed numeric value encountered in ... > 1,000 > > string(3) "dog" > Warning: number_format() expects parameter 1 to be float, string given > NULL > > resource(5) of type (stream) > Warning: number_format() expects parameter 1 to be float, resource given > NULL > > This is the intended API. The function does some sanity checking for > things that clearly don't make sense based on what it was written to do > and fails hard. It also lets the user know about questionable data and > relies on coercion for the rest. You could argue that the "1000 dogs" > case should be more severe than a notice, but that is pretty minor I > think. The extension author has the ability to make that more severe if > she likes. > > > Now turn on strict and we get: > > int(1000) > number_format() expects parameter 1 to be float, integer given > > string(4) "1000" > number_format() expects parameter 1 to be float, string given > > float(1000) > 1,000 > > string(5) "1000 " > number_format() expects parameter 1 to be float, string given > > string(5) " 1000" > number_format() expects parameter 1 to be float, string given > > string(9) "1000 dogs" > number_format() expects parameter 1 to be float, string given > > string(3) "dog" > number_format() expects parameter 1 to be float, string given > > resource(5) of type (stream) > number_format() expects parameter 1 to be float, resource given > > That in itself might not be so terrible, but also not terribly useful > and it is nothing like what the extension author had intended. And how > do you think users will deal with internal functions that are now > suddenly strongly typed even though they were not designed to be? Do you > think they will go look at the source code for the function and mimic > the data sanity checks and put those in their userspace code? Doubtful, > and as people on irc and everywhere have told me, they will just cast. > No big deal. So, we run the same set of data through with strict enabled > and doing a number_format((float)$num): > > int(1000) > 1,000 > > string(4) "1000" > 1,000 > > float(1000) > 1,000 > > string(5) "1000 " > 1,000 > > string(5) " 1000" > 1,000 > > string(9) "1000 dogs" > 1,000 > > string(3) "dog" > 0 > > resource(5) of type (stream) > 5 > > > Out of the 3 scenarios, this is inarguably the worst outcome with the > last line there being the most blatant. Actually outputting an internal > resource id as if it was a valid number to be formatted without the > slightest notice or warning anywhere that something is wrong in the code. > > This is my fear with this approach. People will start littering their > code with casts and it will hide real bugs which is the complete > opposite of what motivated this in the first place. > > Without more thought into how we properly deal with internal/extension > code I don't really understand how so many people foresee this to work > perfectly in the real world. > > -Rasmus > I think you bring up a good point, Rasmus. Extension implementations have existed for many years in a world before user-land scalar typehints. I can see how the author of an extension might have written their code in a way that relies on assumptions about how argument coercion behaves. What would you say if zend_parse_parameters() API and friends were left alone, and a new flag or new versions of these APIs were introduced so that extension authors could gradually adopt the new parameter checking/coercion scheme at their leisure? When adopting the new parameter checking/coercion scheme, if an author needs/wants to do something different for parameter checking or coercion for a given extension, it occurred to me that they could just omit the typehint (i.e. use "z" for zend_parse_parameters()) and then do manual checking inside the function's body. I've seen this technique used in a number of extensions (and in pure PHP functions) and it doesn't seem to cause significant problems AFAICT. I know this isn't a fully baked solution to the issue you raise, but I'm curious to hear your thoughts about this approach. Regards, Drew Paroski