Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:79006 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 68565 invoked from network); 19 Nov 2014 18:28:31 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 19 Nov 2014 18:28:31 -0000 Authentication-Results: pb1.pair.com smtp.mail=anatol.php@belski.net; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=anatol.php@belski.net; sender-id=unknown Received-SPF: error (pb1.pair.com: domain belski.net from 85.214.73.107 cause and error) X-PHP-List-Original-Sender: anatol.php@belski.net X-Host-Fingerprint: 85.214.73.107 klapt.com Received: from [85.214.73.107] ([85.214.73.107:39368] helo=h1123647.serverkompetenz.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 03/D2-51809-B41EC645 for ; Wed, 19 Nov 2014 13:28:28 -0500 Received: by h1123647.serverkompetenz.net (Postfix, from userid 33) id 0113C6D2001; Wed, 19 Nov 2014 19:28:23 +0100 (CET) Received: from 217.253.42.238 (SquirrelMail authenticated user anatol@belski.net) by webmail.klapt.com with HTTP; Wed, 19 Nov 2014 19:28:23 +0100 Message-ID: In-Reply-To: <1433D80F-85DF-45A9-B9E8-2E48E0B6321C@ajf.me> References: <66B7B28C-2651-4A71-AC2A-55D4C7BB3DDC@ajf.me> <656B2A54-572B-4E6A-892B-25FAE428F434@ajf.me> <3844e0dd17eef0f8991dd01eda533e2e.squirrel@webmail.klapt.com> <1433D80F-85DF-45A9-B9E8-2E48E0B6321C@ajf.me> Date: Wed, 19 Nov 2014 19:28:23 +0100 To: "Andrea Faulds" Cc: "Anatol Belski" , "Yasuo Ohgaki" , "PHP Internals" User-Agent: SquirrelMail/1.5.2 [SVN] MIME-Version: 1.0 Content-Type: text/plain;charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] [RFC] Safe Casting Functions From: anatol.php@belski.net ("Anatol Belski") On Wed, November 19, 2014 15:49, Andrea Faulds wrote: > >> On 19 Nov 2014, at 08:33, Anatol Belski wrote: >> >> >> while briefly looking through the conversion examples, i see some weird >> results >> >> string(5) “31e+7” - shouldn't this be valid for int? > > The trend seems to be to consider things with exponents or decimal points > as floats. Even though there’s a case for supporting it for ints, (int) > and intval() don’t work with exponents, so to_int() shouldn’t either. > >> string(4) “0x10” - hex, but that's int, no? > > Supporting hex is a rather obscure use-case. Also, (int) and intval() > don’t support it. > >> string(3) “010” - octal, but that's int, no? > > While allowing leading zeroes would be nice, octal causes problems. In > particular, 0-prefixed strings aren’t handled consistently. Some things > deal with them as decimal, others deal with them as octal. Because the > user’s intent isn’t clear, we can’t support them, and I assume this is why > FILTER_VALIDATE_INT doesn’t support them. > > >> string(4) “10.0” - this would be casting to 10, so int valid > > Allowing .0 for an int doesn’t feel right. What do we do for “10.01”? > Reject > it? That seems rather arbitrary when we would be allowing “10.00”. So it’s > not accepted. > >> object(Stringable)#2 (0) {} - and similar actually, what if _toString() >> returns some int/float literal? that should pass as well, no? > > __toString() always errors if it doesn’t return a string, I see no reason > to change that. But in the other cases it converts strings to numbers. I mean like class A {function __toString(){return '10';}} $a = (string) (new A); //numeric literal > >> Generally I'd say no to this RFC. The current casting is not perfect, >> but as for me - the one suggested is highly questionable as well. IMO as >> long as there are no proper strict types in PHP, any other rule set for >> casting would be just another coordinate system for the same, which >> isn't worth while at least. > > Something like this RFC is a necessary prerequisite for strict types. > Without it, there’s not a convenient way to do a safe conversion. If we > just add strict types, people will blindly use (int) or intval() and > magically, garbage input will be transformed (through the magic of > ignoring everything in the string that doesn’t look like an int) into > apparently sane input and apps will do dangerous things when presented > with bad user input. > > -- > Andrea Faulds > http://ajf.me/ > IMHO it's a new rule set around the old thing. There's no way to foresee all the scenarios. Say I expect an an input to be less than 3. It's up to a programmer whether to check that the input is (int)'3' > 3 and give up, or to try sscanf('2e+22', '%f')[0] > 3. Even not talking about regex. There are already mechanisms allowing to implement that, customizable to a high level and usually one can come up with them. Maybe that rule set would sometimes let spare a line, still it depends on concrete use case. Regards Anatol