Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:105173 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 20277 invoked from network); 9 Apr 2019 14:58:37 -0000 Received: from unknown (HELO mail-it1-f196.google.com) (209.85.166.196) by pb1.pair.com with SMTP; 9 Apr 2019 14:58:37 -0000 Received: by mail-it1-f196.google.com with SMTP id a190so4350100ite.4 for ; Tue, 09 Apr 2019 04:55:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Rk68cZkbpq35Jc8WR5ho7femPVP1HTs/61wkU1byL50=; b=CqvWciBvLCw86HyTT826NTrVCqZt4hpwW+ZCBaXal1Bzg5HY+VXZLembG9qy/GH0ZG s/CyNJsJ9lBiTEYFZtrL4JkNnsrj+4N5Op46YksV9Wx0t3ifzS1T4gNl6aXLoPDZEw4Q KKgb1sk7lDJWlHixPlqrYefbFKzk1Brr1WMB/cl+JLej99S1amG3reuxWhjUpvPJ46Sc ILH11Bl7u+qBoBGpP1xVfhQur+nngzEfuvRNjxeAaCDR8xyHuF5E+M846w0hv33xq6/7 H0/Jhi5o85dxof1QxX6o5V7NqLeCdOkqSjiafEXcqGPGhkwF4MvvhI0dSclGIKjAAXT6 MR2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Rk68cZkbpq35Jc8WR5ho7femPVP1HTs/61wkU1byL50=; b=a3pnKGeCE8VWBE8OhkMSXg58oZohI9dKEeIBUeqDF8rEcWsLbsbLEsIM/rXPVkol81 9aCxU1XRjPTUJKf1TXNGjzklxspGek4EnmRrH7m9XzfF3mFsiAnD0R3GdDpvaEiw62QW 4hWNvbgVG8ZHW55PD1vb6RJqI8WYx7m42IN2kLHQ6K7f7cF4Ff2KhyZIPTvkTZNv9wPC fzPgNVo7yYNrx5f4CGGmdEiuk9UteWliB5jFt7J3XuOlYH3bPrl5Zk4Vp9SIPXQCxat1 ki36eL+nA75+EiqpeK2VIP6mlxuqA8leyHQxf9xh2J3WFhtDs/17mb/newuR5DFyux+z H/Pw== X-Gm-Message-State: APjAAAVH6qe54StKXVmcISYBNw8rBJwRCZRKIPENIrHSJNXC/86yrIuy 3QdFgSmpFWAUok/Ch3KtkXGlC+fnGgW9r6RArSs= X-Google-Smtp-Source: APXvYqzBJT5LkOFZ5ODuaIUko6A4jbjcXTQTY3eIrNqxeZCg+bT6GumhmlbVJUUf2SVWl3sS4J4SxrwSBX3ImRChUX0= X-Received: by 2002:a02:9f93:: with SMTP id a19mr25076755jam.123.1554810930068; Tue, 09 Apr 2019 04:55:30 -0700 (PDT) MIME-Version: 1.0 References: <40683e93-f8e9-5a8c-9646-31c73c99396f@fischer.name> <5ca53eb4.1c69fb81.e223b.922eSMTPIN_ADDED_MISSING@mx.google.com> In-Reply-To: Date: Tue, 9 Apr 2019 13:55:18 +0200 Message-ID: To: Nikita Popov Cc: Derick Rethans , Andrea Faulds , PHP internals Content-Type: multipart/alternative; boundary="0000000000002e0343058617a166" Subject: Re: [PHP-DEV] [RFC] Permit trailing whitespace in numeric strings From: benjamin.morel@gmail.com (Benjamin Morel) --0000000000002e0343058617a166 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable > > I should probably clarify what I mean by explicit and implicit here. By > explicit I mean anything using (int) casts or doing so internally > (implicitly ^^) -- this *must* produce an integer in some way and does no= t > have the option of rejecting the input. By implicit I mean other places > checking for numeric strings, such as "int" parameters. These *do* have t= he > option of rejecting the input. Both cannot work the same way due to the > different constraints. Why? Wouldn't it be nice to align the behaviour of implicit and explicit casting, so that (int) "abc" throws a TypeError? Ben On Tue, 9 Apr 2019 at 13:06, Nikita Popov wrote: > On Tue, Apr 9, 2019 at 12:57 PM Derick Rethans wrote: > > > On Tue, 9 Apr 2019, Nikita Popov wrote: > > > > > On Thu, Apr 4, 2019 at 1:16 AM Andrea Faulds wrote: > > > > > > > I'm kinda unsure how to go forward because of these points. I would > > like > > > > to see improved comparisons, and I would like to see the end of the > > > > =E2=80=9Cnon-well-formed=E2=80=9D numeric string, and I think this = whitespace RFC > could > > > > be helpful to both. But I can't see the future, I don't know whethe= r > > > > people will vote for removing leading or permitting traiing > whitespace > > > > and whether or not they will be influenced by or this will influenc= e > > > > opinion on the further improvements. =C2=AF\_(=E3=83=84)_/=C2=AF > > > > > > > > I'm torn between: > > > > > > > > * Vote on allowing trailing whitespace > > > > * Vote on disallowing leading whitespace > > > > * Vote on which of those two approaches to go for > > > > * Trying to bundle everything together and voting on it as a packag= e. > > > > > > > > I'm probably thinking too strategically. > > > > > > > > > > Given the response on the mailing list (and also other places like > > Reddit), > > > it seems like people feel pretty strongly that it's better to drop > > support > > > for leading whitespace than add support for trailing whitespace. If w= e > do > > > this, I think we should couple this change with the removal of "non > > > well-formed numeric strings", because they are so closely related (on= e > > > change would forbid leading whitespace and the other trailing > > characters). > > > > > > One possible course of action would be: > > > > > > a) In PHP 7.4 throw a deprecation warning in is_numeric_string if the= re > > is > > > leading whitespace (always). > > > b) In PHP 7.4 throw a deprecation warning in is_numeric_string if the= re > > are > > > trailing characters in mode 1 (mode -1 already throws a notice and 0 > > > already treats as non-numeric). > > > b) In PHP 8.0 treat leading whitespace as non-numeric (always). > > > c) In PHP 8.0 treat trailing characters as non-numeric (always), and > > > remove the non well-formed distinction (mode -1). > > > > > > Notably this also affects (int) behavior in that (int) " 42" will b= e > 0 > > > and (int) "42xyz" will be 0. > > > > > > A less aggressive alternative would be: > > > > > > a) In PHP 7.4 throw a deprecation warning in is_numeric_string if the= re > > is > > > leading whitespace (unless mode is 1). > > > b) In PHP 8.0 treat leading whitespace as non-numeric (unless mode is > 1). > > > c) In PHP 8.0 treat leading characters as non-numeric (unless mode is > 1). > > > Remove non well-formed distinction (mode -1). > > > > > > This would keep the behavior of (int) as-is and only affect implement > > > numeric string checks. > > > > > > This discussion how mostly been around the implicit cases, what do > people > > > think about the desired behavior of (int)? > > > > I think there should be no difference in behaviour between implicit and > > explicit cases. > > > I should probably clarify what I mean by explicit and implicit here. By > explicit I mean anything using (int) casts or doing so internally > (implicitly ^^) -- this *must* produce an integer in some way and does no= t > have the option of rejecting the input. By implicit I mean other places > checking for numeric strings, such as "int" parameters. These *do* have t= he > option of rejecting the input. Both cannot work the same way due to the > different constraints. > > So to rephrase my question: While I think there is a consensus that > "123xyz" and " 123" should not be accepted by an "int" parameter, it is > not clear to me that there is also a consensus that (int) "123xyz" and > (int) " 123" should result in 0 rather than 123. > > Regards, > Nikita > --0000000000002e0343058617a166--