Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:105167 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 98646 invoked from network); 9 Apr 2019 13:51:14 -0000 Received: from unknown (HELO mail-it1-f181.google.com) (209.85.166.181) by pb1.pair.com with SMTP; 9 Apr 2019 13:51:14 -0000 Received: by mail-it1-f181.google.com with SMTP id u65so4095435itc.2 for ; Tue, 09 Apr 2019 03:48:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=VNawdQvBSZAM+lH5XGJtymXm9Pqz1vue4CniVUCNWHg=; b=e0KWo2Uj1fZiFZRgbhUwT6KRYS43u5t5nfCn1zznMTc4zii00k6HSumD1a3AuhC8/M dn6dOrGswJvZPz3q9TeY2v9hAHiStSNTkZKJS6L1KMdyNbvCWO3NEXMBa6SRI9Ipri1l tl7VcndADEjoYhfA1+YhZa+e58C64tpnhoVyUv55c5d0Y1IidKpdKCJ3ROpygWvfPE2D uIZ6qmBnoOFIoYbVWAEnqO2Zxsp9vJQJ6aFz3NWK051jrC1owOvybKFl+SdM61iTo++P QPdt9mSpYeGuVeWCdbcCczLCEyPk7aqmwL/CSoU8yDUe45C8BDn1DzNW04kjqR+BkPua 3acg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=VNawdQvBSZAM+lH5XGJtymXm9Pqz1vue4CniVUCNWHg=; b=cNduWZkWv/prVxGtmnKWOGleXluJ6HBxTWKgVS4I0zTu36QO7YtA+q9a12WYLtqiVm qSWeSBGh190v2memxK1QG5s376bwMLkwOlPjXGkfaTztk7j0N/g8s77TRlXX4FqURyKr I9boW/+8UPkn2kalsDWVJCBLpEOy2jXakp+xBpLxbh2QbejhijCm3Nb3IeDn5WNnTxLo APPruy/yDTg19tlhxPLvN+ICZZGW6Hb5WUZ9MSSk77bQ4Um8ZbGlL8FFWWgTBAf5rz4E 8xkn5bz1fGwvCzGoJCUtKy02/y4tey69EQom4fSP3/bIrQRmiuqIlC33OwAXvB+jdcXE 1k1g== X-Gm-Message-State: APjAAAUQlZvp3QPCPPXyrNtS8ttey15BohS21oD2j5x0+oTTL1ireSd6 BApmiNAXyDfeUNj4RB6WmUdsUMlfkHiDC4Hwr5c= X-Google-Smtp-Source: APXvYqyKzQrfT8w9UZ6FEDavuY8nI/dVyVCVKzm98PRe6MA1iwkx+/50o+jhxBbvxBNhnX6HHjdxpgOAMoFZTQ/DvVk= X-Received: by 2002:a24:9197:: with SMTP id i145mr23429737ite.117.1554806885527; Tue, 09 Apr 2019 03:48:05 -0700 (PDT) MIME-Version: 1.0 References: <40683e93-f8e9-5a8c-9646-31c73c99396f@fischer.name> <5ca53eb4.1c69fb81.e223b.922eSMTPIN_ADDED_MISSING@mx.google.com> In-Reply-To: <5ca53eb4.1c69fb81.e223b.922eSMTPIN_ADDED_MISSING@mx.google.com> Date: Tue, 9 Apr 2019 12:47:49 +0200 Message-ID: To: Andrea Faulds Cc: PHP internals Content-Type: multipart/alternative; boundary="0000000000001b3443058616b02f" Subject: Re: [PHP-DEV] [RFC] Permit trailing whitespace in numeric strings From: nikita.ppv@gmail.com (Nikita Popov) --0000000000001b3443058616b02f Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Apr 4, 2019 at 1:16 AM Andrea Faulds wrote: > Nikita Popov wrote: > > I'm always a fan of making things stricter, but think that in this > > particular case there are some additional considerations we should keep > in > > mind. > > > > 1. What is more important to me here than strictness is consistency. > Either > > both " 123" and "123 " are numeric, or neither are. Making "123 = " > > numeric is a change we can easily do, because it makes the numeric stri= ng > > definition more permissive and is thus mostly backwards compatible. Doi= ng > > the reverse change is certainly not compatible and will be a much harde= r > > sell. > > > > 2. I believe that a large part of the motivation here is that by making > the > > numeric string definition slightly more lax (in a consistent manner), w= e > > can make *other* things more strict, because this essentially eliminate= s > > the only "somewhat reasonable" case of trailing characters. The RFC > already > > mentions two of them: > > > > a) We can hard reject "123foo" inputs to "int" arguments (and some othe= r > > places). Currently this is allowed with a notice. I think if we resolve > the > > trailing whitespace question, then there cannot be any reasonable > > opposition to this change. > > b) My own RFC on number to string comparisons would benefit from this. > From > > initial testing it has surprisingly little impact, but one of the few > cases > > that turned up was this comparison with a string that had trailing > > whitespace. > > > > Personally I think both of those changes are a lot more valuable than a > > stricter numeric string definition without leading/trailing whitespace. > > I'm kinda unsure how to go forward because of these points. I would like > to see improved comparisons, and I would like to see the end of the > =E2=80=9Cnon-well-formed=E2=80=9D numeric string, and I think this whites= pace RFC could > be helpful to both. But I can't see the future, I don't know whether > people will vote for removing leading or permitting traiing whitespace > and whether or not they will be influenced by or this will influence > opinion on the further improvements. =C2=AF\_(=E3=83=84)_/=C2=AF > > I'm torn between: > > * Vote on allowing trailing whitespace > * Vote on disallowing leading whitespace > * Vote on which of those two approaches to go for > * Trying to bundle everything together and voting on it as a package. > > I'm probably thinking too strategically. > Given the response on the mailing list (and also other places like Reddit), it seems like people feel pretty strongly that it's better to drop support for leading whitespace than add support for trailing whitespace. If we do this, I think we should couple this change with the removal of "non well-formed numeric strings", because they are so closely related (one change would forbid leading whitespace and the other trailing characters). One possible course of action would be: a) In PHP 7.4 throw a deprecation warning in is_numeric_string if there is leading whitespace (always). b) In PHP 7.4 throw a deprecation warning in is_numeric_string if there are trailing characters in mode 1 (mode -1 already throws a notice and 0 already treats as non-numeric). b) In PHP 8.0 treat leading whitespace as non-numeric (always). c) In PHP 8.0 treat trailing characters as non-numeric (always), and remove the non well-formed distinction (mode -1). Notably this also affects (int) behavior in that (int) " 42" will be 0 and (int) "42xyz" will be 0. A less aggressive alternative would be: a) In PHP 7.4 throw a deprecation warning in is_numeric_string if there is leading whitespace (unless mode is 1). b) In PHP 8.0 treat leading whitespace as non-numeric (unless mode is 1). c) In PHP 8.0 treat leading characters as non-numeric (unless mode is 1). Remove non well-formed distinction (mode -1). This would keep the behavior of (int) as-is and only affect implement numeric string checks. This discussion how mostly been around the implicit cases, what do people think about the desired behavior of (int)? Regards, Nikita --0000000000001b3443058616b02f--