Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:124964 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id B17EC1A00B7 for ; Thu, 15 Aug 2024 23:02:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1723763056; bh=7i8lF+Z37bPyAUpBdapNH9SSxLv8o0+QqeZD5tcqNh4=; h=Date:From:To:In-Reply-To:References:Subject:From; b=VS6drSlX94ZKtQOKKeWZlR/RvTnFDeVL7aBApepAUs2zp5zfOaDyied6cI9EgqMWD p3jE9y4rDjLW3zBc20DInPdXNOVMQL4a0OH0CimxmySc8sZ0Ul9ROAg0zAmPq4vHZO lnIkeEam+PbTV5/M/yzugiZl+T6fLrzmCw3+oeXTv0jNpEhbt086UG7X67olT40QFr w8N5MIPuRv0QBRptOAIP24IkiZABNpnLGMmiDVxRs7ZkeXXafQ1L8mspraCGnwAZyh LznLVz3kDH8PlFOWA9aqJUmvJW/7+eVKumh+Jludmhwrd1hLzChAJCzIjgnus9xWH8 ABJPw82Nmn40w== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 80C01180083 for ; Thu, 15 Aug 2024 23:04:15 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from fhigh4-smtp.messagingengine.com (fhigh4-smtp.messagingengine.com [103.168.172.155]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 15 Aug 2024 23:04:14 +0000 (UTC) Received: from phl-compute-03.internal (phl-compute-03.nyi.internal [10.202.2.43]) by mailfhigh.nyi.internal (Postfix) with ESMTP id C7489114AD73 for ; Thu, 15 Aug 2024 19:02:27 -0400 (EDT) Received: from phl-imap-09 ([10.202.2.99]) by phl-compute-03.internal (MEProxy); Thu, 15 Aug 2024 19:02:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bottled.codes; h=cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm1; t=1723762947; x=1723849347; bh=7i8lF+Z37b PyAUpBdapNH9SSxLv8o0+QqeZD5tcqNh4=; b=PBBWl/uR50MUC8CpcwyXQUgPRl vpZ4PJKQkR4DT+hMq+LkqeZKI6CZHxIVfQK2pOourj+dKYa0UiIGFPjmfqlXf6NF nziPKRsZDT9geOxCztFtUIOpLnsgsFzXatAnlpcNfDZgEIQEtu8hFe7NZXQPyakh tYnu1ksyqWr0wHB2hsiu1QFuBLNOEPT4TxhtQ3/alR1cydJhOWocdado1Bq0xE4s TgCwdE4HOMJ4JaLnwbo4zxaOA3kL0P5RHBeQoywxeIEq2WbHovjatfqZFU6J7XGp Js6SMFF69h8ztAyzKaYjKyHnTuUAh4uYcmH6kMLnibjliqdceimTTLdVEeaQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1723762947; x=1723849347; bh=7i8lF+Z37bPyAUpBdapNH9SSxLv8 o0+QqeZD5tcqNh4=; b=nbRkCRNUxwU3vQRJGVpWZ3cAXaUjGdS/AG+2U8DHLRZw ZdZrT8PRyloynsaL5kDzXhR7b3q2mBZ1rVcNe6T4oSVKm114UCjqw1efYwoH8m0W fhtpHAdXUbs1Wtc70gjq8LGFNp9hXaIe2v5iZPxyyqH/uP1eyuzZRrcp2+CpIAkE OVpzkMsjEc+/+wrP98hO0oOG+8VK4VBx/7xj4UfEV7tAWfa0RYsw2pzC4f+g/AWG 29VHl/6hT8Fa/F4ymUA6vSjaaH+p2JaoWCjOFfrKAHC3SLkkeOtnmoXw8ZkW++J+ 43FDj3ixw+sNkllI7Kr3VskkNUGwdCeR2cERByfFuw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddruddtjedgudekucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucgoufhushhpvggtthffoh hmrghinhculdegledmnecujfgurhepofggfffhvffkjghfufgtsegrtderreertdejnecu hfhrohhmpedftfhosgcunfgrnhguvghrshdfuceorhhosgessghothhtlhgvugdrtghoug gvsheqnecuggftrfgrthhtvghrnhepgeetgeeggeduteekgefftddvgfehieffgeelfffg veevtdeludffkeeiuefggfffnecuffhomhgrihhnpehphhhprdhnvghtpdhphhhpshhtrg hnrdhorhhgpdefvheglhdrohhrghdpghhithhhuhgsrdgtohhmnecuvehluhhsthgvrhfu ihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomheprhhosgessghothhtlhgvugdrtg houggvshdpnhgspghrtghpthhtohepuddpmhhouggvpehsmhhtphhouhhtpdhrtghpthht ohepihhnthgvrhhnrghlsheslhhishhtshdrphhhphdrnhgvth X-ME-Proxy: Feedback-ID: ifab94697:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 78BC715A005F; Thu, 15 Aug 2024 19:02:27 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 Date: Fri, 16 Aug 2024 01:02:05 +0200 To: internals@lists.php.net Message-ID: <8307aee0-f7c2-4e30-9823-ed38e66d0c3a@app.fastmail.com> In-Reply-To: References: Subject: Re: [PHP-DEV] [RFC] On the need of a `is_int_string` ? Content-Type: multipart/alternative; boundary=a58de44a8bd346f19e8f9cc726112c67 From: rob@bottled.codes ("Rob Landers") --a58de44a8bd346f19e8f9cc726112c67 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Thu, Aug 15, 2024, at 17:42, Vincent Langlet wrote: > Hi, >=20 > When string is used as an array key, it's sometimes casted to an int. > As explained in https://www.php.net/manual/en/language.types.array.php: > "Strings containing valid decimal ints, unless the number is preceded = by a + sign, will be cast to the int type. E.g the key "8" will actually= be stored under 8. On the other 08 will not be cast as it isn't a valid= decimal integer." >=20 > This behavior cause some issues, especially for static analysis. As an= example https://phpstan.org/r/5a387113-de45-4bef-89af-b6c52adc5f69 > vs real life https://3v4l.org/pDkoB >=20 > Currently most of static analysis rely on one/many native php function= s to describe types. > PHPStan/Psalm supports a `numeric-string` thanks to the `is_numeric` m= ethod. >=20 > I don't think there is a native function to know if the key will be ca= sted to an int. The implementation would be something similar (but certa= inly better and in C) to=20 > ``` > function is_int_string(string $s): bool > { > if (!is_numeric($s)) { > return false; > }=20 >=20 > $a[$s] =3D $s; >=20 > return array_keys($a) !=3D=3D array_values($a); > } > ``` >=20 > Which gives: > is_numeric('08') =3D> true > ctype_digit('08') =3D> true > is_int_string('08') =3D> false >=20 > is_numeric('8') =3D> true > ctype_digit('8') =3D> true > is_int_string('8') =3D> true >=20 > is_numeric('+8') =3D> true > ctype_digit('+8') =3D> false > is_int_string('+8') =3D> false >=20 > is_numeric('8.4') =3D> true > ctype_digit('8.4') =3D> false > is_int_string('8.4') =3D> false >=20 > Such method would allow to easily introduce a `int-string` type in sta= tic analysis and the opposite, a `non-int-string` one (cf https://github= .com/phpstan/phpstan/issues/10239#issuecomment-1837571316). >=20 > WDYT about adding a `is_int_string` method then ? >=20 > Thanks Hello, At the risk of bikeshedding, it would probably be better to define it in= the `array_*` space, maybe something like `array_key_is_string(string $= key): bool`? As for your function definition, it can be simplified a bit: return (($s[0] ?? '') > 0 || (($s[0] ?? '') =3D=3D=3D '-' && ($s[1] ?? '= ') > 0)) && is_numeric($s); I believe that covers all the cases that I could think of: 01, -01, +01, 1, 1.2, -1, -1.2, ~1, ~01 =E2=80=94 Rob --a58de44a8bd346f19e8f9cc726112c67 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
On Thu, Aug 15,= 2024, at 17:42, Vincent Langlet wrote:
Hi,

When string is used as an array key, it's sometimes casted to an = int.
"Strings containing valid decimal i= nts, unless the number is preceded by a + sign, will be cast to the= int type. E.g the key "8" will actually be stored under 8. On the other= 08 will not be cast as it isn't a valid decimal integer."

This behavior cause some issues, especially for static a= nalysis. As an example https://phpstan.org/r/5a387113-de45-4bef-89af= -b6c52adc5f69

Cur= rently most of static analysis rely on one/many native php functions to = describe types.
PHPStan/Psalm supports a `numeric-string` = thanks to the `is_numeric` method.

I don't = think there is a native function to know if the key will be casted to an= int. The implementation would be something similar (but certainly bette= r and in C) to 
```
function is_int_str= ing(string $s): bool
{
if (!is_numeric($s))= {
return false;
}

$a[$s] =3D $s;

return array_keys= ($a) !=3D=3D array_values($a);
}
```

Which gives:
is_numeric('08') =3D>= ; true
ctype_digit('08') =3D> true
is_int= _string('08') =3D> false

is_numeric('8')= =3D> true
ctype_digit('8') =3D> true
= is_int_string('8') =3D> true

is_numeric(= '+8') =3D> true
ctype_digit('+8') =3D> false
is_int_string('+8') =3D> false

is= _numeric('8.4') =3D> true
ctype_digit('8.4') =3D> fa= lse
is_int_string('8.4') =3D> false

<= /div>
Such method would allow to easily introduce a `int-string` typ= e in static analysis and the opposite, a `non-int-string` one (cf <= a href=3D"https://github.com/phpstan/phpstan/issues/10239#issuecomment-1= 837571316">https://github.com/phpstan/phpstan/issues/10239#issuecomment-= 1837571316).

WDYT about adding a `is_in= t_string` method then ?

Thanks

Hello,

= At the risk of bikeshedding, it would probably be better to define it in= the `array_*` space, maybe something like `array_key_is_string(string $= key): bool`?

As for your function definition, i= t can be simplified a bit:

return (($s[0<=
/span>] ?? '') > 0 ||=
 (($s=
[0'') =3D=3D=3D '-' && ($s[1] =
?? '') > 0)<=
span style=3D"color:rgb(84, 168, 87);">) && is_numeric($s<=
span style=3D"color:rgb(84, 168, 87);">);
I believe that covers all the cases that I could think of:

01, -01, +01, 1, 1.2, -1, -1.2, ~1, ~01

=E2=80=94 Rob
<= /html> --a58de44a8bd346f19e8f9cc726112c67--