Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:125171 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id D91381A00BD for ; Fri, 23 Aug 2024 21:13:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1724447699; bh=4KjMCi4mnK3IsegUW169ZrJWh5FbcBCMaZnUWYPBk4E=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=fPM0juEyC7Qf1ho0cT/2oDCeNVR+Xp1DfWS7CPoulk9maP7CJ+eAl/ZANLclz+H+v Zv/LOA2ici9geExlXcbQ3LdTWkBjG94Dsz4aer3FRzBKYbGZGBmVrFlKmXYFKzQBv0 Dgmezi5dGnpBgLx5cehxcM/Z1gT7pmOBWJTAWGlQ2qOedhnrE1VRiORN/FbcjlZqj5 kbWS/oOCf4BNSAZHdnaHZxZKdJTo/KwZGVnRKwPqCLIunTX8y2kjZEI0sV7YofJx0q YNCTH2TiYHJLzYtpPPcov8mfgJt4IXMWNvqKNI2S+okJvFsPDrfM4sSEVV5RcPT6/T XZsYMMRw4o8eQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 109D3180062 for ; Fri, 23 Aug 2024 21:14:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 23 Aug 2024 21:14:58 +0000 (UTC) Received: by mail-pf1-f172.google.com with SMTP id d2e1a72fcca58-714262f1bb4so1998566b3a.3 for ; Fri, 23 Aug 2024 14:13:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724447587; x=1725052387; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=4KjMCi4mnK3IsegUW169ZrJWh5FbcBCMaZnUWYPBk4E=; b=HVzEQN1dGVRi+yZ3s3o29jB0YYRxMZ7tfYQMPWg0tz4/8pReJspedToAxyhdLAmO9V RsYWmE+Xdl8URWzDbvFmGYZ51WcRki/LSqRiaYxXakYUXHGNpAj+mHQwtK6QIH+8RbFe O3HzeW26acVGMd4Yl3pNAaf6ZrJHqw/qyDouUT3C8cFx/08MLLvjiREAG2Qwd5szRcQ6 nbEG8uBgJc3wZfzAdLptQC29q3snpCcEljcLrVcFs4QcqqoyHOa4UWAYlu0bLY+tS6UR wPqG4r0C/Cv2uIzFmmFlo7dzLA+I9yFhTQsMxLJDfLl4YK2MqpeeeXhNwd6LnqgnpDqE om7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724447587; x=1725052387; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=4KjMCi4mnK3IsegUW169ZrJWh5FbcBCMaZnUWYPBk4E=; b=PJulKGxjOpP7xdReKcWL0Bs8EAcqPIZgd8AZ56WmvL/R0djECGxsXOm1MDqPm9eNl4 XtSiW9ULMIPFL5lX5C46++IDutiWOBGiQlmEZ9HuZHyukbqJCdRqJM9EAjvfLmJ7d7JE x+F4+C0LATt8xpxX2/b4SsuF4l1PgOJEPL51BSPw7UvwP3+Ozc3zaUg403c10Kax5L+E aicgvUkCle6L/x2U3pcKih7ZW8Ey8b2yOL7iJuZ4ZDAyTPKCc5WuoXe6KDxeU0Ts+e75 u+CamGtKv2GvXcpgUgFiyhU9+0G+HxFx7tQGqOkANqmU97Lf15mkjY1efv4vw567bUH0 eZsg== X-Gm-Message-State: AOJu0Yw5qInA32qqqFwwoKol7U1ZMJPARi+hovbiKKOGpZPwIpptgzCG qG3XbsBilodGZYnQoqbr4PaBnBDehk4emUoLBlYTqslXmLHGKEMk1qWEevrWu9govNT3xjMDRBH npfhTkVy7PIvTJ5Dkm08slRNKBmo/+DMD X-Google-Smtp-Source: AGHT+IGiuUWo56uzNErkH3HqEh8GAtA8oamAcznY0VOMHdyh7Jtt0XTmNl8P/IOWbg5QJQ9J1cUyYL0f8S7fV+8mx+w= X-Received: by 2002:a05:6a00:124c:b0:705:b0c0:d7d7 with SMTP id d2e1a72fcca58-71445d2ce35mr4324445b3a.7.1724447586585; Fri, 23 Aug 2024 14:13:06 -0700 (PDT) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 References: <8307aee0-f7c2-4e30-9823-ed38e66d0c3a@app.fastmail.com> In-Reply-To: <8307aee0-f7c2-4e30-9823-ed38e66d0c3a@app.fastmail.com> Date: Fri, 23 Aug 2024 23:12:55 +0200 Message-ID: Subject: Re: [PHP-DEV] [RFC] On the need of a `is_int_string` ? To: Rob Landers Cc: internals@lists.php.net Content-Type: multipart/alternative; boundary="000000000000d516a80620603dcb" From: misterdeviling@gmail.com (Vincent Langlet) --000000000000d516a80620603dcb Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I found a simpler implementation later which rely on array_keys ``` fn is_int_string(string $s): bool =3D> \is_int(array_keys([$s =3D> null])[0= ]); ``` I considered that `is_int_string` was better in the same namespace than `is_object`, `is_array`, `is_int`, `is_numeric`, ... but maybe there was something better than `int_string` to describe this category of string since english is not good (integish ? integable ? integerable ?). But indeed it could be interesting to relate this method to the array namespace... Anyway, this topic does not seems to interest lot of developer so far ^^' Le ven. 16 ao=C3=BBt 2024 =C3=A0 01:04, Rob Landers a = =C3=A9crit : > On Thu, Aug 15, 2024, at 17:42, Vincent Langlet wrote: > > Hi, > > When string is used as an array key, it's sometimes casted to an int. > As explained in https://www.php.net/manual/en/language.types.array.php: > "Strings containing valid decimal ints, unless the number is preceded by > a + sign, will be cast to the int type. E.g the key "8" will actually be > stored under 8. On the other 08 will not be cast as it isn't a valid > decimal integer." > > This behavior cause some issues, especially for static analysis. As an > example https://phpstan.org/r/5a387113-de45-4bef-89af-b6c52adc5f69 > vs real life https://3v4l.org/pDkoB > > Currently most of static analysis rely on one/many native php functions t= o > describe types. > PHPStan/Psalm supports a `numeric-string` thanks to the `is_numeric` > method. > > I don't think there is a native function to know if the key will be caste= d > to an int. The implementation would be something similar (but certainly > better and in C) to > ``` > function is_int_string(string $s): bool > { > if (!is_numeric($s)) { > return false; > } > > $a[$s] =3D $s; > > return array_keys($a) !=3D=3D array_values($a); > } > ``` > > Which gives: > is_numeric('08') =3D> true > ctype_digit('08') =3D> true > is_int_string('08') =3D> false > > is_numeric('8') =3D> true > ctype_digit('8') =3D> true > is_int_string('8') =3D> true > > is_numeric('+8') =3D> true > ctype_digit('+8') =3D> false > is_int_string('+8') =3D> false > > is_numeric('8.4') =3D> true > ctype_digit('8.4') =3D> false > is_int_string('8.4') =3D> false > > Such method would allow to easily introduce a `int-string` type in static > analysis and the opposite, a `non-int-string` one (cf > https://github.com/phpstan/phpstan/issues/10239#issuecomment-1837571316). > > WDYT about adding a `is_int_string` method then ? > > Thanks > > > Hello, > > At the risk of bikeshedding, it would probably be better to define it in > the `array_*` space, maybe something like `array_key_is_string(string > $key): bool`? > > As for your function definition, it can be simplified a bit: > > return (($s[0] ?? '') > 0 || (($s[0] ?? '') =3D=3D=3D '-' && ($s[1] ?? ''= ) > 0)) && is_numeric($s); > > > I believe that covers all the cases that I could think of: > > 01, -01, +01, 1, 1.2, -1, -1.2, ~1, ~01 > > =E2=80=94 Rob > --000000000000d516a80620603dcb Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I found a simpler implementation later which rely on array= _keys
```
fn is_int_string(string $s): bool =3D> \is_int(array_key= s([$s =3D> null])[0]);
```

I considered that `is_int_string` w= as better in the same namespace than
`is_object`, `is_array`, `is_int`, = `is_numeric`, ... but maybe there was something better
=C2=A0than `int_s= tring` to describe this category of string since english is not good (integ= ish ? integable ? integerable ?).
But indeed it could be interesting to = relate this method to the array namespace...

Anyway, this topic does= not seems to interest lot of developer so far ^^'

Le=C2=A0ven. 16 a= o=C3=BBt 2024 =C3=A0=C2=A001:04, Rob Landers <rob@bottled.codes> a = =C3=A9crit=C2=A0:
On Thu, Aug 15, 2= 024, at 17:42, Vincent Langlet wrote:
Hi,

<= /div>
When string is used as an array key, it's sometimes casted to= an int.
"Strings containi= ng valid decimal ints, unless the number is preceded by a=C2=A0+ sign, will= be cast to the int type. E.g the key "8" will actually be stored= under 8. On the other 08 will not be cast as it isn't a valid decimal = integer."

This behavior cause some issues= , especially for static analysis. As an example=C2=A0https://= phpstan.org/r/5a387113-de45-4bef-89af-b6c52adc5f69
vs rea= l life=C2=A0https://3v= 4l.org/pDkoB

Currently most of static anal= ysis rely on one/many native php functions to describe types.
PHPStan/Psalm supports a `numeric-string` thanks to the `is_numeric` metho= d.

I don't think there is a native functio= n to know if the key will be casted to an int. The implementation would be = something similar (but certainly better and in C) to=C2=A0
``= `
function is_int_string(string $s): bool
{
=
if (!is_numeric($s)) {
return false;
}

$a[$s] =3D $s;

return array_keys($a) !=3D=3D array_values($a);
}
```

Which gives:
is_= numeric('08') =3D> true
ctype_digit('08') = =3D> true
is_int_string('08') =3D> false

is_numeric('8') =3D> true
c= type_digit('8') =3D> true
is_int_string('8'= ;) =3D> true

is_numeric('+8') =3D&g= t; true
ctype_digit('+8') =3D> false
is_int_string('+8') =3D> false

is_= numeric('8.4') =3D> true
ctype_digit('8.4'= ) =3D> false
is_int_string('8.4') =3D> false

Such method would allow to easily introduce a `i= nt-string` type in static analysis and the opposite, a `non-int-string` one= (cf=C2=A0https://github.com/phpstan/phpstan/iss= ues/10239#issuecomment-1837571316).

WDYT a= bout adding a `is_int_string` method then ?

Th= anks

Hello,
<= br>
At the risk of bikeshedding, it would probably be better to d= efine it in the `array_*` space, maybe something like `array_key_is_string(= string $key): bool`?

As for your function definiti= on, it can be simplified a bit:

return (($s[0] =
?? '') > 0 || ((=
$s[0] ??=
 '') =3D=3D=3D '-' && (<=
/span>$s[1] ?? '') =
> 0)) &&a=
mp; is_numeric($s);

I believe that covers all the cases that I could think of:

01, -01, +01, 1, 1.2, -1, -1.2, ~1, ~01
=E2=80=94 Rob
--000000000000d516a80620603dcb--